I Built an App That Will Translate Objects to Another Language in Realtime

I built an app that will translate objects to another language in realtime using Machine Learning. For example, point it at a cup and it will translate it to taza in Spanish.

App demo

Over the past few weekends, I built an app which will detect any object you point at it and translate it to another language. For example, if you point your phone at a cup, it will translate it to taza for Spanish. This all happens offline and in realtime. You can download the app free for android.

Is this all really offline?

Yes, using Google’s ML Kit for Android. The model I am using (efficientnet_lite_4) is bundled with the app. However, on startup, you will be prompted to download the language model that you want to use because I did could not find a way to bundle it with the app. But afterwards, all the classification and translation is done offline.

How well does it work?

Eh…good, but not great. It’s very accurate in recognizing objects which are geometrically unambiguous. For example, cars, bananas and computer keyboards are all easy. But for other things, it returns results which are either somewhat related ot plainly wrong. I am using the efficientnet_lite_4 model and it’s trained to detect objects of 1,000 different classes. Of course, there are more than 1,000 different things in the world so this is sort of a limitation within itself.

As for the translation, that’s much better. I just feed it the label (in English) from what was detected, and it outputs the text in another language.

I have only tested this on my Galaxy S10 Plus. The performance on it is pretty good, considering all of what’s happening under the hood. Though, my phone does get a little warm after using the app for a couple of minutes.

This was Surprisingly Easy to build

I have very little knowledge in machine learning and AI. I know what it is and the basics, but I have never actually used tensorflow or any related tools. I also barely understand any of the math behind it (I did take Linear Algebra in college though). But sorry, I’m a web developer who can tell you everything about Javascript and Nodejs, but not anything on AI. Despite my ignorance, Google has made it extremely easy to take advantage of Machine Learning and AI on your mobile device thanks to their new ML Kit. Essentially, I just followed the instructions posted here. As for the input image, I took it directly from a Byte array of the camera frame. To work with the camera I used this awesome CameraView Library. It handled most of the complexity and allowed me to add overlays easily.

Speaking of overlays, that bounding box was the hardest thing about the app. It took me awhile to understand how I had to rotate the dimensions and scale it to the rectangular bounds that ML kit was producing for the object. That alone took me a few days to guess and understand.

Future Work

Right now, the app has a lot of limitations. For starters, if you want to change the language from what you selected, then you will need to go into the app settings and clear the data. This was a little laziness on my part because the support for other languages was a last minute thing. I’m currently learning Spanish so that was my focus. Also, I want to improve the classification and detection of objects. This might require me to modify the model and add in my own training data and re-train. I’m not sure yet.

Still, it was a fun thing to build and I was very surprised that it actually works on some level.