What to recognize on mobile platforms?

Somehow it happened that on whatever freelancing exchange I started the first project - always with recognition, therefore I have a lot of experience in creating attachments with similar functionality, which I wanted to share with you today.

How to recognize?

In fact, there are not so many ways to recognize on mobile devices.

There are three options:

Use the ready-made library, and just feed her images
Use some kind of api or make a recognition on the server
Write your own text recognition library

1. Take the library and feed her pictures

There is nothing cosmic in that, because there are lots of libraries for this purpose, but some are better and others worse. For myself, I noted these:
')

• Tesseract

This library is written in C ++. And she is not here at all because she is cool. My advice: bypass it. On the Internet, there are a lot of articles praising this library, but the syntax is simple and convenient, but it is verified by personal experience that it is terrible. Why I explain it this way: For mobile devices, there are restrictions on the load on the processor, and if you feed the picture in good resolution, the phone just hangs, and even multithreading does not help here, it is simply not adapted for mobile, and the recognition accuracy is just the bottom, if half understand - already good. If someone does not believe - you can check it yourself in less than an hour.

Accuracy: 4/10
Speed: 4/10
Simplicity: 8/10
Load: 2/10

General impression: horror, go around the tenth road

• OpenCV

A good and quite working library, I used it only once - I needed to recognize rectangles, and it completely coped with the task. I also know that it can be a bunch of other things. Tutorials a lot, the syntax is simple nothing superfluous. In general, I am quite pleased with her.

Accuracy: 8/10
Speed: 6/10
Simplicity: 8/10
Load: 6/10

Feature: Just a good library, no less
Overall impression: overall not bad

• Mobile vision

My favorite among libraries. Recognizes text, detects faces, emotions, reads QR codes. In a word, the functionality is good. The most important thing is low iron requirements. Written specifically for mobile devices. I really liked the fact that you can attach a camera to it and not just take a photo and display the results, but immediately, on the fly, scan the data stream from the camera, recognize and draw all sorts of things on top of the camera based on the results, and there is no FPS subsidence. But another important criterion is accuracy, here it is somewhere around 95%. True, there are downsides. Last month, Google released a buggy version of play services and text recognition on many devices did not work - it was just crash, but now everything seems ok.

Site Link: https://developers.google.com/vision/

Accuracy: 10/10
Speed: 10/10
Simplicity: 7/10
Load: 9/10

Feature: attachment to the camera due to fast data processing without loss of quality
Overall impression: beautiful

2. Requests to the network

The essence is simple: we transfer the photo to the network and get the answer in the form of json. Photos can be transferred in different ways: link, base64 string, Post request. Among the shortcomings: constant access to the Internet, it is impossible to process the stream from the camera.

• Start your server with recognizer

In principle, this is a separate topic, but this option should be considered, because you are independent of some kind of api, you do not have constant changes of prices.

Accuracy: library dependent
Speed: depends on the library + time to request
Simplicity: 1/10 (making your own server with recognition is not 5 minutes)
Load: 10/10 (minimal, only network request)

Feature: Do not depend on anyone, and customize it for yourself
General impression: hemorrhoid, but quite real

• Cloud Platform Vision

I liked this thing. It has the largest range of functions: ocr, logo recognition, face detection, Label detection, etc. Separately, I want to highlight the recognition of attractions - there are simply no analogs, and the accuracy is high, it can recognize a decrepit church in some Muhosransk, but there are also incidents. You can feed links and base64 string.

The library for creating requests to cloud vision is of course not very good, but as an example for writing the standards client: https://github.com/GoogleCloudPlatform/java-docs-samples/tree/master/vision

Accuracy: 10/10
Speed: time to request
Simplicity: 7/10
Load: 10/10 (minimal, only network request)

Feature: You can recognize everything, Landmark detection
Overall impression: quite cool, but pricing ...

• Kairos

Few people know about this excellent api for face recognition ( features ). It gives accurate results and is easy to use. In general, they also have sdk for requests, but it is paid, although you can dig it up on the githaba (I found it). Prices are relatively affordable, so you can use.

Accuracy: 10/10
Speed: time to request
Simplicity: 9/10 with sdk, if not 6/10
Load: 10/10 (minimal, only network request)

Feature: A wide range of functions in face recognition
Overall impression: quite cool if you only need to recognize faces.

3. Write your own library for recognition
Pathetic and cool, but you will have to kill a lot of time.

Accuracy: Depends on you
Speed: Depends on you
Simplicity: -1/10
Load: Depends on you

Feature: I want to turn over

I hope that this list will help someone, publish in the comments of the library which is not in the list, but they are cool, or not.

Source: https://habr.com/ru/post/345268/

All Articles