I became acquainted with OCR technology sometime in 1997, when I bought my first, then still manual, black and white scanner Genius ScanMate 256 (by the way, still working). The Direct OCR program on a 3-inch floppy disk was attached to the scanner (damn, from somewhere in the subconscious mind all these names pop up), which with all its own forces tried to prove that you can quickly and almost without error enter the text from the book into the computer. Well, the evidence was not very. FineReader, with whom I met later, did it better. I was interested in the topic of recognition, I spent quite a lot of time on non-fiction articles about OCR technologies.
In 2001, I was preparing a thesis on web-technologies. Long thought about where to put knowledge. Since I was interested in OCR technology, I decided to combine WEB and text recognition. For self recognition, I had to answer FineReader. With friends, we “disassembled” FineReader into separate DLLs and figured out how to call individual functions of these libraries, transferring binary image data, and how to get the recognized text back. A simple web interface was built over this all to upload images, run recognition and get results.
The first limitation at that time for us was the ridiculous bandwidth of the Internet. An A4 page scanned as 200 dots per inch and saved in TIFF format (which FineReader only perceived) could occupy several megabytes in gray tones, and if someone mistakenly or unknowingly scans the color version, the volume increased three to four times. . Such a huge file at that time was even sent and processed with difficulty over a local network, and through the public Internet it is generally a difficult task to accomplish.
The second factor is cost. With such a speed of sending files to scanned pages, each page was expensive. We also took into account that commonly used are hacked versions of text recognition programs, which are obtained free of charge or for pennies.
')
The third factor is demand. In order for a person to use an online text recognition service, at least three factors are necessary: ​​the presence of a scanner, the presence of the Internet and the inability to recognize the text yourself. It was difficult to imagine a large number of such "Krivorukov" and "stupid" users.
The project was implemented, but left “under the cloth” as unpromising.
Two years ago, I suggested to my colleagues at work to consider the option of re-implementing the project. The situation has changed: the Internet has become faster (mp3 files have long been larger in volume than the scanned page in JPG format), scanners are almost everywhere (and you can even take a picture of the text), users try not to burden themselves with all sorts of programs and use services. FineReader has an API, and FLASH allows you to make a fairly convenient web-based interface for managing downloads and recognition. But we did not come to a common opinion and, one might say, missed the opportunity to make a useful and sought-after service
that can be profitably sold to ABBYY or Google .
Now ABBYY has already implemented the online version of Fine Reader for text recognition (supports 6 languages, including Russian; understands documents written in several languages ​​at once, supports input in TIFF formats (including multi-page files), JPEG, BMP, PNG, PCX, GIF, DjVu; supports output in Microsoft® Word, Excel®, Rich Text Format, TXT, searchable PDF).
And the other day, the well-known Google Docs API provided the opportunity to check the same thing on their demo page. Google allows you to upload an image in high resolution (up to 10 megabytes) in JPG, PNG or GIF format. Recognition lasts about two minutes. Only Latin alphabet is supported.
Related Links:
Having rummaged in search engines, I found some more services (some were created literally this year) for online text recognition. Here are some of them:
- OnlineOCR (28 languages, including Russian; supports input in TIFF (multi-page) formats, JPEG / JPG, BMP, PCX, PNG, GIF, PDF (multi-page), files up to 20 MB; output to PDF, MS Word, MS Excel, HTML, RTF, TXT)
- Free OCR (6 languages, no Russian; input in PDF formats (first page only), JPG, GIF, TIFF or BMP, file up to 2 megabytes; output in text format)
- OCR Terminal (6 languages, no Russian; input in PNG, JPEG, GIF, BMP, multi-page TIFF and PDF; output in DOC, TXT, RTF, PDF)
- A small list of free and commercial optical recognition systems online
PS I would also like to note the convenience of the
EverNote system and the fact that this system includes the recognition of texts and texts
on very filthy and crooked photos taken with the left foot in the dark :)PSS I would like to receive feedback on the work of such services from Habravets. Are there any among you who have used recognition in online-finereader, google docs and other services? Your review (and even better examples of recognition and technical limitations) I will add to the post.
Updated: moved to
Services .