Introduction to the OCR tessnet2 library (C # language)

Just the other day, I had the need to recognize the plain text in the picture and there was no desire at all to implement my algorithm, since familiar with the theory and I know that this is not such a simple matter, so I immediately decided to study the market of ready-made libraries first. Just a few requests to Google and I realized that nothing more suitable for me as a library tessnet2 can not be found. I constantly read Habr and I know that there are a lot of articles on OCR theory and I was very surprised that there is nothing about the tessnet2 library.

tessnet2 based on tesseract ocr

The Tesseract OCR engine was one of the 3 best engines presented in 1995 at the UNLV Accuracy test. Between 1995 and 2006, it was slightly refined, but it is probably one of the most accurate OCR engines available in open source. Code that is available will read binary, gray or color image and display text. TIFF reading is designed so that uncompressed TIFF images will be read or Libtiff can be added to read compressed images.

How to use Tessnet2 :
1. Load the library , add a reference (reference) to the Tessnet2.dll in the .NET project.
2. Download the language we need (I personally need English) ( tesseract-2.00.eng.tar.gz ) and add it to the tessdata folder. The tessdata folder must be next to the executable file of our application.

In order to read the text from the image such text is enough:

Bitmap image = new Bitmap ( "eurotext.tif" ); tessnet2.Tesseract ocr = new tessnet2.Tesseract(); ocr.SetVariable( "tessedit_char_whitelist" , "0123456789" ); // If digit only ocr.Init( @"c:\temp" , " eng " , false ); // To use correct tessdata List <tessnet2.Word> result = ocr.DoOCR(image, Rectangle.Empty); foreach (tessnet2.Word word in result) Console .WriteLine( "{0} : {1}" , word.Confidence, word.Text); * This source code was highlighted with Source Code Highlighter .

I was very pleased with the result, so I immediately remembered that a few months ago I had fastened the service for solving a captch for one project, I’ll say right away that nothing good came of it, there was a need for speed, but it was not possible to get there, t. to. such services are not able to provide it, and the result is usually deplorable, it is understandable, because pay there from $ 1 for 1000 correctly entered captchas, which is terrible to say the least. Therefore, for the sake of experiment, I decided to play with this library on that example.
The baseline data for us will be captcha, on which you need to perform the simplest actions on two numbers and get an answer. It sounds quite simple, but the problem is that all the characters are of different colors and there is a dynamic background, sometimes it is even difficult for me (the person) to understand what is written there.
')
Immediately bring the results of the program, after which I will tell you how it all works:

screenshot1

The screenshots clearly show that the library can not solve anything because of the heap of lines, sometimes interferes with the background, which was not completely removed. Therefore, I developed my own small algorithm for cleaning the picture, there is nothing grandiose in it, I just step back a few pixels from the edge and run around the rectangle and collect colors there, also collect colors after the first digit and before the equal sign (the latter is more hack, but t . article is devoted to another, then left so). All I need to do next is to paint over all the colors that came into my collection and are not white.

Of all the algorithms, only the algorithm for painting the region on Bitmap can be most useful:

void FloodFill( Bitmap bitmap, int x, int y, Color color) { BitmapData data = bitmap.LockBits( new Rectangle(0, 0, bitmap.Width, bitmap.Height), ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb); int [] bits = new int [data.Stride / 4 * data.Height]; Marshal.Copy(data.Scan0, bits, 0, bits.Length); LinkedList<Point> check = new LinkedList<Point>(); int floodTo = color.ToArgb(); int floodFrom = bits[x + y * data.Stride / 4]; bits[x + y * data.Stride / 4] = floodTo; if (floodFrom != floodTo) { check.AddLast( new Point(x, y)); while (check.Count > 0) { Point cur = check.First.Value; check.RemoveFirst(); foreach (Point off in new Point[] { new Point(0, -1), new Point(0, 1), new Point(-1, 0), new Point(1, 0)}) { Point next = new Point(cur.X + off.X, cur.Y + off.Y); if (next.X >= 0 && next.Y >= 0 && next.X < data.Width && next.Y < data.Height) { if (bits[next.X + next.Y * data.Stride / 4] == floodFrom) { check.AddLast(next); bits[next.X + next.Y * data.Stride / 4] = floodTo; } } } } } Marshal.Copy(bits, 0, data.Scan0, bits.Length); bitmap.UnlockBits(data); } } * This source code was highlighted with Source Code Highlighter .

For those who are interested in experimenting, I attach the source code .

Total

We met a rather interesting library tessnet2, checked its work in real conditions, achieved quite good results of solving for complex pictures (captchas), of course there are errors, but their number is negligible, especially for this type of captch you can add a check using a regular expression and you will know for sure that the solved text corresponds to the required format.

Source: https://habr.com/ru/post/112599/

All Articles

Introduction to the OCR tessnet2 library (C # language)

tessnet2 based on tesseract ocr

Total

More articles: