📜 ⬆️ ⬇️

Optical Character Recognition by .NET

For example, I created a popular site that displays numbers in the form of pictures, an ad.




')
Here is the number itself:



First of all, I will need a dictionary of all the characters that can be found in such pictures, so I’ll start not with this phone, but with learning. To do this, I found on the same site ads 2 phones that contained all the possible 10 digits and glued them into one image:



Each character highlights the fact that it does not merge with the background, and each identical character is drawn in the same way. First, remove the transparency:

void RemoveAlphaChannel(Bitmap src) { for (int y = 0; y < src.Height; y++) for (int x = 0; x < src.Width; x++) { var pxl = src.GetPixel(x, y); if (pxl.A == 0) src.SetPixel(x, y, Color.FromArgb(255, 255, 255, 255)); } } 


Cut off the excess:

 private Bitmap CropImage(Bitmap sourceBitmap) { var upperLeft = GetCorner(sourceBitmap, true); var lowerRight = GetCorner(sourceBitmap, false); var width = lowerRight.X - upperLeft.X; var height = lowerRight.Y - upperLeft.Y; Bitmap target = new Bitmap(width, height); using (Graphics g = Graphics.FromImage(target)) { g.DrawImage(sourceBitmap, new Rectangle(0, 0, target.Width, target.Height), new Rectangle(ul, new Size(width, height)), GraphicsUnit.Pixel); } return target; } 

I will not specifically describe the GetCorner method. In short, it compares colors pixel by pixel by pixel and returns the upper left or lower right points framing the useful area of ​​a rectangle.

Next, parse the resulting picture into symbols and add them to the collection. I used an algorithm that each iteration plucks at the symbol to the left:

 private void CropChars(Bitmap bitmapPattern, string stringPattern) { var croped = CropImage(bitmapPattern); RemoveAlphaChannel(croped); int cntr = 0; for (int x = 0; x < croped.Width; x++) { for (int y = 0; y < croped.Height; y++) { if ( (y == croped.Height - 1 && x > 0) || (x == croped.Width - 1 && x > 0) ) { var rect = new Rectangle(0, 0, x, croped.Height); //  if (_charInfoDictionary.FirstOrDefault(c => c.Char == stringPattern[cntr]) == null) _charInfoDictionary.Add(new CharInfo(CropImage(croped, rect), stringPattern[cntr])); ++cntr; if (croped.Width - x <= 1) return; croped = CropImage(croped, new Rectangle(x, 0, croped.Width - x, croped.Height)); x = 0; } if (!IsEmptyPixel(croped.GetPixel(x, y))) { break; } } } } 


Key points here are 2:

1. stringPattern is the term "8929520-51-488926959-74-93", each character of which corresponds to the graphic representation of the character.

2. An entity that describes a symbol:

 public class CharInfo { //  public int[] _hsbSequence; //- ,     ,     (   ) private const int XPoints = 4; private const int YPoints = 4; //   public char Char { get; set; } //   public Bitmap CharBitmap { get; private set; } public CharInfo(Bitmap charBitmap, char letter) { Char = letter; CharBitmap = charBitmap; //      -  Bitmap resized = new Bitmap(charBitmap, XPoints, YPoints); _hsbSequence = new int[XPoints * YPoints]; int i = 0; //   *10.  ,  double  0.0()  1.0() for (int y = 0; y < YPoints; y++) for (int x = 0; x < XPoints; x++) _hsbSequence[i++] = (int)(resized.GetPixel(x, y).GetBrightness()*10); } /// <summary> ///     ,    /// </summary> /// <param name="charInfo"></param> /// <returns> </returns> public int Compare(CharInfo charInfo) { int matches = 0; for (int i = 0; i < _hsbSequence.Length; i++) { if (_hsbSequence[i] == charInfo._hsbSequence[i]) ++matches; } return matches; } } 

Now, returning to the number in the ad, it remains only to put together a similar collection (with one difference: the symbolic representation for each element will occupy a space) and compare each element with a dictionary.

 public IEnumerable<CharInfo> Recognize(Bitmap bitmap) { RemoveAlphaChannel(bitmap); var charsToRecognize = CropChars(bitmap); List<CharInfo> result = new List<CharInfo>(); foreach (var charInfo in charsToRecognize) { CharInfo closestChar = null; int maxMatches = 0; foreach (var dictItem in _charInfoDictionary) { var matches = dictItem.Compare(charInfo); if (matches > maxMatches) { maxMatches = matches; closestChar = dictItem; } } result.Add(closestChar); } return result; } 


As a result, we have a collection of symbols for which the piece of iron picked up and correctly put down all the numbers.

  StringBuilder sb = new StringBuilder(); foreach (var charInfo in charsToRecognize) sb.Append(charInfo.Char); textBox1.Text = sb.ToString(); 



Recognition of the letters of the alphabet, for example, the letters "» ", is somewhat more complicated due to the fact that they have components and require a more complex algorithm for finding a framing rectangle, but the comparison algorithm itself will be the same.

PS As for third-party libraries, at that time I found several of them, among which (however, I don’t remember the names of the others) I chose the MODI library from Microsoft for my purposes (it was part of MS Office). She recognized the text perfectly. Of the minuses - in the context of one process only one recognition procedure could work, she simply did not want to straighten out in several streams.

Source: https://habr.com/ru/post/239803/


All Articles