📜 ⬆️ ⬇️

What do monks have in common with optical text recognition and goat cheese?

If you answer "ABBYY FineReader", then you will be right. Some time ago, Father Gregory, the abbot of St. Gregory Palamas Monastery, turned to the American office of ABBYY with a request to help solve the unusual problem that the monastery faced. The monastery stores an archive of old documents in Greek with the political system of diactrics, which had to be digitized. Upon learning of this, our American colleagues presented the abbot with a box of ABBYY FineReader 10 Professional Edition . What kind of system is this and why FineReader needed exactly for Father Gregory - read under the cut.

Greek is one of the most ancient written languages ​​of the world and has a rich history (details can be found at least in Wikipedia ). Until 1982, the polytonic system was adopted in written Greek - the superscripts and subscripts were used to denote stresses and aspirations (they are called diacritical). It looks like this:


')
Since in modern oral Greek there is no aspiration and the types of stress are not distinguished, since 1982 a monotonic system with one mark is also officially used in writing.

Recognizing documents in polytonic Greek is in principle not difficult, since most modern fonts contain accented characters. The main thing for Father Gregory was to find a convenient program that would allow the monks to simplify digitization work as much as possible. The choice fell on ABBYY FineReader 10, which supports modern monotonic Greek with one accent mark. In addition, the FineReader for recognizing non-standard accented accents could use the ABBYY FineReader 10 Professional Edition template editor, which teaches the program to recognize non-standard characters (we described this feature in detail here ).

There are seven diacritical superscript accents in the Greek political system. Most of them and their various combinations can be used with the vowel letters of the Greek language. In total, a little more than two hundred possible combinations of characters with accents are obtained. It remained to train FineReader to recognize individual polytonic stresses and their combinations. Now the program is trained and the monks are ready to start work.

We hope that thanks to FineReader 10, the monks will be able to save one of the main treasures of the monastery - the ancient Greek texts - and they will be able to continue their normal life in prayer, teaching and work. In gratitude, ABBYY employees received fruit grown on the territory of the monastery and carefully collected by the monks, the best goat cheese and smoked salmon they have ever tasted.

Alisa Rakhmanova,
OCR

Source: https://habr.com/ru/post/123168/


All Articles