ABBYY FlexiCapture Engine 9.0: How to embed DataCapture technology into an application (programmer's view)

Not so long ago, my colleagues told Habr's readers about the new version of the product for developers ABBYY FlexiCapture Engine 9.0, which allows our users to embed technology to extract data from images into their software solutions.

In the above article, we talked about what this technology is and what it allows us to do, praised by the new API and examples. In this article I would like to complete the picture and show how the work with the product looks from the programmer's point of view: to give an opportunity to “touch” the API and talk about some “goodies” that make it easy and natural to integrate our product into most types of applications (desktop, server, cloud, etc.).

We start with the basic scenario of extracting data from images using FlexiCapture Engine 9.0 (this scenario is well suited for a simple desktop application), and then we can discuss what can be changed in this scenario to adapt it to the requirements of the application being developed.
In promotional materials it looks like this:
')

In the code it looks like this (read the comments):

  [C#] //  FCEngine    IEngineLoader engineLoader = new FCEngine.InprocLoader(); IEngine engine = engineLoader.Load( serialNumber, "" ); //            IFlexiCaptureProcessor processor = engine.CreateFlexiCaptureProcessor(); processor.AddDocumentDefinitionFile( sampleFolder + "Invoice_eng.fcdot" ); //         processor.AddImageFile( sampleFolder + "Invoices_1.tif" ); processor.AddImageFile( sampleFolder + "Invoices_2.tif" ); //      (    )   IDocument document = processor.RecognizeNextDocument(); assert( document != null ); //     assert( document.DocumentDefinition != null ); //       assert( document.Pages.Count == 2 ); //       //         string invoiceNumber = documents.Sections[0].Fields[0].Value.AsString; //      ...

Now we will describe in more detail the individual steps of this scenario, roughly corresponding to the commented pieces in the code:

Download FlexiCapture Engine

The loading of the Engine object corresponds to the loading and initialization of executable modules and, as a rule, is performed once at the start or at the first use.

There are several download options. You can load Engine directly into the main process, as in the above example, which is convenient for simple desktop applications. It is also possible to increase Engine reliability in a separate workflow, which is recommended for server solutions. When loaded into a separate process, it is possible to control the priority and lifetime of this process. You can also create a pool of workflows, in which Engine-safe thread-safe instances run in parallel and completely independently of each other and periodically recycle. (The examples include a ready implementation of such a pool.)

Further use of the Engine is practically independent of the boot method, so the same code can be reused, and the boot method can be an easily implemented option.

Creating and configuring the processor

The FlexiCaptureProcessor is a lightweight configurable handler that receives images as input and converts them into documents with output data. The number of such objects in one process is limited only by available resources. Processors can be created each time before use or cached (in this case we save on the loading time of document definitions).

To work, the processor requires you to specify a set of document definitions . Document definitions can be loaded ready-made from files on disk, as in the example, or from a stream of bytes in memory, which allows you to store these objects in arbitrary storage (application resources, databases, shared network storage, etc.). You can also create document definitions on the fly from scratch, either on the basis of a “flexible” or “hard” geometric markup description. (As a rule, it is recommended to use ready-made templates created and debugged using FlexiCapture visual tools. Software creation or modification of definitions may be required, for example, to simplify maintenance when you need to work with families of similar forms whose geometry changes from year to year. and the semantics of the data is more or less preserved).

Using processors, it is possible to organize multilevel processing, when one processor at the input performs the primary classification of documents according to some simple features and gives the result to one of the more specialized processors, which already determine the exact type of the document and retrieves data. This approach can improve productivity when working with a large number of different types of documents. In addition, specialized processors can work in parallel, further increasing the throughput of the entire solution.

Input Images

For processing the processor input is a sequence of images. In the simplest case, these may simply be links to files, as in the above example — in this case, the links are added to the processor's internal queue. You can also create a custom queue (source) of images and attach it to the input of the processor.

A user queue on demand can provide the processor with links to image files, in-memory streams corresponding to these files, or downloaded images (in the internal format of the FlexiCapture Engine). The latter option also allows you to implement arbitrary image preprocessing by the user before transferring them to the processor (using built-in tools or custom algorithms). Also in the user queue, you can arrange to wait until the next image becomes available (more on this later).

In some cases, the queue interface of images can directly map objects of the target application, representing the image storage in the workflow system, the custom implementation of a “batch of documents”, etc. These objects to implement their logic can privately implement this interface.

Processing cycle

In the example given at the beginning of the article, exactly one document is recognized, which is a fairly common task. However, in the more general case, in this place you should “twist” the processing cycle, at each step of which the next sequence of images is selected from the queue and exactly one finished document corresponding to this sequence is issued, or an error (there is no access to the image, a broken image, etc. .).

In conjunction with the user image queue, scrolling of the processing cycle allows you to implement a pull model of image acquisition by the processor when the processing cycle “pumps” images from the queue and can be suspended for the next image loading time.

One queue of images can be used from several processors working in parallel, which makes it easy to parallelize processing (at the same time, processors without idle time and “active waiting” simultaneously “pumping” images from the queue as they appear, an example of the implementation of this approach can be found in the examples).

Work with results

The processor issues documents that include a tree-like structure of nodes containing extracted data and pages with geometric markings showing where the data was found on the image.

Data from the document can be retrieved directly and independently stored in files or in a database (including the necessary sections of images) or allowed for further processing. You can also export data using the built-in export tools to file formats.

You can also save the document as an intermediate result to a file or stream in memory for marshaling or further processing elsewhere and at another time. Such further processing may be, for example, data verification using built-in verification tools.

Built-in verification tools allow you to optimize manual verification of large amounts of data by combining data into groups. Instead, it is often possible and convenient to verify the data programmatically (bypassing the document and programmatically checking the data on the database, dictionaries, checksums, etc.).

Custom link resolution mechanism

In the example discussed at the beginning of the article, we used links to document definition files and image files as paths in the file system. The use of links is often convenient and natural, and allows you to postpone the download of the relevant objects until it becomes necessary. Links are also easier to marshal when transferring between processes.

Redefining the link resolution mechanism allows the user to completely rebuild the mechanism for working with links in the context of this processor to suit his needs. The link is considered simply as some string understood by this mechanism. This can be some url or identifier in the database or a combination of those with an indication of the protocol. Redefining the link resolution mechanism requires only the ability to return the path to a local (temporary) file on disk or a stream of bytes in memory for this string. Additionally, when working with links, a protocol is defined for controlling the lifetime of returned objects (so that temporary files are deleted in a timely manner, and objects from memory are unloaded).

Conclusion

Much of what is described in this article is covered by the examples supplied with the product. This allows developers using our technologies to begin work with an already proven working solution and develop it to fit their needs. Examples include such types of applications as:

Desktop applications (uses a separate processor, loaded directly into the user process, images as a list of files on disk)
High-performance multithreaded processing server (using a pool of processing processors that take images from a shared queue)
Web service (a pool of processing processors that parallelly process parallel client requests)

There is also a special example (Code Snippets), written specially by programmers for programmers and containing many ready-made short scripts for using the product in exactly the form (without translation errors :)), as it was intended by the developers and tested.

There is an excellent reference, which describes in detail all the objects and methods and their use.

For more information about ABBYY FlexiCapture Engine 9.0, you can read on the ABBYY website .

Alexey Kalyuzhny
Product Development Department

Source: https://habr.com/ru/post/125347/

All Articles