📜 ⬆️ ⬇️

Comparative testing of Smart IDReader on 5 computing complexes with Elbrus processors

Smart IDReader is an application that allows you to recognize authentication documents on various platforms. Different recognition modes allow you to extract data from the document holder from the video stream, photos or document scans.



Today we decided to tell you about how we tested the Smart IDReader on a family of Russian-made computing systems - Elbrus. What are we going to test? How does document recognition work on the new Elbrus-8.4 machine? If it is interesting, we go under kat.


Our previous articles on recognition on Elbrus:
https://habrahabr.ru/company/smartengines/blog/304750/
https://habrahabr.ru/company/smartengines/blog/317672/
https://habrahabr.ru/company/smartengines/blog/329858/
https://habrahabr.ru/company/smartengines/blog/340918/


What are we going to test?




The review involves 5 different devices based on Elbrus processors:


Elbrus 101-RS


Elbrus 101-PC is a compact microprocessor-based workstation based on the Elbrus-1C + microprocessor with a nettop format system unit, featuring a low noise level. The Elbrus-1C + processor itself has a built-in 3D graphics accelerator that supports OpenGL 2.1 and OpenCL 1.2, and consumes no more than 25 watts of power, making it well suited for embedded systems and portable terminals.





Elbrus 401-PC


Elbrus 401-PC is a personal computer based on the Elbrus-4C microprocessor, which has repeatedly appeared in our reviews.





Server Elbrus-4.4


Server Elbrus-4.4 - 4-processor server based on Elbrus-4C, equipped with 96 GB of RAM. On such a powerful server, you can solve complex computational problems, use it for various server applications, or simply store data.



(output cut off)


Elbrus 801-RS


Elbrus 801-PC is a workstation based on the Elbrus-8C microprocessor, released last year. Elbrus-8C has an improved architecture: it supports up to 25 operations per 1 clock, and also operates at a frequency of up to 1300 MHz. At our sample at the time of the experiments, the clock frequency was reduced to 1200 MHz.



(output cut off)


Elbrus-8.4


And, finally, a new development of the MCST and INEUM. Brooke: 4-processor server module based on Elbrus-8C with a 10-Gigabit Ethernet interface M10GE / E of its own developed by MCST, providing trusted communication between nodes. It is intended for use as a node for storing, processing and transmitting data or solving other tasks for which there is enough imagination.


Characteristics Elbrus-8.4


Parameter nameValue
Microprocessor nameElbrus-8 (189110)
The number of cores in the microprocessor, pcs.eight
Maximum microprocessor clock frequency, GHzup to 1.3
The number of microprocessors in the computing device, pcs.four
RAM size, GBup to 256 GB of RAM with error correction (ECC)
Cooling systembuilt-in air type
I / O channels3 gigabit ethernet connectors
3 PCI Express slots
1 RS-232 bus connector
4 USB bus connectors
1 VGA video connector
Power supply220 ± 22 V, 50 ± 1 Hz
Power consumption, W, not more500
Operating temperature range−10 C… +50 C


(output cut off)


At our sample at the time of the experiments, the clock frequency was reduced to 1200 MHz.


Now we give the characteristics of all tested machines together:


A machineElbrus 101-PCElbrus 401-PCElbrus-4.4Elbrus 801-PCElbrus-8.4
CPUElbrus-1C +Elbrus-4CElbrus-4CElbrus-8SElbrus-8S
The number of general purpose kernelsonefoursixteeneight32
Clock frequency, MHz98580075012001200
Number of operations per clock (per core)up to 25 (8 intact., 12 substances.)up to 23up to 23up to 25 (8 intact., 12 substances.)up to 25 (8 intact., 12 substances.)
Technological process40 nm65 nm65 nm28 nm28 nm
Storage device120 GB SSD mSATA 3.0120 GB SSD mSATA 2.0500 GB HDD 3.5 '' SATA2.0120 GB SSD mSATA 3.02 TB HDD 3.5 '' SATA3.0
The amount of RAM with error correction (ECC)16 GB24 GB96 GB32 GB128 GB
The number of transistors (per processor)375 million986 million~ 986 million2.73 billion~ 2.73 billion
L1 cache (per core)64 KB data + 128 KB team64 KB data + 128 KB team64 KB data + 128 KB team64 KB data + 128 KB team64 KB data + 128 KB team
L2 cache (per core)2 MB2 MB2 MB512 KB512 KB
L3 cache (shared)---16 MB16 MB

The width of the SIMD instructions for all processors was 64 bits.


Tested documents


We decided to consider the recognition of 6 fairly different types of documents. It:


Passport RF




Biometric passport of the Russian Federation




Driving license of the Russian Federation




UK driving license




German ID cards




Sheet disability (sick leave)




As you can see, for driver's licenses and ID-cards are a few samples that differ among themselves. In fact, this is a rather typical situation: after the release of new standards, both updated documents and old-style documents are in use for some time. In addition, there may be different documents issued in different regions or documents for different categories of citizens, for example, adults and minors. Therefore, before recognizing driver's licenses or ID-cards, Smart IDReader determines to which particular type the document belongs.


Performance evaluation


To assess the performance of the Smart IDReader, we measured the net recognition time of one scan or photo without taking into account the loading of an image from a file, and also without taking into account the loading of configuration files. In this case, the document in the image can be arbitrarily rotated. The recognition time was averaged over 100 images of each document.


Our application was compiled for the Elbrus architecture from source code using the lcc compiler 1.21.19 and was launched in native mode. Parallelization was performed on the maximum available number of threads using the tbb library.


First, we run sequential recognition (time per image):


Elbrus 101-PCElbrus 401-PCElbrus-4.4Elbrus 801-PCElbrus-8.4
Passport RF3.87 s1.90 s1.80 s1.21 seconds1.09 seconds
Biometric passport of the Russian Federation3.33 s1.85 s1.80 s1.10 s1.05 s
Driving license of the Russian Federation4.24 s2.12 s1.81 seconds1.24 s1.09 seconds
UK driving license2.26 seconds1.08 seconds1.03 s0.69 s0.66 s
German ID cards2.32 s1.22 seconds1.13 seconds0.77 seconds0.72 s
Sick leave7.59 s3.40 s2.65 s1.97 s1.49 seconds

In a more visual form:



You can see that we were not sitting idly: since our last article, the time of recognition of the RF passport has decreased by 1.5 times both on 401-RS and 801-PC and has become less than 2 seconds. But recognition of more than 4 streams does not give a significant performance increase on all documents except the hospital one: after all, only 12 text fields are recognized in a passport, and even less in driver's license and ID-cards: 7. Therefore parallelized there is not such a large part of the algorithm, which, of course, is a drawback for recognizing individual document images on multi-core systems. The sick-list contains much more fields, so the acceleration between 401-RS and Elbrus-4.4 and 801-RS and Elbrus-8.4 is more noticeable. It is also worth noting that 101-PC with one core runs only half as fast as 401-PC. This is because the Elbrus-1C + new revision is in the 101-RS, supporting execution of up to 25 operations per clock and operating at a frequency of 950 MHz, but in 401-RS there is an old Elbrus-4C performing up to 23 operations per clock (well, you can't get away from the Amdal law :))


However, on multi-core systems, you can run Smart IDReader in server mode: run several document recognition processes in parallel. In this mode, we will be able to fully load all the processor cores and more realistically assess the performance of the respective devices.


Each recognition call was parallelized in the same way as in the previous experiment, but here the processing time included loading an image from a file.


Results at full load of Elbrus (average time for one image):


Elbrus 401-PCElbrus-4.4Elbrus 801-PCElbrus-8.4
Passport RF1.27 seconds0.36 s0.43 s0.11 s
Biometric passport of the Russian Federation1.13 seconds0.36 s0.42 s0.11 s
Driving license of the Russian Federation1.79 s0.47 s0.64 s0.16 s
UK driving license0.93 s0.26 s0.32 seconds0.08 s
German ID cards0.99 s0.26 s0.37 s0.10 s
Sick leave2.22 s0.66 s0.86 s0.22 s

Results in the form of a diagram:



From these results, it can be seen that the server modules based on Elbrus fully correspond to the declared characteristics and for tasks with a high degree of parallelism, they show an acceleration of 3-4 times. At the same time, the Elbrus-4.4 server is still 20-30% more powerful than the Elbrus 801-RS workstation. A comparison of the 401-PC and the 801-PC also brought no surprises: the 801-PC is almost 3 times faster than its predecessor due to an increase in the clock frequency and a significant improvement in the architecture. For Elbrus-4.4 and Elbrus-8.4, this ratio has been preserved.



We are very grateful to the company and employees of the MCST and INEUM. Brooke for the opportunity to test the new server Elbrus-8.4 and we want to wish them to continue to delight us with worthy developments!


Congratulations to all with the coming and wish you professional success, health and happiness in the new 2018 year!


')

Source: https://habr.com/ru/post/345758/


All Articles