Bing search optimized using neural network on FPGA

In designing server infrastructure to handle search queries, each company has its own know-how. For example, Microsoft has been actively experimenting in recent years with the use of FPGA (Field-Programmable Gate Array, user-programmable gate arrays).

For the Bing search engine, the ranking infrastructure is divided into three parts: feature extraction (feature extraction), free-form expressions, and machine learning scoring.

')
Since 2011, the Catapult project has begun the process of migrating these systems to FPGAs. The process was not easy, over the years Microsoft has experimented with three models of FPGA, for each of which had to design special motherboards.

In June 2014, Microsoft told us that it was possible to transfer 1632 servers in one of the data centers to the Catapult platform (that is, to FPGA). This made it possible to maintain the search engine performance at the same level, reducing the number of servers by half.

The work continued in the same direction, and now Microsoft told about the latest modifications made in the Bing machine learning subsystem.

First, Microsoft switched to the new high-performance FPGA Altera Arria 10 . They have increased performance in floating point operations (a threefold gain in energy efficiency, compared with GPU).

Secondly, a new original design of the convolutional neural network on the Altera Stratix-V FPGA matrices has been developed. This neural network is now used in computer vision problems, pattern recognition and image classification, including for the Bing search engine.

If you believe the results published in a scientific paper , in the standard tests for the classification of ImageNet 1K and ImageNet 22K neural networks from Microsoft exceeds the previous design options on FPGA about three times. In these two tests, Catapult Server + Stratix V D5 performs classification at 134 and 91 images per second, respectively.

At the same time, significantly improved performance in the cost of joules per image, compared with different GPUs optimized for this task. Thus, Microsoft servers will be able to work more efficiently and cheaper than servers on standard GPUs.

Source: https://habr.com/ru/post/376929/

All Articles

Bing search optimized using neural network on FPGA

More articles: