Introduction
We are developing deep neural networks for analyzing photos, videos and texts. Last month we bought a very interesting thing for one of the projects:
Intel Movidius Neural Compute Stick .

This is a specialized device for neural network computing. In fact, the external video card, sharpened by neural networks, is very compact and inexpensive (~ $ 83). We want to share the first impressions of working with Movidius. All interested in asking under the cat.
Computing power of the device
In terms of computing, neurons are extremely voracious: they need GPUs for learning, and for use in real-world tasks, they are also GPUs or powerful CPUs. Movidius NCS allows you to use deep neural networks on devices that were not originally designed for it, for example: Raspberry Pi, DJI Phantom 4, DJI Spark. We are talking only about the prediction stage (inference of a pre-trained network): the training of neural networks on Movidius is not yet supported.
The chip's performance is about 100 gigaflops, 10 ^ 9 FLOPS, (this roughly corresponds to the level of top-end supercomputers of the early 90s, now it is in the order of hundreds of petaflops, 10 ^ 15).
')
For reference: FLOPS is the number of computational operations or instructions that are performed on floating-point (FP) operands per second. To go deeper into the topic, I recommend the Intel article .The piece of iron is based on the Myriad 2 chip. The Myriad 2 configuration includes 12 specialized programmable vector processors. The components of the SoC are connected to a high-speed internal connection that works with minimal delays. Myriad 2 is positioned as a coprocessor in conjunction with an application processor in mobile devices, or as a stand-alone processor in wearable or embedded electronics devices.
Myriad 2 processor itselfBut in the form factor flash drives (Neural Compute Stick) it can be used to embed neural networks in drones, for example, together with the Raspberry Pi.
Let's start the installation and launch of the first program on NCS
What we need
- Intel Movidius. Find out where it is sold, you can link . We took on Amazon.
- Ubuntu 16.04 LTS or Raspbian OS. Officially, only they are supported, but in principle, you can try to use on other Linux.
- SDK from the official repository of the company. We download it further from the console.
- Exported from Tensorflow or Caffe binary with neural network weights graph. The latest version of Movidius only supports Tensorflow or Caffe model formats. Since we will run the standard example, we will not have to build the graph ourselves.
Training
We connect Movidius to the USB 3.0 connector. Next, write to the console:
$ git clone https://github.com/movidius/ncsdk.git $ cd ncsdk $ sudo make install
These commands will install:
- NCS Libraries → / usr / local / lib
- NCS Toolkit binaries → / usr / local / bin
- NCS Include files → / usr / local / include
- NCS Python API → / opt / movidius
And also add the path to Movidius python-lib in PYTHONPATH.
Run an example
In the same folder, run the command to build the examples:
$ make examples
To prepare a standard example — an implementation of inception_v1 trained on ImageNet — we will execute the following commands:
$ cd examples/tensorflow/inception_v1 $ make all
The last command uses the grid description and the already trained weights and compiles the binary graph, which we can then run on Myriad 2 VPU.
Now we run the
test script run.py. Briefly tell what happens in the script as a whole (some parts of the script are omitted):
When we collected the example, we entered the command
make all into the console, after which useful information was output to the console, for example, you can see how quickly data passes through each layer of the network using the
Detailed Per Layer Profile . Useful for debugging and optimizing stuff.
Run the script:
$ python3 run.py
The test image is loaded onto NCS, passes through Inception, and the recognition result is displayed in the console (probability distribution over 1000 + 1 categories of ImageNet dataset).
Console output Number of categories: 1001 Start download to NCS... ******************************************************************************* inception-v1 on NCS ******************************************************************************* 674 mouse, computer mouse 0.99512 663 modem 0.0037899 614 joystick 0.00031853 528 desktop computer 0.00021553 623 lens cap, lens cover 0.0001626 ******************************************************************************* Finished
Test picture
We uploaded this photo to Movidius and drove it through Inception. It can be seen that the network with ~ 99% confidence believes that the picture shows a computer mouse (thanks to our hint :)), the modem is in second place with close to 0% confidence, and so on. The grid is right, so congratulations on your first neuron, successfully launched on this device!
Conclusion
In the end I would like to list the main advantages and disadvantages of the device.
First bad news:
- The device officially supports work only with Raspbian OS or Ubuntu 16.04 LTS.
- The device and its SDK currently only support files with weights of neural networks in the Caffe and Tensorflow format.
- Only predictions (inference) can be made on the device, and models cannot be trained.
Good news:
- You can run neurons on the Raspberry Pi!
- Very simple python / C API.
- Low power consumption (1 W), the device is powered by USB.
- Very fast for such a compact device: for example, preprocessing photos ~ 800x800 and running it through Inception_v1 takes ~ 120-130 milliseconds.
- There is a collection of ready-to-run open-source models (the so-called Model Zoo ).
- Interestingly, you can connect several NCS at once, which will work out of the box in parallel mode. However, we have not tested it yet.
So Intel suggests using Movidiuses to speed up computingOf course, this device has analogues.
One of them - and the most promising so far - is
Gyrfalcon Technology Laceli , which has 28 times more performance and 90 times more energy efficiency. The only obstacle to buying is that the device has not yet entered the market.
Another competitor that has long been on the market is
NVIDIA Jetson TX2 . Differences:
- Very different price categories ($ 559 vs. $ 83)
- Different capacities (two CPU cores on Denver 2 architecture, four ARM Cortex A57 cores and a 256-core Pascal GPU versus one Myriad 2)
- Different form factor: Jetson is much bigger, NCS compact
- Both devices solve the same problem - the task of introducing neurons on board something: a car, a UAV, etc.
If interested, we will write in the near future another article about using Jetson TX2 for neural networks. Thank you for your attention and have a nice day)
PS Intel announced the launch of a
competition for optimizing neural networks for the Intel Movidius Neural Compute Stick. Registration is until January 26, the end of the competition - March 15.