While TensorFlow is actively conquering the world, fighting for an audience with such major players of the machine learning market and deep neural networks like Keras, Theano and Caffe, other less ambitious projects, meanwhile, are partisans trying to occupy at least some niche. I just wanted to tell about one of these projects today due to the complete lack of information about him at Habrahabr. So,
tiny-dnn is a completely autonomous C ++ 11 implementation of in-depth training, created for use in conditions of limited computing resources, embedded systems or IoT. Details under the cut.
What do the developers offer us
- The speed is even without a GPU . This is achieved through TBB and SSE / AVX vectorization. 98.8 percent accuracy on MNIST in 13 minutes of training
- Portability About this further
- Ability to import Caffe models . But you will need protobuf
- Predictable performance . A simple multi-threaded model + no GC (was it worth it to write it, because the library for C ++ programmers)
In general, there is nothing revolutionary, but there is nothing particularly to blame. It is proposed to do everything in C ++ and quite straightforwardly. It is possible to save and load the model in the file and from the file (again, without dependencies). For a complete list, see the GitHub link.
Installation and assembly
The library is distributed by source codes. There can be no other way for this library, since all the code is contained in h-files, that is, there is absolutely nothing to collect into the library. A dubious decision, but in pursuit of wild portability and simplicity - why not?
To run the examples or tests you will need
cmake . A dozen options are available, including USE_OPENCL - at the moment an experimental feature that in the future can give a significant benefit. I suggest to look at the complete list as well on the project page.
')
Experience using in MS Visual Studio
After calling “cmake.” A * .sln file will be generated for MS Visual Studio 2015 (Community Edition is quite appropriate). In this solution there are two projects, the library itself and the tests. Here you can simply add your project, and to start using the library, register:
#include "tiny_dnn/tiny_dnn.h"
but you should remember to add the directory with tiny-dnn to the “Include directories”. Also in my case there was a build problem, namely:
error C4996: 'std::copy::_Unchecked_iterators::_Deprecate': Call to 'std::copy' with parameters that may be unsafe - this call relies on the caller to check that the passed values are correct. To disable this warning, use -D_SCL_SECURE_NO_WARNINGS. See documentation on how to use Visual C++ 'Checked Iterators'
for the solution it was required to add define _SCL_SECURE_NO_WARNINGS in the project parameters (C / C ++ -> Preprocessor).
Usage example
You can find examples on the page, but they are about MNIST jammed to holes, it needs data, parsing functions (which are actually built into the library), etc. Personally, I wanted a banal "Hello, world" in the world of neural networks - the implementation of xor. Googling did not give quick results, which resulted in the desire to do it yourself and share in the article. Here's what happened:
#include "tiny_dnn/tiny_dnn.h" using namespace tiny_dnn; using namespace tiny_dnn::activation; network<sequential> construct_mlp() { //auto mynet = make_mlp<tan_h>({ 2, 8, 2 }); auto mynet = make_mlp<relu>({ 2, 8, 2 }); assert(mynet.in_data_size() == 2); assert(mynet.out_data_size() == 2); return mynet; } int main(int argc, char** argv) { auto net = construct_mlp(); std::vector<label_t> train_labels {0, 1, 1, 0}; std::vector<vec_t> train_numbers{ {0, 0}, {0, 1}, {1, 0}, {1, 1} }; adagrad optimizer; // use gradient_descent? net.train<mse>(optimizer, train_numbers, train_labels, 4, 1000); // batch size 4, 1000 epochs for (auto& tn : train_numbers) { auto res_label = net.predict_label(tn); auto res = net.predict(tn); std::cout << "In: (" << tn[0] << "," << tn[1] << ") Prediction: " << res_label << std::endl; } std::cin.get(); return 0; }
First impressions
They are controversial. On the one hand, everything is very simple and guaranteed to start everywhere, on the other hand, on a banal XOR, the network somehow converges for a long time, for example, for 100 epochs, the output is correct, but extremely uncertain results like 0.15, for 1000 epochs something like 0.8 . It seems that a similar model converges faster at tensorflow, but this is not accurate :)
Further, the demo project itself is going longer than we would like. It seems that this is due to the approach with headers. Perhaps in a larger project it will not even be noticeable, in a small project with one C ++ file it is very even. By the way, perhaps the solution from MS with so-called precompiled headers ...
The way the library suggests using C ++ 11 syntax and features, oddly enough, I was pleased - I do not program as much on C ++ “bare” (more often Qt), but the test example was given on the first attempt. By the way comments are welcome.
→
Link to GitHub→
Link to a useful presentation