Hi, Habr! We offer you a translation of the
post “Getting Started with Deep Learning” by Matthew Rubashkin from Silicon Valley Data Science about the advantages and disadvantages of existing Deep Learning technologies and which framework to choose, taking into account the specifics of the task and the ability of the team.
Here, in SVDS, our R & D department studies various Deep Learning technologies: from image recognition to trains, to deciphering human speech. Our goal was to build a continuous process of data processing, creating a model and assessing its quality. However, starting the study of available technologies, we could not find a suitable guide for launching a new Deep Learning project.
Nevertheless, it is worthwhile to pay tribute to the community of enthusiasts who are engaged in the development and improvement of Deep Learning technologies with open source. Following their example, it is necessary to help others evaluate and select these tools, share their own experiences. The table below shows the criteria for choosing one or more Deep Learning frameworks:
The resulting rating is based on a combination of our experience in applying the above technologies for image and speech recognition, as well as the study of published comparative studies. It is worth noting that this list does not include all the available Deep Learning tools, more of which can be found
here . In the coming months, our team is impatient to test
DeepLearning4j ,
Paddle ,
Chainer ,
Apache Signa , and
Dynet .
')
We now turn to a more detailed description of our assessment system:
Programming Languages: Starting with Deep Learning, it is best to use the most convenient programming language for you. For example, Caffee (C ++) and Torch (Lua) have wrappers for their API in Python, but using these tools is recommended only if you are fluent in C ++ or Lua, respectively. For comparison, TensorFlow and MXNet support many programming languages, which makes it possible to use this tool, even if you are not a professional in C ++. Note: Unfortunately, we have not yet had the opportunity to test a new Python wrapper for Torch, PyTorch, released by Facebook AI Research (FAIR) in January 2017. This framework was created so that Python-developers can improve the construction of neural networks in Torch.
Study Guides: Regarding the quality and quantity of study guides and training materials, there are serious differences among the Deep Learning tools. So, Theano, TensorFlow, Torch and MXNet have excellent written tutorials that are easy to understand and use in practice. On the other hand, we could not find beginners' tutorials for working with Microsoft's CNTK and Intel's Nervana Neon, which are fairly advanced analysis tools. Moreover, we found out that the degree of involvement of the GitHub community is a reliable indicator of not only the future development of the toolkit, but also the speed with which you can fix a bug or error in the code using StackOverflow or Git Issues. It is important to note that now TensorFlow is King Kong in the world of Deep Learning in terms of the number of tutorials, materials for self-study, developers, and, finally, users.
CNN modeling capabilities: Convolutional neural networks are used for image recognition, recommender systems, and natural language processing (NLP). CNNs consist of a set of different layers that convert the input data into estimates for a dependent variable with previously known classes. (For more information, read the Eugenio Culurciello
overview of neural network architectures). CNN can also be used to solve regression problems, for example, calculating the steering angle of cars on autopilot. We evaluated frameworks based on the availability of certain capabilities for CNN modeling: the availability of embedded layers of neural networks, the availability of tools and functions for connecting these layers to each other. As a result of the analysis, ease of building models using TensorFlow and its
InceptionV3 , as well as an easy-to-use temporal layout of CNN in Torch, separate the framework data among the rest.
RNN modeling capabilities: Recurrent neural networks (RNN) are used for speech recognition, time series prediction, image capture, and other tasks that require processing of sequential information. Due to the fact that embedded RNN models are not as common among frameworks as CNN, it is important to explore other open-source projects that use the technology that suits you when implementing Deep Learning. For example, Caffe has a minimal amount of capabilities for RNN modeling, while
Microsoft CNTK and Torch have a rich set of documentation and built-in models. Tensorflow also has some RNN materials, and
TFLearn and
Keras include a large number of RNN examples using TensorFlow.
Architecture and easy-to use modular user interface: In order to create and train new models, the framework must have an easy-to-use, modular user interface. TensorFlow, Torch and MXNet, for example, have it, creating an intuitive development environment. For comparison, such frameworks as Caffe require considerable effort to create a new layer of the network. We also found out that, in particular, TensorFlow is easily debugged not only during, but also after training, thanks to the
TensorBoard GUI .
Speed: Torch and Nervana showed the best results when
testing the performance of convolutional neural networks. TensorFlow was tested with comparable results, and Caffe and Theano were far behind the leaders. In turn, Microsoft CNTK has shown itself to be the best when learning recurrent neural networks (RNN). The authors of
another study , comparing Theano, Torch and TensorFlow, chose Theano as the winner in terms of RNN training.
Support for multiple GPUs: Most Deep Learning algorithms require an incredible amount of FLOPS (floating point operations). For example, DeepSpeech, Baidu's pattern recognition model, requires 10 exaflops for training, and this is more than 10e18 calculations! Given that one of the leaders in the graphics processor market, NVIDIA`s Pascal TitanX can perform
11–9 flops per second , it will take more than a week to train a fairly large array of data. In order to reduce model creation time, it is necessary to use different GPUs on different systems. Fortunately, most of the above tools provide this feature. In particular,
MXNet is
considered to have one of the most optimized engines for working with several video cards.
Keras Compatibility: Keras is a high-level library for quickly implementing Deep Learning algorithms, which is great for exploring analysts with a field. Keras currently supports 2 back-end, TensorFlow and Theano and will receive
official support from TensorFlow in the future. Moreover, Keras is in favor of the fact that,
according to its author, the library will continue to act as a user interface that can be used with several back-end sites.
Thus, if you are interested in developing in the Deep Learning area, I would recommend first assessing the abilities and requirements of your team and project accordingly. For example, for image recognition by the Python development team, the best option would be to use TensorFlow, given its extensive documentation, high performance and excellent design tools. On the other hand, for the implementation of a project based on RNN with a team of professionals at Lua, the best option is Torch and its incredible speed, coupled with the capabilities of modeling recurrent networks.
Biography: Matthew Rubashkin, along with a solid background in optical physics and biomedical research, has experience in developing, database management and predictive analytics.===
In our
program Deep Learning, which starts on April 8 for the second time, we use two libraries: Keras and Caffe. As noted, Keras is a higher-level library and allows you to quickly prototype. This is a plus, because the program does not have to waste time trying to learn how to work with the same TensorFlow, but take immediately and try out different things in practice.
Caffe is also quite understandable in use, although not very convenient, but it can also be used in production solutions. In the review it is not marked as the fastest. If you follow the link to the
original speed analysis , you can see the note: “It may seem that TensorFlow and Chainer are faster than Caffe, but in fact this may not be the case, since the frameworks were tested on CuDNN, and Caffe on its default engine ".