Deep Learning Hardware

Deep learning is a process that requires large computing power. Of course, there is nothing good about spending money on buying hardware from the cover of a magazine, which will then fly into the dustbin. You need to approach this matter wisely.

Let's try to look at examples of hardware solutions related to the work on mastering the topic of deep learning. Well, let's touch on a little theory.

/ Photo by Berkeley Center for New Media CC
')
Creating a system of deep learning, it is definitely worth thinking about using GP . The productivity gain he gives is too great to be ignored. Experts advise to pay attention to such models as the GTX 680 (limited budget) and GTX Titan X (expensive convolution solution) or GTX 980 (best solution for your money).

To choose the right CPU , you need to understand what it has to do with deep learning. Here he performs very few calculations, but still writes and reads variables, executes instructions, initiates calls to the GP, and passes parameters to it.

Most deep learning libraries (like most applications) use just one thread. This means that multi-core processors are useless. However, they may come in handy if you work with frameworks that use parallel computing, such as MPI.

The size of the cache of the central processor does not play a large role in the exchange of data between the CPU and the GP. In deep learning, the data sets are usually too large to fit in the cache, and for each new mini-batch sample, data has to be read from memory - therefore, in any case, you will need constant access to the RAM.

Clock frequency is not always the best measure of performance in this situation. Even when the CPU is 100% loaded, the matter may not be in its actual operation, but simply in the cache errors (memory has a lower frequency than the CPU). The frequency of the processor determines the maximum frequency of the RAM, and both of these frequencies determine the resulting memory bandwidth of the processor.

Throughput matters only when you are copying large amounts of data. It determines how quickly a mini-batch can be rewritten and placed so that you can begin initializing the data transfer to the graphics processor.

With direct memory access (RAP), the task will be to ensure the correct transfer, which will allow to do with cheaper and slower memory. This can be done asynchronously and without loss of performance. The frequency of the RAM does not matter.

The size of the RAM is very important here. It should not be less than the amount of GPU memory. If there is a lot of memory, you can avoid a lot of problems, save time and increase productivity by addressing more complex issues.

/ Photo UnknownNet Photography CC

Regarding the storage of large amounts of data, everything is generally understandable: most likely, you will store one part of it on a hard disk or SSD, another part in RAM and a third part (two small samples) in the memory of a graphics processor. For example, if you are working with a large set of 32-bit data, then you definitely need an SSD, since hard drives with a speed of 100-150 MB / s are too slow and will not be able to keep up with GP. Many people buy SSDs just for convenience, but for deep learning it is only necessary if the dimension of the input data is large and you cannot compress them effectively.

Go ahead. Cooling It will certainly affect performance to a greater extent than poor hardware does. During operation, modern GPUs increase their frequency and power consumption to the maximum, but as soon as the processor temperature reaches 80 ° C, the speed is reset to avoid exceeding the temperature threshold. This approach guarantees the best performance.

In the case of deep learning, the temperature threshold is reached within a few seconds after the algorithm starts. Therefore, the easiest and cheapest way to circumvent the restrictions is to install BIOS firmware that has new, more adequate fan settings that keep the temperature and noise at an acceptable level.

Water cooling is also an option. It will practically reduce the temperature of a single GPU even at maximum load, and the temperature threshold will never be reached. Service does not require much effort and should not be a problem. The only question is money.

The motherboard must have enough PCIe ports so that all GPUs can be attached to it. PCIe 2.0 is suitable for a single GPU, but PCIe 3.0 has better price / performance. Choosing a motherboard is generally a simple task: choose the one that supports the hardware you need.

Large monitors that allow you to work productively and control the situation - the best investment. The difference between one and three monitors is huge, we advise you to pay attention to it.

Source: https://habr.com/ru/post/265167/

All Articles

Deep Learning Hardware

More articles: