AMD video adapters can be used not only for their intended purpose (games and work with graphics). Everyone knows about the possibilities of OpenCL to accelerate general computing using GPUs, and today we will talk about security issues associated with impressive computing power.

One of the most popular GPGPU graphics cards is the AMD R9 280X. With a modest price of ~ 220-230 dollars, it is ready to share three gigabytes of memory and 2048 stream processors based on the GCN v1.0 architecture, totaling about 3.4 TFlops for computing single precision and about 870 GFlops for double precision computing. Performance indicators may vary slightly depending on the vendor version and clock frequencies “wired” in the BIOS.
For comparison, one scandalously famous video card (the one with “3.5 GB of 4 declared memory”) costs a hundred dollars more, while it shows an impressive 3.8-4 TFlops for 32-bit floating point numbers, but for FP64 - funny ~ 120-130 gflops.
Let's go back to GPGPU. Perhaps, for your task you will have few opportunities of one video card, and you will install two, three, or even four, the benefit of motherboards and power supplies can now allow this. What if this is not enough? The killer feature of OpenCL technology, Virtual OpenCL, comes onto the scene and allows you to combine many accelerators installed in several computers into one high-performance cluster.
')
Virtual OpenCL
VCL
is available for free and works with any hardware that supports the standard OpenCL 1.0 or 1.1, allows you to combine various devices into one computer network and provide its power for any applications that can work with OpenCL.
As an example of the use of such technology, I would like to talk about a monstrous password bust farm consisting of 25 AMD GPUs.
Password cracker
Breaking a password with brute force often rests on the computing power of the computer that will perform the search. Even if one abstracts from various levels of protection against bruteforce attacks (such as captcha or deleting / encrypting information after an nth attempt to log in with the wrong password), the standard password of 8 characters can be searched for a long time. When using only lowercase letters of the Latin alphabet, you have to go through 26
8 (208 827 064 576) options, and if you use numbers, special characters and a different register, then the number of possible combinations will exceed 72
8 (722 204 136 308 736). Perhaps, it is not so difficult to generate 720 trillion passwords, but the passwords themselves in the open form, of course, no one stores, using their hashes instead.
Calculation of hash collisions (search for values ​​whose hash coincides with the desired one) is a much more resource-intensive task than it might seem at first glance — it is a particular case that the participants of the BitCoin network decide. Habrapamer
mark_ablov wrote a terrific article on the subject of
mining Bitcoins with a pen and paper , in which he examined in detail all the stages of calculations and showed how vulnerable BitCoin is to the hardware capabilities of productive clusters.
Modern passwords are stored in this form, which cannot be easily “solved” by assembling a special chip, so that pieces of hardware are able to play rough and efficiently: providing a huge amount of FP32 / FP64 operations per second, and here AMD technology, OpenCL capabilities and VCL farms will come in handy.
When BitCoin was “mined” with the help of video cards, lovers collected special farms from a large number of accelerators:
A couple of years ago, about the same piece of hardware, separated into several server cases, was shown at a computer security conference in Oslo. There are quite a few options for using this chest in experienced hands:
The cluster from the GPU runs on Linux, the video cards are combined by the VCL system, which provides the host system with all the video cards as one large system for the execution of OpenCL orders.
The farm can make up to 350 billion estimated password hashes per second using the NTLM algorithm. It is used in Microsoft Windows since the days of Windows Server 2003. To iterate through the eight-character password (the most popular in length both among ordinary users and in the corporate segment), containing all the Latin alphabet characters in various registers, numbers and special characters, this monster is only five half hours
As you may have guessed, the main difficulty for hacking is provided by the exponent, that is, the password length. Increasing the password by one character leads to an increase of 2 orders of magnitude:
72
9 = 51 998 697 814 228 992 against 72
8 = 722 204 136 308 736. In this case, one character increases the number of variants by
80 thousand times 72 times. Speaking easier, the longer and harder you have a password, the harder it is to select it by brute force.
Performance
The possibilities of such a farm from GPU are really impressive, and they show good results even on “hard” hashing algorithms: MD5 (180 billion assumptions per second), SHA1 (63 billion assumptions per second) and LM (20 billion assumptions per second) . For the so-called The results of slow hash algorithms are also quite good: bcrypt (05) and sha512crypt received 71,000 and 364,000 assumptions per second, respectively.
Optimization and scalability
The experiments with Password Cracker were conducted long ago, when the VCL was a fairly “raw” product. The collaboration of the author of this mega-farm with the creators of the VCL led to an improvement in the load balancer. A special script made it possible to improve the work of Hashcat on the VCL, so today you can run the code not on 25, but on at least 128 GPUs, while maintaining linear performance growth.
In June 2012, Poul-Henning Camp, author of the md5crypt () function, which is widely used in FreeBSD and Linux, asked the community to stop using its function. About it even the
material on Habré came out. The author feared a situation where an attacker could get more than 1 million checks per second on computer hardware available in regular stores. Password Cracker on 25 GPU exceeded Poule-Hennigna Kamp's fears 77 times, and the ability to scale it 5 or more times makes brute-force hashing even more vulnerable to collisions: if now the “standard” eight-character password is recruited in 6-8 hours, then 128 GPU such a search can be reduced to an hour.
I will not touch it
Perhaps no one will ever hack your company, and at home you do not keep anything valuable / compromising / important. No one is immune from leaks in major firms: relatively recently, the LinkedIn social network “lost” six and a half million password hashes. If the farm “solve” the passwords on AMD video accelerators (and not on professional hardware), then about 90% of the passwords could be processed in a reasonable time.
A long and complex password (provided it is properly used, is not stored in an open form, and so on) is half the protection from such powerful computing systems. Of course, there are other approaches (like “salted” hashes), but it is not always possible to make changes to the existing algorithm or work product, and lengthening the minimum password to 13 or 20 characters is easier than ever.