Hacker, hack yourself

We have come across many stories about how exploits caused irreparable damage to communities or even led to their collapse. When we started working on the Discourse project, we remembered the lessons we learned from these stories. We set a goal to create an open source program that would provide security to all its communities by default - even if there are thousands or millions of them.

At the same time, we also attach great importance to portability, that is, the ability to load and unload data from the Discourse at will. That is why Discourse, unlike other forum services, is subject to a Creative Commons license. Even an ordinary user on Discourse can freely export and download their posts directly from their personal profile.

Forum owners have the ability to create backups and restore entire databases of sites directly from the admin panel in a web browser. Automatic backups by default are made on a weekly basis. I’m not just considered the world's leading backup expert !

By many years of experience, we know that combining security with data portability is also a challenge. Sees ASCII, unload the entire database - this is exactly what hackers will strive for as soon as they manage to get into your system. This is for them, we can say the main prize.
')
To eliminate this danger, we gradually tightened the requirements for creating backups on Discourse in various ways:

The administrator password must be at least 15 characters;
Administrative actions such as creating or uploading a backup are officially logged;
The keys for downloading the backup are one-time and are sent to the administrator’s mail in order to make sure that he really has access to this address.

Information security is tied to the protection in depth, so that all these rigor help ... but we still need to imagine that some Internet villains took over your database. And now what? Well, at first we will understand that in it in general is.

1. Cookies

It is cookies that allow the browser to identify the user. Usually they are stored in the form of hashes, rather than specific values - therefore, even having access to the hash, it is impossible to impersonate the end user. Moreover, in many modern web frameworks, the life cycle of cookies is small, so that in any case they remain relevant for no more than 10-15 minutes.

2. Email addresses

Of course, the fact that their e-mail fell into the hands of hackers is a cause for excitement, but in reality very few people nowadays care about their addresses.

3. All posts and the contents of the branches

For the purity of the experiment, let us assume that the site is completely open to the public and there is nothing secret placed on it. Accordingly, we do not have to worry about the leakage of commercial secrets and other sensitive information (at least for now) - all the posts were still public. Otherwise, the situation would deserve a separate post; maybe someday I'll write it.

4. Password hashes

All that remains is password hashes. And this ... this is a really serious problem .

When a hacker gains access to your database, he can crack password hashes by large-scale offline attacks with full use of the capabilities of any cloud service that he can afford. And having cracked any hash, he gets an opportunity to enter the system under the guise of the corresponding user for all eternity ... well, or at least until this user changes the password.

It is for this reason that if you found out (or even simply suspected) that your database fell into the wrong hands, the first thing you should do is change the passwords of all users.

And what if you do not know? Is it worth it to be proactive and change all passwords every 30 days, following the example of the worst IT departments of mega-corporations in the world? This greatly spoils the user experience, and in itself can lead to pathologies. In reality, it is likely that you will not know about the data breach until it is too late to do something. Therefore, it is crucial to slow down the process for hackers, to win yourself some time, in order to have time to react and deal with the situation.

Thus, you have, in fact, only one way to protect users, and everything will depend on how resistant the password hashes are to the attacks stored in the database. This sustainability is determined by two factors:

Hashing algorithm. It should be as slow as possible and, ideally, designed to still slow down on graphics processors. Why - you will clearly see paragraphs in five.
Operating factor or number of iterations. The more the better, the only restriction is the possibility of a DoS attack.

I read manuals that said that the total operating factor should be made so high that password hashing takes at least 8 milliseconds on the corresponding platform. As it turns out, Sam Saffron , one of the co-founders of Discourse along with me, made the right decision back in 2013, choosing, on the recommendation of NIST, PBKDF2-HMAC-SHA256 and PBKDF2-HMAC-SHA256 and 64 thousand iterations. We took measurements and found out that, in fact, the process takes about 8 milliseconds using our current Ruby login code on the servers used (quite good ones, Skylake 4.0 Ghz).

But that was four years ago. How secure are password hashes in our database today? And what will happen in four years? After ten years? We are creating a program with a reserve for the future and we want to be sure that we are making the right decisions, which will later serve to universal security. Therefore, in accordance with the principle “ design with the expectation of intruders ”, the time has come to put on Darth Vader's helmet and get into the role of a villain - let's crack our own hashes!

For this we will use the coolest graphics processor, known today - GTX 1080 Ti . For reference: at 1080 PBKDF2-HMAC-SHA256 reaches 1180 kH / s, and at 1080 Ti - 1640 kH / s. In just one generation of video cards, the ability to attack hashes increased by almost 40%. Think about these numbers.

To begin with, a small test in the spirit of hello world to make sure everything works. I downloaded hashcat , went to the demo version of our site on try.discourse.org and created a new account there with a password 0234567890. I checked the database and saw that the following values for the hash and salt columns were generated for the new user:

hash
93LlpbKZKficWfV9jjQNOSp39MT0pDPtYx7/gBLl5jw=
salt
ZWVhZWQ4YjZmODU4Mzc0M2E2ZDRlNjBkNjY3YzE2ODA=

Hashcat requires the following input file format: for each hash, one line is allocated, which contains the hash type, the number of iterations, salt and hash (encoded in base64); all this is separated by colons:

type iter salt hash
sha256:64000:ZWVhZWQ4YjZmODU4Mzc0M2E2ZDRlNjBkNjY3YzE2ODA=:93LlpbKZKficWfV9jjQNOSp39MT0pDPtYx7/gBLl5jw=

Let's score all this in hashcat and see what happens:

type iter salt hash
sha256:64000:ZWVhZWQ4YjZmODU4Mzc0M2E2ZDRlNjBkNjY3YzE2ODA=:93LlpbKZKficWfV9jjQNOSp39MT0pDPtYx7/gBLl5jw=

I note that we set a deliberately small amount of work - you need to guess just three characters. Of course, it didn't take long! See the password at the very end? We did it.

sha256:64000:ZWVhZWQ4YjZmODU4Mzc0M2E2ZDRlNjBkNjY3YzE2ODA=:93LlpbKZKficWfV9jjQNOSp39MT0pDPtYx7/gBLl5jw=:0234567890

Now we are convinced that everything works, and we can get down to business. Let's go from simple to complex. How long will it take to brute-force the simplest possible password on the Discourse, consisting of 8 digits - which is only 108, that is, a little more than a hundred million different combinations?

Hash.Type........: PBKDF2-HMAC-SHA256
Time.Estimated...: Fri Jun 02 00:15:37 2017 (1 hour, 0 mins)
Guess.Mask.......: ?d?d?d?d?d?d?d?d [8]

Even with the most powerful GPU, the result was ... well, so-so. Do not forget: in this test we work with a single hash, so an hour is needed for each row in the table (that is, for each user). And this is not all bad news: Discourse for quite a long time does not allow setting passwords of eight characters. What will be the time cost for longer passwords from numbers?

?d?d?d?d?d?d?d?d?d [9]
Fri Jun 02 10:34:42 2017 (11 hours, 18 mins)

?d?d?d?d?d?d?d?d?d?d [10]
Tue Jun 06 17:25:19 2017 (4 days, 18 hours)

?d?d?d?d?d?d?d?d?d?d?d [11]
Mon Jul 17 23:26:06 2017 (46 days, 0 hours)

?d?d?d?d?d?d?d?d?d?d?d?d [12]
Tue Jul 31 23:58:30 2018 (1 year, 60 days)

But the passwords from the same numbers - this is a primitive for the little ones! What if we take real passwords that have at least lowercase letters, or even a combination of numbers + uppercase letters + lowercase letters?

Guess.Mask.......: ?l?l?l?l?l?l?l?l [8]
Time.Estimated...: Mon Sep 04 10:06:00 2017 (94 days, 10 hours)

Guess.Mask.......: ?1?1?1?1?1?1?1?1 [8] (-1 = ?l?u?d)
Time.Estimated...: Sun Aug 02 09:29:48 2020 (3 years, 61 days)

The brute force method “let's try all possible combinations of letters and numbers” something no longer seems such a brilliant idea, even with a high-quality graphics processor. But what if we shorten this period eight times by simply inserting eight video cards into one machine ? The budget of a small company or a wealthy individual may well pull it. Unfortunately, when the processing time is 38 months, a decrease of eight times the weather will not do. Let's better imagine the attack by the authorities, who have enough money to throw several thousand (reducing the period to 1.1 day), and even tens of thousands (reducing to 2.7 hours) of the graphics processors to this task. In that case ... yes. Even assuming that the minimum password length is set to ten characters, you are in big trouble.

If we want Discourse to stand up against government attacks, we need to come up with something. Hashcat has a convenient benchmark mode; Here is a list of the most reliable hashes known to him (that is, which takes the most time) that were tested on 8 Nvidia GTX 1080 graphics processors. From what I recognized, bcrypt, scrypt and PBKDF2-HMAC-SHA512 deserve the most mention. .

The results, which gave this small experiment with hashcat, convinced me that in the organization of storing hashes in the database, we did not make gross mistakes. But I wanted to be absolutely sure of this, so I turned to the services of a security specialist and penetration testing, so that he (having previously signed a non-disclosure agreement) tried to hack our two workers and very popular forums on Discourse .

“I was provided with two sets of password hashes from two different communities on the Discourse, which contained 5,909 and 6,088 respectively. Both used the PBKDF2-HMAC-SHA256 algorithm with a 64k operating factor. With the help of hashcat, my Nvidia GTX 1080 Ti GPU model machine generated hashes at ~ 27,000 per second.

The same list of requirements applies to the passwords of all communities on Discourse:

User password must be at least 10 characters long.
The administrator password must be at least 15 characters long.
Users can choose from two methods of identification: entering a name and password or logging into the system through third-party services (Google, Facebook, Twitter, and so on). If they choose the second option, the system automatically creates a strong 32-character password for the account. It is impossible to determine whether the password was set by a human or generated automatically.

Using the lists of popular passwords and masks in three weeks, I managed to crack 39 of the 11.997 hashes, 25 from the ████████ community, and 14 from the ████████ community.

This information security specialist regularly conducts audits of this kind, so instead of using brute force methods, he used wordlist along with effective patterns and masks that he knew from previous attempts to crack passwords. As a result, he received the following list of passwords (one turned out to be a duplicate):

007007bond
123password
1qaz2wsx3e
A3eilm2s2y
Alexander12
alexander18
belladonna2
Charlie123
Chocolate1
christopher8
Elizabeth1
Enterprise01
Freedom123
greengrass123
hellothere01
I123456789
Iamawesome
khristopher
l1ghthouse
l3tm3innow
Neversaynever
password1235
pittsburgh1
Playstation2
Playstation3
Qwerty1234
Qwertyuiop1
qwertyuiop1234567890
Spartan117
springfield0
Starcraft2
strawberry1
Summertime
Testing123
testing1234
thecakeisalie02
Thirteen13
Welcome123

If you invest 8 times more resources and double the time spent, you can assume that any hacker who is extremely committed or has a good set of wordlists and masks will eventually be able to calculate 39 x 16 = 624 passwords, i.e. 5% of the total number of users. This is acceptable, but more than desired. We are determined to add types to the hash table in future versions of Discourse in order to switch to a more reliable (in other words, slower ) password hashing scheme in a year or two.

This exercise gave me a deeper understanding of what can happen with the worst case scenario - a database leak combined with a professional attack on password hashes. Now I can more confidently recommend our service and vouch for the quality of the work that our engineers have done to make it safe for everyone. So if you, like me, have doubts whether everything complies with safety requirements, then it’s time to check this assumption. Do not wait until you are hacked - hacker, hack yourself!

Source: https://habr.com/ru/post/330302/

All Articles

Hacker, hack yourself

More articles: