📜 ⬆️ ⬇️

Unique Carnegie Mellon University Password Database Study



A recent study of the Carnegie Mellon University password database revealed several interesting correlations between demographic characteristics and the quality of passwords that people use. The uniqueness of this study lies in the fact that all accounts, the passwords to which were studied, belong to the staff and students of Carnegie Mellon University. These passwords were used to access very important data and functions on the university site, restrictions on the length and complexity of the password during registration were fairly stringent. The university database stored detailed personal data of all users, and the authentication server logs contain information on the speed of password entry, successful and unsuccessful login attempts. In total, about 40,000 passwords from existing and disabled accounts were investigated.

Regular password databases, accessible after leaks from hacked sites, contain a lot of garbage in the form of one-time accounts of random visitors with passwords like “12345” or “password” and vice versa, little information about users - usually only a login or email address.
')

As can be seen in the graph above, the strongest passwords were expectedly found among the staff and students of the computer and engineering departments. Behind them are humanists and artists. Those who devoted themselves to the study of politics and business were the most vulnerable. The probability of picking up a password for a computer science student is 45% lower than the password for a business school student. Apparently not without purpose, antagonism between sysadmins and accountants often becomes the subject of anecdotes and tales. Another pattern is that men invent passwords by 8% better than women.

Along with demographic and professional factors, the properties of the passwords themselves were also examined. The fact that the longer the password, and the more numbers in it, the letters of the upper case and special characters - the better, is known to all. However, now it became clear how much better. Adding one letter to the password in lower case reduces the probability of guessing by up to 70% (the probability of selecting the original password is taken as 100%). Upper-case characters and letters reduce to 56% and 46%, respectively. The location of symbols and numbers in the password text is also of great importance. Upper case at the beginning of the password does not provide many advantages. The numbers and special characters in the end also do not work so well. Best of all, if they are "smeared" with a password. The patterns are clearly visible on these diagrams:



Of particular interest is the method of collecting and processing information. How did the clear password base and personal data of users fall into the hands of scientists? Strictly speaking, she did not fall. The study was conducted with the assistance of the security service of the university in fairly harsh conditions. The fact is that for historical reasons, the passwords of users of the university server were not stored as hashes, but as encrypted entries, and the encryption key was stored in the security service. Scientists agreed to conduct a study before the university switched to more modern technology, with hashes and salt.

The password database was decrypted on a separate computer that was not connected to the network, only some security officers had physical access to it. Passwords were stored only in RAM, the swap was disabled. Scientists had to write scripts to obtain statistical indicators, not seeing the data itself. Each line of their code, as well as the output data, was thoroughly checked to ensure that no critical data left the system. Debugging scripts in this mode was very difficult and slow.

After the end of the work, all the original data were thoroughly destroyed. The statistics themselves were selected so that it was impossible to identify a narrow group of users with particularly weak passwords, and to undertake a targeted attack against their accounts - this is why in some cases small groups were grouped into an impersonal “other” category.

Source: https://habr.com/ru/post/201556/


All Articles