📜 ⬆️ ⬇️

A little bit about the privacy of real Git repositories.

logo

Introduction


Hello, dear readers. Today on the agenda we have a little testing -
the first ≈100 thousand in popularity of sites on the Internet (ranking based on attendance statistics with Alexa Rank ). It is worth noting that this testing will be quite narrowly focused, namely, we will check each site for the existence and openness of the Git repository without authentication directly from the web at the url of the desired one. Let me remind you that such a security breach often allows you to read current source codes on the server, get sensitive information (files of configs, system structure, etc.) and, later, get some kind of rights on the server. Paradise for various kinds of villains, and only :)
I did a completely similar test for myself about 100 days ago, and today we will do it again, see what has changed and what to do with it.
Of course, we will use the list of sites obtained in the first test.
For those interested, you are welcome under cat.

* All information described in the article is provided solely for research and informational purposes.

What's happening?


So, first you need to understand that the number of sites is not the smallest, and manual verification, of course, cannot be done. Solution - we write an automated utility to check.
Generally speaking, in practice, a sufficient verification condition is very simple:
We assume that the Git-repository is open and accessible from the web without authorization , if the config file is available for reading at http (s): //sitename.com/.git/config ( funny, sometimes this file also contains data to connect to the git-server, but we do not need this at all ).
')
Here the main point is that many developers close access to view the directory /.git/ , but forget to close access to the files / directories inside it. Thus, if we were able to read the config, then almost always we will be able to read the file /.git/index (a list containing all the files), and, in fact, we will be able to read all the available sources (from the directory /.git/objects/ , converting blob objects to the original file view). To do this, you can use any git-dumper (for example, this one ), or write your own.

Tests and analysis


Using this information and writing a utility (you can look at the main code here ) to check the above-described item, we get the following:
Testing # 1
Date: December 11, 2016
Number of sites tested: 99991
Open Git repositories: 639 (0.64% of the total)

Testing # 2
Date: March 21, 2017
Number of sites tested: 99991 (the same list of sites as the first time)
Open Git Repositories: 599 (0.60% of total)

It is noteworthy that the operating time of the utility on a home laptop (at an Internet connection speed of 20 Mb / s) was about 16 minutes, which is not much.
So, for 100 days the number of "open" repositories (from my sample) was reduced by 40 pieces. This is about 6% of the initial amount.
How many developers changed their minds? No, it does not seem (at a pace only after 4-5 years, we can expect corrections of problems on this sample).
In general, of course, the percentage of open repositories is small. But on the other hand, taking a sample of, say, one million sites — this is already about 10,000 sites with a similar gap.
In this case, you need to understand that these are the most popular sites according to Alexa Rank, which means that they must be protected. Presumably, the further down the list, the more often open repositories will come across.
Among the sites found were found sites with a very large audience (> 1kk unique / day), as well as resources from various educational institutions ( including Russian leading universities, among them some are released in the direction of web security ) and organizations. Even the site of a well-known archiver suffers from this.
image
( * Example of received source codes. Pay attention to SQL dumps and authorization list )

Attack vector variant


In order for readers to better understand the danger of this oversight of developers, let’s throw one of the possible scenarios for hacking the server:
  1. Found open git repository
  2. Access to the source code of the site is obtained by dumping
  3. Among the sources found files with names like config.php / database.php
  4. The files found sensitive information. Namely - data to connect to the site database (say, MySQL DBMS)
  5. The site also found phpMyAdmin - connected to the database using the data from the items above.
  6. Found a couple of login-password from the admin panel (probably, after decrypting the password hash)
  7. Through the admin panel executed the malicious code on the server / flooded the shell / etc
  8. As a result, whatever. Depends on the goals and opportunities provided

And this is not the easiest scenario, requiring a certain set of circumstances.
It is very sad, but often developers are fans of keeping backups in the database directly in the repository. Then it remains only to download it and, after finding out the sensitive information, apply it against the server.
Also, having studied the source code, you can find other vulnerabilities (for example, sql injection ) or paths to executable files that allow you to administer the resource. Or simply "merge" all available sources. There are lots of options.

The most interesting thing is that almost the entire process (from extracting the URLs of sites to obtaining source codes) can be automated. Moreover, such solutions already exist, and attackers successfully monetize your resources.

I, of course, do not provide lists of checked sites and result files. If you wish, you can take the TOPs of sites, and test them yourself.

How to protect yourself?


In summary, I’ll note the main steps for the privacy of your git repository:
  1. Completely block access to the repository directory from the web (if it is already looking there), incl. to read files and subdirectories (how to do it - depends on the web server). Make sure that you cannot read such files /.git/config , /.git/index and others (even without these files, having access to the folder /.git/objects/ you can merge the source code by sorting the addresses - so close everything from other people's tricky ones eyes and hands).
  2. Using, for example, .gitignore, configure ignoring files with sensitive information (backups, configs, and the rest) from the repository - they are completely unnecessary there, and provide a serious security threat. When executing this item, even if the attacker penetrates your repository, he will not be able to do anything significant (except that he sees your bad-style programming will be able to divert your source code).

By following these rules, you can avoid the situations described above. But you also should not forget about other types of attacks on web applications (and not only) - but in this article we will not talk about them, because they are not associated with repositories.

Peter was with you
Thanks for attention.

Source: https://habr.com/ru/post/324530/


All Articles