In continuation of the work that the company 
2Parts and 
Anton Isaikin did half a year ago, we (aldonin and dasm32) decided to scan 1,000,000 of the most popular sites of the modern Web, ranging from google.com to wordpress.com.
We used Perl to write a scanner. Its first version did not use the rich possibilities for creating and using threads. But when, within 3 days, as a result, only 25% of the websites were scanned - a paltry 250,000, the question of increasing productivity was urgently raised :)
After a little, but invaluable help from comrades with 
perlmonks.org, multithreading was fully involved, and the rest of our base was checked in just one day.
The results, of course, surprised us. 
It was found about 
4,500 (more precisely - 
0.43% ) sites with the above "vulnerability." The percentage was even slightly higher than that of 
2 Companions and 
Anton Isaykin . Among them were a lot of large and popular portals and services, the names and addresses of which we will not publish, following the principle of the authors who discovered this aspect of the negligence of many webmasters and administrators. Also, during the scan, only one angry letter was sent to our server from one German site, in which, by the way, it was written that we “load” their web server :)). One pathetic request. Anyway.
')
Despite the fact that in our time access to the global computer network is already present in almost every home, and the news of the IT community have long become international, our foreign colleagues did not seem to know about the danger that may even allow an ordinary user to get their hands on holy of holies - working mechanisms of other people's web projects, large and small. We were rather surprised by such disorder, which was inherent in even the experienced and “battle-hardened” creators of web services.
It is quite clear that, for example, for most Russian employees of the IT industry, reading information in a foreign language is not such a problem, especially if there are many computerized dictionaries, but not all of them were even ready for such a “check” - when scanning Approximately 
80 large-scale projects in the .ru zone were discovered with open “doors” to get the source code.
StatisticsThe most popular zone with open curious eyes for SVNs, as expected, will be the .com zone, which has a good half of vulnerable sites. The distribution of sites by geographic domain zones can be seen in this chart.

It was also analyzed by PR rating (PageRank - link ranking from well-known Google).

And as it turned out later, according to McAfee SiteAdvisor, out of 4373 sites around 43x were detected malicious scripts.
A bit about the "vulnerability"Using a specially crafted request in the browser, we can get a peculiar list of project files, as well as their owners and the time of the last change, as well as the source codes of the site pages.

Although it does not always come out :)

How an attacker can take advantage of the information received is known only to him.
Perhaps he will simply take a list of users and begin to pick up the password to the administrator part of your site, having logins that could be used by you more than once. Maybe he will use access to the sources to get files with configs like config.inc.php, in which many popular content management systems like to store data to connect to the database server, or just download the entire site and calmly search for vulnerabilities in it already your computer, without disturbing your server with suspicious requests. Or maybe he will use the source code, and put in the Network an analogue of your service ... But you never know what else you can think of ways to use this "goodies"?
How do we protect ourselves from this? We will not be engaged in impudent copy-pasteing, but will only send a concerned reader by reference to the 
pioneer post.
In the lower part are the necessary recommendations for protection.
This does not apply to you? If you don’t use SVN on your website, then yes. Otherwise, try to follow the link 
vash-site.ru/.svn/entries and check that your entries-file does not “shine” to the whole world.
In the end, I would like to say that we did not set ourselves the goal of “download the sources of sites”, so we did not receive a single source, as well as the entry files were not saved. Now we are gradually drawing the attention of owners of scanned sites to their mistake. For the reason that the information has just begun, we, alas, will not show you examples of vulnerable sites. But if the owners agree with us, the links will be provided if you wish.
Sincerely, aldonin and dasm32.