📜 ⬆️ ⬇️

Typos bring Google $ 500 million per year

Everything is very simple: the so-called tayposkvotter register domains with "typos" to collect random traffic, and place contextual advertising there, usually Google AdWords. At the Financial Cryptography and Data Security conference, experts from Harvard published their research (PDF), in which they tried to estimate the size of the market. The authors also make the assumption that Google provides technical assistance to domainers and shares profits with them.

According to their estimates, there are at least 938,000 domains on the Web, which are erroneous spellings of the 3264 largest sites in the .com zone (at least five-letter counts were taken into account). Each popular site has an average of 281 domains with typos. “Erroneous” is about 1.16% of the entire Internet in the .com zone.

A little about the research methodology. Typos were generated according to the Damero-Levenshteyn model , that is, each letter replacement, absence of a letter, an extra letter or change of letters in places is considered that a new word is at a distance of 1 step from the original. For the study generated a list of domains in two steps from the originals. Plus, specific network typos were added (for example, the letters www at the beginning of the name of each site, etc.). For the 3264 largest sites, 1,910,738 candidates turned out. Then, a random sample of 2195 sites was compiled, which the researchers checked manually to determine the percent confidence. According to the results of the check, the estimate of the number of Taipekvotter domains was reduced to 937,918.

As part of the study, a crawler was launched, which bypassed 284,914 domains from the list assumed to be Taipokvottersky. It turned out that contextual advertising was placed on 80% of the available sites, and a redirect cost on the remaining 20%.
')


A large percentage of blocking is due to the fact that tens of thousands of TIPS domains are hosted on some servers, so crawler access was blocked as part of the usual protection against DDoS attacks. The absolute majority of them then open normally from other IP addresses. As for the "unclassified" domains, these are mainly websites using JavaScript, which the crawler cannot handle normally.

What kind of contextual advertising is placed on tayposkvotterov domains? 36% is contextual advertising of the original site with the correct spelling. The bulk of the rest is links to its competitors.

Were also identified 1,250 identifiers partner program Google, which advertise on these domains. The identifiers can be seen in the URL after the “client =” parameter. So, it turned out that some of these identifiers are more common.



The five largest partners of Google cover 63% of the market, and the top 10 cover 76% of the market.

Among affiliate programs, the most popular are Commission Junction (905 domains from the sample) LinkShare (652) and Performics (Google Affliate Network, 290).

As for the redirect, 75 legitimate websites were identified that collect traffic from the Typesquot domains. For example, the image hosting service Pict.com receives traffic from 128 domains, where the names of competitors are mistakenly written. Or the well-known casino Bet365.com collects traffic from domains where the competitor’s name Sportsbook is mistakenly written (saportsbook.com, sxportsbook.com and another 326 variants).

via New Scientist

Source: https://habr.com/ru/post/84675/


All Articles