📜 ⬆️ ⬇️

How to view 20 million domain names and be satisfied

Friends, welcome! Below you will find a story about how 20 million domain names were analyzed and what came of it. The results can be viewed by downloading the csv-file or by restoring a database dump in PostgreSQL.


image


If you wish, you can play around with the source here or directly with containers, using


docker-compose.yml
version: "2" services: app: image: danieljust/domain-finder-v1 tty: true ports: - "3000:3000" rabbit: image: rabbitmq:3 db: image: postgres environment: POSTGRES_PASSWORD: example POSTGRES_USER: postgres POSTGRES_DB: postgres 

Instructions can also be found on the githab
Enjoy reading!


Disclaimer!


Everything that you see and read in this article is not a call and agitation for domaining, and even more so for cybersquatting. All actions were performed for the sake of interest and, as they say, “for fun”.


Introduction


Many companies wanting to rebrand or are just entering the wide market have a desire to choose a beautiful domain.
For the sake of curiosity, it was decided to look at short 1-3 character domains as beautiful domains.


Terminology



The structure of the table in the database


idsldlengthtlddomainpriceroublepriceavailabledefinitive
oneoneactor1.actor20,0001199520TrueTrue

id - record identifier
sldlength - length of second-level domain
tld - top level domain
domain - the actual domain name
price - price in dollars
roubleprice - price in rubles
available - flag indicating domain availability
definitive - a flag indicating whether the flag was available verified with the registry

Result


In the process of work, interesting combinations of domain names were noticed, you can see them in the table below.


domainroubleprice
2.pizza47981
0.fail23991
a.xyz1199520
ab.xyz299880
ad.money11876
as.mba2400
as.guru11996
at.network23991
js.army47981

2.pizza - Perfect for the beginner pizzeria;
0.fail - for the super-reliable something;
a.xyz, ab.xyz - for those who want to be closer to Google;
ad.money - for advertising space;
as.guru, as.mba - for consulting firms;
at.network - for companies involved in network administration;
js.army - proletarians of all countries, unite.

Most two-character domains, if they were free, bite their price.
In the two-character top-level domains of the countries, four free domains were found (all in the Czech zone), and also for a small amount of 1,000 rubles.
In the three-character top-level domains of the countries there were much more free and at an affordable price.
The number of common top-level domains repeatedly prevails over the domains of countries (country domains make up only 4% of the total number of free domain names)


image


The path to the results


Stage 1. Start


The basis of possible characters in SLD were taken -1234567890abcdefghijklmnopqrstuvwxyz (37 characters in total).
It turns out that we have the number of arrangements with repetitions p ^ n.
Total 37+37 37+37 37 37=$5205options.
Since SLD cannot begin and end with a hyphen, we will exclude such cases and we will get 49284.
But this is only the beginning.


Stage 2. Selecting an API


Many sites let you know if a specified domain is busy via the web interface.
To perform the task of manual data entry is not enough and requires an API that can solve the problem.
During the search, the following options were met and dropped:


  1. provide us with your passport data, and we will give you access to the API;
  2. pay us only once (from 5 to 15 dollars) and get lifelong access to the API;
  3. Payment of access to the API once a month;
  4. each API request is about $ 0.01.

But the soul wanted to bring something useful into the world of open-source, and even as free as possible.
The solution was the API .


Its advantages:


  1. free;
  2. allows you to process up to 500 domains in one request;
  3. API documentation.

Its cons:


  1. limited number of requests per minute;
  2. The responses from the server do not always match what the UI offers.

For example, an API response may contain information that the site is busy and cannot be bought. At the same time, this domain name may be available for purchase through the UI.


How to be confident in the availability of the domain?


In the process of communication with technical support, it was found that with the final confirmation of the purchase of the selected domain, a control check of its availability is carried out.
From observations, the definitive flag makes it possible to conclude that the domain name is busy.


Stage 3. Choice of tools and solution preparation


Using the godaddy API, you can get a list of TLDs in which it is possible to purchase domain names.
TLDs consisting of one word are selected from them (* .com.ru, etc. have been removed). As a result, 400 TLD. Light arithmetic leads us to 49284400=$19,713,60domains to check.
The godaddy API can handle up to 500 domains in 1 request, but has a limited number of requests per minute.
In accordance with the above, the algorithm of the program was as follows:


  1. split all the domains you need to check into 5,000 domains;
  2. put the pieces in the RabbitMQ queue;
  3. take a portion of the data;
  4. divide by 500 domains. Send 10 requests;
  5. process data, put information about free domains into the database;
  6. wait 20 seconds;
  7. If there are messages in the queue, perform steps 3-6 again.

For convenience, PostgreSQL and RabbitMQ were raised as docker containers.


Stage 4. Data analysis


After the work of the script was finished, it became necessary to extract from the obtained data something interesting and useful.
The data is kindly placed in domains.sql and domains.csv .
image


In the following, filtering refers to the search for found SLDs in the list of the most frequent English letter combinations in accordance with this source.


image
image


From the pair of graphs above, we can conclude that the number of free domain names containing commonly used combinations of letters of the English alphabet tends to zero.


The five most expensive domain names


domainroubleprice
ads.cloud11,906,200
vod.cloud11,852,400
usa.cloud11,852,400
seo.cloud11,852,400
vip.cloud11,852,400

Top 5 Cheapest Domain Names


domainroubleprice
xt1.company590
xt1.casa590
xsz.company590
xt1.click590
xt1.business590

Conclusion


That's all Folks!


Having gone through the Internet, many fun domains have been identified. And the most important thing is that new companies should not despair: interesting domain names are still free, it remains only to see them.


')

Source: https://habr.com/ru/post/334992/


All Articles