
The post is more suitable for the hub "
Spam and antispam ", but globally the post is still for the off-topic "I am promoting".
Recently, I had a task to collect email addresses of schools to invite participation in interscholastic competitions. On Habré there were several posts devoted to collecting email from commercial and non-commercial sites. I have not seen a single truly effective and civilized option for an automatic or semi-automatic option, although from time to time such a need arises. 99% of the tools are email generators and “sales databases”, or desktop buggy software that you don’t want to use.
Lyrical digression. The topic of spam and antispam is a very thin line, so I will immediately give a definition: a civilized (or delicate) way - in respect of those who will receive the newsletter. The manual version of the list is the most optimal, but the speed of modern life makes it necessary to automate everything that is possible, because The task of any mailing is to inform a large number of people in the shortest time possible.
A couple of weeks ago, I was approached by the developer of the
spider-post.com service, which solves this problem. He suggested that I test the resource and post a review on Habré. I agreed, because I am interested in the topic, but I did not find similar tools. I would be glad to see in the comments links to other services.
')
All your questions will be transferred to the authors of the development. Answers to them will appear in the comments.
A variant of a compromise solution to the task of collecting email seemed to me like this:
- select the sites related to your business according to certain criteria;
- pick up email from them;
- check for validity;
- clean manually from email, the appearance of which is in doubt;
- make a trial newsletter with a proposal to subscribe on an ongoing basis.
Spider Post uses a similar approach.

- you choose a region and set up lists of key phrases that characterize your business;
- the service selects sites on search engines according to the specified parameters and collects lists of emails. The ready list can be received within several hours. According to the developer, the service analyzes what is written after the “@” and checks whether the site is alive and the email, how old the resource is, whether it is commercial;
- After that, the lists can be downloaded and cleaned manually (the report also contains the website addresses in order to clean more efficiently).
I conducted testing on several topics in which I understand something (high schools, phosphors, perimeter security). Results and conclusions below.
Screenshot of the completed order page:

Detailed results information:

The impression is ambiguous.
- I asked highly specialized queries to minimize the possibility of getting into the final list of garbage.
- In all cases, the email database was obtained with tens of thousands of addresses and a large number of "strange-looking" email.
- “Prosherstit” such a document manually is simply unrealistic, and most of the B2B market can hardly boast of such a number of participants, and accordingly, email.
Some tips to developers on the functionality. What I would like to add to the functional:
- The ability to use the query language of search engines, thanks to which it will be possible to narrow the number of sites for selection.
- Collect additional information. In addition to the site address - a rubric and its description from the Ya. Catalog or from the issuance of a search engine
- The ability to specify which sites need to collect addresses (for example, according to my task with schools there are federal resources)
These simple additions will reduce the percentage of garbage in the lists and simplify further cleaning.