📜 ⬆️ ⬇️

We read Habr by mail

image
Everyone read Habr. And me as well. But once a proxy was denied access to it. In this regard, it was decided to do something that would allow reading Habr's articles.
The result was a service running on a home computer that checks for new articles on Habré, sends article headers to mail, and also sends the articles themselves upon request. Communication with the service only by mail. More under the cut.

The service is running on a home computer. The service starts four tasks that are executed after a specified period of time:
  1. (Every 15 minutes) Checking for new articles on hubs. Numbers and titles of new articles are stored in the database.
  2. (Every 5 minutes) Sending to subscribers numbers and titles of new articles that have not yet been sent
  3. (Every 2 minutes) Receive request from test mail. The request is saved in the database
  4. (Every 2 minutes) Processing the request and sending the full article to the sender's mail

More about the job:
  1. Hubs a lot, download articles for all for very long. A small list was selected for testing. Only 47 hubs. When performing the first task, the service downloads the pages of the hubs' articles, parses the text, and puts new articles in the database (not the article itself, but only the title and number)
  2. When you perform the second task, the service selects from the database the numbers and headers of the new articles (which have not yet been sent) sends all subscribers
  3. When the third task is performed, new letters are downloaded from a special mail habrpost@mail.ru. Subject of the letter and the sender are stored in the database
  4. When the fourth task is performed, the received letters are analyzed. Depending on the subject of the letter, the result of the assignment may be the following:
    • The sender subscribes to the newsletter receiving new article headers
    • The sender unsubscribes from the newsletter
    • The sender receives an article by mail. If the article has already been downloaded before, then the article will be downloaded. If the article has not yet been downloaded, then the HTML + article with all the images and CSS styles will be downloaded. All this is packaged in the archive
    • The sender receives a newly downloaded article by mail (the article will be completely re-downloaded, even if it was downloaded before)


Well, the most interesting. I would appreciate feedback. You can test the service. Here is a list of commands:

Description:
- Write a new letter to habrpost@mail.ru;
- The team is written in the subject line;
- The command is recorded only numbers;

Command Description:
1 - subscribe to the newsletter (receive the titles of new articles by mail);
0 - unsubscribe;
123456 - get article No. 123456 by mail;
123456 1 - reload article No. 123456 and receive the article by e-mail;
')
PS October 10, stopped the service to analyze the work of the service. The results can be found in another article .

Source: https://habr.com/ru/post/238795/


All Articles