How difficult is it to build a full-fledged email marketing service? What do you need to foresee? What pitfalls can meet on the way of inquiring minds of developers?

Let's try to figure it out together. Within the framework of several articles I will talk about how I have done my own email-mailing service in a year, what lessons I learned for myself and what I plan to do with all this further.
')
At once I will make a reservation that the article deals only with the technical side of the issue.
Briefly about yourself
I have been writing in Python for 5 years now, I mainly use Django, PostgreSQL, I can write JavaScript at the jQuery + KnockoutJS level. In my spare time, I work on freelancing and my own Internet projects, one of which I now plan to tell about. I have been involved in this project for about a year.
Objective of the project
At the very beginning, I set a rather simple goal - to create a working solution for sending transactional letters and email newsletters with functions of tracking openings, transitions, inability to deliver a letter, complaints about spam. I planned to use this solution in my other projects, because Yandex PDA (mail for the domain), which I used before, did not have such functions, but they were needed.
Then there was no talk of giving this solution in the form of SaaS to all users on the Internet.
Tasks
- Understand how event tracking works in email newsletters, deal with tracking.
- Come up with a solution that will work under average loads (2-3 million letters per month). Why exactly 2-3 million? I believe that such a volume is necessary to recoup such a project (time spent + material resources such as servers).
- Implement a convenient interface for analyzing mass and transactional mailings.
Next, I will try to more or less elaborate on how I completed each of these tasks.
Technology
I decided to use the technologies that I know - Python, Django, PostgreSQL, KnockoutJS, LESS, py.test.
Additionally, in the course of working on the project, I quite well figured out Celery and microservice architecture.
At this point I propose to finish the introductory part and move on to the most interesting - practice.
How does email tracking work?

When you send mail, you probably want to know whether your letters reached the addressees, read them at all, if they are interested in them, if they clicked on the links in the letter, what they did on the site after that, if order, call and so on.
You can get answers to these questions only with the help of tracking or Yandex-Metrics systems (well, or by asking your recipients personally).
Email Tracking
Today, the standard approach for tracking letter openings is to inject a special pixel into a letter — you can see this pixel in most of the advertising letters in your mailbox, if you look at the source of the letter. It might look something like this:
<img src="http://api.mailhandler.ru/message/track/<UNIQUE_EMAIL_ID>/OPENED/" width="1px" height="1px" border="0"/>
It is clear that when requesting a URL specified in the image
src attribute, an event should be added, indicating that the letter with
id equal to
UNIQUE_EMAIL_ID was open.
However, not all so simple. Very often in src images indicate the URL leading to any php script and do not think that the mail service really wants to receive in return valid headers for the image, as well as the image itself. If the mail service for this reason is disappointed in your pixel, it will simply cut it out of the letter and you will not know whether your recipient has opened the letter or not.
In order to avoid this, you must add the correct response headers and give a valid image to the client. The implementation on the Django Rest Framework might look something like this:
class TrackMessageView(APIView): renderer_classes = [JPEGRenderer] @property def pixel(self): return open(os.path.join(settings.STATIC_ROOT, 'site/img/pixel.jpg'), 'rb') def get(self, request, *args, **kwargs): manager = BaseManager() message = manager.get_message_by_unique_id(self.kwargs['unique_id']) if message: manager.track_message(message) return Response(self.pixel.read(), status=201) return Response(status=404)
Follow the links in the letter

I think an inquisitive reader should have no problems with the implementation of this type of tracking. In general, each link in the letter is replaced by a link through a special redirection service that creates an event like “Follow the link.” Additionally, you can add a unique identifier to each link - then you can implement the “heat map” of the letter. This is a very useful feature, for example, for A / B testing.
The implementation in Python looks quite simple:
REDIRECT_URL_TEMPLATE = '%s/message/redirect/%s/' HREF_REGEXP = r'(?<=href=(\"|\'))(http|https)([^\"\']+)(?=(\"|\'))' ... def replace_links(message): redirect_url = REDIRECT_URL_TEMPLATE % (settings.API_URL, message.unique_id) message.html_body = re.sub(HREF_REGEXP, r'%s?next=\2\3' % redirect_url, message.html_body) ...
Tracking undeliverable emails
But with this all the more interesting.
Every time the mail server cannot deliver your letter, in response to the sender's address, the report on undeliverable with a description of the reason (sometimes detailed, sometimes so-so) is sent. To process these incoming emails, I used the approach that consists in forwarding the incoming Python email to the handler script via
/ etc / aliases .
An example of a piece of letter for analysis:
Final-Recipient: rfc822; ****@****.ru Original-Recipient: rfc822; ****@****.ru Action: failed Status: 4.4.1 Diagnostic-Code: X-Postfix; connect to ****.ru[xx.xx.xxx.xxx]:25: Connection refused
The script itself tries to more or less intelligently understand the reason for the inability to deliver the letter and creates a Soft-Bounce event (the letter cannot be delivered
at the moment , but you can try again) or Hard-Bounce (the letter
will never be delivered , for example, because the box does not exist).
Here it is important to make a small remark on how to actually respond to such events according to the rules of mail services such as Mail.ru, Yandex, etc.
Subscription-based distribution services must unconditionally remove subscribers from the subscriber database or take measures to suspend mailings to addresses that generate an SMTP protocol error: 550 user not found (tracking the validity of the recipient database is a necessary condition for maintaining a good reputation of the sender);
»
Link to the list of rulesThus, it was necessary for me to provide for “switching off” of subscribers to whose addresses it is impossible to deliver mail. As a result, I came to the conclusion that I turn off the subscriber from all the lists of subscribers to the service.
Well, with tracking it seems to be sorted out.
Some statistics
At the moment, about 150,000 emails per month are being sent through my service. Is it a lot or a little? Probably not enough, considering the volumes that I set myself within the framework of the indicated tasks.
Of them:
- 20% - open (this is quite a large percentage, in fact, thanks to the transactional mail)
- 13% - referrals
- 9% - Hard / Soft bounce
PS
In the following articles I will talk about how and how I process this data, talk about the intricacies of using celery in such projects, as well as focus on what I plan to do with this service further.
Thanks for attention!