Today I want to talk about how we did it so that Yandex.Mail does not need to include the display of pictures in each letter, as many places have to do, and in general - how we provide protection when displaying the text of the letter. What is not such an obvious thing as it may seem.
Our users receive about one hundred million emails per day and about thirty million read through the web interface. This is a huge field of activity for intruders, so we have built a multi-level protection.

')
Emails are scanned for spam, phishing, viruses, malicious content and links. Under the cut, we will tell you about what protection mechanisms pass the letters before being displayed in the user web interface.
Spam Defense
More than 90% of all e-mails on modern Internet are adware or malicious spam. Spam Defense effectively prevents attempts to send spam, including unsolicited mailings, which not only offer you to buy something, but try to actively attack you personally or your computer. There are several attack vectors here. The most common are phishing and malware distribution: viruses, trojans, bots.
Daily Spamooborona stops 50-70 thousand phishing emails per day. Phishers try to attack both Yandex accounts directly, and use addresses on Yandex to try to steal accounts on other services. Interestingly, very frequent attacks on accounts in online games and game stores, such as World of Tanks or Steam.
We analyze the texts of letters, as well as use technologies such as
DMARC to identify and combat phishing emails.
Antivirus
The next level of protection is the good old antivirus. In 2014, real live computer viruses in the form of executable files are emailed quite rarely. We see no more than a few thousand such letters per day. However, each such file is potentially very dangerous, so it is vital for us to prevent our users' computers from becoming infected. In this we have been helped for many years by Dr. Web, whose server antivirus we run on a fairly large cluster of thirty machines in the full scan mode of all incoming emails.
The second phase of antiphishing
Spam Defense filters emails when they arrive at our incoming mail reception server (MXs). An important and understandable disadvantage of such a regime is that sometimes the threat becomes widespread, and we already have good signs to detect it after a certain number of such malicious emails have been received into user boxes. For such situations, we have another level of checks that is performed when each particular letter is displayed in the Yandex.Mail web interface.
Immediately before sending the letter to the screen to the user, its text is scanned for a match with a short list of strings and regular expressions. An important feature of this list is that it can be edited instantly, which is very important in the case of an attack that occurs right now. Every second, Yandex.Mail accepts up to several tens of thousands of letters and the delays inevitably associated with updating large databases are very harmful here.
Sanitizer
Modern email is HTML. As we, advocates and observers of purity of standards, did not resist progress, users made their choice. Alas, this also means that all the richness of expressive means that is now available in HTML, turns into the need to very carefully scan the markup and prevent attempts to use "active" elements to attack web mail users. HTML was developed without taking into account situations where one document is safely embedded inside another, and this is exactly what needs to be done when displaying an HTML letter in the web interface (that is, in fact, inside another HTML page). We simply called this component a sanitizer. It parses HTML at the level of characters, rather than element objects, since many attacks on web interfaces use markup that is not valid from the point of view of standards of the language to bypass the simplest checks. Now HTML is no longer a single language, but a whole family, and the sanitizer separately is able to parse the CSS description language within certain elements and attributes. The result of the work of the sanitizer is a simplified text of the letter, which can be safely inserted into another HTML page and not be afraid that any script will be executed, which will be available to the entire DOM of the web interface or that the styles from this internal block will suddenly affect the elements beyond.
Letter display
For quite some time now, the Yandex.Mail web interface has been working only using the secure HTTPS protocol. In addition to fully encrypting the traffic between our servers and the user's browser, we also use an additional secure cookie to authenticate requests that is not available without encryption. Thanks to her, even if your provider stole your authorization while you were without encryption, but you watched the Yandex Poster in a logged state or simply visited any site that uses the Yandex.Metrica counter, it will not be able to get into your mailbox. To do this, he will need an additional cookie, which can be obtained only by decrypting HTTPS traffic.
Check for viral links with untwisted shortened links
We do not dwell on this and continue to protect our users even after the letter is already displayed on the screen and can be read. The most dangerous emails try to take the user out of the Yandex.Mail protected interface to the outside and already there do something bad with it. Therefore, at the time of the click to any external link in the letter, one more step is triggered - checking the link in the database of malicious links of a large Yandex web anti-virus. We have already
written a little about this before, but it would be appropriate to repeat that our web antivirus constantly “indexes” (just like the Yandex.Search spider robot) millions of pages on the Internet in search of viruses and other malicious programs and makes up the most complete and relevant world map of infected internet. Even if you are reading an old letter containing a link that was absolutely safe yesterday, but today already leads to an infected page, we will warn you about it and make efforts to avoid infection.