⬆️ ⬇️

We sort email in Java

To my last project, written 80% in Java, I had to add a module - a parser of all letters passing through the server. The religious motives of the module are very strange, but I would like to share some details.



Available:


Postfix mail server with Dovecot delivery service on CentOS. Well, the JVM.



Message structure


What is an e-mail, its constituent parts, their approximate structure, headers and MIME types are humanly described on Wikipedia .

More interesting is the structure of the letter file name on the server. An example of the name of a new (not read / not requested by the client) letter:

')

1348142977.M852516P31269.mail.example.com,S=3309,W=3371 




The name consists of flags. Flags are separated by commas; when creating a new letter, it indicates “where”, “when” the letter came and its dimensions.





Of this, the time to create the letter was useful for me (the first ten numbers). However, often this time may differ from the time in the message header, so the time in the name I used only to filter messages in the directory.



Additional / client flags


The client mail interface (hereinafter referred to as the client) can add flags to the letter name. The beginning of client flags is indicated by the symbol ":"



As soon as the client gets a request for new letters from the server, a request is sent to the transport to move each of the requested letters to the “read” directory and add to the name of the information flag (one of the two) separated from the subsequent flags with a comma:



Despite the fact that the letter on the server is already in the “read” folder, the user will see it as new, because customers read the flags, not the location of the letter.

That is, only when the user himself opens the letter (or another action with it) and the “S” (Seen) flag is added to its name, it becomes visually “read”. Different actions on the letter, as one would expect, add their own flags, see notes.



Example:

A new message has come to the server for our mailbox, its name will look something like the following:



 1348142977.M852516P31269.mail.example.com,S=3309,W=3371 


God forbid Outlook, which requests a list of new emails and tells us to move them on the server to the "read" directory, adding the flag:



 1348142977.M852516P31269.mail.example.com,S=3309,W=3371:2, 


Next, we delete open Outlook and click on the new letter, while adding the S flag:



 1348142977.M852516P31269.mail.example.com,S=3309,W=3371:2,S 


And then another answer to it and delete:



 1348142977.M852516P31269.mail.example.com,S=3309,W=3371:2,SRT 


As we can see, flags are listed without separators.



Notes: some clients have the ability to customize (not) moving the letter to the "read" folder. Also, clients sometimes add flags not specified in the documentation “for their needs”, which I didn’t pay much attention to.

More useful information about flags: cr.yp.to/proto/maildir.html



And a little java


I used javax.mail to work with letters. We are kindly provided with the abstract class javax.mail.Message , although in this case I limited myself to javax.mail.MimeMessage .

The module rotates on the server, so we access the messages locally (checks and exception handling are omitted in the code):



 //   properties   Session session = Session.getDefaultInstance(System.getProperties()); FileInputStream fis = new FileInputStream(pathToMessage); MimeMessage mimeMessage = new MimeMessage(session, fis); 


Now we can count the headers of the letter, which are expected in ASCII. If the title is not found, then null will be returned to us. For example:



 String messageSubject = mimeMessage.getSubject(); String messageId = mimeMessage.getMessageID(); 


To determine the list of recipients, we are provided with the getRecipients method, which takes as a argument Message.RecipientType. The method returns an array of objects of type Address . For example, we list the recipients of the letter:



 for(Address recipient : mimeMessage.getRecipients(Message.RecipientType.TO)){ System.out.println(recipient.toString()); } 


To find out the sender (s) of the letter, we have a getFrom method. Also returns an array of objects of type Address. The method reads the “From” header, if it is absent - reads the “Sender” header, if it is absent and “Sender” - then null.



 for(Address sender : mimeMessage.getFrom()){ System.out.println(sender.toString()); } 


Next we analyze the message body (in most cases we need text and attachments). It can be composite (Mime multipart message), or it can contain only one text / plain format block. If the body of the letter consists only of an attachment (without text), it is still marked as a multipart message. According to RFC822, the format is specified for the message body (and its parts) in the Content-Type header.



  //        if(mimeMessage.isMimeType("multipart/mixed")){ // getContent()    ,   . //   - Object,    Multipart Multipart multipart = (Multipart) mimeMessage.getContent(); //       for(int i = 0; i < multipart.getCount(); i ++){ BodyPart part = multipart.getBodyPart(i); // html-   , "text/plain"  "text/html" (     html ),       : if(part.isMimeType("text/plain")){ System.out.println(part.getContent().toString()); } //    part  else if(Part.ATTACHMENT.equalsIgnoreCase(part.getDisposition()){ //     .    ,  decode String fileName = MimeUtility.decodeText(part.getFileName()); //  InputStream InputStream is = part.getInputStream(); //    ,  -    .... } } } //          else if(mimeMessage.isMimeType("text/plain")){ System.out.println(mimeMessage.getContent().toString()); } 




That's all. Hope the material can be useful.

Also on oracle.com there is a useful FAQ on javax.mail.



UPD: As stated in the first comment, parts of the message body can be nested inside each other. There, in the comments, there are two ways to sort them out.

Source: https://habr.com/ru/post/153415/



All Articles