TamTam: how we made a new messenger

Hi, Habr! My name is Yuri Buyanov, I am the developer of the TamTam messenger. Today I want to tell you a little about how it was created and how it works from the inside. TamTam is a new Mail.Ru Group messenger, which was developed on the basis of the OK Messages application. In 2016, we made a separate messenger for Odnoklassniki for those who often correspond in social networks and who are more comfortable with a separate application.

The experiment turned out to be successful, so at the beginning of the year we decided to develop the “OK Messages” as a separate messenger from the social network under our own TamTam brand, but with a starting audience. Already in the first weeks after launch, tens of thousands of channels appeared in TamTam, and the audience continued to communicate as actively as in OK Messages. This was made possible thanks to the rapid operation of the application and several technical features. I will tell about them in more detail.

Challenges that prompted ideas

Let's start with the difficulties: it was they who brought us ideas that were later realized in the product and eventually turned into the advantages of the application. This is primarily about the fast and stable work of the messenger.

The starting audience of TamTam is from the most different corners of the world, including the irregular coverage of the mobile network (and sometimes with the complete absence of fixed Internet). In some CIS countries outside of large cities, a 2G connection is actually actually the only window on the Internet.

It was also important that not all potential TamTam users run every year to buy a new iPhone or HOT NEW from Samsung. According to statistics, the most popular device for iOS with our users is the iPhone 5s, and for Android - the inexpensive Galaxy 2014–2015 release. At the same time, TamTam has a rather young audience: 28% of the daily audience is people aged 27–34 years old, and more than half of users (54%) are under 35 years old.

Therefore, one of the priorities in the development of the messenger for us from the very beginning was the optimization of the application in terms of both speed and work with the network . In a word, it was required to make it unnoticeable for users so that the application worked at any level of connection. And with any audience growth, too. TamTam in the first few months shows quite good numbers: the number of installations is already approaching 3 million, and the number of channels is already more than 50,000.

How we made the app fast

The speed from the user's point of view is primarily the launch speed. The time that elapses before displaying new content (for example, when opening a chat with a new message by push-notification). The smoothness of the work in general - in particular the scroll. In the iOS team, we try to test and measure performance on the iPhone 5 and iPhone 4S. The Android team has at its disposal a Galaxy S3 and Megaphone a login for 1000 rubles. As a result, on more powerful devices, the application just flies.

In each test build, you can include a frame counter per second, and the duration of operations in narrow places is recorded in the logs and in the statistics system.

For example, this graph shows the time from the moment the application is launched when opened by pushing until the moment when the user sees this particular message on the screen. Two drops on the graph correspond to the inclusion of content-push on half and on all users.

Despite the abundance of tools and metrics, subjective sensations remain the main tool for assessing application performance. No one can accurately answer what kind of delay in milliseconds is permissible when opening a message screen, but almost everyone can tell if they have a feeling that the application is “tupit”.

How do we optimize? First of all, we take everything that is possible from the main thread: working with the database (more on this later), working with the network, serializing and deserializing data, processing images, and even computing related to text layout.

When we start the application or open the chat screen, performing heavy operations in the background will not save from the visible delay. So some operations like layout of bubbles still need to be optimized in time, while others are better done immediately upon receipt of a message and cache the result of their execution in the database.

When choosing third-party solutions and libraries in narrow places, we also tried to take into account speed and compactness. In particular, this is why we chose MessagePack (and specifically made a benchmark for various implementations for iOS), changed the library for mapping data into objects from Mantle to YYModel, and stopped at lz4 as a traffic compression algorithm.

In addition, to achieve a smooth interface, we symptomatically optimize rendering:

we avoid offscreen rendering, which loads the processor;
resize images in advance in the background instead of using standard UIViewContentMode working in the main stream;
make our UI hierarchies flatter and simpler;
we cache those objects and data whose creation is too expensive. From the height of cells with text to YYTextLayout (an object that stores information about the display of text in the YYText library), NSAttributedStrings, and even UIViews themselves.

In all the lists there is a manual layout without auto layout. Although we also love auto layout and use declarative layout with Masonry in the code - but only where appropriate.

Offline and work with bad internet

When working with a network, we try to minimize traffic and delays by choosing a fast compact protocol and aggressive caching.

As a way to communicate with the server, we use only TCP sockets and a binary protocol. This allows us to both receive updates from the server in real time, and work in a more familiar mode “request - answer”.

The API itself, i.e., a set of commands on top of a low-level protocol, can be implemented in the future, if desired, on top of another transport, for example, on web sockets. With all this, we do not have to touch the high-level logic of the application.

Packages themselves consist of a fixed-length header with service information: command code, protocol version, payload length. Answers to requests can come in a different order and interspersed with server commands, so there is a sequence number in the header that allows you to link the request and the answer.

As a format for payload, we decided to try messagepack. It does not require a hard task of the scheme, it is very compact and has rather nimble serialization libraries for many platforms. In fact, this is an effective binary analog JSON. In order to further reduce traffic consumption, we compress the payload with the lz4 algorithm. We also chose it for the speed and low load on the CPU and the battery.

One of the main ways to ensure the normal operation of the application in a bad network is maximum support for offline mode . The application should cache maximum data, spend less time and traffic on synchronization and be able to postpone sending commands until a connection appears. Moreover, the connection may return even the next time the application is launched, i.e., all postponed sending tasks must be able to be saved in the database.

After connecting, the client is authenticated while simultaneously requesting critical data: settings, a list of contacts and chats with the latest messages. We store the timestamp of the last update (in the server time frame) and pass it in the request in order to get back only what has really changed. After the connection is established, we can receive updates in real time: for example, new messages or changes in contact data.

With the history of messages in the chat is a little more complicated. To load in advance the entire history of all chats is meaningless, but what we received once was that we cache and try not to request anymore. If you look at which parts of the chat history are cached, we will see that there are “gaps” in the story. For example, with the update of the chat list after login, we saw that the last message in the chat has changed. At the same time, we have a section in the database (or several sections) of the chat history, which was cached during the previous session. In addition, we do not know how many messages are on the server between the last message in the chat and the previous cached message, and this adds to its difficulties.

Therefore, in addition to the messages themselves, we store metadata about continuous chunks of history — the chunks we cache. When scrolling chat, we use this information: it helps us to determine whether to load the next page from the database or send a request to the server. Or maybe do both. When new history sections are received from the server, these chunks change size and merge with each other (if the client understands that the newly received history section connects two separate chunks that are available in the database).

Since many operations can be performed offline, we have developed a mechanism for saving tasks. It can run tasks, wait for their execution, save their state in the database or load and run at the start of the application.

Tasks can be saved in the database, they encapsulate all the execution logic. Since dependencies on other tasks and on the state of the application can be quite complex, tracking them is also implemented in the tasks themselves. For example, the task of sending a message with a photo must make sure that the photo is processed, uploaded to the CDN (separate tasks are responsible for this), wait (if necessary) for the network connection, and then immediately try to send the message itself.

Two tricks for smooth application

I will talk a little about a couple of techniques that we used to bypass the limitations of the system, preventing us from making a friendly and smooth interface. For example, iOS-application.

One of the difficulties in developing was the endless scrolling in the chat, i.e., imperceptible to the user, loading the message history while scrolling up the chat. In 99% of cases, the user is just at the bottom of the chat and wants to scroll it up in order to read old messages. Here we have two problems.

First, the constant stumbling on the upper limit of the message list and waiting for loading every few screens is annoying. This problem was not very difficult to solve: we are not waiting for the user to scroll to the very top and see the “twist” there, but try to request the previous pages of the story beforehand during the scroll: both from the local cache and from the server. If there are messages in the cache or on a fast connection, the user simply does not have time to scroll to the very top by the time we can display the new message stack.

The second problem turned out to be much more serious: after inserting such a page to the top of the list of messages (made on the basis of UITableView), the contentOffset for the already loaded section shifts, and the scrolling “jumps”. Of course, we can calculate the size of the inserted page and change the contentOffset back, but this leads to a sharp stop of the scroll animation, which is ugly and discouraging the user. We tried to do this in various ways, including, for example, tracking the contentSize of a table through KVO, but always failed: UITableView is simply not chronically adjusted for elements to be added to the top of the list.

In the end, after a series of attempts, we were able to solve this problem by applying a kind of “hack”: we turn the list upside down with the help of .transform, and then we turn over each cell in the opposite direction. The user does not notice anything, but now the contentOffset is counted from the bottom, and the loading of old messages does not affect it in any way.

This solution has a number of pitfalls, but we also managed to get around them, and they do not interfere with us. First, you need to convert the inverted cell indices into indexes in your data model, and vice versa. If you have more than one section, the calculations will be very complicated, so it’s best to limit yourself to one. Of course, this does not allow us to use floating section headers, which on the chat screen would be useful, for example, to display daily separators in history. But the floating separators in the end was not so difficult to do manually.

Secondly, in rare cases, difficulties may arise with the calculation of coordinates inside the cells, for example, when working with gestures, but they are also solvable. Thirdly, when data is loaded down, the problem returns, but loading when scrolling down occurs very rarely, so for us this is not very difficult. In this case, we do not preload when scrolling, but wait until the user scrolls down to the bottom of the table, then show the download indicator, update the table and change the contentOffset.

The second difficulty we encountered was animated and asynchronous update lists. If several independent updates occur almost simultaneously (for example, the history page is loaded at the top of the chat and a new message arrives at the bottom), then the data used by the tableView delegate may change, even if the previous update has not finished animating.

This may result in the UITableView rendering the wrong cell or even dropping it altogether: this is even more likely if you use the previous hack. You can, of course, refer to the reloadData method, which is synchronous in the UITableView, but this leads to blinks, scrolling stops and other things that irritate the user.

Especially for such cases, we made a separate queue for the sequential processing of such updates. All changes to the model and display them on the UI are made inside the blocks that are queued. In this case, the block can lock the queue at the start of the animation or some other asynchronous operation and unlock it upon completion. Thus, all work with the table goes sequentially, and the data does not change until the previous animation is completed.

Persistence

For caching data in the iOS client, we use the YapDatabase library .

YapDatabase is a Key-Value store on top of SQLite with a very large set of features. This library seems to me much simpler and more flexible than CoreData. Here you can choose the mechanism of serialization of objects in the database: the default is NSCoding, and we use the same MessagePack.

YapDatabase does not require the inheritance of objects from the base class or the implementation of a protocol; it does not bind objects to the context. Reading and writing is done using synchronous or asynchronous transactions.

And with the help of the extension system, all the same features are available as in the “real” database: arbitrary SQL queries and indexing of several fields, full-text search, change subscription (as in NSFetchedResultsController), encryption enabled, working with CloudKit, etc. . Hello-world I will not give examples of working with the database here, they are in the wiki on github .

For my taste, YapDatabase improves the efficiency and clarity of the code, but some of my colleagues do not like it very much. And they can be understood: after a long work with CoreData, in order to switch to YapDatabase, you really need to turn the brain somewhat.

In addition, during asynchronous work with the database through several connections, it is necessary to understand well how the database handles parallel read and write requests: through one or different connections . And also remember that objects are updated in the database entirely. You can not just save the copy that you read some time ago and modified. It is necessary to read the object from the database, change it as you need, and write it back in a single transaction. Otherwise, you can accidentally write outdated data to the database.

In general, working with the base is very conveniently built into our reactive code writing style. Asynchronous transaction templates (read / write / modify an individual object) are very easy to wrap, for example, into ReactiveCocoa signals, and build in the work with the database in one chain with sending and processing network requests.

Application architecture

I will not talk a lot about architecture, but they don’t allow me to mention its genre laws, as they say. There are already a lot of reports and articles about MVVM (for example, a classic tutorial in the version for Objective-C b RAC: part 1 , part 2 , or an article about the implementation of this pattern for Swift).

Under the ViewModels layer there is a set of services that implements (and, if possible, encapsulates) business logic, logic of working with the protocol and caching. Navigation in the application is carried out using a so-called router, i.e., an object encapsulating the code necessary to open a particular screen. In fact, there are several routers in the process, since the router has a tendency to become such a bold God Object. Therefore, wherever possible, we try to decompose it. For example, a separate router is responsible for the entire process of registering / authenticating a user.

From the experience of previous projects, we knew that Dependency Injection greatly simplifies the structure of the application and greatly facilitates changes in the architecture. At the very beginning, we used the Typhoon framework for DI, but during the optimization of the application launch time, we found out that resolving dependencies takes an inadmissibly long time at the start of the application (units of seconds on weak devices). Therefore, we switched to manual DI through property-based injection. I would not say that the code has become more: the level of services in the application is usually configured in one class, and the entire configuration of services is easy to read. For share and imessage extensions, of course, services are configured separately, since in this case they need a much smaller set of them.

Thus, the coherence of the code was initially not very large, and even after quite a long time after the start of development, we were able to easily transfer part of the services and serving code to a separate library (more precisely, even a set of libraries) that implements most of the internal logic of the messenger, including working with the protocol and caching, and which can be embedded in other applications.

Conclusion

The speed of the application or work offline - for us it’s not a desire to be in trend, but rather a real way to give a convenient opportunity for communication to specific users. Those who have expensive mobile traffic or just bad internet. And this is quite a serious motivation to do well. As a result, it’s assessed by users, so I suggest you install the instant messenger and share your feedback in the comments. I will be glad to answer questions.

Source: https://habr.com/ru/post/333610/

All Articles