
There are many informative articles about high performance on mobile devices on the web, and as many about general API design. But very little is discussed about the architectural solutions needed to optimize the performance of backend APIs designed for use by mobile clients.
Of course, it is necessary to optimize the performance of mobile applications themselves. But we, infrastructure engineers, can do a lot to ensure that mobile customers are reliably and quickly provided with data and software resources - thereby maintaining a positive experience in the use of mobile applications.
Here is what you need to consider when designing mobile software:
')
- Limited screen size . Little space for data, small images.
- Less concurrent connections . This is important because, unlike desktop browsers that can perform many simultaneous asynchronous requests, mobile browsers have a limited number of connections to a single domain.
- Network slower . Network performance is greatly affected by the overall signal reception level, servicing numerous subscribers (and although some of them are sitting on Wi-Fi, some networks become overloaded and perform additional search operations if the user connects to another base station).
- Lower computing power . Intensive client computing, 3D rendering, and the active use of JavaScript can dramatically decrease performance.
- Less caches . Mobile clients are generally limited in memory, so for the sake of increased performance, it’s better not to rely too much on cached content.
- "Special" browsers . The ecosystem of mobile browsers is a lot like a fragmented desktop browser environment several years ago, when developers released all new versions with fatal flaws and incompatibilities.
There are many ways to solve the described difficulties, but this article is mostly about what can be done with an API or backend to improve the performance (or its perception) of mobile clients. We will look at two main questions:
- Minimize network connections and data transfer requirements . Efficient multimedia processing, efficient caching and the use of more time-consuming data-oriented operations and with fewer connections.
- Sending the network "correct" data . Designing such APIs that only return the required / requested data, as well as optimization for different types of mobile devices.
Although the article focuses on the mobile segment, many lessons and ideas can be applied to API clients in other areas.
Minimizing connectivity and data transfer over the network
One of the most important tasks that needs to be addressed to improve performance on mobile devices is to minimize the number of HTTP requests necessary to render a web page. This can be solved in different ways, and the choice of approach may depend on your data.
Images
If you make one request for each image on the page, then you can improve the speed and take advantage of the caching of individual images. The desktop browser is able to process requests quickly and in parallel, so that their number does not greatly reduce performance (and due to caching, the “strength” margin becomes even higher). However, on mobile devices, a large number of requests can be deadly.
Minimizing requests for images can reduce the total number of requests, and in some cases - the amount of transmitted data (which also has a positive effect on performance). What strategies can you follow?
Sprites
Using image sprites helps to reduce the number of individual images that need to be downloaded from the server. But this solution has a flaw: sprites can cause trouble in support, and in some situations they are not easy to generate (for example, when searching through a catalog of products, you need to show a large number of thumbnails).
Applying CSS instead of images
If, wherever possible, to avoid images and use CSS rendering for
shadows ,
gradients and other effects, you can reduce the number of bytes that need to be transferred and downloaded.
Responsive image support
Adaptive images are a common way to deliver the right picture to the appropriate device.
Apple does this by loading regular images, and then using JavaScript, replacing them with higher resolution images. There are a
number of other approaches , but we are
still far from solving this problem .
To use responsive images, make sure they are supported on the server and that the APIs support different versions of the same image. And the specific implementation will depend on client decisions.
Reduce additional requests by using data URIs to inline images
An alternative to sprites is to use URI data to inline images in HTML. As a result, the pictures become part of the whole page. And although their size in bytes may increase, such images are better compressed using Gzip, which compensates for the increase in the amount of information transmitted.
Tip : if you are using a URI, then:
- Reduce the pictures to the desired size before embedding in the URI.
- Make sure Gzip compresses the answers to the queries (to take advantage of the compression).
- Please note that the images embedded in the URI are part of the CSS page, and thus it will be more difficult to cache individual images. Therefore avoid inlining if there are good reasons for local caching of images (for example, if they are often used on different pages).
Using local storage and caching
Since mobile networks can run slowly, HTML, CSS, and images can be stored in local storage (localStorage).
Here is an excellent study on
how to improve Bing performance using local storage, using which it was possible to reduce the size of an HTML document from ≈200 KB to ≈30 KB.
A great way to improve the subjective assessment of performance by users is to pre-select the data that will be used on mobile devices in order to send them to customers without additional requests. These include paged search results, popular searches, and user data. If you think about this approach and take it into account in the architecture, then you can create APIs that can prepare and cache data before the user requests them, which will improve the subjective perception of performance.
Tip : Data that is unlikely to change when an application is updated (for example, categories or main navigation) should be delivered inside the application in order not to waste time and resources transferring over the network.
Ideally, the data should be transferred as needed, and loaded in advance when it is justified. If the user does not see the image or content, then do not send it (this is especially important for responsive websites, since some simply “hide” some elements). An excellent application for preliminary preparation of images is the gallery of search results. It is better to immediately load the next and previous images to speed up the interface. But do not get carried away, do not upload too many pictures, because the user may not even see them.
Retrieving data from local storage
may degrade performance , but this effect is much weaker compared to transferring data over the network. In addition, some applications other than local storage use to improve performance and speed of launch and
other features of HTML5 , for example,
appCache .
Tip : if you embed CSS and JavaScript directly into a separate request, then save the links to these files and transfer them to the server via cookies, the client will not have to download these resources again (only new files will be transferred over the network). This will save a lot of time and is a great tool for using local caching. Learn more about how to directly embed and link to the mentioned files, it is written here:
http://calendar.perfplanet.com/2011/mobile-ui-performance-considerations/ .
Non-blocking I / O
When it comes to client optimization, it is recommended to monitor the
blocking JavaScript execution , which can severely degrade performance. But for the API, this is even more important. If you have a lengthy API call, for example, with a call to a third-party resource that can complete with a timeout, then it is important to implement the call as non-blocking (or even make it long to wait) and choose either polling or a triggering model.
- Polling (pull model) : in the polling-API, the client makes a request and then periodically checks for the result, reducing the frequency of checks if necessary.
- Triggering (push model) : in a trigger-API, the call generates a request and then listens while waiting for a server response. He provides a callback that triggers an event by which the caller will know when the query result appears.
Triggering APIs are usually harder to implement, because mobile clients are not reliable. So most often it is better to use the polling model.
For example, in the
Decide mobile app, product pages displayed local prices for countries in which these products were available. Since the results were provided by a third-party service, we were able to make queries using the polling API, and then get results without stopping the application and not supporting an open connection waiting for results.
Make sure your APIs respond quickly and do not block execution, waiting for results, because mobile clients have a limited number of connections.
Tip: Avoid the "chatty" API . In the case of a slow network, several API calls should be avoided. A good rule of thumb: put all the data needed to render the returned page in one API call.
If on the server side some components are much slower than others, it may be advisable to split the API into separate calls, focusing on the characteristic response time. Thus, the client can start page rendering after the first quick calls, while waiting for answers to slower ones. That is, we reduce the time required for the text to appear on the screen.
Avoiding Redirects and Minimizing DNS Requests
As for requests, redirects can degrade performance, especially if these are redirects between domains that require the DNS query to be performed.
For example, many sites work with their mobile versions using client redirects. That is, when a mobile client accesses the URL of the main site (for example,
katemats.com ), it redirects to the mobile site
m.katemats.com (this is very often found where the sites are built on different technology stacks). Here is an example of such a scheme:
- The user is googling for the request “yahoo” and clicks on the first link in the issue.
- Google captures a click with its URL tracking, and then redirects to www.yahoo.com [redirect]
- The answer to the Google redirect goes through the base station of the mobile operator, and then gets on the client phone.
- Running a DNS query for www.yahoo.com .
- Found IP is transmitted via BS to the phone.
- When the phone accesses www.yahoo.com , it will be recognized as a mobile client and redirected to m.yahoo.com [redirect]
- Then the phone again needs to perform a DNS query, this time for the m.yahoo.com subdomain.
- Found IP is transmitted via BS to the phone.
- Finally, the final HTML and the necessary resources are transmitted through the BS to the phone.
- Some pictures on the pages of the mobile site are provided via CDN through links to another domain, say, l2.yimg.com .
- The phone again performs the DNS query — for the l2.yimg.com subdomain.
- Found IP is transmitted via BS to the phone.
- Pictures are drawn, the page is ready.
As you can see, there is a lot of overhead that can be avoided by using redirects on the server side (that is, by routing through the server and minimizing the number of DNS requests and redirects on the client), or by using
adaptive techniques .
Tip : if you cannot avoid a DNS query, then to save time, try using a
preliminary DNS query for known domains.
HTTP and SPDY pipelining
Another useful technique is
HTTP pipelining . It allows you to combine multiple requests into one. Although I would choose
SPDY , which optimizes HTTP requests so that they are much more efficient. This protocol is supported in the Amazon Kindle browser, Twitter and Google.
Sending the "correct" data
Different clients need to send different files, CSS and JavaScript. Even the number of results can vary. Designing APIs that support different combinations and versions of results and files will give you maximum flexibility when creating a wonderful interaction experience.
For results use limit and offset.
As with conventional APIs, extracting results using
limit and
offset allows clients to request various data required for a specific use (there will be less results for mobile clients). I prefer the
limit and
offset notation, because it is more common (than, say,
start and
next ), well understood by many databases, and therefore easy to use.
/products?limit=25&offset=75Choose the default value that corresponds to the largest or the smallest common denominator, with which you can identify the most important clients for your business (fewer - if your audience is based on mobile clients, more - if you come in mainly with desktop computers, this is usually true for B2B sites and services).
Support partial response and partial update
Design your APIs so that customers can request only the information they need. This means that APIs must support a set of fields, rather than returning the full representation of the resource each time. If the client does not have to collect and parse unnecessary data, the queries are simplified, and the performance is improved.
Partial update allows clients to do the same with the data they write to the API (and then you don’t need to define all the elements within the resource classification).
Google supports partial response with optional optional comma-separated list fields:
http://www.gogle.com/calendar/feeds/[emil prtected ]/private/full? fields=entry(title,gd:when)If
entry is specified for a call, this means that the caller requests only a partial set of fields.
Avoid or minimize the use of cookies.
Every time a client sends a request to a domain, all cookies from this domain are included in it - even duplicated or extraneous values. Therefore, another way to reduce the amount of transmitted data and improve performance is to maintain a small size of cookies (and not request them if they are not necessary). Do not unnecessarily use or require cookies. Provide static content that does not need permissions from a domain without cookies (for example, images from a static domain or from a CDN). Here is a description of some methods of working with cookies:
https://developers.google.com/speed/docs/best-practices/request .
Creating device profiles for the API
Given the variety of screen sizes and resolutions on desktops, tablets and smartphones, it is helpful to create profiles that you will support. For each profile, you can provide different images, data and files suitable for specific devices. This is done using
media queries to the client .
The more profiles, the better the interaction experience with respect to a particular device. But then it will be harder to accompany all sorts of supported functions and scenarios (because the devices are constantly changing and evolving). So it is better to maintain only the absolutely necessary number of profiles As for the compromises and opportunities for creating a good experience of interaction, I recommend reading this article:
https://mobiforge.com/design-development/effective-design-multiple-screen-sizes .
For most applications, three profiles are enough:
- Mobile phones: less images, touch control is available, low network bandwidth.
- Tablets: more images, but adapted for a network with low bandwidth, touch control is available, more data is being transmitted in the request.
- Desktop computers: images for tablets and higher resolution images for Wi-Fi and desktop browsers.
The desired profile can be selected on the client. The server-side APIs should be designed to take these profiles and send different information depending on which device sent the request. For example, smaller images will be sent, or the size of the results will be reduced, or CSS and JavaScript will be inline.
Suppose if one of your API returns search results, then the profiles may be:
/products?limit=25&offset=0The default profile (desktop) is used, a standard page is given, which requests each image separately, so subsequent views can be loaded from the cache.
/products?profile= mobile &limit=10&offset=010 results are returned, images with low resolution are transmitted as URIs in one HTTP request.
/products?profile= tablet &limit=25&offset=020 results are returned, pictures with low resolution, but larger in size are transmitted as URIs in one HTTP request.
You can even create profiles for gadgets such as
feature phones . They, unlike smartphones, allow you to cache files only page by page. So for such clients it is better to use profiles than to send CSS and JavaScript in each request.
It is recommended that profiles be used instead of partial responses if the server responses vary greatly depending on the profile. For example, if in one case the answer contains inline URI-images and a compact layout, and in the other case - no longer. Of course, profiles can also be defined using “partial answers”, although they are usually used to define a part (or portion) of a standard scheme (for example, a subset of a larger classification), rather than the entire other data set, formats, etc.
Finally
There are many ways to make the web faster, including on mobile devices. I hope this article will be a useful guide for API developers who design server parts for working with mobile clients.