Optimization for the new price

App Engine is in full swing towards a bright future and a new method of calculating resources. Panic and chaos moved into the ranks of developers under AE - too much is kept in the strictest secret (or they don’t really know in the company what to do). But today we will not discuss whether the blobstore will be included in free quotas, or you will have to pay for access to it only. We will not discuss where to go if the new prices are not affordable at all and we will not talk about $ 50 “freebies” that Google is trying to soften the transition to new rails (especially since this good news was sent to all application administrators).

Today we will talk about application optimization. You have already optimized your application for minimum CPU and memory consumption? Forget about it, now it is secondary and your money will be taken by other measures.

Introduction.

As part of the transition to new pricing , we updated the set of resources included in the application usage report. We abandon the CPU clock and move to a system that will take into account the number of instances running hours (Frontend and Backend), the number of API calls, the amount of stored data and traffic. More information in our FAQ .
')
Before launching a new pricing model, we released comparative bills so that you can see how it affects your costs. You can start optimizing your application before the price changes take effect and see how the optimizations affect your bill.

In this article, we will show how to interpret the new usage report and show some strategies that can be used to manage resources, and how they can affect the performance of your application.

We study old and new reports.

Daily usage reports are in the control panel on the Billing History page located at _https://appengine.google.com/billing/history?&app_id=$APP_ID . Clicking the [+] icon next to one of the reports will expand the details of a single day, where you can see new and old resources. A preview of the usage report for your application looks like this:

We will go through the list of parameters in the invoice and explain what they mean, review some resource management strategies and explain how they will affect the performance of the application.

Instance management.

The first two lines of the new account relate to the use of the application instances. You can read about instances in our documentation. You can see the number of instances used by the application in the control panel at _https://appengine.google.com/instances?&app_id=$APP_ID or by selecting the Instances graph in the drop-down list at _https://appengine.google.com/dashboard?&app_id=$APP_ID .

App Engine Scheduler.

App Engine uses a scheduling algorithm to determine how many instances the application needs to serve its traffic. With each request received by your application, we decide whether to service it to an available instance (to what is idle or receive parallel hapros), whether to send a request to the waiting queue, or to launch a new instance for this request. We make decisions based on the available instances, the speed with which the application responds to requests (its latency) and the time it takes to start and initialize a new instance before starting to service requests. In most cases, when we consider that we can service the request faster by launching a new instance, we launch a new instance.

Of course, application traffic is erratic, so the scheduler continues to track the number of idle instances of your application. These instances can be useful for servicing traffic hops without noticeable user delay. If the scheduler determines that the application has too many idle instances, then it withdraws resources by stopping one of the instances.

Strategies to reduce the number of instances used.

Reduced latency.

The latency of your application has a big impact on the number of instances needed to service the application. Therefore, reducing the delay can greatly affect the number of required instances. Here is a list of actions to reduce latency:

Cache frequently used shared data more. In other words: use memcache. If you also set cache-control headers for an application, you can greatly increase the efficiency of caching by servers and browsers. Even caching for a few seconds can greatly increase the efficiency of the application. Python applications should also use runtime caching.
Use memcache more efficiently. Use batch calls instead of a number of single.
Use tasks (Tasks) for actions not related to queries. If your application performs actions that can be performed outside of the serving request user, send it to tasks. Sending these actions to the Task Queue instead of waiting for execution until the completion of the request will significantly reduce noticeable delays to the user. A task queue can provide much greater control over execution speed and helps distribute the load more evenly.
Use data storage more efficiently. We delve into the details below.
Send URL Fetch requests in parallel.
- Use asynchronous API calls ( Java , Python )
- Combine URL Fetch calls into packets (which you can process one at a time during a user request) and process them in an offline task in parallel with the asynchronous URL Fetch.
Record HTTP sessions in Java asynchronously. HTTP sessions allow you to configure your application to asynchronously write session data to the datastore by adding <async-session-persistence enabled="true"/> to the appengine-web.xml file. Session data is always synchronously recorded in memcache, and if the request tries to get data that is not in memcache, then the request is sent to the datastore, which may not even have the latest update. This means that there is a small risk that the application will receive expired session data, but for most applications, the speed gain far outweighs this risk.

Adjust the scheduler manually.

On the Application Settings page in the control panel there are two sliders that will help you set some variables used by the scheduler to control the instances of the application. Here is a brief explanation of how to use them to find a compromise between performance and resource use:

Reduce the maximum number of idle instances. Setting Max Idle Instances allows you to control the maximum number of idle instances of the application. Setting this limit tells App Engine to stop any idle instances beyond this limit, so that they do not consume additional quotas or incur additional costs. However, a smaller number of idle instances also means that the App Engine scheduler will have to launch new instances during a traffic jump, which may increase noticeable delays for users.
Increase the minimum latency. Increasing the Minimum Waiting Delay (Min Pending Latency) tells the App Engine scheduler not to start a new instance until the request is in the queue for more than a specified time. If all instances are occupied, then the user service request may need to wait in a queue to reach this threshold. Setting a large value for this parameter will require a smaller number of instances to start, but can result in large delays noticeable to the user during an increased load.

Enable concurrent requests in Java.

In release 1.4.3, we introduced the ability for your application instances to handle multiple simultaneous requests for Java. Enabling this option will reduce the number of required instances, but to work with it correctly, your application must be thread-safe. You can read more about parallel queries in the Java documentation .

Note : Multi-threading for Python will not be available until Python 2.7 is launched, which is in the work plan . In Python 2.7, multithreaded instances can handle more requests and should not consume quotas of instances running hours while idle while waiting for responses from blocking APIs. Since Python does not currently support simultaneous servicing of more than one request for one instance, and in order to allow all developers to adapt to simultaneous requests, we will give a 50% discount on opening hours of frontend instances before November 20, 2011. Python 2.7 is now in the closed stage testing.

Manage application repository.

App Engine calculates the storage cost based on the size of the objects in the data store, the size of the indices needed to maintain the data, and the amount of data in the blobstore.
Here are some actions you can perform to check if there is more data in the index than necessary:

Use the Get Indexes feature ( Java , Python ) to check which indexes are set for your application. Removing any indexes that are not necessary for the application will save on data storage and reduce the cost of recording the object. Also, you can see the running application indexes in the control panel at _https://appengine.google.com/datastore/indexes?&app_id=$APP_ID .
When designing data models, check whether you can write queries to reduce the total number of indices. Read our query and index documentation for more information on how App Engine creates indexes.

Manage the use of datastore.

In the new model, we will take into account the number of operations performed in the datastore (instead of the currently used CPU resources). Several strategies that can lead to a decrease in datastore resource consumption, as well as lower latency requests to the datastore:

Reconfigure data modeling to replace queries with more efficient and cheaper key-receiving.
Use only key requests instead of all object requests when this is possible.
To reduce latency, replace multiple single get() requests with one batch get() .
For pagination, use datastore pointers instead of offset.
Parallelize multiple requests to datastore using the asynchronous datastore API ( Java , Python ).

Traffic management

The main way to reduce outgoing traffic is always when it is possible to put the appropriate Cache-Control header in the responses and set a reasonable expiration date for static files ( Java , Python ). Using the Cache-Control: public header Cache-Control: public will allow proxy servers and user browsers to cache responses for a designated time.

Incoming traffic is more difficult to control, as this is the amount of data sent by users to the application. However, it is a good opportunity to mention the DoS protection service ( Java , Python ), which will allow you to block traffic from unwanted IP.

Manage other resources.

The last values in the report are using the Email, XMPP, and Channel APIs. For these APIs, it's best to make sure that you use them efficiently. One of the best ways to test API usage is to use Appstats ( Python , Java ) to make sure that the application does not make unnecessary calls. It is also always a good idea to make sure that you check the level of errors and look for any possible incorrect calls. In some cases it is possible to catch such calls in advance.

Source: https://habr.com/ru/post/127481/

All Articles