📜 ⬆️ ⬇️

App Engine API under the hood

This topic I want to open a series of translations blog Nick Johnson . Nick publishes extremely useful articles on GAE, shares his experience, and puts on unusual experiments. I hope these materials will be useful to you.

If you use the App Engine only for simple applications, it is better to refrain from further reading. If you are interested in low-level optimizations or you want to write a library to work with the most intimate components of App Engine, please read more!

Common API


')
Ultimately, each API call passes through one common interface with 4 arguments: the name of the service (for example, 'datastore_v3' or 'memcache'), the name of the method (for example, 'Get' or 'RunQuery'), request and response . Request and response are protocol buffers — a binary format commonly used by Google to exchange structured data between processes. The specific type of request and response of the protocol buffers depends on the method called. When an API call occurs, the request protocol buffer is generated from the data sent in the request, and the response protocol buffer remains empty and is subsequently filled with the data returned by the API call response.

API calls are made by passing the four parameters described above to the 'dispatch' function. In Python, the apiproxy_stub_map module performs this role. This module is responsible for maintaining the correspondence between the service name — the first of the parameters described — and the stub that processes it. In the SDK, this correspondence is provided by creating local stubs — modules that mimic the behavior of the API. In production, the interfaces to the real API are transferred to this module during the start of the application, i.e. before the application code is loaded. A program that makes API calls should never care about implementing the APIs themselves; she does not know how the call is handled: locally or was it serialized and sent to another machine.

As soon as the dispatch function finds the appropriate stub for the called API, it sends a call to it. What happens next depends entirely on the API and the environment, but in production as a whole the following happens: the protocol buffer request is serialized into binary data, which is then sent to the server (s) responsible for processing this API. For example, calls to the repository are serialized and sent to the repository service. This service deserializes the request, executes it, creates a response object, serializes it and sends it to the stub that made the call. Finally, the stub de-serializes the response to the buffer protocol response and returns a value.

You must be wondering why you need to handle the buffer protocol response in each API call. This is because the protocol buffer format does not provide any way to distinguish the types of data being transferred; It is assumed that you know the structure of the message you are planning to receive. Therefore, it is necessary to provide a “container” that understands how to deserialize the response received.

Let us look at an example of how this all works by running a low-level query to the repository — the resulting entity instances by the key name:
  1. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)
  2. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)
  3. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)
  4. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)
  5. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)
  6. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)
  7. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)
  8. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)
  9. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)
  10. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)
  11. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)
  12. from google.appengine.datastore import datastore_pb from google.appengine.api import apiproxy_stub_map def do_get(): request = datastore_pb.GetRequest() key = request.add_key() key.set_app(os.environ['APPLICATION_ID']) pathel = key.mutable_path().add_element() pathel.set_type('TestKind') pathel.set_name('test') response = datastore_pb.GetResponse() apiproxy_stub_map.MakeSyncCall('datastore_v3', 'Get', request, response) return str(response)

Very thorough, right? Especially in comparison with the similar high-level method - TestKind.get_by_key_name ('test')! You must understand the entire sequence of actions: generating a request and a response of the protocol buffers, filling the request with relevant information (in this case, the name of the entity and the name of the key), then calling apiproxy_stub_map.MakeSyncCall to create a remote object (RPC). When the call is completed, the answer is filled in, which can be seen by its string display:
  1. Entity {
  2. entity <
  3. key <
  4. app: "deferredtest"
  5. path <
  6. Element {
  7. type: "TestKind"
  8. name: "test"
  9. }
  10. >
  11. >
  12. entity_group <
  13. Element {
  14. type: "TestKind"
  15. name: "test"
  16. }
  17. >
  18. property <
  19. name: "test"
  20. value <
  21. stringValue: "foo"
  22. >
  23. multiple: false
  24. >
  25. >
  26. }

Each remote call for each API uses the same pattern inside - only the set of parameters in the request and response objects are different.

Asynchronous calls


The process described above refers to a synchronous API call — that is, we wait for a response before we can do anything further. But the App Engine platform supports asynchronous API calls. With asynchronous requests, we send a call to a stub, which is returned instantly, without waiting for a response. Then we can request a response later (or wait for it if necessary) or set a callback function that will be automatically called when the answer is received.

At the time of this writing, only some APIs support asynchronous calls, in particular, URL fetch APIs , which are extremely useful for retrieving several web resources in parallel. The principle of operation of the asynchronous API is the same as that of the ordinary ones - it simply depends on whether asynchronous calls are implemented in the library. APIs like urlfetch are adapted for asynchronous operations, but other, more complex APIs are much more difficult to get to work asynchronously.
Consider an example of how to convert a synchronous call to asynchronous. Differences from the previous example are highlighted in bold:
  1. from google.appengine.datastore import datastore_pb
  2. from google.appengine.api import apiproxy_stub_map
  3. from google.appengine.api import datastore
  4. def do_ async _get ():
  5. request = datastore_pb.GetRequest ()
  6. key = request.add_key ()
  7. key.set_app (os.environ ['APPLICATION_ID'])
  8. pathel = key.mutable_path (). add_element ()
  9. pathel.set_type ('TestKind')
  10. pathel.set_name ('test')
  11. response = datastore_pb.GetResponse ()
  12. rpc = datastore.CreateRPC ()
  13. rpc.make_call ('Get', request, response)
  14. return rpc, response


The differences are that we create an RPC object for one particular access to the repository and call its make_call () method, instead of MakeSyncCall (). Then we return the object and the response of the protocol buffer.

Since this is an asynchronous call, it was not completed when we returned the RPC object. There are several ways to handle an asynchronous response. For example, you can pass a callback function to the CreateRPC () method or call the .check_success () method of an RPC object to wait for the call to complete. We demonstrate the last option, as it is easier to implement. Here is a simple example of our function:
  1. TestKind (key_name = 'test', test = 'foo'). Put ()
  2. self.response.headers ['Content-Type'] = 'text / plain'
  3. rpc, response = do_async_get ()
  4. self.response.out.write ("RPC status is% s \ n"% rpc.state)
  5. rpc.check_success ()
  6. self.response.out.write ("RPC status is% s \ n"% rpc.state)
  7. self.response.out.write (str (response))

Output:
  1. RPC status is 1
  2. RPC status is 2
  3. Entity {
  4. entity <
  5. key <
  6. app: "deferredtest"
  7. path <
  8. Element {
  9. type: "TestKind"
  10. name: "test"
  11. }
  12. >
  13. >
  14. entity_group <
  15. Element {
  16. type: "TestKind"
  17. name: "test"
  18. }
  19. >
  20. property <
  21. name: "test"
  22. value <
  23. stringValue: "foo"
  24. >
  25. multiple: false
  26. >
  27. >
  28. }

The status constants are defined in the google.appengine.api.apiproxy_rpc module - in our case, 1 means “in progress”, 2 means “finished”, which means that the RPC is really executed asynchronously! The actual result of this query is, of course, the same as the regular synchronous one.

Now that you know how low-level RPC works and how to make asynchronous calls, your capabilities as a programmer have greatly expanded. Who will be the first to write a new asynchronous interface to the App Engine APIs like Twisted?

Source: https://habr.com/ru/post/110886/


All Articles