Disclaimer : This article is not about "how smart I am and how stupid Google is." This article is about some non-obvious problems and features of Google AppEngine (GAE) that would be nice for those who want to start working with the “evil empire” :-)

Google has done a lot of great things - search, mail without spam ... Google gets a bunch of our private data, but we continue to use it because it works so well ...
Some time in IT circles there was quite a bit of noise about AppEngine, and I decided to try working with him in my new project.
I chose Python with the Google framework to get the best compatibility and speed. I started with performance tests, and the results were ...
somewhat disappointing.
Test | Requests per second |
print 'Hello world' | 260 |
1 reading from Datastore, writing to Datastore | 38 |
1 reading from the Datastore | 60 |
10 readings from Datastore, 1 record | 20 |
1 reading from memcached, 1 writing to memcached | 80 |
1 reading from memcached | 120 |
Normal LAMP application, 6 SQL queries, http://3.14.by/ | 240 on the Atom 330 server 198 at shared hosting for $ 7 |
The test was conducted on 20 parallel requests from two different servers on the same continent, the figures were average for 7 seconds of execution (later I hammered and 10 minutes each). Some may say: “Hey! The results are not so bad, my [insert URL here] can process only 2 requests per second, so even 20-38 is not bad. ” I would say that this is the simplest application of a couple of lines; in this application, you will need 5-25 requests for data per page. Well, the classic LAMP Web applications that do not show 100 requests per second should be sent straight to the dump: crazy:
')
In addition, in test 10 readings 1, the error 'Error: Server Error' fell off if more than 10 parallel requests were used (the internal error was 'too much contention on these datastore entities. Please try again') - one of the examples of unexpected exceptions on level ground.
Scalability
I expected that at some point GAE would distribute my application to more than 1 server (at least it should have been in theory). However, after 10 minutes of stress testing and spending 10% of the daily quota on the CPU, the speed remained exactly the same. Probably GAE does not scale applications so quickly.
Sources
quite simple:
from google.appengine.ext import db
class Counter (db.Model):
nick = db.StringProperty ()
count = db.IntegerProperty ()
res = Counter.gql ("WHERE nick = 'test3'")
print 'Content-Type: text / html'
print ''
print '<html> <body> <h1> This is a datastore performance test </ h1>'
print 'it <h2> it reads a counter and it’s value in datastore </ h2>'
for v in res:
v.count = v.count + 1
print 'New counter value:', v.count
# v.put ()
print '</ body> </ html>
The application is flooded here:
http://mafiazone-dev.appspot.com/ .
In general, really, everything works as Google promises - speed does not depend on scale, slowly on a small scale, slowly and on a large one :-) Even one request for something that requires working with a Datastore or several Memcached - takes a lot of time . Well, if you need to make several requests per page - your chances of seeing the exception will increase dramatically by timeout.
Regular LAMP applications (like my home page) can easily serve 10'000'000 hits per day, and after tighter optimization, even 30'000'000 (an average of 500 hits per second at 8-10 peak hours). Do many projects require at least 10% of this load? And what if 0.01% of these requests fall off due to some kind of exception that you couldn’t or didn’t manage to process?
General list of problems in my opinion
If you are going to work with Google AppEngine, you need to keep in mind the following:
- Any appeal to Datastore may accidentally fall off with a small chance. Google says that the chance of this has decreased from 0.4% to 0.1, but it still remains. Datastore was not designed as a classic database - response for the expected time is not guaranteed. And you cannot handle all possible exceptions, since All this requires CPU time, and it is limited.
- Memcached is not the memcached you are used to. Here it is quite slow (hundreds of operations per second, instead of the usual tens of thousands).
- You will have to find a place to store static. You can not distribute large files from GAE, and besides, it is not very fast and quite expensive.
- There are reviews that URLFetch is not reliable enough. I think this will be fixed over time :-)
- You cannot select a datacenter. For example, if you live in Europe, and GAE launches your application in the US, users will feel that everything is somewhat slow, and you cannot move the application to Europe. It will transfer itself, but when it is not known. Mine was not transferred after all my benchmarks.
- Google can serve an unlimited number of requests, but each request is at least 100-200ms, and your application will never “fly”. And do not forget that you need to write additional code to service new unexpected exception.
- Google assures that free limits will be enough for several million requests. However, in my simplest examples I spent 0.65 CPU hours per 15k requests, i.e. only 150k hits per day. Accordingly, the paid service is somewhat more expensive than you can imagine. Probably, these millions are achievable only on 100% static pages.
- Google may say a lot about the fact that this is an open-source solution, but it is not. The key technologies are closed, and the SDK test server seems intentionally made super-braking (only 2 requests per second for Core2Duo 3.6Ghz). So no competition, either reconcile with the service, or rewrite everything with 0.
So that I want to see in GAE differently:
- Much more deterministic behavior. If I go beyond the limits, I'd rather get a warning in the mail than throw an exception, which I potentially won't have time to handle.
- Datastore & Memcached performance is much better. Probably memcached is common here, and overwhelmed with requests, and therefore slowly ...
- Choice of data center
- Cluster-aware API. It would be nice to have a small piece of very fast memory on the local server, with the event “initialize storage” and “release storage”.
More thoughts
Some time ago I was working with a very cool technology - everything was redundant and super-reliable, “cloudy”, convenient API, but with it the rendering of the forum page took 4 seconds on a 4-processor server. It was complete shit. If the technology slows down or the user is not comfortable with it - this technology is complete crap. Using technology just because “it's cool” is somehow stupid.
Where Google App Engine can be used and where not
GAE is excellent for simple read-only (i.e., without complex database logic, small amounts of data) of applications WITHOUT peak loads (i.e., it may not be able to scale on the fly). Your homepage will just fit perfectly on GAE with zero support costs.
This is not suitable for complex applications, where there is a lot of logic, a lot of data on the screen, and there are occasionally shashdot / habr effects. Google AppEngine may not be able to handle an unexpected peak of several hundred requests per second.
Conclusion
Does this all mean that progers and Goggle architects are stupid? : crazy: Not at all, in fact, under such conditions it would be difficult to achieve the best. It is really difficult to achieve unlimited scalability for applications that are not sharpened specifically for the cluster. As a result, most applications will not be able to work properly on GAE.
But if your application gets along well with the features and limitations of GAE, and you are satisfied with the performance and likelihood of errors - I think this will be the best solution for you.
Lies
here . And
then - in English.