📜 ⬆️ ⬇️

Django-nonrel Google App Engine for Python website

In this article I want to talk a little bit about the development of my project - the website egaxegax.appspot.com .

Since I am a big fan of the Python language, I decided to create my website on the popular Django framework. To use it on the free appspot.com hosting, I adapted the code to use the NoSQL version of django and the Google App Engine platform.


The site exists from the 12th year. I use it as a platform to explore the possibilities of django and appengine live. It is also interesting to study the statistics on it in Google Webmasters : search indexes, queries. For example, for myself, I found out that Google indexes the title headers for the search, not the contents of the meta tags.

It all started with the Articles section - small notes on program topics: scripts, configs, usage examples. But the articles quickly ended. And to create new ones in large volume does not work. We needed something more.
')
Somewhere somewhere on the web I downloaded an archive of files with lyrics and chords of songs. He added a couple of dozen of his rebounds to them and decided to put everything on the site. Total got about 25,000 files. Manually do not download as much. To do this, I wrote a song_conv.bat script that converts text files to dumps for loading table data into the GAE DataStore .

Data loading had to be divided into several stages due to restrictions on the number of write operations per day in the DataStore. It was possible to record about 700-800 records (files) per day.

In this way, I downloaded the first portion of texts with an object of about 11,000 files (records in the DataStore). After that, I wrote a song_upload.py script to load data through a POST request using the HTTP protocol. Thus, I simulate filling in the fields of the input form and processing goes through the same controller. Data loading speed has decreased, but I can debug data insertion locally.

After downloading the data some time later, when opening the page of the site, an error of 503 Server Error: Over Quota increasingly began to occur. Having learned the logs on the server, I found out that the main users of my site were googlebot and yandexbot, which access the pages at intervals of 2-3 minutes. The error occurs due to exceeding the limit of the number of operations per day to read from the DataStore.

Looking at the documentation and examples on appengine, I realized that I did not use the cache module at all (namely, memcache). Each page opening caused access to the database through the QuerySet. In the new scheme, I pass the results of samples from QuerySet records to Dictionary lists, which are stored in the cache and read from there when re-accessed. This solved the problem with the rapid expiration of the reading limit.

Later I added the Photos and News section. Sections are designed as separate applications (apps). Data is stored in DataStore tables. The Photo section also uses the BlobStore file storage. All applications use the cache when retrieving data.

By analogy with the section Chords, I fill out the section of the Book , where I post the texts of electronic books. I get the texts of the books by unpacking the * .epub files with the help of the script book_conv_up.py from the / media / scripts directory. Unlike the lyrics, they are much larger in volume and cannot be fully displayed on the page. In addition, there was a problem with the fact that the entire book could not be added to the cache due to exceeding the cache memory limit. To do this, I read, put in the cache and display them in chapters.

To fill the sections of the Photo and Books, I wrote the photo_upload.py and book_upload.py scripts as well as to fill in the lyrics.

The site has user authorization built into Django and the addition of new ones with verification through Captcha.

For those interested, go to the project page in the GitHub django-egaxegax repository .

Source: https://habr.com/ru/post/274431/


All Articles