One of the problems faced by GAE-developers, accustomed to working with relational databases and ORM, are links and relationships in App Engine. This tutorial addresses two questions: first, what are the relationships in the DBMS in general ?; secondly, how are they used in GAE?
Types of relationships
DBMSs operate on several types of relationships — one-to-one, one-to-many, and many-to-many. Despite differences in terminology, relationships work on the same principles as references. A link is an entity field that contains the key of another entity — for example, if a pet refers to an owner, this means that the pet entity has a field that contains the owner key of the entity.
All kinds of relationships can be represented as links. Type “one-to-many” in the simplest form - link: each “pet” has its own “owner”, so the “owner” can have several “pets” that refer to it. At the same time, the “owner” himself does not change - individual “pets” rely on him, who called him their owner.
')
One-to-one relationships are one-to-many with the additional limitation that there is only one “pet” referring to the “owner”. This limitation can be enhanced by storing cross-references (reference fields to each other in each entity).
The many-to-many relationship is a bit more complicated. They can be implemented in several ways, but they all boil down to a list of pairs of links. Consider a webpage as an example. Each of the pages has many inbound and outbound links. They can be represented by a list of pairs of the form (from_url, to_url). In relational DBMS, similar matches are stored in separate tables, which are joined in queries to search for related records.
Now let's look at how the above types of links work in App Engine. In general, it is often useful to get rid of the terminology “one-to-many”, etc., and consider entities from an object-oriented point of view. Put the question differently: how should one entity refer to another to fit your data structure?
App Engine Relationships
One-to-many
This type of relationship is easily implemented in any system. The App Engine provides storage of the “one” side key in essence from the “many” side. In Python, the ReferenceProperty field is used for this:
class Owner(db.Model): name = db.StringProperty() class Pet(db.Model): name = db.StringProperty() owner = db.ReferenceProperty(Owner)
To find the "owner" for the "pet", we refer to the pet.owner attribute, and the App Engine automatically loads the entity to which we refer. To find all the "pets" that refer to a specific "owner", it is enough to run the following query:
pets = Pet.all().filter('owner =', owner).fetch(100)
A similar result can be obtained more simply: ReferenceProperty automatically creates a property in the Owner class for quick and convenient access to related data, so you can extract the list of "pets" like this:
pets = Owner.owner_set.fetch(100)
By default, App Engine refers to this property as a field name + "_set", but you can set your own:
class Pet(db.Model): name = db.StringProperty() owner = db.ReferenceProperty(Owner, collection_name='pets') pets = owner.pets.fetch(100)
Another way to model a one-to-many relationship is to bind an entity to a parent. At the time the entity is created, it can be assigned a parent. In this case, the key of the parent entity becomes part of the child key and cannot be changed in the future. Here is how it looks in our example:
class Owner(db.Model): name = db.StringProperty() class Pet(db.Model): name = db.StringProperty() bob = Owner(name='Bob') felix = Pet(name='Felix', parent=bob) owner_of_felix = felix.parent
Further, we nowhere explicitly indicate the relationship between entities - it follows from the indication of the parent at the time of creation. When is it better to use binding to the parent (parent) instead of the reference field (ReferenceProperty)? This affects the operation of transactions: in App Engine, in each individual transaction, you can operate on entities of only one group, i.e. set of entities with a parent from the same group. If you want to prevent related entities from entering the transaction, use the reference field. Also, remember that an entity can have only one immediate parent, and its key cannot be changed after creation.
One to one
One-to-one relationships are a particular case of one-to-many relationships. They are carried out by storing on the “one” side of the field-reference to another entity.
Many-to-many
Many-to-many are the most difficult to implement. For App Engine there are several solutions to build them. The most obvious approach is a relational table similar to a relational database, which contains pairs of keys for both sides of the relationship. For our pet / owner example, it looks like this:
class Owner(db.Model): name = db.StringProperty() class Pet(db.Model): name = db.StringProperty() class PetOwner(db.Model): pet = db.ReferenceProperty(Pet, collection_name='owners') owner = db.ReferenceProperty(Owner, collection_name='pets')
The advantages of this method are that you can add additional properties to relationships - for example, when modeling page link links, you can add a link text field to a relationship. Access to data is carried out in stages: there are associated pairs, from which the desired entities are then extracted. The example uses the batch extraction of entities from the links described in this
article * :
petowners = felix.owners.fetch(100) prefetch_refprops(owners, 'owner') owners = [x.owner for x in petowners]
Removing entities in a different direction (from “owner” to “pets”) is carried out in a similar way.
Another approach is that a list of entity keys of the other side is stored on one side of the relationship. This is useful when the number of stored items is limited (say, a few hundred or less). With this list it is convenient to perform batch operations. For example:
class Pet(db.Model): name = db.StringProperty() class Owner(db.Model): name = db.StringProperty() pets = db.ListProperty(db.Key)
From each "owner" you can extract a list of his "pets":
pets = db.get(bob.pets)
And to find all the "owners" for a given "pet", run this query:
owners = Owner.all().filter('pets =', felix).fetch(100)
Finally, a hybrid approach may prove to be the most productive and flexible. On this subject, I advise you to look at the wonderful report by Bret Slatkins.
Development of complex scalable applications on App Engine .
* -
refers to the pattern developed by the author of the article for extracting entities from links without executing unnecessary queries to the repository. In short, the reference field does not load the entity immediately, and when accessing an attribute or a method of reference, the query will be completed. To minimize the number of requests, the pattern loads entities by reference at a time (approx. Translator).