📜 ⬆️ ⬇️

Tale of how I wrote my REST framework with web sockets

This article focuses on another REST framework (for Python 3), a feature of which is the use of web sockets for data exchange between client and server. About where the idea came from, what I had to face when writing my first library for Python and what came out of it in the end, I will tell further.


For those who are interested in this article - please, go under the cat.

1. The idea of ​​the project


The idea originated around mid-April 2015, when I stayed with a colleague at work, with whom we are registered in the same project in my office. In order to somehow entertain ourselves minimally while we were engaged in programming directly, we decided to talk about various interesting Python projects. In the process of communication, we somehow spontaneously approached the topic about our own projects and what could be interesting to use further in our projects (not necessarily related to work). When discussing directly, the idea arose that it would be cool to have a fairly “flexible” framework that uses web sockets through which you can transfer data in both directions without any problems. Each request comes in JSON format and contains some headers that are familiar when using REST and HTTP protocol. And as a pleasant addition, it provides the ability to send notifications (notifications) from the server side to the client out of the box for some event / timeout.
')
Naturally, after such a lengthy discussion, I decided to bring this idea to life (and why not?). Self-interest, enthusiasm and the desire to do something useful for the development of the third Python's ecosystem only gave an extra motivation to get down to business quickly.

2. Setting goals


For myself, personally, I highlighted a few additional points on which it was also decided to focus my own efforts when writing the library, besides what was mentioned earlier:


Naturally, to release all of the above in the first version was completely unrealistic for me, since I simply would not leave the development process, so for a small simplification, it was decided to break everything into small “pieces”. Make them, test them, put them into release, and then do the rest. First, we write what is most critical for the library (routing, view, authentication, etc.), and later, as far as possible, add new functionality.

3. Preparation for development: the choice between Aiohttp vs Gevent vs Autobahn.ws


Development began around the end of April 2015. In order to somehow ease the work during the project implementation, the search began for any ready-made solutions (or already existing libraries, which were not previously anticipated). There were no libraries that had a similar idea with mine or at least minimally out of the box what was supposed to be done. This has led to the complication of the task, since most of the necessary components will need to be written independently, based on their own understanding of all the processes taking place.

I decided to start directly with the libraries, which make it possible to use web sockets. At that time, several such packages were found: aiohttp, gevent and autobahn.ws. Each library has its own advantages and disadvantages, but, first of all, I proceeded from their capabilities and possible further re-use of the code, so that they would not have to fence their bikes again, especially where it was not necessary.

Aiohttp is a web development library based on the asyncio standard library and developed by svetlov . Not to say that I had some great real-world experience using this library, although it is worth noting that a lot of things have been done very well. However, the proposed solution with web sockets seemed to me somewhat low-level (although, in some cases, this may actually be convenient). I wanted some higher level of abstraction (for example, as in gevent-websocket or autobahn.ws, where the client / server has methods like onMessage and sendMessage, so similar to the methods from the event-oriented Twisted framework). The rest - the library is beautiful.

Gevent was one of the first packages that was the focus on first look. And the idea of ​​using it was also quickly rejected: at the time of the project start (April 2015), gevent was not ported to the third branch of the Python language. Although, if it were still ported, I would use it, taking the gevent-websocket extension and everything could work out quite well. At the time of this writing, this library already has support for the third branch, but now I don’t see any point in switching to it.

Autobahn.ws is the library with which I had previously repeatedly encountered when writing my small pet projects and with which I already have some minimal experience of using. A pretty good community, plus the author of the library is always ready to help in case of any problems (for example, when I couldn’t combine it with Twisted + wxPython, Tobias explained to me very well how this can be done). The latest versions are compatible with asyncio, just add decorators in the required places. A nice feature was the compliance with the RFC6455 document and the presence of incoming / outgoing data checking (were they sent / received in UTF-8 encoding, which I consider to be quite convenient). Therefore, it was decided to use it as the basis for the future library.

4. Problems with development


When writing the first version of the library, I simply did not know how to approach the solution of a number of problems. After a brief reflection, I decided to just go with the implementation of the way the server would process the incoming request from the client, like:

1) Received an inquiry
2) Checked that the necessary data arrived, on the basis of which it will become clear how to process the request (type of operation, where we are applying, etc.)
3) We are looking for a handler that matches the incoming request (a specific entry point and the method that will be called). If you find nothing, return an error. If everything is fine, then we select the appropriate handler and pass the received arguments to it.
4) The generated response resulted in a specific format (JSON, XML, etc.)
5) Returned the answer to the client.

In theory, everything sounds pretty simple, in fact, everything turned out to be exactly the opposite. The only thing that occurred to me was to go from the high level of abstraction to the lower ones. That is, I went as follows when we work with Autobahn.ws and asyncio loop:

1) Create an instance of a “factory” that will use the asyncio loop and accept incoming connections and serve them. After the “handshake process” is completed, we are ready to receive requests from the client and process them.

2) Received a request from the client in a specific format. In our case, we will receive it in the form of JSON as follows:

{ 'method': 'POST', 'url': '/users/create', 'args': { 'token': 'aGFicmFoYWJyX2FkbWlu' }, 'data': { 'username': 'habrahabr', 'password': 'mysupersecretpassword', } } 

This JSON has a fairly simple structure. The client is enough to determine several important parameters for us:


Approximately this type of requests will be expected to receive our server. If there is no any of the required arguments, talk about it immediately (for example, forgot to add a method). Otherwise, we go further in our list.

3) So, the request is delivered to the server, it is the correct format and correct. Now we want to process it accordingly and return the answer. However, what do we need for this? From my point of view, for the first time it will be enough to have a routing system that allows you to register a required handler to a specific URL, which would form the corresponding response, convert it to JSON, XML (or any other format) and return it to the client.

At this point I want to draw your attention to the routing. This is quite an important point, because we would like to provide access to some fixed URL in order to receive, for example, a list of current users (like "/ users /"). On the other hand, to get access to URLs of the form "/ users //", by which you need to receive detailed information about the user That is, the routing of the first type will be considered as simple, static, and the second - as dynamic, because there is a parameter in the path to the resource, which varies from request to request.

Regular expressions will help us to solve this problem. Every time a path to a resource is declared, for example:

 router = SimpleRouter() router.register('/auth/login', LogIn, 'POST') router.register('/users/{pk}', UserDetail, ['GET', 'PATCH']) 

We will perform an analysis of the path to such a resource. And create an endpoint that will only process requests of a specific type and only along the specified path. When the request for this resource comes, it will be enough for us to go through the dictionary, where the key is the path, and the value is the handler. In case a dynamic path is detected, at the time of receiving the request, and we have found the required handler, we will forward the detected dynamic parameter to the request processing point so that it is possible to obtain an object by key or perform some other operation using this parameter.

And of course we take into account the case when a request comes for a non-existent URL. For him, it is enough to return an error with a specific description.

4) Wow, now something has cleared up. We are able to find the required paths, handlers for them, and with the help of regulars we can find and forward parameters (for the case if a dynamic path is caught). Next, we look at the method parameter specified in JSON and try to find the corresponding class method from the view. If it is absent, we speak about it immediately and do not perform any operations. Otherwise, we make a call to the detected method and form the answer.

5) Next, perform the serialization of data (including for cases with errors) in some format. By default everything is converted to JSON format.

6) We transfer the generated response back to the client via a web socket.

And according to this rough plan, I followed before release 1.0. It was quite interesting to write my views, routing system and other interesting features. Although in the process of writing the first release, in the course of development of this pet-project, modules with configurations were required (in our case it was a module similar to that in Django). Or, for example, the necessary authentication for me slowly led to the implementation of support for middleware and JSON Web Token modules. As mentioned earlier - we do all sorts of modules on our own, we do not try to pull something extra.

Anyway, the writing of the “next bike” for me resulted in additional efforts and time costs. Although, to be honest, I do not regret at all that I went this way, since the time spent writing, debugging and regular debugging makes itself felt: now I’ve got a little better understanding of how this works in general.

If while writing the first version the writing of the code and its debugging went quite well, then with the implementation of version 1.1 I just tied up for a long time in debugging. Writing and porting code did not take as much time as searching and detailed analysis of what is happening, for example:

1) Analysis of the source codebase of the Django REST framework for what and how is happening “under the hood”: what do we do when we want to write or read a certain object; when and how we understand what fields were obtained (and whether they have any connections at all with other models) and what is required to serialize / deserialize them.

2) Serialization of SQLAlchemy models, similar to how it happens between the Django REST code and the Django ORM.

3) Have the ability to work with routing so that you can get the path to some object through the already written API (so that you can read and write some data on the received URL).

When developing this part of the functionality, the source codes of the library as Django REST (which in many ways was the basis for the next version), and the sources of SQLAlchemy + marshmallow-sqlalchemy libraries, which in many ways helped bring all of my ideas, helped me a lot.

Although a lot of resources were spent, but the end result fully justified all the costs - now we have the opportunity to work with SQLAlchemy as we used to do in Django REST. Working with data is the same and practically has no strong differences. This is great, even there is no need to practically retrain: the available API is in many respects identical to that used in Django REST.

5. Current project status


At the current time, the library provides the following features:


6. Example of use


A brief example is the following code, where work with users and email addresses will take place. Let's start the tables described using SQLAlchemy ORM:

 # -*- coding: utf-8 -*- from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import Column, Integer, String, ForeignKey from sqlalchemy.orm import relationship, validates Base = declarative_base() class User(Base): __tablename__ = 'users' id = Column(Integer, primary_key=True) name = Column(String(50), unique=True) fullname = Column(String(50), default='Unknown') password = Column(String(512)) addresses = relationship("Address", back_populates="user") @validates('name') def validate_name(self, key, name): assert '@' not in name return name def __repr__(self): return "<User(name='%s', fullname='%s', password='%s')>" % (self.name, self.fullname, self.password) class Address(Base): __tablename__ = 'addresses' id = Column(Integer, primary_key=True) email_address = Column(String, nullable=False) user_id = Column(Integer, ForeignKey('users.id')) user = relationship("User", back_populates="addresses") def __repr__(self): return "<Address(email_address='%s')>" % self.email_address 

Now we describe the corresponding serializers for these two models:

 # -*- coding: utf-8 -*- from app.db import User, Address from aiorest_ws.db.orm.sqlalchemy import serializers from sqlalchemy.orm import Query class AddressSerializer(serializers.ModelSerializer): class Meta: model = Address fields = ('id', 'email_address') class UserSerializer(serializers.ModelSerializer): addresses = serializers.PrimaryKeyRelatedField(queryset=Query(Address), many=True, required=False) class Meta: model = User 

As many of them managed to notice, in the place where we defined the class for user serialization, the addresses field is specified, with the argument queryset = Query (Address) in the constructor of the PrimaryKeyRelatedField class. This is done so that the serializer for SQLAlchemy ORM can build a connection between the addresses field and the table, passing the primary keys to this class during serialization. To some extent, this is similar to the QuerySet of the Django framework.

Now we will implement views that allow using some available API to work with the data in these tables:

 # -*- coding: utf-8 -*- from aiorest_ws.conf import settings from aiorest_ws.db.orm.exceptions import ValidationError from aiorest_ws.views import MethodBasedView from app.db import User from app.serializers import AddressSerializer, UserSerializer class UserListView(MethodBasedView): def get(self, request, *args, **kwargs): session = settings.SQLALCHEMY_SESSION() users = session.query(User).all() data = UserSerializer(users, many=True).data session.close() return data def post(self, request, *args, **kwargs): if not request.data: raise ValidationError('You must provide arguments for create.') if not isinstance(request.data, list): raise ValidationError('You must provide a list of objects.') serializer = UserSerializer(data=request.data, many=True) serializer.is_valid(raise_exception=True) serializer.save() return serializer.data class UserView(MethodBasedView): def get(self, request, id, *args, **kwargs): session = settings.SQLALCHEMY_SESSION() instance = session.query(User).filter(User.id == id).first() data = UserSerializer(instance).data session.close() return data def put(self, request, id, *args, **kwargs): if not request.data: raise ValidationError('You must provide an updated instance.') session = settings.SQLALCHEMY_SESSION() instance = session.query(User).filter(User.id == id).first() if not instance: raise ValidationError('Object does not exist.') serializer = UserSerializer(instance, data=request.data, partial=True) serializer.is_valid(raise_exception=True) serializer.save() session.close() return serializer.data class CreateUserView(MethodBasedView): def post(self, request, *args, **kwargs): serializer = UserSerializer(data=request.data) serializer.is_valid(raise_exception=True) serializer.save() return serializer.data class AddressView(MethodBasedView): def get(self, request, id, *args, **kwargs): session = settings.SQLALCHEMY_SESSION() instance = session.query(User).filter(User.id == id).first() session.close() return AddressSerializer(instance).data class CreateAddressView(MethodBasedView): def post(self, request, *args, **kwargs): serializer = AddressSerializer(data=request.data) serializer.is_valid(raise_exception=True) serializer.save() session.close() return serializer.data 

At the current time, we are writing separate views for working with objects and separately with a list of objects. In each of these subclasses inherited from MethodBasedView, specific handlers are implemented that will be used. For each type of request (get / post / put / patch /, etc.) a handler is written.

The last step is to register this API, and so that it is accessible to us from the outside:

 # -*- coding: utf-8 -*- from aiorest_ws.routers import SimpleRouter from app.views import UserListView, UserView, CreateUserView, AddressView, \ CreateAddressView router = SimpleRouter() router.register('/user/list', UserListView, 'GET') router.register('/user/{id}', UserView, ['GET', 'PUT'], name='user-detail') router.register('/user/', CreateUserView, ['POST']) router.register('/address/{id}', AddressView, ['GET', 'PUT'], name='address-detail') router.register('/address/', CreateAddressView, ['POST']) 

In general, here everything is ready, it remains only to start the server and connect through any client (Python + Autobahn.ws, using JavaScript, and so on, there are many options). For example, I will simply show a couple of simple queries using Python + Authobahn.ws (I’ll make a reservation in advance, the example with the client is not perfect, here the task is just to show how we can do it):

 # -*- coding: utf-8 -*- import asyncio import json from hashlib import sha256 from autobahn.asyncio.websocket import WebSocketClientProtocol, \ WebSocketClientFactory def hash_password(password): return sha256(password.encode('utf-8')).hexdigest() class HelloClientProtocol(WebSocketClientProtocol): def onOpen(self): # Create new address request = { 'method': 'POST', 'url': '/address/', 'data': { "email_address": 'some_address@google.com' }, 'event_name': 'create-address' } self.sendMessage(json.dumps(request).encode('utf8')) # Get users list request = { 'method': 'GET', 'url': '/user/list/', 'event_name': 'get-user-list' } self.sendMessage(json.dumps(request).encode('utf8')) # Create new user with address request = { 'method': 'POST', 'url': '/user/', 'data': { 'name': 'Neyton', 'fullname': 'Neyton Drake', 'password': hash_password('123456'), 'addresses': [{"id": 1}, ] }, 'event_name': 'create-user' } self.sendMessage(json.dumps(request).encode('utf8')) # Trying to create new user with same info, but we have taken an error self.sendMessage(json.dumps(request).encode('utf8')) # Update existing object request = { 'method': 'PUT', 'url': '/user/6/', 'data': { 'fullname': 'Definitely not Neyton Drake', 'addresses': [{"id": 1}, {"id": 2}] }, 'event_name': 'partial-update-user' } self.sendMessage(json.dumps(request).encode('utf8')) def onMessage(self, payload, isBinary): print("Result: {0}".format(payload.decode('utf8'))) if __name__ == '__main__': factory = WebSocketClientFactory("ws://localhost:8080") factory.protocol = HelloClientProtocol loop = asyncio.get_event_loop() coro = loop.create_connection(factory, '127.0.0.1', 8080) loop.run_until_complete(coro) loop.run_forever() loop.close() 

More details about the entire source code of the sample can be here .

7. Further development


There are many ideas how to extend the current functionality of the library. For example, you can develop this module in the following directions:


Again, I remind you that many features are scheduled for different releases, and not for one. This was done on purpose so as not to rush from one extreme to another, doing something in parallel, because in the end nothing good will come of it. And it's easier for you, too.

8. And in conclusion ...


I think it turned out quite well for the first time, despite the absence of any experience in writing my own libraries. And to contribute (albeit small) to the development of the Python language - I want it quite strongly. Do not be surprised at how much time was spent on it: everything was done (and continues to be done) during free time and periodic interruptions (since regular work with one project is very tiring, and I want to develop in several directions at the same time).

Anyway, I will be glad to hear all your suggestions, ideas and improvements on this library in the comments (or in the form of a pool of requests on my GitHub). Feel free to ask any questions about the library and any features of implementation - I will be glad to any feedback.

All of the above code, as well as the sources of the aiorest-ws library, can be viewed on GitHub . Examples are located in the root of the project, in the examples directory. Documentation can be found here .

Source: https://habr.com/ru/post/274353/


All Articles