📜 ⬆️ ⬇️

Django Channels - the answer to the modern web

In the world of Django, the addition of Django Channels is gaining popularity. This library should bring asynchronous network programming to Django that we have been waiting for. Artyom Malyshev at Moscow Python Conf 2017 explained how the first version of the library does it (now the author has already written down channels2), why does it do it and whether it does at all.

First of all, Zen Python says that any solution should be the only one. Therefore, in Python, at least three . Network asynchronous frameworks already exist in large numbers:


It would seem, why write another library and whether it is necessary at all.
')

About speaker: Artyom Malyshev is an independent Python developer. Engaged in the development of distributed systems, speaks at conferences on Python. Artyom can be found on the nickname @ PROOFIT404 on Github and on social networks.

Django is synchronous by definition . If we are talking about ORM, then synchronously refer to the database during attribute access, when we write, for example, post.author.username, it does not cost anything.

In addition, Django is a WSGI framework.

WSGI


WSGI is a synchronous interface for working with web servers.

def app (environ, callback) : status, headers = '200 OK', [] callback (status, headers) return ['Hello world!\n'] 

Its main feature is that we have a function that takes an argument and immediately returns a value. This is all that a web server can expect from us. No asynchronous and does not smell .

This was done a long time ago, back in 2003, when the web was simple, users read all kinds of news on the Internet, went to guest books. It was enough just to accept the request and process it. Give an answer and forget that this user was at all.


But, for a moment, now is not 2003, so users want much more from us.

They want a rich web application, live content, they want the application to work great on the desktop, on the laptop, on other tops, on the clock. Most importantly, users do not want to press F5 , because, for example, there is no such button on tablets.



Web browsers, of course, meet us - they add new protocols and new features. If you and I were developing only the frontend, then we would simply take the browser as a platform and use its core features, since it is ready to provide them to us.

But, for backend programmers, everything has changed a lot . Web sockets, HTTP2, and the like are a huge pain in terms of architecture, because they are long-lived connections with their own states that need to be processed.


This is the problem that Django Channels for Django is trying to solve. This library is designed to give you the ability to handle connections, leaving the Django Core, to which we are accustomed, absolutely unchanged.

This was done by a wonderful man, Andrew Godwin , who has a terrible English accent that speaks very quickly. You should know him for things like the long-forgotten Django South and Django Migrations, which came to us from version 1.7. Since he repaired the migration for Django, he has been busy repairing web sockets and HTTP2.

How did he do it? Once upon a time, the following image went on the Internet: empty squares, arrows, the inscription “Good architecture” - you enter your favorite technologies into these small squares, you get a site that scales well.



Andrew Godwin wrote a server on these squares that stands in the front and accepts any requests, be they asynchronous, synchronous, e-mail, whatever. Between them is the so-called Channel Layer, which stores received messages in a format that is accessible to a pool of synchronous workers. As soon as an asynchronous connection sent us something, we record it in the Channel Layer, and then the synchronous worker can take it from there and process it in the same way as any Django View or anything else, synchronously. As soon as the synchronous code sent the response back to the Channel Layer, the asynchronous server will send it, stream it, do everything it needs. Thus, an abstraction is made.

This implies several implementations, and in production it is proposed to use Twisted, as an asynchronous server that implements the frontend for Django, and Redis , which will be the very channel of communication between synchronous Django and asynchronous Twisted.

The good news is that in order to use Django Channels, you don’t need to know either Twisted or Redis at all - these are all the implementation details. Your DevOps will know this, or you will meet when you repair production at three o'clock in the morning.

ASGI


Abstraction is a protocol called ASGI. This is a standard interface that lies between any network interface, server, be it a synchronous or asynchronous protocol, and your application. Its main concept is the channel.

Channel


A channel is an ordered first-in-first-out queue of messages that have a lifetime. These messages can be delivered zero or one time, and can only be received by one Consumer.

Consumers


In Consumer, you are just writing your code.

 def ws_message (message) : message.reply_channel.send ( { 'text': message.content ['text'], } ) 

A function that accepts a message may send several answers, or may not send the answer at all. Very similar to view, the only difference is that there is no return function, thus we can talk about how many answers we return from the function.

We add this function to routing, for example, we hang it to receive a message on a web socket.

 from channels.routing import route from myapp.consumers import ws_message channel_routing = [ route ('websocket.receive' ws_message), } 

We register it in Django settings, as well as register the database.

 CHANNEL_LAYERS = { 'default': { 'BACKEND': 'asgiref.inmemory', 'ROUTING': 'myproject.routing', }, } 

There can be several Channel Layers in a project, just as there can be several databases. This thing is very similar to db router, if someone used it.

Next, we define our ASGI application. It synchronizes how Twisted starts and how synchronized workers are started — they all need this application.

 import os from channels.asgi import get_channel_layer os.environ.setdefault( 'DJANGO_SETTINGS_MODULE', 'myproject.settings', ) channel_layer = get_channel_layer() 

After that, the code is deployed: we launch gunicorn, standardly send an HTTP request, synchronously, with the view, as we are used to. We start the asynchronous server, which will stand in front of our synchronous Django, and the workers who will process the messages.

 $ gunicorn myproject.wsgi $ daphne myproject.asgi:channel_layer $ django-admin runworker 

Reply channel


As we have seen, message has such a thing as the Reply channel. Why do you need it?

hannel unidirectional, respectively WebSocket receive, WebSocket connect, WebSocket disconnect is a common channel to the system for input messages. And the Reply channel is a channel that is strictly tied to the user's connection. Accordingly, message has an input and output channel. This pair allows you to identify from whom you received this message.


Groups


A group is a set of channels. If we send a message to a group, it is automatically sent to all channels of this group. This is convenient because nobody likes to write for loops. Plus, the implementation of groups is usually done using the native functions of the Channel layer, so it works faster than just sending messages one by one.

 from channels import Group def ws_connect (message): Group ('chat').add (message.reply_channel) def ws_disconnect (message): Group ('chat').discard(message.reply_channel) def ws_message (message): Group ('chat'). Send ({ 'text': message.content ['text'], }) 

Groups are also added to routing.

 from channels.routing import route from myapp.consumers import * channel_routing = [ route ('websocket.connect' , ws_connect), route ('websocket.disconnect' , ws_disconnect), route ('websocket.receive' , ws_message), ] 

And as soon as the channel is added to the group, reply will go to all users who have connected to our site, and not just the echo-answer to ourselves.

Generic consumers


What I love Django for is declarative. Similarly, there are declarative Consumers.

Base Consumer is basic, it can only map the channel that you have defined to your own method and call it.

 from channels.generic import BaseConsumer class MyComsumer (BaseConsumer) : method_mapping = { 'channel.name.here': 'method_name', } def method_name (self, message, **kwargs) : pass 

There are a large number of predefined consumers with deliberately augmented behavior, such as WebSocket Consumer, which determines in advance that it will handle WebSocket connect, WebSocket receive, WebSocket disconnect. You can immediately specify which groups to add the reply channel to, and as soon as you use self.send it will understand, send it to a group or to a single user.

 from channels.generic import WebsocketConsumer class MyConsumer (WebsocketConsumer) : def connection_groups (self) : return ['chat'] def connect (self, message) : pass def receive (self, text=None, bytes=None) : self.send (text=text, bytes=bytes) 

There is also a version of WebSocket Consumer with JSON, that is, not text, not bytes, but already parsed JSON will come to receive - this is convenient.

It is added to routing in the same way via route_class. In route_class, myapp is taken, which is determined from the consumer, from there all channels are taken and all channels specified in myapp are routed. Write in such a way less.

Routing


Let's talk in detail about routing and what it provides us.

First, these are filters.

 // app.js S = new WebSocket ('ws://localhost:8000/chat/') # routing.py route('websocket.connect', ws_connect, path=r'^/chat/$') 

This may be the path that came to us from the URI of the web socket connection, or the http request method. This can be any message field from a channel, for example, for an e-mail: text, body, carbon copy, whatever. The number of keyword arguments for a route is arbitrary.

Routing allows you to do nested routes. If several consumers are determined by some common characteristics, it is convenient to group them and add everyone to the route at once.

 from channels import route, include blog_routes = [ route ( 'websocket.connect', blog, path = r'^/stream/') , ] routing = [ include (blog_routes, path= r'^/blog' ), ] 

Multiplexing


If we open several web sockets, each has a different URI, and we can hang several handlers on them. But let's be honest, open a few connections just to make something beautiful on the back end, unlike an engineering approach.

Therefore, it is possible to call several handlers via a single web socket. We define such a WebsocketDemultiplexer, which operates with the notion of stream within a single web socket. Through this stream, it will redirect your message to another channel.

 from channels import WebsocketDemultiplexer class Demultiplexer (WebsocketDemultiplexer) : mapping = { 'intval': 'binding.intval', } 

The routing multiplexer is added in the same way as in any other declarative consumer route_class.

 from channels import route_class, route from .consumers import Demultiplexer, ws_message channel_routing = [ route_class (Demultiplexer, path='^/binding/') , route ('binding.intval', ws_message ) , ] 

The stream argument is added to the message so that the multiplexer can figure out where to put the given message. The payload argument contains everything that goes to the channel after the multiplexer processes it.

It is very important to note that in the Channel Layer, the message will fall twice : before the multiplexer and after the multiplexer. Thus, as soon as you start using a multiplexer, you automatically add latency to your queries.

 { "stream" : "intval", "payload" : { … } } 

Sessions


Each channel has its own sessions. This is a very handy thing, for example, to keep state between calls to handlers. You can group them by the reply channel, since this is an identifier that belongs to the user. The session is stored in the same engine, which stores the usual http session. For obvious reasons, the signed cookie is not supported, they are simply not in the web socket.

 from channels.sessions import channel_session @channel_session def ws_connect(message) : room=message.content ['path'] message.channel_session ['room'] = room Croup ('chat-%s' % room).add ( message.reply_channel ) 

During the connection, you can get http session and use it in your consumer. As part of the negotiation process, setting up a web socket connection is sent to the user's cookies. Accordingly, therefore, you can get a user session, get a user object that you used to use in Django before, just as if you were working with a view.

 from channels.sessions import http_session_user @http_session_user def ws_connect(message) : message.http_session ['room'] = room if message.user.username : … 

Message order


Channels allows you to solve a very important problem. If we establish a connection with a web socket and immediately send it, then this leads to the fact that two events — WebSocket connect and WebSocket receive — are very close in time. It is very likely that consumer for these web sockets will run in parallel. Debugging it will be very fun.

Django channels allows you to enter two types of lock:

  1. Easy lock . With the help of the session mechanism, we guarantee that until the consumer receives the message, we will not process any message on the web sockets. After the connection is established, the order is arbitrary, perhaps parallel execution.
  2. Hard lock - only one consumer of a specific user is executed at a time. This is an overhead of synchronization, since the slow session engine is used. Nevertheless, there is such an opportunity.

 from channels.generic import WebsocketConsumer class MyConsumer(WebsocketConsumer) : http_user = True slight_ordering = True strict_ordering = False def connection_groups (self, **kwargs) : return ['chat'] 

In order to write this, there are the same decorators that we saw earlier in the http session, channel session. In declarative consumer you can just write attributes, as soon as you write them, it will automatically apply to all methods of this consumer.

Data binding


In due time Meteor became famous for Data binding.

Open two browsers, go to the same page, and in one of them click on the scroll bar. At the same time, in the second browser, on this page, the scroll bar changes its value. That's cool.

 class IntegerValueBinding (WebsocketBinding) : model = IntegerValue stream = intval' fields= ['name', 'value'] def group_names (self, instance, action ) : return ['intval-updates'] def has_permission (self, user, action, pk) : return True 

Django can now do the same.

This is implemented using hooks provided by Django Signals . If binding is defined for a model, all connections that are in a group for this instance model will be notified of each event. They created a model, changed the model, deleted it - it will all be in the alert. The notification occurs on the specified fields: the value of this field has changed - a payload is formed, sent via a web socket. It's comfortable.

It is important to understand that if in our example we constantly click the scroll bar, then messages will always go on and the model will be saved. This will work up to a certain load, then everything will rest on the base.

Redis layer


Let's talk a little more about how the most popular Channel Layer for production - Redis.

It is arranged well:


The channel is simply a list of id from Redis. By id is the value of a particular message. This is done so that you can control the life of each message and channel separately. In principle, this is logical.

 >> SET "b6dc0dfce" " \x81\xa4text\xachello" >> RPUSH "websocket.send!sGOpfny" "b6dc0dfce" >> EXPIRE "b6dc0dfce" "60" >> EXPIRE "websocket.send!sGOpfny" "61" 

Groups are implemented by sorted sets. Distribution to groups is performed inside the Lua-script - it is very fast.

 >> type group:chat zset >> ZRANGE group:chat 0 1 WITHSCORES 1) "websocket.send!sGOpfny" 2) "1476199781.8159261" 

Problems


Let's see what problems this approach has.

Callback hell


The first problem is the newly invented callback hell. It is very important to understand that most of the problems with the channels that you encounter will be in style: arguments came to the consumer that he did not expect. Where they came from, who put them in Redis is all a dubious task to investigate. Debugging of distributed systems in general for the strong in spirit. AsyncIO solves this problem.

Celery


On the Internet, they write that Django Channels is a replacement for Celery.

I have bad news for you - no, it is not.

In channels:


I see the future as official support for using channels and celery together, with minimal cost, with minimal effort. But Django Channels is not a Celery replacement.

Django for modern web


Django Channels is the Django for the modern web. This is the same Django that we all used to use: synchronous, declarative, with a large number of batteries. Django Channels is just one battery plus. You should always understand where to use it and whether to do it. If the Django project is not needed, then the Channels are not needed there. They are only useful in projects where Django is justified.

Moscow Python Conf ++

A professional conference for Python developers comes to a new level - on October 22 and 23, 2018 we will gather 600 best Python programmers in Russia, present the most interesting reports and, of course, create an environment for networking in the best traditions of the Moscow Python community with the support of the Ontico team.

We invite experts to make a presentation. The program committee is already up and running until September 7th.

For participants, an online brainstorming program is conducted. In this document, you can make the missing topics or just the speakers, whose performances are interesting to you. The document will be updated, in fact, you will be able to follow the program formation all the time.

Source: https://habr.com/ru/post/418445/


All Articles