Impressions of visiting EuroPython 2014

One of the distinguishing features of the Python language is the conferences dedicated to this language, the so-called PyCons. Not so long ago, I was able to attend one such PyCon - EuroPython 2014. EuroPython is one of the largest European annual conferences on Python, which has been held for the last three years in Florence, and in 2014 - the first time in Berlin. While fresh memories decided to write a small report - what and how it was.

Instead of introducing

Immediately I will make a reservation, there will be exclusively impressions and short theses, and there will not be a detailed retelling of the content of the reports, since with a strong desire you can watch all of them on YouTube - the organizers of this conference not only did not make any commercial secret from the video presentations, so they also organized a live broadcast of all these videos (by the way, videos from last year’s conferences can also be found in the public domain on the same YouTube).

And further. Far from all the reports directly affected Python. That is, the reports often reviewed some useful technologies, and it was a little to say how these technologies could be used in the Python world. Therefore, if in the process of reading a certain paragraph of this opus you have the thought “so where is the python here? o_O "- I advise you to immediately see the video - everything will be there.

To begin with, practically every day of the conference was organized according to a schedule: Sutra - Keynotes , then reports - 20-45 minutes each (with a lunch break and coffee break), towards evening - Lightning Talks . I think here it is worthwhile to say in more detail what Keynotes and Lightning Talks are .
')
Keynotes are such reports, not very technical, with a great deal of philosophy. In my opinion, there is little practical application in them, therefore in my narration I will miss them.

Regarding Lightning Talks - these are such long sessions of an hour like this at 1.5, during which anyone could come out and speak out. For each speech was given about 10 minutes. Among these mini-reports, there was quite a lot of flames (advertisements for their products, advertisements for all sorts of events, like PyCon, in Brazil, some general philosophical thoughts, etc.). Therefore, in my story I will try to reflect only those presentations that seemed to me most useful and interesting.

Day One (Python vs Haskell)

Since the opening of the conference was on the first day, there were few reports and something more or less useful. Actually, the most important report of the day: what Python can learn from Haskell . In fact, the report dealt not only with Haskell alone, but also a little about Erlang, but this is not the point. The main idea of the report was that static code analyzers never catch errors like 1 + "1", and that all the fault is dynamic strict implicit typing in Python, which leads to problems of refactoring, etc. Variants of the solution are to use annotations (hello, Python 3), to use an experimental version of the Python interpreter: mypy , which at the syntax level of the language allows you to specify the types of the function arguments. That is, you can write like this:

def fib(n: int) -> None: a, b = 0, 1 while a < n: print(a) a, b = b, a+b

and it will be correctly interpreted by the interpreter. Of course, the thing is quite interesting, but again, it only works for Python code 3. I tried to search for mypy in standard Debian reps and did not find it, and compile it by hand is somehow lazy. Perhaps it would be useful to him, if he were a little more common, there would be support at the IDE level, etc. (by the way, the speaker actively called for contributing to this project). It was also said that mutability is evil, as well as weak support for Python algebraic data types. All this in my opinion is very, very controversial. Nevertheless, I recommend watching the video of the report at least to have an idea of what is happening in other languages (and of course, to be ready for a well-argued dispute in the holivars ala “which language is better”).

Video of the report 'What can python learn from Haskell?'

I also remember one report from Lightning Talks, a guy (by the way from Russia), promoting his library called Architect , the main advantage of which is the addition of the ability to automatically partition tables into a database using ORM (Django, SQLAlchemy, Pony models are supported). From databases - MySQL and postgreSQL. For people who work with these databases, this information may sometimes be useful.

Day two (nix, Kafka, Storm, Marconi, Logstash)

There was quite an interesting report about the package manager nix . In fact, there is a whole distribution kit built on this package manager. And it is called NixOS . Its usefulness, to be honest, seems somewhat doubtful to me, but the nix package manager itself can be quite useful in some cases (especially considering the fact that it does not prohibit the use of the main package manager, ie, yum or apt). The main feature of this package manager is that all operations performed by it are not destructive. That is, roughly speaking, when installing each new package, the previous version of the package is not overwritten, but a new user environment is created with a new set of symlinks. This allows:

1. at any time roll back to some previous state of the user environment
2. simultaneously use multiple versions of packages (i.e. for example, multiple versions of ffmpeg, or several versions of python). And this is all without any obvezk in the form of virtualization, dockers, etc.
3. when updates are not likely to break the system, because The old package is not demolished during the update, and the new package is placed in some separate environment, and at the end of the installation, the symlinks are switched

Of the minuses - if you keep all versions of the packages with all the dependencies, then of course the space on the HDD will need more and in the appendage we get some redundancy packages. In my opinion, these are disadvantages that can be tolerated. Also, the report briefly described how to build your packages for nix, and in particular, python-yachi packages. In general, if there is a problem Dependency hell , then nix allows you to solve this problem quite elegantly.

Video of the report 'Rethinking packaging, development and deployment'

On the same day there was a report about stream processing of large amounts of data using Kafka and Storm . The only useful thing that I learned from this report is that Storm is great for processing continuous streaming data (not static, unlike Hadoop), and Kafka will give a huge rabbitMQ head start in terms of the wildest message bandwidth (100k + / sec messages at the node against 20k / sec in rabbitMQ), but at the same time it loses in terms of the topology of the distribution of messages among consumer. In the context of the report, these two technologists were considered together, and Kafka acted as a transport for delivering messages to the Storm .

Video of the report 'Designing NRT (NearRealTime) stream processing systems'

There was a good introductory talk about Marconi - this is a messaging system within OpenStack (who does not know, OpenStack is completely written in python). Marconi is used as a link between the components of the OpenStack cloud, as well as the quality of a separate notification service. Actually is a direct analogue of the SNS and SQS from Amazon. Provides RESTfull API, can use MongoDB, Redis as well as SQLAlchemy as a message store (though SQLAlchemy was not recommended in production for performance reasons), there is no support for AMPQ protocol, but they plan to add it in the future.

Video of the report 'Marconi - OpenStack Queuing and Notification Service'

There was also a report about Logstash / Elasticsearch / Kibana - a set of mega useful utilities for collecting, filtering, storing, aggregating and displaying logs. By the way, the usefulness of logstash was mentioned several times in various reports from different people. Personally, I did not hear anything particularly new from this report. One of the ideas that was presented in this report is how to use the logstash to track all the logs from one request, and also to put together all the logs connected on a single basis from all components of a distributed system. By the way, an interesting library for logging called Logbook was mentioned during the report. Judging by the description, a worthy alternative to the standard logging library in Python.

Video of the report 'Log with logstash and elasticsearch'

Third day (Sphinx, gevent, DevOps risk mitigation)

The third day began with writing multilingual Sphinx documentation . This report was very useful for me personally, because within the framework of the project I am working on, there was a task to support two language versions of the API documentation - English and Russian, while I would like to make this process as simple and transparent as possible. In fact, everything is quite simple. There is such a wonderful GNU utility gettext , which is actively used for the internationalization of various OpenSource projects (I think everyone knows about gettext without any explanations), and there is a wonderful package sphinx-intl . Using sphinx-based rst-based documentation, simple commands prepare * .po files, which are then translated into a special gettext-editor, and on the basis of which sphinx documentation is made for a particular selected language. Also in the report was mentioned SAAS service Transifex , which facilitates the work of translators. As I understand it, the general principle of the service is such - with the help of simple console utilities, you can upload and download translation files to this service, which provides translators with a convenient Web-based interface for translating texts. Console utilities for this service, as I understand it, work on the principle of git push / pull. The service is not free. I think everyone interested (who faced with the problem of internationalization) to watch the video report is not necessary, just look through the slides to understand everything.

Video of the report 'Writing multi-language documentation using Sphinx'

Of the interesting reports that were on this day: a report about gevent (I thought it important to go to this report, because the gevent-e just a little more than fully built WebDAV-service project, which I do). In fact, nothing fundamentally new was said, they began with introductory on asynchrony implementation in Python and ended up with gevent itself . If anyone doesn’t know what gevent is, this report may seem interesting to you, but for those who are already familiar with this technology, are they lied to? From the heard interesting: 1. web-microframe , entirely made on gevent-e, with support for PostgreSQL, 2. AMQP-library , also entirely made on gevent-e.

Video of the report 'Gevent: asynchronous I / O made easy'

There was also a very interesting talk “DevOps Risk Mitigation: Test Driven Infrastructure” , about infrastructure testing as part of the deploy-i process. In fact, there is no magic - RPM is going, rolling out somewhere on the test machines, and then automatically using rsh, we go to these machines and test everything that is possible, starting from HTTP proxy and ending with the system of collecting logs. The speaker, a very colorful old school administrator, as I understood, does not recognize any of these puppet / chef / salt, but he is aware of the idea that to maintain product quality, tests should cover not only one code . In my opinion, the idea is correct, and this is really what we should strive for. Perhaps not in such ways as the report says, but nonetheless. All DevOps AM - must see.

Video of the report 'DevOps Risk Mitigation: Test Driven Infrastructure'

Day four (source protection, SOA from Disqus, abstract debugger architecture, dh-virtualenv)

The day began with the most remarkable report on the “Multiplatform binary packaging and distribution of your client apps” . I think many programmers who write commercial applications have thought about the problem at least once in their life: “they can copy and read our code!”. That is, in other words, the problem arises - to deliver the product in an encrypted form, well, or in the form of binaries, from which it is quite problematic to pull out and modify the source code. By the way, Dropbox, whose PC client is written in Python, solves this problem rather hemorrhoidly - they put their own patched version of the Python interpreter in the installer, which can read encrypted * .pyc files. The solution proposed in the report:

1. cythonize source code - we translate them into * .c
2. we compile received in native extensions
3. collect exe-shnik using PyInstaller-ra
4. Pack in setup.exe / dmg / rpm / deb file

For more details, I recommend watching the video report and slides . Naturally, each of the 4 stages I described in the report is analyzed in more detail - samples of the code are given, how and what to do. Well, of course, it is worth making a reservation that obfuscation of this kind does not save about reverse engineering, when a person can import an obfuscated package and just simply go over the names of the methods / variables. By the way, I also recommend reading this article for this topic (it is mentioned in the report).

Video of the report 'Multiplatform binary packaging and distribution of your client apps'

The following was a very good report from one of the developers of Disqus . The report talked about the benefits of SOA architecture on the example of the Disqus service. The Disqus service is a little more than fully built on Django, more precisely it is divided into a bunch-a bunch of small microservices (REST API, workers, crones, etc.), each of which is built on Django. By the way, the speaker explained why Django, and not something else - a large community, a lot of ready-made solutions + is much easier to find specialists. If you look at the technology stack, then Disqus uses uwsgi, django, celery, postgreSQL as the base and redis for the cache. To fumble the common code between their microservices, as I understand, they collect separate python packets. Of the benefits of the SOA approach:

1. independent scalability
2. simplicity deploy
3. ease of working with code

of the minuses:

1. if any one API changes (for example, external API service), then you have to remember to catch up with other services under the modified API
2. as mentioned a little higher - it is harder to fumble the common code between services

Video of the report 'How Disqus is using Django as the basis of our Service Oriented Architecture'

Python Debugger Uncovered - this is a very cool talk from the PyCharm developer. I advise all backend-developers to look for general erudition, how some abstract debagger in a vacuum works. There is no high magic, all debuggers are made according to the same principle and similarity using native tools of the Python language itself. By the way, for reference, debuggers PyCharm and PyDev are combined.

Video of the report 'Python Debugger Uncovered'

Also on this day there was a very worthwhile report about the dh-virtualenv tool from Spotify. Spotify uses Debian as the basis of its production OS, and the goal of creating this utility was to combine the deploy project in the form of deb-ok with the encapsulated virtualenv. The general meaning is: on the one hand, Debian is hellishly stable, and Debian packages are convenient in that they allow you to write all non-python-dependencies (like libxml), on the other hand, virtualenv is convenient in that it allows you to isolate python-dependencies, and all these dependencies will be the freshest packages, because taken from PyPI. Tulsa dh-virtualenv allows you to combine one with the other, and roughly speaking, automatically collect deb-ki from the current deployed virtualenv-a. It is put by the way through the usual apt-get. Inside the project, in addition to setup.py and requirements.txt, a debian directory is created, which describes the characteristics and dependencies of the deb package (rules, control, etc.), and the dpkg-buildpackage -us -uc console command is added to create the package. virtualenv on the final qa / prod machine is not necessary, because it is automatically downloaded and packaged by the utility when creating the package.

Video of the report 'Packaging in packaging: dh-virtualenv'

Personally, I remember the Lightning Talks of this day with one very interesting report about why you should not abuse getattr ().
Code example:

 import random class A(object): def get_prop(self): return getattr(self, 'prop', None) class B(A): @property def prop(self): return random.chioce(['test prop1', 'test prop2', 'test prop3']) print(B().get_prop())

This code will always output None, since the exception (due to the wrong method name, i.e. random.chioce) will be ignored inside getattr.

Day five (features of working with memory, DB API, make Go from Python)

The report “Everything You Always Wanted to Ask” was a very interesting experience for me as a person who was very far from C / C ++ and was used to thinking more mundane matters, it was very interesting to listen. Some things I already knew, some things once again refreshed in my memory. I will not dwell on the details, I will say this - it was especially interesting to hear about the existing tools that have real practical use ( objgraph , guppy memory profiler, etc.), and about the fact that in Python you can use different languages that implement low-level malloc (), and what profit will be from these replacements. In general, I personally recommend everyone to see this report . Also on the same day another cool report on a similar topic was held - “Fun with cPython memory allocator” . Unfortunately, I did not go to him, but judging by the feedback from my colleagues, the report is very worthwhile. Many probably faced a problem when you create a list in Python from a large number of lines of lines, then you delete it, but the memory does not decrease. The report tells about this problem - how it is, why and how to deal with it.

Video of the report "Everything You Always Wanted"

Video report of the 'Fun with cPython memory allocator

Next was the highly controversial report "Advanced Database Programming with Python" . For those who, in their practice, did not work much with databases, I recommend listening. Learn things like transaction isolation levels, for example, and how they differ from each other, as well as about python-specific database operations (according to PEP 249 autocommit = 0 de facto and commites should not forget to write manually) and about some basic things about query optimization. The report is ambiguous, because the author focuses on a set of very rare optimizations as a type, such as generating the ID of an inserted record in Python, rather than relying on the auto_increment / sequences database. This is certainly good, but experience shows that having heard such reports, some programmers start optimizing everything and everyone prematurely, and this in 99% of cases leads to very disastrous consequences.

Video report of 'Advanced Database Programming with Python

And the last was a speech from Benoit Shesno, the creator of the gunicorn web server . 100500 existing variants of multitasking in python were considered, and the new 100501th variant, the offset library, introducing the Crowin Go language functionality into python. During the talk, I did a little digging in the guts of this library - apparently, this base is based on a lower-level implementation of Crowin based on the fibers library. The offset itself introduces higher-level wrappers to the language. Those. roughly speaking, it makes writing python programs akin to what they would look like in Go. In his report, the author just gives examples of the similarity of the implementation code of an abstract task written in Go and written in Python, but using offset . In general, all those who lack the existing functionality of threads, tornado / twisted, asyncio, gevent and multiprocessing module - this library may seem very interesting. Listen to the report itself does not make much sense - it is better to immediately go into the code on github and try.

Video of the report 'Concurrent programming with Python and my little experiment'

The final Lightning Talks on this day were remembered by the HSTS talk. A very useful thing to say about which very few people know. In fact, this is the HTTP response header, which tells the browser to always force the use of an HTTPS connection for this hostname. Those. in the future, if the user drives some-url.com in the browser, the browser automatically automatically substitute https. It is useful both for security reasons and for reasons of reducing the number of redirects from HTTP to HTTPS returned from the server.

Conversations on the sidelines

At the conference there was a huge number of booths from companies one way or another related to Python (Google, Amazon, DjangoCMS, JetBrains, Atlassian, etc.). All of them could be approached and communicated on all sorts of interesting questions. We talked a lot with the guys from Google (although it was not at the conference itself, but at the after party from Google). From the interesting - they use Python mainly in internal products, well, except maybe Youtube. They also told us in secret that Google developers are not very fond of BigTable, and now Google laboratories are preparing to release a new revolutionary database (codename Spanner ), which allows to do distributed transactions on a cluster, while still having all the advantages of NoSQL . According to rumors, it seems like even Open Source (of which, of course, there are big-big doubts).

We also talked with representatives of DjangoCMS (there’s nothing interesting, a banal uncomplicated CMS on Django, you can install on your server, or you can use a SaaS solution) and with representatives of Amazon.Regarding the latter, I asked them a question raised at the highload conference of 2012, about the fact that the bandwidth of the instances is quite different and disproportionate to the type of the instance (see the presentation - slide 25), but received in response “well, this is the specificity of virtualization, we don’t we can say why, contact support. ” By the way, I think many will be interested, the guys from Amazon were handing out questionnaires with questions on Python-related topics. Already, I don’t remember whether they won prizes, or hantili in this way. In general, the question for a very specific, from the category of "those same questions that never occur in practice, but they like to ask for interviews in large offices»:

1. the Which is Called Creating Company first the when an object:
a. __create__
b. __new__
c. __init__
d. __del__

2. What is printed by the last statement in:

 def foo(x, l=[]): l+=2*x print l foo('a') foo('bc')

a. ['a', 'b', 'c']
b. ['a', 'bc']
c. ['a', 'a', 'b', 'c', 'b', 'c']
d. ['a', 'a', 'bc', 'bc']

3. What does the last statement print mean?

 class A(str): pass a=A('a') d={'a':42} d[a]=42 print type(d.keys()[0])

a. str
b. A
c. dict
d. int

4. Which of the following will these 2 statements return on Python vesrion 2?

 5 * 10 is 50 100 * 100 is 10000

a. True, True
b. True, False
c. False, True
d. False, False

In general, the impressions of the conference are very positive. The main bet of the organizers was on communication on the sidelines. In fact, this is the first conference in my memory, where reports are immediately immediately laid out on YouTube. And although I would rate the level of most reports as average, nevertheless, quite a lot of interesting things have been said that one way or another can be applied in real projects.

Source: https://habr.com/ru/post/236119/

All Articles