📜 ⬆️ ⬇️

Rails 4 - Thread-Safety



In Rails 4.0, the config.threadsafe option will be enabled by default! and in this lesson you will learn what it does after all, how it affects production, and how you should behave with threads in general.



The content of the cycle "The Subtleties of Rails 4"



Some time ago, Aaron Patterson (Tenderlove) posted an entry on his blog . In it, he talked about the threadsafe option in the rails and mentioned that it will most likely be included by default in the Rails 4 release. Therefore, in this episode I’ll talk about it.

First, create a simple Rails 3 application called thready:
')
terminal
$ rails new thready $ cd thready 

Opening and looking at the configuration file of the application for production shows that the option is commented out:

/config/environments/production.rb
 # Enable threaded mode # config.threadsafe! 

It should be understood that its inclusion does not mean a magical transformation into a multi-threaded application. What, then, does she do? When viewing the source code of the threadsafe method ! You can see that the installation of the options set is happening:

 def threadsafe! @preload_frameworks = true @cache_classes = true @dependency_loading = false @allow_concurrency = true self end 


The first 3 options are responsible for the so-called impatient loading of the application. Thus, when the application is launched, it is loaded immediately, instead of downloading in parts, as it happens when threadsafe is disabled.

The fourth option, allow_concurrency, cuts off the use of the intermediate layer Rack :: Lock . When you run the rake middleware command in the developer mode in the terminal, you can see that Rack :: Lock is one of the first layers that registers activity:

terminal
 $ rake middleware use ActionDispatch::Static use Rack::Lock #    . 

Turning on the threadsafe option in production and running:

 $ rake middleware RAILS_ENV=production 

This layer will no longer be visible. To answer the question “And what does Rack :: Lock do?” You should look at the source code ( attention, the file has undergone changes ):

/lib/rack/lock.rb
 require 'thread' require 'rack/body_proxy' module Rack class Lock FLAG = 'rack.multithread'.freeze def initialize(app, mutex = Mutex.new) @app, @mutex = app, mutex end def call(env) old, env[FLAG] = env[FLAG], false @mutex.lock response = @app.call(env) response[2] = BodyProxy.new(response[2]) { @mutex.unlock } response rescue Exception @mutex.unlock raise ensure env[FLAG] = old end end end 

When the next request comes to the call method, then a lock is imposed on the mutex object and only after that the request is processed, after which the block is removed. This approach ensures that only one request will be processed at a time. For a better understanding of the process, I will show it clearly. To do this, create a new controller with one method:

terminal
 $ rails g controller foo bar 

In the created method, the current thread will fall asleep for a second, after which some mysterious text will be displayed:

/app/controllers/foo_controller.rb
 class FooController < ApplicationController def bar sleep 1 render text: "foobar\n" end end 

Now we will start the server in development mode. Note that WEBrick is used (by default, this is what it is). Now, in a separate tab of the console, we will make a request to the site using curl:

terminal
 $ curl http://localhost:3000/foo/bar foobar 

There was a second delay before the message appeared, as expected. Now we will make 5 parallel simultaneous requests. To do this, use the & symbol, which allows you to make asynchronous requests:

 % repeat 5 (curl http://localhost:3000/foo/bar &) % foobar foobar foobar foobar foobar 

Here, of course, this is not visible, but these 5 answers came in turn, one after another, taking a total of 5 seconds. This is due to the above sequential request processing. Now we will start the server in production mode and again make 5 parallel requests to it:

terminal
 % rails s -e production % repeat 5 (curl http://localhost:3000/foo/bar &) 

All 5 responses from the server will come at the same time, thanks to the threadsafe! Option enabled, due to which the Rack :: Lock layer is no longer used. So now requests are processed asynchronously, hooray!

Does all this mean that with multithreading enabled, now you need to start writing stream-safe code? In fact, it all depends on the settings of the production. Most of the popular rails servers, such as Unicorn and Phusion Passenger, pass only one request at a time through one worker. In other words, even with the option turned on, requests will be processed in turn.

You can see how Unicorn will behave. To do this, you need to uncomment the following line:

/ Gemfile
 # Use unicorn as the app server gem 'unicorn' 

Then run bundle install. Now, using the unicorn command, you can start the server for the Rails application in production mode:

terminal
 $ unicorn -E production -p 3000 

Running curl again now will show that there is no multithreading. This is how Unicorn works with one worker. For Unicorn, no additional mutex and layer Rack :: Lock are required. It is for this reason that the threadsafe option is! will be enabled by default in production: the logic for processing requests and threads is left to the production environment. But do not forget that the inclusion of this option also includes the impatient, forced download of the entire application, which can cause some side effects. So do not forget to thoroughly test your application on a test server before moving it to production!

Another small note: threadsafe option ! can change its name in release 4 rail in order to better show the logic of its work.

So, we already know that Unicorn and Passenger do not support multi-threading, but what to do if you want a server with its support? Puma comes to the rescue. This server is based on Mongrel, so it can run any Rack application using multiple threads. Puma supports JRuby, Rubinius and even MRI. Let's try:

/ Gemfile
 # Use unicorn as the app server # gem 'unicorn' gem 'puma' 

Now start the server in production mode:

terminal
 $ rails s puma -e production 

Running now curl will show that the responses from the server come almost instantly.

Puma will not work very well in MRI, because of the mechanism that is built into this version of the mechanism called Global Interpreter Lock. But JRuby and Rubinius have better quality thread support, so Puma will work better in them.

Of course, when using multithreading, you need to carefully monitor the code in your application and its security. Here is a small example of unsafe code:

/app/controllers/foo_controller.rb
 class FooController < ApplicationController @@counter = 0 def bar counter = @@counter sleep 1 counter += 1 @@counter = counter render text: "#{@@counter}\n" end end 


The controller has a class variable called @@ counter, which increases by 1 with each call to the bar method. In the method itself, we save the value of the class variable, sleep a second, then return the value and display it on the screen. Let's see how everything will work in a single-threaded environment:

terminal
 $ rails s 

Run curl 4 times, after which 4 digits will appear, each with a second delay:

terminal
 % repeat 4 (curl http://localhost:3000/foo/bar &) % 1 2 3 4 

Now we stop the server, and start Puma in production mode:

terminal
 $ rails s puma -e production 

Her output will be different:

terminal
 % repeat 4 (curl http://localhost:3000/foo/bar &) % 1 1 1 1 

Now requests are processed simultaneously in multiple threads. Thus, the last request had time to complete even before the first one had managed to complete its work. Therefore, you should be very careful in working with data (@@ counter, for example), to which several streams have access. To resolve the problem you need to use an object of the mutex type:

/app/controllers/foo_controller.rb
 class FooController < ApplicationController @@counter = 0 @@mutex = Mutex.new def bar @@mutex.synchronize do counter = @@counter sleep 1 counter += 1 @@counter = counter end render text: "#{@@counter}" end end 

The framed mutex code becomes thread safe. After that, you need to restart the application server and re-use curl. The counter now works correctly, since the code processes the threads sequentially one after the other.

Things like class and object variables, global variables and constants must be used inside a mutex. Despite the warnings of the interpetator, constants in Ruby can be changed. As the line, however. In the case of strings, use the freeze method to prohibit changing them. And please don't forget that the code inside the class is shared memory, so you don’t need to insert methods into the class itself dynamically after the application is loaded.

Fortunately, transforming your application into a thread-safe is not as difficult as it may seem at first. As a rule, you are unlikely to often share shared information. And if it is necessary, it is better to look at the known ways to do it. It is likely that there are more elegant methods.

It may be a difficult task to make sure that all the gems used in your application are safe. Read the README of the used libraries, if there is no necessary information there, then it is worth creating a ticket in the tracker.

There is one more problem that can be encountered in a multithreaded application and it is connected with a pool of database connections. This pool sets the number of simultaneous connections to the database and defaults to 5. I have commented mutex for the demonstration.

/app/controllers/foo_controller.rb
 def bar #@@mutex.synchronize do counter = @@counter sleep 1 counter += 1 @@counter = counter #end render text: "#{@@counter}\n" end 


Having tried to make 12 simultaneous requests to the application, it will be seen that only 4 of them have passed, due to the limited number of connections. Despite the fact that in this method there is no query to the database, rails still reserves connections for the duration of each query. And if during the request there will not be enough connections, the application will wait until it receives its own. Thus, if the request is delayed in time, timeout errors will start to pour and other requests will also fall off due to the lack of available connections from the database pool.

By increasing the value in the pool to 15, you can repeat the previous curl request and all requests will be successful.

Thanks for attention!

Permission was received for publication from the author (since this is a translation of a paid screencast). All inaccuracies and errors, please report to the PM.



application



Subscribe to my blog !

Source: https://habr.com/ru/post/172109/


All Articles