📜 ⬆️ ⬇️

Merging Rails and Merb: Performance (Part 2 of 6)

Six consecutive articles on the merger of Rails and Merb were published on www.engineyard.com from December 2009 to April 2010. This is the second article. The first is here .

The next great improvement we hoped to bring to Rails from Merb was better performance. Since Merb came after Rails, Merb developers had the opportunity to find out which parts of Rails were used more often and optimize their performance.

We wanted to take performance-enhancing changes from Merb and transfer them to Rails 3. In this post, I’ll cover several optimization points added to Rails 3: reducing controller time and (strong) acceleration of the collection of partials.

')
To begin with, we focused on the performance of several specific, but widely used parts of Rails:

This is clearly a rough estimate, but it covered most cases where performance could be significantly better, but the Rails developer was not able to improve it on his own.

Controller overhead


The first step was to improve the costs of the Rails controller. In Rails 2.3, there is no way to test them, because you have to use render :string to send text to the client in response, which means doing the entire render process. And yet, we wanted to reduce it as much as possible.

By doing this, we used Stefan Kaes 'fork (Stefan Kaes' fork of ruby-prof), which CallStackPrinter with CallStackPrinter (the best way to visualize profile data from a Ruby application, from all I've seen). We also wrote several benchmarks that could duplicate the profiler in the course of its work in order to focus on a particular site and obtain more accurate data.

When we looked at the process of the controller, it turned out that the dominant part was occupied by the creation of an answer. Rummaging deeper, we saw that ActionController installed the headers directly, then parsed them again before returning the answer to get additional information. A good example of this phenomenon is the Content-Type header, which has two components (the content-type itself and the optional charset). Both components were available in the Response object via a getter and a setter:

Copy Source | Copy HTML<br/> def content_type =(mime_type)<br/> self .headers[ "Content-Type" ] =<br/> if mime_type =~ / charset / || (c = charset ). nil ?<br/> mime_type.to_s<br/> else <br/> "#{mime_type}; charset=#{c}" <br/> end <br/> end <br/> <br/> # Returns the response's content MIME type, or nil if content type has been set. <br/> def content_type <br/> content_type = String (headers[ "Content-Type" ] || headers[ "type" ]). split ( ";" )[ 0 ]<br/> content_type .blank? ? nil : content_type <br/> end <br/> <br/> # Set the charset of the Content-Type header. Set to nil to remove it. <br/> # If no content type is set, it defaults to HTML. <br/> def charset =( charset )<br/> headers[ "Content-Type" ] =<br/> if charset <br/> "#{content_type || Mime::HTML}; charset=#{charset}" <br/> else <br/> content_type || Mime ::HTML.to_s<br/> end <br/> end <br/> <br/> def charset <br/> charset = String (headers[ "Content-Type" ] || headers[ "type" ]). split ( ";" )[ 1 ]<br/> charset .blank? ? nil : charset .strip. split ( "=" )[ 1 ]<br/> end <br/>

As you can see, the Response object worked directly with the Content-Type header, parsing part of the header if necessary. This was particularly problematic because Response did extra work on the headers during the preparations before sending the response to the client:

Copy Source | Copy HTML<br/> def assign_default_content_type_and_charset !<br/> self .content_type ||= Mime ::HTML<br/> self .charset ||= default_charset unless sending_file?<br/> end <br/>

That is, before sending the response, Rails again split the Content-Type header semicolon and then did more work with strings to connect them together again. And of course, Response#content_type= used in other parts of Rails to properly set it depending on the type of template or inside the respond_to blocks.

This did not take hundreds of milliseconds when queried, but in heavily cached applications, the costs could be higher than the cost of removing objects from the cache and returning them to the client.

The solution in this case was to store the content type and charset in the fields of the response object and combine them with one simple quick operation when preparing the response.

Copy Source | Copy HTML<br/>attr_accessor :charset, :content_type<br/> <br/> def assign_default_content_type_and_charset !<br/> return if headers[CONTENT_TYPE].present?<br/> <br/> @content_type ||= Mime ::HTML<br/> @charset ||= self . class .default_charset<br/> <br/> type = @content_type. to_s .dup<br/> type < < "; charset=#{@charset}" unless @sending_file<br/> <br/> headers[CONTENT_TYPE] = type<br/> end <br/>

Now we just find the instance variables and create one String. Multiple changes to these lines of code have reduced the time spent from approximately 400 microseconds to 100 microseconds. Of course, not a lot of time, but it could really weaken performance-sensitive applications.

Render Partial Collection


Rendering a collection of partials was another good opportunity for optimization. And in this case, the improvement was milliseconds, not microseconds!

First, implementation in Rails 2.3:

Copy Source | Copy HTML<br/> def render_partial_collection (options = {}) #:nodoc: <br/> return nil if options[:collection].blank?<br/> <br/> partial = options[:partial]<br/> spacer = options[:spacer_template] ? render (:partial => options[:spacer_template]) : '' <br/> local_assigns = options[:locals] ? options[:locals].clone : {}<br/> as = options[:as]<br/> <br/> index = 0 <br/> options[:collection].map do |object|<br/> _partial_path ||= partial ||<br/> ActionController::RecordIdentifier.partial_path(object, controller. class .controller_path)<br/> template = _pick_partial_template(_partial_path)<br/> local_assigns[template.counter_name] = index<br/> result = template.render_partial( self , object, local_assigns.dup, as)<br/> index += 1 <br/> result<br/> end .join(spacer).html_safe!<br/> end <br/>

An important part is what happened inside the loop, which could happen hundreds of times in a large collection of partials. In this case, the implementation in Merb had better performance, which we were able to transfer to Rails. Here is the implementation of Merb.

Copy Source | Copy HTML<br/>with = [opts. delete (:with)].flatten<br/>as = (opts. delete (:as) || template.match(%r[(?:.*/)?_([^\./]*)])[ 1 ]).to_sym<br/> <br/> # Ensure that as is in the locals hash even if it isn't passed in here <br/> # so that it's included in the preamble. <br/>locals = opts.merge(:collection_index => - 1 , :collection_size => with.size, as => opts[as])<br/>template_method, template_location = _template_for(<br/> template,<br/> opts. delete (: format ) || content_type,<br/> kontroller,<br/> template_path,<br/> locals.keys)<br/> <br/> # this handles an edge-case where the name of the partial is _foo.* and your opts <br/> # have :foo as a key. <br/>named_local = opts.key?(as)<br/> <br/>sent_template = with.map do |temp|<br/> locals[as] = temp unless named_local<br/> <br/> if template_method && self . respond_to ?(template_method)<br/> locals[:collection_index] += 1 <br/> send(template_method, locals)<br/> else <br/> raise TemplateNotFound, "Could not find template at #{template_location}.*" <br/> end <br/> end .join<br/> <br/>sent_template <br/>

Now we understand that this is far from ideal. A lot of things happen here (and I personally would like to see this method refactored). But the interesting part is what happens inside the loop (starting with sent_template = with.map ). Unlike ActionView, which figured out the name of the template, took the template object, the name of the counter, etc., Merb limited the activity inside the loop to setting several Hash values ​​and calling the method.

For a collection of 100 partials, the difference between the costs could be about 10 ms and 3 ms. For the collection of small partials, this was noticeable (and the reason for the inline partials, which were appropriate in order for the partials to come first).

In Rails 3, we improved performance by reducing what happens inside the loop. Unfortunately, one feature of Rails made optimizing a little more difficult. In particular, you could render a partial using a heterogeneous collection (a collection containing Post, Article and Page objects, for example) and Rails would render the correct template for each object (Article objects are rendered in _article.html.erb , etc.) . This means that it is not always possible to determine the exact pattern that needs to be rendered.

Faced with this problem, we were not able to fully optimize the heterogeneous case, but we did render :partial => "name", :collection => @array faster. To achieve this, we split the logic into 2 ways: faster for the case when we know the pattern, and slower for the case when it needs to be defined depending on the object.

Here is what the render collection now looks like when we know the pattern:

Copy Source | Copy HTML<br/> def collection_with_template (template = @template)<br/> segments, locals, as = [], @locals, @options[:as] || template.variable_name<br/> <br/> counter_name = template.counter_name<br/> locals[counter_name] = - 1 <br/> <br/> @collection.each do |object|<br/> locals[counter_name] += 1 <br/> locals[as] = object<br/> <br/> segments < < template. render (@view, locals)<br/> end <br/> <br/> @template = template<br/> segments<br/> end <br/>

What's important, the loop itself is now small (even simpler than what happened inside the loop in Merb). What else is worth noting is that in the process of improving the performance of the code, we created a PartialRenderer object to track the state. Although you might think that creating a new object would be expensive, it turns out that creating objects in Ruby is relatively cheap and objects can offer caching options that are much more complicated in procedural code.

For those who want to see improvements in pictures, here are a few things: first, the improvement between Rails 2.3 and Rails 3 in Ruby 1.9 (a smaller bar means more speed).
image

But for more expensive operations:
image

Finally, a comparison of Rails 3 on four platforms (Ruby 1.8, Ruby 1.9, Rubinius, and JRuby):
image

As you can see, Rails 3 is noticeably faster than Rails 2.3, and that all platforms (including Rubinius!) Are noticeably improved compared to Ruby 1.8. In general, a wonderful year for Ruby!

In the next post I’ll tell you about the improvements in the Rails 3 API for plugin authors - follow the messages and, as always, leave comments!

Source: https://habr.com/ru/post/90160/


All Articles