
I want to tell you about sore: about working with
AR in general and with Relation in particular; warn against standard garden products that can easily ruin your life and make the code slow and voracious. The narration will be based on Rails 3.2 and ActiveRecord of the same spill. In Rails 4, of course, a lot of new and useful things, but you still need to switch to it, and the foundation is the same anyway.
This material, for the most part, is intended for beginners, because the author is very painful to look at extracting the contents of entire tables into memory in the form of ActiveRecord objects and other shooting of limbs when using
AR . Developers who know Zen, the topic is unlikely to benefit, they can only help, adding to it with their examples and edifications.
How many times have they told the world ...
If you started working with
Relation (and with any
ActiveRecord object in general), then you need to clearly represent one thing: at what point we “embody” the sample, that is, at what point we stop constructing the SQL query. In other words: when data is sampled and we go to in-memory processing. Why is it important? Yes, because it is awkward:
Product.all.find{|p| p.id == 42}
It can hang the server, pick up all the RAM and do many more dirty tricks. And the same, but in other words:
')
Product.find(42)
will work quickly and without consequences. This way,
find and
find is not at all the same thing! Why? Yes, because in the first case, we said
Product.all and shot ourselves, because it means extracting the entire contents of the
products table and building an AR object for each row, creating an array of them, and then looking for it, which is the class method
Array (generally speaking,
find from
Enumerable , but these are details). In the second case, everything is much better:
find is the AR method and is intended for searching by
pk . That is, we generate a request
SELECT * FROM products WHERE products.id = 42;
We execute it, we receive one line and all.
What is good and what is bad
Now, having understood why working with AR is a big responsibility, let's figure out how not to shoot yourself in the foot. This is quite simple: you need to use the methods that AR gives us. Here they are:
where, select, pluck, includes, joins, scoped, unscoped, find_each and a few more, which can be found in the documentation or in the
next hub . But the better not to use the list will be very difficult and, at the same time, very simple: it is undesirable to use everything else, since almost all the remaining variety of methods turns
Relation into an
Array with all the ensuing consequences.
Simple recipes
Now, I will give a few standard and not very designs that make life easier, but which are very often forgotten. But before asking the reader a question: remember the function has_many. Think about what parameters you know and what are you actively using? List them in your mind, count ... and now the question: do you know how many of them really are?
Answer24 pieces in Rails3 and 12 in Rails4. The 12pcs difference is made up of methods like where, group , etc., as well as methods for working with pure SQL, which in Rails4 are passed in a block, not in a hash.
Why did I ask this? Yes, to very roughly assess your level and say that if most of the options you know, then the following is unlikely to bring you new knowledge. This assessment is very conditional, therefore, dear reader, do not be angry much, if it seemed to you to be ridiculous / untenable / strange / etc (underline the necessary).
Recipe number one
So now let's go in order. About
update_attributes and
update_attribute everyone knows (or not all?). The first is to massively update fields with calling validations and callbacks. Nothing of interest. The second one skips all validations, starts callbacks, but can update the value of only one selected field (someone prefer
save (validate: false) ). But about
update_column and
update_all for some reason are often forgotten. This method skips both validations and callbacks and writes directly to the database without any preliminary caresses.
Recipe number two
The comments reminded about the wonderful method of
touch . They also often forget about him and write something like
@product.updated_at = DateTime.now @product.save
or
@product.update_attribute(:updated_at, DateTime.now)
Although, for good, for such purposes it is easier to do this:
@product.touch(:updated_at)
In addition,
touch has its own after_touch
callback , as well as the option
: touch is present in the
belongs_to method.
How to iterate correctly
The
hub already talked about find_each, but I can't help but mention it again, because
product.documents.map{…}
and they are isomorphic, there are a little more than everywhere. The problem with ordinary iterators applied to
Relation is only one: they pull everything from the database at once. And this is terrible. In contrast,
find_each , by default, drags 1000 pieces at a time and it's just great!
UPD: as already noted in the comments, all methods that are not uniquely projected onto raw-sql are delegated to to_a because of what the entire query is retrieved into memory and working with it is no longer on the DB side, but on the Ruby side.
Tip about default_scope
Wrap the contents of
default_scope in a block. Example:
default_scope where(nullified: false)
What is the difference? The first option is executed right at the server start and if the
nullified field was not found in the database, then the server will not take off. The same applies to migrations - they will not pass due to the lack of a field, which, most likely, we just want to add. In the second case, due to the fact that
Ruby is lazy, the block will be executed only at the moment of accessing the model and the migration will be performed normally.
Has_many through
Another common patient is
product.documents.collect(&:lines).flatten
here the product has many documents that have many lines. It often happens that you want to get all the lines of all documents related to the product. And in this case, create the above construction. In this case, you can recall the
through option for reports and do the following for the product:
has_many :lines, through: documents
and then execute
product.lines
It turns out and clearer and more efficient.
Little about JOIN
Continuing on the topic of joins, let's remember about
includes . What is special about it? Yes that is
LEFT JOIN . Quite often I see that the left / right join is written explicitly.
joins("LEFT OUTER JOIN wikis ON wiki_pages.wiki_id=wikis.id")
This of course also works, but pure SQL in RoR was always not in high esteem.
Also, without departing from the cash register, it is necessary to remind about the difference of values ​​in
joins and
where when used together. Suppose we have a
users table, and various entities, for example,
products have an
author_id field and an
author relational report, which has a
users table.
has_one :author, class: 'User', foreign_key: 'author_id'
The following code for this case will
not work
products.joins(:author).where(author: {id: 42})
Why? Because
joins indicates the name of the relation that joins, and
where the condition is imposed on the table and you need to say
where(users: {id: 42})
You can avoid this by explicitly specifying
'AS author' in the join, but this will again be pure SQL.
Next, look at joins from a different angle. Whatever we do not join, in the end we get the objects of the class with which it all began:
Product.joins(:documents, :files, :etc).first
In this case, we get the product regardless of the number of joins. Some people are saddened by this behavior, since they would like to get fields from the tables. And they start doing the same query from the other side: take documents, join them with products, write pure SQL for communication with other entities, generally invent the bicycle when the correct and logical code was written at the very beginning. Therefore, I recall the very basis:
Product.joins(:documents, :files, :etc).where(...).pluck('documents.type')
Here we get an array with the desired field from the database. Pros: minimum requests, no AR objects are created. Minuses: in Rails 3
pluck takes only 1 (one) parameter and this
pluck('documents.type', 'files.filename', 'files.path')
can only be done in Rails 4.
Build reports
We now turn to the consideration of working with the
build- nd relation. In general, everything is quite simple:
product.documencts.build(type: 'article', etc: 'etc').lines.build(content: '...')
After calling
product.save , we will save all associations along with validations, preference and courtesans. In all this joyful action there is one nuance: all this is good when the
product is not
readonly and / or there are no other restrictions on preservation. In such cases, many are satisfied with the garden, similar to the garden with
joins in the example above. That is, create a
document , bind it to the
product and
build the lines for the document. It turns out that the default behavior, which is usually tied to error handling, does not work. Therefore, in the appendage, all this is immediately surrounded with crutches, forwarding errors and it turns out pretty disgusting. What to do in this case? We need to remember about
autosave and understand how it works. Without going into details I will say that it works on
callback . Therefore, there is a way to keep the relation for the above described product:
product.autosave_associated_records_for_documents
In this case, the document will be saved, its callbacks will be called to save the lines, etc.
Some words about indexes
Lastly, I need to say about the indexes, because many people have beaten their heads against hard objects because of problems on the basis of indexes. Immediately I apologize for interfering with ActiveRecord and the possibilities of the database, but according to my personal conviction: it is impossible to work well with AR, not realizing what is happening at this moment on the side of the database.
Problem one
For some reason, many people believe that the
order for
Relation does not depend on which column we sort by. A variation on this misconception is the lack of understanding of the difference between
order Relation and
order Array . Because of this, you can meet the
default_scope with an order for the VARCHAR field and questions in the spirit: “Why is it that your page loads so slowly? There are only a couple of records retrieved from the database! ”. The problem here is that defaulting sorting is damn expensive if we don't have an index on this column. By default, AR sorts by
pk . This happens when we do.
Products.first
But
pk has an index almost always and there are no problems. But when we say that it will do
order (: name) for any access to the model, problems begin.
For reference : if you explain “on the fingers”, then when sorting by the indexed column, the real sorting does not occur, it is already present in the database and the data is immediately sent in the correct order.
Problem two
Composite indexes. Not everyone knows about them and even fewer people know why they are needed. In short, a composite index is an index based on two or more DB fields. Where can it come in handy? Two frequent places of use:
- polymorphic associations
- intermediate table of relations "many to many".
About polymorphic links have been told
here . For them, very often, it is convenient to create a composite index. Here is a slightly updated example from
off-line :
class CreatePictures < ActiveRecord::Migration def change create_table :pictures do |t| t.string :name t.integer :imageable_id t.string :imageable_type t.timestamps end add_index :pictures, [:imageable_id, :imageable_type]
Here are a
few words about the difference between the ordinary and composite index. Further, I will not go into details, because the topic is for a separate hub. Besides, before me
everything was painted.
Now about the intermediate table of relations. All known
HBTM . Here, in some cases, it is appropriate to hang the composite index on
assemblies_parts (see the
HBTM link). But we must remember that the sequence of fields in the composite index is known. Details
here .
Problem three
"Indices are needed everywhere!" It occurs not so often, but causes terrible brakes of everything and everything. It must be remembered that the index is not a panacea and guaranteed x10-x100 to speed, but a tool that needs to be used in the right places, and not to wave it over your head and shove it into each hole. Here you can read about the types of indexes, and
here you can find out why they are generally needed.
Behind this all
Thank you for reading to the end. Write about typos and inaccuracies in HP, I will be glad to fix it. I would also be happy if you share your experience and what you need to remember and what is better to use in different situations during development.