Using SQL in Rails

Firstly, the article is not about how I love rails, and secondly, the article is not about how I hate them. They can be treated completely differently and they will become better only if they are changed. And they can only become worse if they start to change. Well, in general, I warned you, and you understood me.

One of the basic concepts of ActiveRecord is that the database is quite utilitarian and can be changed. Well, you are sitting, writing your own models using MySQL, and suddenly you read somewhere that you can take and replace MySQL with MongoDB. Well, not so radically, but, say, with PostgreSQL you may have reasons to replace MySQL. Or vice versa, I have nothing against MySQL. Here ActiveRecord claims that it is easier to do this for you, they say, before / after filters and associations are abstract enough not to worry about database queries and cares about the logic of the application. That instead of WHERE is_archived = TRUE you will gladly write where(is_archived: true) and ActiveRecord will do everything for you. All examples will be provided for PostgreSQL, not for MySQL, so MySQL users will have to reinvent their own bicycle.

But how not so! In practice, it turns out that this layer of abstraction is completely full of holes, like a trough from the tale of the Golden Fish. And that many basic features cannot be used, like comparing dates or working with arrays. And you get scoops with forced where("#{quoted_table_name}.finished_at >= ?", Date.current) or where("#{quoted_table_name}.other_ids <@ ARRAY[?]", ids) . To which ActiveRecord gives a quite conscious and logical answer: do not use it. Instead of arrays, use a habtm connection, and if you need to compare dates, live with it. Yes, and God forbid you skip quoted_table_name in such a scoop - the very first includes or joins put everything in its place. It is easier to always and everywhere write, so as not to knock your hand.

And, of course, as soon as you decide to interfere with ActiveRecord, there will be no way back. Not that the odds, even the ghostly hopes for a painless transition to another database will be. It will be much easier to print and burn this source code. And of course, there is no other reason not to use extra database capabilities in your application. Use on health and force others!

And when it turns out that your scoops in the models folder are more than half of these extra utilities, it will become quite obvious that ActiveRecord is just a convenient wrapper for integrating one piece of code with tablets with another piece of code. And scopes, like where(is_archived: true).joins(:sprint).merge(Sprint.archived) , will work fine and combining them will be no more difficult than making an egg, right?

The next stage will be denormalization. No, denormalization, as it were, always did not disappear anywhere, but concern for her was placed on the mighty shoulders of Rails and ActiveRecord, and you know that these two guys were not distinguished by promptness and asceticism in the requirements for resources. Suppose counter_cache: true is the first step to denormalization, because COUNT(*) AS sprints_count you will not allow ActiveRecord to do so (you don’t want to change select() , right?). And counter_cache not far from perfect, and in some cases, the real number can be out of sync from the cached one. Uncritically, of course, but unpleasant. And this is only the first candidate in order to settle in the database and not load the already loaded head of the ruby-machine. Just a couple of triggers and ready! Firstly, when deleting and adding a new record to the plate A, you need to recalculate the number of records in plate B and that's it, right? Well, when editing, of course, if foreign_key changed, because the UPDATE B SET a_id = $1 WHERE id = $2 query UPDATE B SET a_id = $1 WHERE id = $2 will break the counter_cache for both the old A and the new.

  CREATE OR REPLACE FUNCTION update_#{parent_table}_#{child_table}_counter_on_insert() RETURNS TRIGGER AS $$ BEGIN UPDATE #{parent_table} SET #{counter_column} = COALESCE((SELECT COUNT(id) FROM #{child_table} GROUP BY #{foreign_column} HAVING #{foreign_column} = NEW.#{foreign_column}), 0) WHERE (#{parent_table}.id = NEW.#{foreign_column}); RETURN NULL; END; $$ LANGUAGE plpgsql;

The next path of the database works will be related to the date-time. And first, let's just created_at fields created_at and updated_at will serve in the database, good, it is much easier. First we set defaults:

  change_column_default :table_name, :created_at, -> { 'CURRENT_TIMESTAMP' } change_column_default :table_name, :updated_at, -> { 'CURRENT_TIMESTAMP' }

And to immediately do it everywhere, you can organize a cycle on all the labels, where there are these fields. In addition to the schema_migrations and ar_internal_metadata , of course:

  (tables - %w(schema_migrations ar_internal_metadata)).each { ... }

Everything, now the default value for these tables will be exactly the way we need it. And now is the time to take care that the rails do not touch these fields. This is done with two bolts in the right places. And yes, there is an option in setting up the framework, which is responsible for this:

 Rails.application.config.active_record.record_timestamps = false

So, the next step is to update the updated_at field at the time of updating the record. It's simple:

  CREATE OR REPLACE FUNCTION touch_for_#{table_name}_on_update() RETURNS TRIGGER AS $$ BEGIN SELECT CURRENT_TIMESTAMP INTO NEW.updated_at; RETURN NEW; END; $$ LANGUAGE plpgsql;

Now you need to completely get rid of touch: true in models. This thing is very much like a dash target - also completely full of holes. And I will not even explain why, because you already know all these cases. This is not much more complicated, all you need is to update updated_at not only for yourself:

  CREATE OR REPLACE FUNCTION touch_for_#{table_name}_on_update() RETURNS TRIGGER AS $$ BEGIN UPDATE foreign_table_name SET updated_at = CURRENT_TIMESTAMP WHERE id = NEW.foreign_column_name; SELECT CURRENT_TIMESTAMP INTO NEW.updated_at; RETURN NEW; END; $$ LANGUAGE plpgsql;

Of course, the chain of calls for such triggers will do an extra action, but in the postgame of the sane mechanism it will trigger the triggers without changing the entry itself. You can try to make SET title = title , but this is no better than SET updated_at = CURRENT_TIMESTAMP .

Exactly the same trigger will be on the insert, only it is not necessary to update the updated_at :

  CREATE OR REPLACE FUNCTION touch_for_#{table_name}_on_insert() RETURNS TRIGGER AS $$ BEGIN UPDATE foreign_table_name SET updated_at = CURRENT_TIMESTAMP WHERE id = NEW.foreign_column_name; RETURN NEW; END; $$ LANGUAGE plpgsql;

Of course, you could try to write it with a single function, adding a check for the current event directly inside the trigger, like IF TG_OP = 'UPDATE' THEN , but it is preferable to make all triggers as simple as possible to reduce the likelihood of an error.

You may want to somehow automate the generation of such triggers, and then you will probably need to find all foreign links between the current table and the rest. With this request you can easily do this:

  SELECT ccu.table_name AS foreign_table_name, kcu.column_name AS column_name FROM information_schema.table_constraints AS tc JOIN information_schema.key_column_usage AS kcu ON tc.constraint_name = kcu.constraint_name JOIN information_schema.constraint_column_usage AS ccu ON ccu.constraint_name = tc.constraint_name WHERE constraint_type = 'FOREIGN KEY' AND tc.table_name = '#{table_name}' ORDER BY ccu.table_name;

Another very helpful tip. Call triggers monotonously to be able to verify that one request is available or not, for example, this query will find all touch-insert triggers:

  SELECT routine_name AS name FROM information_schema.routines WHERE routine_name LIKE 'touch_for_%_on_insert' AND routine_type ='FUNCTION' AND specific_schema='public';

And the last thing left is the worst. The fact is that the rails are not designed for at least some clever database and absolutely did not care if the database could change at least something other than the id field, and then only when inserted. Therefore, there is no imputed mechanism to add RETURNING updated_at to update queries no, you will need to dive into the insides of Rails to the ears.

MankiPatch turned out not so much neat, but first of all the goal was to damage the current work of the framework as little as possible.

I will quote it completely

 module ActiveRecord module Persistence # https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/persistence.rb#L729-L741 def _create_record(attribute_names = self.attribute_names) attribute_names &= self.class.column_names attributes_values = attributes_with_values_for_create(attribute_names) an_id, *affected_rows = self.class._insert_record(attributes_values).dup self.id ||= an_id if self.class.primary_key Hash[ApplicationRecord.custom_returning_columns(self.class.quoted_table_name, :create).take(affected_rows.size).zip(affected_rows)].each do |column_name, value| public_send("#{column_name}=", self.class.attribute_types[column_name.to_s].deserialize(value)) if value end @new_record = false yield(self) if block_given? id end private :_create_record # https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/persistence.rb#L710-L725 def _update_record(attribute_names = self.attribute_names) attribute_names &= self.class.column_names attribute_names = attributes_for_update(attribute_names) if attribute_names.empty? affected_rows = [] @_trigger_update_callback = true else affected_rows = _update_row(attribute_names) @_trigger_update_callback = affected_rows.any? end Hash[ApplicationRecord.custom_returning_columns(self.class.quoted_table_name, :update).take(affected_rows.size).zip(affected_rows)].each do |column_name, value| public_send("#{column_name}=", self.class.attribute_types[column_name.to_s].deserialize(value)) end yield(self) if block_given? affected_rows.none? ? 0 : 1 end private :_update_record end module ConnectionAdapters module PostgreSQL module DatabaseStatements # https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb#L93-L96 def exec_update(sql, name = nil, binds = []) execute_and_clear(sql_with_returning(sql), name, binds) { |result| Array.wrap(result.values.first) } end # https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb#L147-L152 def insert(arel, name = nil, pk = nil, _id_value = nil, sequence_name = nil, binds = []) sql, binds = to_sql_and_binds(arel, binds) exec_insert(sql, name, binds, pk, sequence_name).rows.first end alias create insert # https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb#L98-L111 def sql_for_insert(sql, pk, id_value, sequence_name, binds) # :nodoc: table_ref = extract_table_ref_from_insert_sql(sql) if pk.nil? # Extract the table from the insert sql. Yuck. pk = primary_key(table_ref) if table_ref end returning_columns = quote_returning_column_names(table_ref, pk, :create) if returning_columns.any? sql = "#{sql} RETURNING #{returning_columns.join(', ')}" end super end # No source in original repo def quote_returning_column_names(table_ref, pk, action) returning_columns = [] returning_columns << pk if suppress_composite_primary_key(pk) returning_columns += ApplicationRecord.custom_returning_columns(table_ref, action) returning_columns.map { |column| quote_column_name(column) } end # No source in original repo def sql_with_returning(sql) table_ref = extract_table_ref_from_update_sql(sql) returning_columns = quote_returning_column_names(table_ref, nil, :update) return sql if returning_columns.blank? "#{sql} RETURNING #{returning_columns.join(', ')}" end # No source in original repo def extract_table_ref_from_update_sql(sql) sql[/update\s("[A-Za-z0-9_."\[\]\s]+"|[A-Za-z0-9_."\[\]]+)\s*set/im] Regexp.last_match(1)&.strip end end end end end

The most important thing is that there is a call to ApplicationRecord.custom_returning_columns to find out which columns, besides id, interest us. And this method looks something like this:

  class << self def custom_returning_columns(table_ref, action) return [] if ['"schema_migrations"', '"ar_internal_metadata"'].include?(table_ref) res = [] res << :created_at if action == :create res << :updated_at res += case table_ref when '"user_applications"' [:api_token] when '"users"' [:session_salt, :password_changed_at] # ... else [] end res end end

Instead of conclusions, we can say that, the sick head of Rails became a little less sick. counter_cache processes such as counter_cache and touch will sink into oblivion and in the next article you will be able to think about something more global, such as removing hanging spaces, validating data, cascading data deletion, or paranoid deletion. If you liked this article, of course.

Source: https://habr.com/ru/post/421939/

All Articles

Using SQL in Rails

More articles: