📜 ⬆️ ⬇️

How to quickly and easily write DSL in Ruby

The presented text is a translation of an article from the official blog of the ZenPayroll company. Despite the fact that in some issues I do not agree with the author, the general approach and methods shown in this article may be useful to a wide range of people writing in Ruby. I apologize in advance for the fact that some bureaucratic terms could be translated incorrectly. Hereafter, my notes and comments are in italics.

In ZenPayroll we try to hide the complexity of the problem being solved as much as possible. Salary accrual has traditionally been a bureaucratic wasp nest, and implementing a modern and convenient solution in such an unfriendly atmosphere is an attractive technical problem that is very difficult to solve without
automation.

ZenPayroll is now creating a nationwide service (already implemented in 24 states), which means that we meet a variety of requirements unique to each state. At first, we noticed that we spend a lot of time writing template code instead of concentrating on what makes each state unique. Soon we realized that we could solve this problem, using the advantages of creating a DSL to speed up and simplify the development process.
')
In this article, we will create DSL as close as possible to what we use ourselves.

When do we need DSL?


Writing a DSL is a huge amount of work, and it can not always help you in solving a problem. In our case, however, the advantages outweighed the disadvantages:

  1. All specific code is collected in one place.
    There are several models in our Rails application in which we need to implement state-specific code. We need to generate forms, tables, and manipulate mandatory information related to employees, companies, submission schedules, and tax rates. We make payments to government agencies, submit the generated forms, calculate income tax and much more. The DSL implementation allows us to collect all code specific for the marquee in one place.
  2. Standardization of states.
    Instead of creating every new state from scratch, the use of DSL allows us to automate the creation of common things for states and, at the same time, allows us to flexibly configure each state.
  3. Reducing the number of places where you can make a mistake.
    Having a DSL that creates classes and methods for us, we shorten the template code and have fewer places where developers intervene. Having tested DSL qualitatively and protected it from incorrect input data, we will greatly reduce the likelihood of an error.
  4. The possibility of rapid expansion.
    We create a framework that facilitates the implementation of unique requirements for new states. DSL is a toolkit that saves us time for this and allows development to move on.

Spelling dsl


In this article, we will focus on creating a DSL that will allow us to store company identification numbers and payroll parameters (used to calculate taxes). Although this is just a quick glance at what DSL can provide us, it is still a full introduction to the topic. Our final code, written using the generated DSL, will look something like this:

StateBuilder.build('CA') do company do edd { format '\d{3}-\d{4}-\d' } sos { format '[AZ]\d{7}' } end employee do filing_status { options ['Single', 'Married', 'Head of Household'] } withholding_allowance { max 99 } additional_withholding { max 10000 } end end 

Fine! This is a clean, clear and expressive code that uses an interface designed to solve our problem. Let's start.

Parameter Definition


First of all, let's define what we want to get in the end. First question: what information do we want to store?

Each state requires companies to register with local authorities. When registering in most states, companies are given identification numbers that are required to pay taxes and submit documents. At the company level, we must be able to store different identification numbers for different states.

Withholding taxes are calculated based on the amount of benefits received by the employee. These are values ​​that are defined in the W-4 forms for each state. For each state, there are many questions asked to determine tax rates: your taxpayer status, related benefits, disability benefits, and more. For employees, we need a flexible method for defining different attributes for each state in order to correctly calculate tax rates.

The DSL we write will handle company identification numbers and basic payroll information for employees. Next we use this tool to describe California. Since California has some additional conditions that need to be taken into account when calculating salaries, we will focus on them in order to show how to develop DSL.

I provide a link to a simple Rails application so that you can follow the steps to be taken in this article.

The application uses the following models:


We create successor models from the CompanyStateField and EmployeeStateField models that will use the same tables as the single table inheritance base classes. This allows us to identify their state-specific heirs and use only one table to store all of these models. To do this, both tables contain serialized hashes, which we will use to store specific data. Although it will be impossible to conduct queries using these data, this allows us not to inflate the database with unused columns.
Note translator. When using Postgres, this data can be stored in natively supported JSON.

Our application is prepared for working with the states, and now our DSL must create specific classes that implement the required functionality for California.

What will help us?


Metaprogramming is an area where Ruby can show itself in all its glory. We can create methods and classes right at the time of program execution, as well as use a huge number of metoprogramming methods, which makes creating DSL in Ruby a pleasure. Rails itself is
DSL for creating web applications and a huge amount of its “magic” is based on Ruby's metaprogramming capabilities. Below I will give a small list of methods and objects that will be useful for metaprogramming.

Blocks


Blocks allow us to group code and pass it as an argument to a method. They can be described using do end or curly braces. Both options are identical.
Note translator. According to the accepted style, the syntax do end is used in multi-line constructions, and curly brackets in single-line ones. There are also some differences (thanks to mudasobwa ) that are irrelevant in this case, but that can give you quite a few funny debugging minutes.
Recycled original comment
Blocks allow us to group code and pass it as an argument to a method. They can be described using do end or curly braces. Both options are identical.
Note translator. According to the accepted style, the syntax do end is used in multi-line constructions, and curly brackets in single-line ones.


Both of you are wrong :)

In fact, there is a difference, and it can lead to an error in the code, from which it is easy to turn gray, but which is extremely difficult to catch, if you don’t know what it is. See:
 require 'benchmark' puts Benchmark.measure { "a"*1_000_000 } # => 0.000000 0.000000 0.000000 ( 0.000427) puts Benchmark.measure do "a"*1_000_000 end # => LocalJumpError: no block given (yield) # => from IRRELEVANT_PATH_TO_RVM/lib/ruby/2.0.0/benchmark.rb:281:in `measure' # => from (irb):9 


Cool, right?

Think before clicking:
Due to the different priority of the operators, the code of the second example is actually executed in the following sequence:
 (puts Benchmark.measure) do # irrelevant code end 



Correct the note in the code, please. People read :)

You almost certainly used them if you used a method like each :
 [1,2,3].each { |number| puts number*2 } 

This is a great thing for creating DSLs, because they allow us to create code in one context and execute it in another. This gives us the opportunity to create a readable DSL by moving method definitions to other classes. Many examples of this we will see below.

send


The send method allows us to call object methods (even private ones), passing it the name of the method as a symbol. This is useful for calling methods that are usually called inside a class definition or for interpolating variables for dynamic method calls.

define_method


In Ruby, define_method gives us the ability to create methods without using a normal procedure when describing a class. It takes as its argument a string that will be the name of the method and a block that will be executed when the method is called.

instance_eval


This thing is necessary when creating a DSL is almost the same as blocks. It takes a block and executes it in the context of the receiver object. For example:

 class MyClass def say_hello puts 'Hello!' end end MyClass.new.instance_eval { say_hello } # => 'Hello!' 

In this example, the block contains a call to the method say_hello , despite the fact that there is no such method in its context. The instance of the class returned from MyClass.new is the receiver for instance_eval and the call to say_hello occurs in its context.

 class MyOtherClass def initialize(&block) instance_eval &block end def say_goodbye puts 'Goodbye' end end MyOtherClass.new { say_goodbye } # => 'Goodbye!' 

We again describe a block that calls a method that is undefined in its context. This time we pass the block to the constructor of the class MyOtherClass and execute it in the context of the self receiver, which is an instance of MyOtherClass . Fine!

method_missing


This is the magic that Rails find_by_ * methods work with. Any call to an undefined method falls into method_missing , which takes as input the name of the method called and all arguments passed to it. This is another great thing for DSL, because it allows you to create methods dynamically when we do not know what can really be caused. This gives us the opportunity to create a very flexible syntax.

Design and Implementation of DSL


Now that we have some knowledge of our toolkit, it's time to think about how we want to see our DSL and how we will continue to work with it. In this case, we will work “backwards”: instead of starting with the creation of classes and methods, we will develop an ideal syntax and we will build everything else around it. We will consider this syntax as a sketch of what we want to receive. Let's look again at how everything should look as a result:

 StateBuilder.build('CA') do company do edd { format '\d{3}-\d{4}-\d' } sos { format '[AZ]\d{7}' } end employee do filing_status { options ['Single', 'Married', 'Head of Household'] } withholding_allowance { max 99 } additional_withholding { max 10000 } end end 

Let's break it apart and gradually write code that wraps our DSL into the classes and methods we need to describe California.


If you want to follow me using the provided code, you can do git checkout step-0 and add the code with me during the reading process.


Our DSL, which we called StateBuilder, is a class. We begin the creation of each state by calling the build class method with the abbreviation of the state name and the block describing it as parameters. In this block, we can call the methods that we call company and employee and pass each of them our own configuration block, which will customize our specialized models ( CompanyStateField :: CA and EmployeeStateField :: CA )

 # app/states/ca.rb StateBuilder.build('CA') do company do #  CompanyStateField::CA end employee do #  EmployeeStateField::CA end end 

As mentioned earlier, our logic is encapsulated in the StateBuilder class. We call the block passed to self.build in the context of a new instance of StateBuilder , so the company and employee must be defined and each of them must take a block as an argument. Let's start the development by creating a class pig that fits these conditions.

 # app/models/state_builder.rb class StateBuilder def self.build(state, &block) #    ,   raise "You need a block to build!" unless block_given? StateBuilder.new(state, &block) end def initialize(state, &block) @state = state #         StateBuilder instance_eval &block end def company(&block) #  CompanyStateField::CA end def employee(&block) #  EmployeeStateField::CA end end 

Now we have a base for our StateBuilder . Since the company and employee methods will define the CompanyStateField :: CA and EmployeeStateField :: CA classes, let's define what the blocks that we will pass to these methods should look like. We must define each attribute that our models will have, as well as some information about these attributes. What is especially nice about creating your own DSL is that we don’t have to use standard Rails syntax for getter and setter methods, as well as validations. Instead, let's implement the syntax we described earlier.
Note translator. Controversial thought. I would still try to minimize the zoo syntax within the application, albeit at the expense of some redundancy code.


It's time to do git checkout step-1 .


For Californian companies, we must store two identification numbers: a number issued by the California Department of Employment (EDD) and a number issued by the State Secretariat (SoS).

The format of the number is EDD: "### - #### - #", and the format of the number of SoS is "@ #######", where @ means "any letter" and # is "any digit".

Ideally, we should use the name of our attribute as the name of the method to which to pass a block as a parameter, which will determine the format of this field (It seems the time has come for method_missing !).
Note translator. Maybe something is wrong with me, but the syntax of the form
 field name, params 
seems to me more understandable and logical than the one proposed by the author (compare with standard migrations). When using the author's syntax at first glance, it is not at all obvious that it is permissible to write any names in blocks describing a company or an employee, and you also get an excellent grenade launcher for shooting in the leg (see below).
Let's write what the calls to these methods will look like for EDD and SoS numbers.

 #app/states/ca.rb StateBuilder.build('CA') do company do edd { format '\d{3}-\d{4}-\d' } sos { format '[AZ]\d{7}' } end employee do #  EmployeeStateField::CA end end 

Notice that here, when describing a block, we changed the syntax from do end to curly braces, but the result did not change - we still pass the executable block of code to the function. Now let's do the same for employees.

According to the Californian Certificate of Benefits in calculating taxes, workers are asked about their tax status, the number of benefits and any other additional withheld sums they may have. Taxpayer status can be Single, Married or Head of Family; tax benefits should not exceed 99, and for additional withheld amounts, let's set a maximum of $ 10,000. Now let's describe them in the same way as we did for the company fields.

 #app/states/ca.rb StateBuilder.build('CA') do company do edd { format '\d{3}-\d{4}-\d' } sos { format '[AZ]\d{7}' } end employee do filing_status { options ['Single', 'Married', 'Head of Household'] } withholding_allowance { max 99 } additional_withholding { max 10000 } end end 

Now we have the final implementation for California. Our DSL describes attributes and validations for CompanyStateField :: CA and EmployeeStateField :: CA using our own syntax.

All we have to do is translate our syntax into classes, getters / setters, and validations. Let's implement the company and employee methods in the StateBuilder class and get working code.


The third part of Marlezonsky ballet: git checkout step-2


We implement our methods and validations by defining what to do with each of the blocks in the StateBuilder # company and StateBuilder # employee methods. Let's use an approach similar to the one we used when defining a StateBuilder : create a “container” that will contain these methods and execute the transferred block using its instance_eval in its context.

Let's call our containers StateBuilder :: CompanyScope and StateBuilder :: EmployeeScope and create in the StateBuilder methods that create instances of these classes.

 #app/models/state_builder.rb class StateBuilder def self.build(state, &block) #    ,   raise "You need a block to build!" unless block_given? StateBuilder.new(state, &block) end def initialize(state, &block) @state = state #         StateBuilder instance_eval &block end def company(&block) StateBuilder::CompanyScope.new(@state, &block) end def employee(&block) StateBuilder::EmployeeScope.new(@state, &block) end end 


 #app/models/state_builder/company_scope.rb class StateBuilder class CompanyScope def initialize(state, &block) @klass = CompanyStateField.const_set state, Class.new(CompanyStateField) instance_eval &block end end end 


 #app/models/state_builder/employee_scope.rb class StateBuilder class EmployeeScope def initialize(state, &block) @klass = EmployeeStateField.const_set state, Class.new(EmployeeStateField) instance_eval &block end end end 

We use const_set to define the subclasses CompanyStateField and EmployeeStateField with the name of our state. This will create us the CompanyStateField :: CA and EmployeeStateField :: CA classes, each of which is inherited from the corresponding parent.

Now we can focus on the last stage: the blocks passed to each of our created attributes ( sos , edd , additional_witholding , etc.). They will be executed in the context of CompanyScope and EmployeeScope , but if we try to execute our code now, we will get errors about calling unknown methods.

We use the method method_missing to handle these cases. In the current state, we can assume that any method called is an attribute name, and the blocks passed to them describe how we want to configure it. This gives us a "magic" ability to define the necessary attributes and save them.
to the database.

Attention! Using method_missing in such a way that it does not provide a situation in which super can be called can lead to unexpected behavior. Errata will be difficult to track, since they will all fall into method_missing . Be sure to create options where method_missing will call super when you write something based on these principles.
Note translator. In general, it is better to keep the use of method_missing to a minimum, because it slows down the program very much. In this case, this is not critical, since all this code is executed only when the application starts.


We define the method method_missing and pass these arguments to the last container we create, AttributesScope . This container will call store_accessor and create validations based on the blocks we pass to it.

 #app/models/state_builder/company_scope.rb class StateBuilder class CompanyScope def initialize(state, &block) @klass = CompanyStateField.const_set state, Class.new(CompanyStateField) instance_eval &block end def method_missing(attribute, &block) AttributesScope.new(@klass, attribute, &block) end end end 


 #app/models/state_builder/employee_scope.rb class StateBuilder class EmployeeScope def initialize(state, &block) @klass = EmployeeStateField.const_set state, Class.new(EmployeeStateField) instance_eval &block end def method_missing(attribute, &block) AttributesScope.new(@klass, attribute, &block) end end end 

Now, every time we call the method in the company block in app / states / ca.rb, it will fall into the method_missing function we defined . Its first argument will be the name of the method called, the name of the attribute being defined. We create a new AttributesScope instance, passing it a class to change, the name of the attribute being defined and the block that configures the attribute. In AttributesScope, we will call store_accessor , which will determine the getters and setters for the attribute, and use the serialized hash to store the data.

 class StateBuilder class AttributesScope def initialize(klass, attribute, &block) klass.send(:store_accessor, :data, attribute) instance_eval &block end end end 

We also need to define the methods that we call inside the blocks that configure the attributes ( format , max , options ) and turn them into validators. We do this by converting the calls to these methods to the validation calls that Rails expects.

 class StateBuilder class AttributesScope def initialize(klass, attribute, &block) @validation_options = [] klass.send(:store_accessor, :data, attribute) instance_eval &block klass.send(:validates, attribute, *@validation_options) end private def format(regex) @validation_options << { format: { with: Regexp.new(regex) } } end def max(value) @validation_options << { numericality: { greater_than_or_equal_to: 0, less_than_or_equal_to: value } } end def options(values) @validation_options << { inclusion: { in: values } } end end end 

Our DSL is ready for battle. We have successfully identified the CompanyStateField :: CA model, which stores and validates the EDD and SoS numbers, as well as the EmployeeStateField :: CA model, which stores and validates tax benefits, taxpayer status, and additional fees for employees. despite the fact that our DSL was created
for automation of fairly simple things, each of its components can be easily extended. We can easily add new hooks in DSL, define more methods in models and develop it further, based on the functionality that we have implemented now.

Our implementation noticeably reduces repetitions and template code in the backend, but still requires each state to have its own client-side views. We have expanded our internal development so that it covers the client side for new states, and if there is interest in the comments, I will write another post telling how this works for us.

This article shows only part of how we use our own DSL as a tool for expanding staff. Such tools have shown tremendous utility in expanding our payroll service for the rest of the United States, and if such tasks interest you, then we can work together !

Happy metaprogramming!

Source: https://habr.com/ru/post/241265/


All Articles