📜 ⬆️ ⬇️

DSL to JavaScript for C ++ or code generator - it's easy!

Good Monday, habrovchane!

It was picked just now with one universal, and therefore to an indecent powerful, data access interface on Python. Indecent power is expressed as a set of parameters for all occasions, often extremely extravagant and necessary only in 5% of cases. As a result, you have to duplicate the whole pack of parameters and details, even in straight-line queries, which causes pessimism and a desire to do something else. And then I remembered a similar story from my distant past, which I share.


Problem


It was a long time ago and not true , so long that it can be safely attributed to the category of memoirs. They sent us to help the department engaged in outsourcing. The project was already at the stage of active coding, and it was too late to discuss and change something (or maybe it was impossible initially). It was necessary to make the business-level interface to the data stored in 20 tablets. In C ++. Under Windows and Linux. 20 plates in Postgres or Orakle. The “business-level interface” is such a euphemism for the dumbest set of operations on entities from these tables, namely, by choosing a key or value of other fields, creating, modifying, or choosing connected entities.
By that time, I was already beginning to guess that our work is sometimes not very interesting, but so much so! This task by its despondency broke all records. Overnight, I felt in the shoes of those people who are not interested in their work, and for some reason, this didn’t inspire me at all. Against this background, my colleagues were not so bad - they just went through a bunch of C ++ (C) or simplifying access to the database and made sure that none of them, contrary to statements, normally work simultaneously with Oraklom and Postgres.

Idea


The conviction that this stupid work should be done by anyone, but not only me, slowly pushed me to the idea that it was necessary to dump it on someone's thread. From my environment, only a computer agreed to accept such a fate, so that, as a result, he had the honor of generating the necessary code. Strained in this whole undertaking only that at that time about the subject I had a very vague idea as something very complex and non-trivial. From the incubation period, the idea of ​​generation brought a random memory of the debugging of one web application with JSP, bins and other horror flying on the wings of the night. Debuger showed me the code that is generated from the JSP (java server pages) pages. For example from such a page:
<ul> <% for (int i = 0; i < 10; i++) { %> <li> <%= i %> <% } %> </ul> 

It turns out the following code
 out.println("<ul>"); for (int i = 0; i < 10; i++) { out.print("<li>"); out.print(i); out.println(); } out.println("</ul>"); 

That is, the generation algorithm from JSP code which, in turn, generates HTML is very simple, approximately: everything inside <%%> becomes the code, and everything outside is wrapped in out.print ("<text>"), Well, a little syntactic sugar. In our case, you just need to replace HTML with C ++, but with something else and, after the simplest transformations, we get the code that generates the desired C ++ code. “Something else” I chose according to the principle of least resistance - the windows scripting host was already on our build machine, respectively, we use the java script (or ECMAScript, as it is there).
')

Prototype


In this article we will consider the most “close to the text” solution. The original solution, unfortunately, cannot be considered - too many important details have already been erased from memory. We assume that we only have Postgres database, in order not to leave the cozy linuh windows scripting host, we will not use the windows scripting host either, as the data we will use some kind of thread with a simple layout from Employee, Department, etc.
As a standalone JS implementation, let's take the first one that came to mind - rhino (for some reason, the v8 was the second). So, we put rhino, we create the codegen.js file, we write print (2 * 2) there; rhino codegen.js: 4, voila!
codegen.js
 if (arguments.length < 1) { print("Usage: codegen.js <template>"); quit(); } function produce_text(text) { return "__codegen_output += '" + text.replace("'", "\\'", 'g').replace('\n', '\\n', 'g').replace('\r', '\\r', 'g').replace('\t', '\\t', 'g') + "';\n"; } function produce_code(code) { if (code[0] == '=') { return '__codegen_output += ' + code.substr(1) + ';\n'; } return code + '\n'; } remainder = readFile(arguments[0]); var code = 'var __codegen_output = ""; '; while (remainder.length) { var begin = remainder.indexOf('<%'); if (begin >= 0) { code += produce_text(remainder.substr(0, begin)); remainder = remainder.substr(begin + 2); var end = remainder.indexOf('%>'); if (end >= 0) { code += produce_code(remainder.substr(0, end)); remainder = remainder.substr(end + 2); } else { code += produce_code(remainder); remainder = ''; } } else { code += produce_text(remainder); remainder = ''; } } code += 'print(__codegen_output);' eval(code); 

- the dumbest, straightforward implementation of the algorithm described just above - 50 lines - the entire code generator. Testing:
template file:
 <% var className = 'MyClass'; var fields = ['Name', 'Description', 'AnotherOne', 'LastOne']; %> class <%= className %> { private: <% for(var i = 0; i < fields.length; i++) { %> int <%= fields[i] %>; <% } %> }; 


  rhino codegen.js template: 

 class MyClass { private: int Name; int Description; int AnotherOne; int LastOne; }; 

In <%%> we have any js code, <% = expr%> is replaced by the result of the expr calculation. In principle, this is all we need, it is enough to generate anything at all. Unfortunately, it is worth noting that this simplicity also has a downside - the code in the template is very dense and it is difficult to read it without a suitable syntax highlighting. Not the last violin in this is played by JS himself - he is no different to laconic and expressive
Now time to try to generate something more useful.
template file:
 <% var model = [ { name: 'Employee', fields: { Id: { type: 'int' }, Name: { type: 'string' } } } ]; var cppTypeMap = { 'int': 'int', 'string': 'std::string' }; %> <% for (var i = 0; i < model.length; i++) { var entity = model[i];%> struct <%= entity.name %> { <% for (var field in entity.fields) { %> <%= cppTypeMap[entity.fields[field].type] %> <%= field %>; <% } %> }; <% } %> 


  rhino codegen.js template: 

 struct Employee { int Id; std::string Name; }; 

The content of the model variable is, in fact, the so-called DSL (domain specific language). In our case, it is a language for describing entities of the subject area. The current version of it is too primitive to be at least somehow useful, so let's add it with all the necessary.
 var model = [ { name: 'Department', fields: { Id: { type: 'int' }, Name: { type: 'string'} }, primaryKey: 'Id' }, { name: 'Employee', fields: { Id: { type: 'int' }, Name: { type: 'string' }, DepartmentId: { type: 'int', references: 'Department' } }, primaryKey: 'Id' } ]; 


Decision


Now we are able to generate the code for receiving entities from the database by key, by connected entities, etc. You can also generate a sql script to create a database. In order to generate different sources from one model, we will scatter the code a bit:
model
 var model = [ { name: 'Department', fields: { Id: { type: 'int' }, Name: { type: 'string'} }, primaryKey: 'Id' }, { name: 'Employee', fields: { Id: { type: 'int' }, Name: { type: 'string' }, DepartmentId: { type: 'int', references: 'Department' } }, primaryKey: 'Id' } ]; 

will only contain our DSL,
cpp.template
 <% load('model'); var cppTypeMap = { 'int': 'int', 'string': 'std::string' }; function fieldType(entity, field) { return cppTypeMap[entity.fields[field].type]; } %> <% for (var i = 0; i < model.length; i++) { var entity = model[i];%> struct <%= entity.name %> { <% for (var field in entity.fields) { %> <%= fieldType(entity, field) %> <%= field %>; <% } %> <% var fieldList = []; for (var field in entity.fields) fieldList.push(field); %> static <%= entity.name %> ByKey(<%= fieldType(entity, entity.primaryKey) %> key, pqxx::work& tr) { if (!tr.prepared("<%= entity.name %>ByKey").exists()) { tr.conn().prepare("<%= entity.name %>ByKey", "select <%= fieldList.join() %> from <%= entity.name %> where <%= entity.primaryKey %> = $1"); } pqxx::result rows = tr.prepared("<%= entity.name %>ByKey")(key).exec(query); <%= entity.name %> result; <% for (var j = 0; j < fieldList.length; j++) { %> result.<%= fieldList[j] %> = rows[0][<%= j %>].as<<%= fieldType(entity, fieldList[j]) %>>(); <% } %> return result; } <% for (var field in entity.fields) if (entity.fields[field].references) { var ref = entity.fields[field].references; %> <%= ref %> Get<%= ref %>() { return <%= ref %>::ByKey(<%= field %>); } static std::vector<<%= entity.name %>> By<%= ref %>(<%= fieldType(entity, field) %> key) { if (!tr.prepared("<%= entity.name %>By<%= ref %>").exists()) { tr.conn().prepare("<%= entity.name %>By<%= ref %>", "select <%= fieldList.join() %> from <%= entity.name %> where <%= field %> = $1"); } pqxx::result rows = tr.prepared("<%= entity.name %>By<%= ref %>")(key).exec(query); std::vector<<%= entity.name %>> result; for (pqxx::result::size_type i = 0; i < rows.size(); i++) { <%= entity.name %> row; <% for (var j = 0; j < fieldList.length; j++) { %> row.<%= fieldList[j] %> = rows[i][<%= j %>].as<<%= fieldType(entity, fieldList[j]) %>>(); <% } %> result.push_back(row); } return result; } <% } %> }; <% } %> 

- template for generating a plus code and
sql.template
 <% load('model'); var sqlTypeMap = { 'int': 'integer', 'string': 'text' }; function fieldType(entity, field) { return sqlTypeMap[entity.fields[field].type]; } %> <% for (var i = 0; i < model.length; i++) { var entity = model[i];%> CREATE TABLE <%= entity.name %> ( <% for (var field in entity.fields) { var ref = entity.fields[field].references; %> <%= field %> <%= fieldType(entity, field) %><% if (ref) { %> REFERENCES <%= ref %><% } %>, <% } %> PRIMARY KEY (<%= entity.primaryKey %>) ); <% } %> 

- template for generating the database creation script.

  rhino codegen.js cpp.template: 

 struct Department { int Id; std::string Name; static Department ByKey(int key, pqxx::work& tr) { if (!tr.prepared("DepartmentByKey").exists()) { tr.conn().prepare("DepartmentByKey", "select Id,Name from Department where Id = $1"); } pqxx::result rows = tr.prepared("DepartmentByKey")(key).exec(query); Department result; result.Id = rows[0][0].as<int>(); result.Name = rows[0][1].as<std::string>(); return result; } }; struct Employee { int Id; std::string Name; int DepartmentId; static Employee ByKey(int key, pqxx::work& tr) { if (!tr.prepared("EmployeeByKey").exists()) { tr.conn().prepare("EmployeeByKey", "select Id,Name,DepartmentId from Employee where Id = $1"); } pqxx::result rows = tr.prepared("EmployeeByKey")(key).exec(query); Employee result; result.Id = rows[0][0].as<int>(); result.Name = rows[0][1].as<std::string>(); result.DepartmentId = rows[0][2].as<int>(); return result; } Department GetDepartment() { return Department::ByKey(DepartmentId); } static std::vector<Employee> ByDepartment(int key) { if (!tr.prepared("EmployeeByDepartment").exists()) { tr.conn().prepare("EmployeeByDepartment", "select Id,Name,DepartmentId from Employee where DepartmentId = $1"); } pqxx::result rows = tr.prepared("EmployeeByDepartment")(key).exec(query); std::vector<Employee> result; for (pqxx::result::size_type i = 0; i < rows.size(); i++) { Employee row; row.Id = rows[i][0].as<int>(); row.Name = rows[i][1].as<std::string>(); row.DepartmentId = rows[i][2].as<int>(); result.push_back(row); } return result; } }; 

I admit that the generated code did not run and did not even compile, but I promise that it is very close to the real code that works with postgres. :)

  rhino codegen.js sql.template: 

 CREATE TABLE Department ( Id integer, Name text, PRIMARY KEY (Id) ); CREATE TABLE Employee ( Id integer, Name text, DepartmentId integer REFERENCES Department, PRIMARY KEY (Id) ); 

For many, I think it is obvious that I did not invent anything new. Many ORMs can generate similar code. The main purpose of the article was to demonstrate that creating your own language, even if only DSL, is not easy, but very simple. This is not at all as scary as it seems, because at many stages you can save a lot of money. For example, in this case we saved on the parser - most of the work is done by the JS engine, and on the compiler - it is much easier to generate the plus code than the machine code, and let the plus compiler drag around the weights.

Source: https://habr.com/ru/post/199012/


All Articles