📜 ⬆️ ⬇️

SQL engine for generating reports. Idea

Introduction

The article is written from the point of view of a programmer from the ERP system technical support.

Carrying out the next test task, for the next vacancy, I got another brilliant idea.
Why not give the user the editing of formulas for calculating report columns? The result is the same Excel that everyone is used to, only the syntax is SQL.

And of course, in the 30 years of the development of ERP systems, this has already been realized where it is. But their algorithms are their secrets, but I do not mind. I will share an idea.

Test

There is a relational database with data that comes in daily. Each month based on these data reports are built. It is necessary to think over and propose the architecture of the system for building periodic reports.
')
System requirements:

1) Reports can be viewed in the web interface, downloaded to excel, retrieved from the web service as JSON (for transmission to an external system).
2) Reports are built for different clients. Each client wants a different report from the standard one: its own header and footer, displayed columns, column order, additional calculated columns.
3) Ability to save reports. The original data is changed, and the saved report remains unchanged.
4) Edit the report (source data) on the fly in the web interface.

The result should be an ER (Entity-Relationship) chart. Text description of the solution.

Idea

What is a regular report? This is a printed source data table to which several columns have been added with the calculated values, or a summary of the source data has been summed up.

What do users want from programmers? change the logic of calculations, if it concerns the logic of selecting source data, then only the programmer can deal with it, because it is necessary to change or the presentation on which the report is based, or the logic of the stored procedure, and if these are only formulas for which calculations are made, then this work can be done at the mercy of users. What is offered?

Teach the user to write SQL formulas, users are smart enough to write Excel formulas and, if desired, they will master SQL as well, and the standard set of SQL should of course be expanded with domain-specific functions.

If functions are written in SQL, then we can substitute them in a SELECT, which should then be executed as a dynamic query.
This feature is in T-SQL and PL / SQL.

Sketch implementation project

Main entities


Formulas

Formulas for calculating fields will be stored in the formulas table, DDL:

CREATE TABLE formulas ( id INT NOT NULL ,code NCHAR(50) NOT NULL ,formula NVARCHAR(MAX) NOT NULL ,CONSTRAINT PK_formulas PRIMARY KEY CLUSTERED (id) ,CONSTRAINT UK_formulas_code UNIQUE (code) ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] 

Each formula has its own unique code designation - the code field; formula - SQL expression to calculate the value.

The text of the formula will be inserted into the phrase “SELECT”, which will be executed dynamically, so you can use parameters (": PARAM1") in the formula and substitute the values ​​of these parameters when dynamically executing the SQL code. Similarly, you can form the phrase “FROM”, “WHERE” and other parts of the SQL statement.

When calculating a column using a formula, additional parameters may be required which are not in the source data; in order to substitute such parameters, the “placeholder” should be indicated in the formula - the text to be replaced with the parameter value.

To do this, you can “attach” an entry from the formula_parameters , DDL table to the entry in the formula table:

 CREATE TABLE formula_parameters ( id INT NOT NULL ,code NCHAR(50) NOT NULL ,formula_id INT NOT NULL ,placeholder NVARCHAR(4000) NOT NULL ,CONSTRAINT PK_formula_parameters PRIMARY KEY CLUSTERED (id) ,CONSTRAINT UK_formula_parameters_code UNIQUE (formula_id, code) ,CONSTRAINT UK_formula_parameters_placeholder UNIQUE (formula_id, placeholder) ,CONSTRAINT FK_formula_parameters_formulas_id FOREIGN KEY (formula_id) REFERENCES dbo.formulas (id) ) ON [PRIMARY] 

code - unique parameter code;
placeholder - the text to replace with the parameter value, in the formula, the value must be substituted through the appropriate CAST or another way of converting the type from string to required (taking into account the specifics of the DBMS), the text is substituted into the formula, therefore CAST must be from the text to the required type, this must be taken into account formatting the parameter value.

Columns and Sections

Formulas can be substituted in the columns and in the report section. The value of the column is calculated by the same formula. Report sections are a cap and a basement. You can add other types of sections. A section can contain several calculated values, so several formulas can be associated with one section.

Column formulas are stored in the columns , DDL table:

 CREATE TABLE columns ( id INT NOT NULL ,formula_id INT NOT NULL ,name NVARCHAR(MAX) NOT NULL ,description NVARCHAR(MAX) NULL ,CONSTRAINT PK_columns PRIMARY KEY CLUSTERED (id) ,CONSTRAINT FK_columns_formulas_id FOREIGN KEY (formula_id) REFERENCES dbo.formulas (id) ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] 

name - each column has its own name, different columns can have the same name.
description - the description may be empty.

Section formulas - region_formulas , DDL:

 CREATE TABLE region_formulas ( id INT NOT NULL ,formula_id INT NOT NULL ,region_id INT NOT NULL ,placeholder NVARCHAR(4000) NOT NULL ,CONSTRAINT PK_region_formulas PRIMARY KEY CLUSTERED (id) ,CONSTRAINT UK_region_formulas_formula_id_region_id UNIQUE (formula_id, region_id) ,CONSTRAINT FK_region_formulas_formulas_formula_id FOREIGN KEY (formula_id) REFERENCES dbo.formulas (id) ,CONSTRAINT FK_region_formulas_regions_id FOREIGN KEY (region_id) REFERENCES dbo.regions (id) ) ON [PRIMARY] 

placeholder - the text to be replaced with the value calculated using the formula.

In one section ( region_id ) one formula ( formula_id ) can be calculated only once, for all places where it is necessary to insert its result must be the same placeholder .

You can of course give free rein to users, this restriction is the “autopilot” optimization.

The calculated values ​​of the parameters of the partition template must be inserted into the partition template, the partition templates are the table regions :

 CREATE TABLE regions ( id INT NOT NULL ,pattern NVARCHAR(MAX) NOT NULL ,name NVARCHAR(4000) NOT NULL ,description NVARCHAR(MAX) NULL ,CONSTRAINT PK_regions PRIMARY KEY CLUSTERED (id) ,CONSTRAINT UK_regions_name UNIQUE (name) ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] 

pattern - the actual section template
name - section name
description - the description may be empty.

When calculating a specific section text, it is necessary to replace the pattern in the text of the pattern (table regions) with the text of the placeholder (table region_formulas) with the calculated values.

The layout of the sections in the report instance is defined by the consumers_report_regions , DDL table:

 CREATE TABLE consumers_report_regions ( consumer_id INT NOT NULL ,region_id INT NOT NULL ,region_order INT NOT NULL ,type_id INT NULL ,CONSTRAINT PK_consumers_report_base PRIMARY KEY CLUSTERED (consumer_id, region_id) ,CONSTRAINT UK_consumers_report_regions_region_order UNIQUE (consumer_id, region_order) ,CONSTRAINT FK_consumers_report_regions_consumer_reference_id FOREIGN KEY (consumer_id) REFERENCES dbo.consumer_reference (id) ,CONSTRAINT FK_consumers_report_regions_regions_id FOREIGN KEY (region_id) REFERENCES dbo.regions (id) ,CONSTRAINT FK_consumers_report_regions_report_region_types_id FOREIGN KEY (type_id) REFERENCES dbo.report_region_types (id) ) ON [PRIMARY] 

type_id - link to the directory of types of sections (header / basement / other);
region_order - the order of the section in the report, for one client, the value of the order can not be repeated.

The column order for each customer is unique and is stored in the consumers_report_columns , DDL table:

 CREATE TABLE consumers_report_columns ( column_id INT NOT NULL ,consumer_id INT NOT NULL ,column_order INT NOT NULL ,CONSTRAINT PK_consumers_report_columns PRIMARY KEY CLUSTERED (consumer_id, column_id) ,CONSTRAINT UK_consumers_report_columns_column_order UNIQUE (consumer_id, column_order, column_id) ,CONSTRAINT FK_consumers_report_columns_columns_id FOREIGN KEY (column_id) REFERENCES dbo.columns (id) ,CONSTRAINT FK_consumers_report_columns_consumer_reference_id FOREIGN KEY (consumer_id) REFERENCES dbo.consumer_reference (id) ) ON [PRIMARY] 

column_order - the ordinal number of the column, for one customer each column has its own order value;

Thus, using the consumers_report_columns and consumers_report_regions tables, the report structure is set individually for each client.

The tables columns and region_formulas define calculation rules for columns and sections. The formulas table sets the rules for calculating values.

Instance

The calculated report (instance) is stored in the structure of the tables report_instances, report_region_instances, report_cell_instances. The report_instances table stores information about the generated report copies, DDL:

 CREATE TABLE report_instances ( id INT NOT NULL ,name NVARCHAR(4000) NOT NULL ,description NVARCHAR(MAX) NULL ,state_id INT NOT NULL ,CONSTRAINT PK_report_instances PRIMARY KEY NONCLUSTERED (id) ,CONSTRAINT UK_report_instances_name UNIQUE (name) ,CONSTRAINT FK_report_instances_report_instace_states_reference_id FOREIGN KEY (state_id) REFERENCES dbo.report_instace_states_reference (id) ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] 

name - instance name unique in the table;
description - the description may be empty;
state_id - reference to the instance status directory (“formed”, “ready”, “saved”, “deleted”).

Formed report sections - report_region_instances , DDL:

 CREATE TABLE report_region_instances ( instace_id INT NOT NULL ,consumer_id INT NOT NULL ,region_id INT NOT NULL ,value NVARCHAR(MAX) NULL ,CONSTRAINT PK_report_region_instances PRIMARY KEY CLUSTERED (instace_id, consumer_id, region_id) ,CONSTRAINT FK_report_region_instances_consumers_report_regions FOREIGN KEY (consumer_id, region_id) REFERENCES dbo.consumers_report_regions (consumer_id, region_id) ,CONSTRAINT FK_report_region_instances_report_instances_instace_id FOREIGN KEY (instace_id) REFERENCES dbo.report_instances (id) ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] 

value - section text, the text is saved “already glued”, individual values ​​calculated by the formulas are not saved.

The generated cell values ​​of the table part are report_cell_instances , DDL:

 CREATE TABLE report_cell_instances ( id INT NOT NULL ,instance_id INT NOT NULL ,consumer_id INT NOT NULL ,column_id INT NOT NULL ,row_order INT NOT NULL ,value NVARCHAR(MAX) NULL ,CONSTRAINT PK_report_cell_instances PRIMARY KEY CLUSTERED (id) ,CONSTRAINT UK_report_cell_instances_column_id_row_order UNIQUE (instance_id, consumer_id, column_id, row_order) ,CONSTRAINT FK_report_cell_instances FOREIGN KEY (consumer_id, column_id) REFERENCES dbo.consumers_report_columns (consumer_id, column_id) ,CONSTRAINT FK_report_cell_instances_report_instances_id FOREIGN KEY (instance_id) REFERENCES dbo.report_instances (id) ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] 

row_order - row order, row number of the table part;
value is the cell text.

One instance of the report of one client has one calculated column that cannot have several identical row numbers, that is, the cell coordinate is unique - a pair (column, row).

According to the client's report structure (consumers_report_columns, consumers_report_regions) and calculated values ​​(report_region_instances, report_cell_instances), you can “glue” a specific instance of the report.

Thus, the mechanism of unique formulas and the composition of columns for each client is implemented, the mechanism of saving the calculated report is implemented.

It remains only to develop a stored procedure for calculating the values ​​using the formulas and the corresponding storage.

I would be glad to constructive criticism or references to the classical implementations of such logic.

Links


Source: https://habr.com/ru/post/278595/


All Articles