Ideal catalog, data sample optimization

Introduction

At the next interview I was asked about the shortcomings of the EAV data model (Entity Attribute Value), I did not find what to say, in my opinion this is the ideal way to store arbitrary data. After a short reflection, I said that the only problem is the impossibility of constructing indices for samples.

After the interview I was puzzled by this question for a few days, I came to some conclusions, to clear my conscience a little google. Googled confirmation of his thoughts, but this was not enough for me - I wanted to implement the confirmation with numbers.

If you are interested in what conclusions I came to and what benefits from optimization can be obtained, then welcome to cat.

Table of contents on the series of articles "Ideal catalog"

Brief conclusions

For those who are tired and do not master a lot of letters, the conclusion is as follows: for sampling data, you must use materialized views, for which indices are built for all columns.
Sampling on the materialized view works 5-15% faster than sampling from a similar table with similar indices.
This concludes the conclusions and begins a consistent presentation.

Methods of working with the EAV model

Data can be obtained by direct queries, you can make a presentation for each type of Entity (catalog rubrics), you can make a similar materialized view. As an option, you can make a table, but with every Attribute change, it will be necessary to perform ALTER TABLE with DROP COLUMN and ADD COLUMN, and this is exactly what they leave with EAV, although if you store data in EAV and use the table only for reading, normal option.
You can google other ways to optimize EAV, but I researched only these.

Testing method

Scripts (PHP) were written to generate the contents of “headings” and scripts to generate the DDL code for creating a table, creating a view, and creating a materialized view (PostrgeSql).
With these scripts the appropriate data sources were created.
Now we select an entity (catalog rubric), which will have many positions and few characteristics (attributes).

SELECT r.code, (SELECT COUNT(*) FROM rubric_property rp WHERE rp.rubric_id = r.id) property_count, (SELECT COUNT(*) FROM rubric_item ri WHERE ri.rubric_id = r.id) item_count, (SELECT COUNT(*) FROM rubric_property rp WHERE rp.rubric_id = r.id) + ( SELECT COUNT(*) FROM rubric_item ri WHERE ri.rubric_id = r.id) summary FROM rubric r GROUP BY r.code, r.id ORDER BY summary DESC, property_count DESC, item_count DESC;

The number of attributes in my opinion does not matter much, but if someone wants to evaluate the impact of the number of attributes, then all the scripts are in the repository .
We make two types of samples, the first type of sample returns one line, the second one - several lines.
Of course, for full testing, more sample types are required.

For each type of sample, we perform queries to each data source: we sample from a query, from a view, from a table, from a materialized view and again from a query, and so on in a circle. Such a recurrence in theory will reduce the impact of query caching Performing the same query 1000 times in a row seems to me a bad way to test.
Thus, we make five measurements of the execution time. We discard the extreme measurements - the most is the run time and the shortest run time. What remains is tabulated.

Testing was conducted on the following data set:

rubrics (entity) - 323
characteristics of the product (attribute, attribute) - 47,229
commodity items - 6,989
values (value) - 1 102 279

Directly in the chosen category there were 41 positions and 22 characteristics.

Test results

The query returns one row (many selection conditions)

 WHERE mv.tiger IN ('poor', 'white', 'orange', 'red') AND mv.bowl BETWEEN 1000 AND 4000 AND mv.clock > 3000 AND mv.legs < 2000 AND mv.snake = 'crazy'

iteration	SELECT	VIEW	TABLE	MAT VIEW
one	61	53	22	20
2	49	50	20	15
3	50	49	20	14
mean time ms	53	50	20	sixteen

The query returns multiple rows (fewer sample conditions)

 WHERE mv.tiger IN ('poor', 'white', 'orange', 'red') AND mv.bowl BETWEEN 1000 AND 4000

iteration	SELECT	VIEW	TABLE	MAT VIEW
one	92	70	31	nineteen
2	63	56	nineteen	14
3	54	54	22	18
mean time ms	69	60	24	17

For direct requests, these are still good results. During the testing, I changed the data set several times and the sampling time could exceed one second (1000 ms), while the sample from the materialized view took no more than 30 ms.
As you can see, even if the data is stored in a table, the selection from the materialized view is faster.

Possible optimizations

Materialized view

Data Definition Language

  -- CREATE MATERIALIZED VIEW : CREATE MATERIALIZED VIEW tea_mv AS SELECT ri.item_id item_id, ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'snake' ) "snake" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'tiger' ) "tiger" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'beans' ) "beans" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'sweater' ) "sweater" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'pudding' ) "pudding" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'bumper' ) "bumper" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'train' ) "train" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'alligator' ) "alligator" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'trousers' ) "trousers" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'bowl' ) "bowl" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'cabbage' ) "cabbage" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'ship' ) "ship" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'undershirt' ) "undershirt" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'simmer' ) "simmer" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'squirrel' ) "squirrel" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'ceiling' ) "ceiling" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'legs' ) "legs" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'jacket' ) "jacket" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'book' ) "book" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'stomach' ) "stomach" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'clock' ) "clock" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'furniture' ) "furniture" FROM rubric_item ri WHERE ri.rubric_id = ( SELECT id FROM rubric WHERE code = 'tea' ) ; /* [2017-12-03 14:27:34] completed in 163ms */ -- CREATE INDEX FOR MATERIALIZED VIEW : CREATE INDEX tea_mv_snake ON tea_mv ("snake"); CREATE INDEX tea_mv_tiger ON tea_mv ("tiger"); CREATE INDEX tea_mv_beans ON tea_mv ("beans"); CREATE INDEX tea_mv_sweater ON tea_mv ("sweater"); CREATE INDEX tea_mv_pudding ON tea_mv ("pudding"); CREATE INDEX tea_mv_bumper ON tea_mv ("bumper"); CREATE INDEX tea_mv_train ON tea_mv ("train"); CREATE INDEX tea_mv_alligator ON tea_mv ("alligator"); CREATE INDEX tea_mv_trousers ON tea_mv ("trousers"); CREATE INDEX tea_mv_bowl ON tea_mv ("bowl"); CREATE INDEX tea_mv_cabbage ON tea_mv ("cabbage"); CREATE INDEX tea_mv_ship ON tea_mv ("ship"); CREATE INDEX tea_mv_undershirt ON tea_mv ("undershirt"); CREATE INDEX tea_mv_simmer ON tea_mv ("simmer"); CREATE INDEX tea_mv_squirrel ON tea_mv ("squirrel"); CREATE INDEX tea_mv_ceiling ON tea_mv ("ceiling"); CREATE INDEX tea_mv_legs ON tea_mv ("legs"); CREATE INDEX tea_mv_jacket ON tea_mv ("jacket"); CREATE INDEX tea_mv_book ON tea_mv ("book"); CREATE INDEX tea_mv_stomach ON tea_mv ("stomach"); CREATE INDEX tea_mv_clock ON tea_mv ("clock"); CREATE INDEX tea_mv_furniture ON tea_mv ("furniture");

The materialized view works many times faster than direct requests.
When changing data, you must update the view (REFRESH MATERIALIZED VIEW).
When adding an attribute (characteristic) it is necessary to re-create the view (CREATE MATERIALIZED VIEW).

These operations take a matter of seconds. These actions can be automated and performed without the involvement of qualified personnel.

But if we have a million users and they constantly fill in new catalog positions (headings), but also the administration changes the list of characteristics several times an hour, this can be a problem.

Table

Data Definition Language

 -- CREATE TABLE : CREATE TABLE tea_t ( id serial not null constraint tea_pkey primary key, is_hidden integer default 0, insert_date timestamp with time zone default now(), item_id integer REFERENCES item (id), "snake" TEXT, "tiger" TEXT, "beans" TEXT, "sweater" TEXT, "pudding" TEXT, "bumper" TEXT, "train" TEXT, "alligator" TEXT, "trousers" TEXT, "bowl" DOUBLE PRECISION, "cabbage" DOUBLE PRECISION, "ship" DOUBLE PRECISION, "undershirt" DOUBLE PRECISION, "simmer" DOUBLE PRECISION, "squirrel" DOUBLE PRECISION, "ceiling" DOUBLE PRECISION, "legs" DOUBLE PRECISION, "jacket" DOUBLE PRECISION, "book" DOUBLE PRECISION, "stomach" DOUBLE PRECISION, "clock" DOUBLE PRECISION, "furniture" DOUBLE PRECISION ) ; /* [2017-12-03 14:28:24] completed in 82ms */ -- CREATE INDEX FOR TABLE : CREATE INDEX tea_t_snake ON tea_t ("snake"); CREATE INDEX tea_t_tiger ON tea_t ("tiger"); CREATE INDEX tea_t_beans ON tea_t ("beans"); CREATE INDEX tea_t_sweater ON tea_t ("sweater"); CREATE INDEX tea_t_pudding ON tea_t ("pudding"); CREATE INDEX tea_t_bumper ON tea_t ("bumper"); CREATE INDEX tea_t_train ON tea_t ("train"); CREATE INDEX tea_t_alligator ON tea_t ("alligator"); CREATE INDEX tea_t_trousers ON tea_t ("trousers"); CREATE INDEX tea_t_bowl ON tea_t ("bowl"); CREATE INDEX tea_t_cabbage ON tea_t ("cabbage"); CREATE INDEX tea_t_ship ON tea_t ("ship"); CREATE INDEX tea_t_undershirt ON tea_t ("undershirt"); CREATE INDEX tea_t_simmer ON tea_t ("simmer"); CREATE INDEX tea_t_squirrel ON tea_t ("squirrel"); CREATE INDEX tea_t_ceiling ON tea_t ("ceiling"); CREATE INDEX tea_t_legs ON tea_t ("legs"); CREATE INDEX tea_t_jacket ON tea_t ("jacket"); CREATE INDEX tea_t_book ON tea_t ("book"); CREATE INDEX tea_t_stomach ON tea_t ("stomach"); CREATE INDEX tea_t_clock ON tea_t ("clock"); CREATE INDEX tea_t_furniture ON tea_t ("furniture"); -- FILL TABLE WITH DATA : INSERT INTO tea_t (item_id,snake,tiger,beans,sweater,pudding,bumper,train,alligator,trousers,bowl,cabbage,ship,undershirt,simmer,squirrel,ceiling,legs,jacket,book,stomach,clock,furniture) SELECT ri.item_id item_id, ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'snake' ) "snake" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'tiger' ) "tiger" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'beans' ) "beans" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'sweater' ) "sweater" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'pudding' ) "pudding" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'bumper' ) "bumper" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'train' ) "train" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'alligator' ) "alligator" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'trousers' ) "trousers" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'bowl' ) "bowl" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'cabbage' ) "cabbage" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'ship' ) "ship" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'undershirt' ) "undershirt" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'simmer' ) "simmer" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'squirrel' ) "squirrel" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'ceiling' ) "ceiling" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'legs' ) "legs" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'jacket' ) "jacket" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'book' ) "book" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'stomach' ) "stomach" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'clock' ) "clock" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'furniture' ) "furniture" FROM rubric_item ri WHERE ri.rubric_id = (SELECT id FROM rubric WHERE code = 'tea') ; /*[2017-12-03 14:30:08] 41 rows affected in 105ms*/

Queries to the table as well as queries to the materialized view work many times faster than direct queries, while the query to the table will work a little slower than the query to the materialized view.

In the tables, when changing the attribute list, the problems are the same as in the materialized view, but when the value of the characteristic changes, it is not necessary to update all the records in the table, it is enough to update only the changed rows and columns.

If it is impossible to create a foreign key for the view, then it is possible and necessary for the table to link the position of the catalog and the entry in our table of characteristics, thereby increasing the consistency of the data. Using the table allows you to use all the features of the tables (triggers, logging, backup).

Representation

Data Definition Language

 CREATE VIEW tea_v AS SELECT ri.item_id item_id, ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'snake' ) "snake" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'tiger' ) "tiger" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'beans' ) "beans" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'sweater' ) "sweater" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'pudding' ) "pudding" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'bumper' ) "bumper" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'train' ) "train" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'alligator' ) "alligator" , ( SELECT sm.string FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN string_matter sm ON sm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'trousers' ) "trousers" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'bowl' ) "bowl" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'cabbage' ) "cabbage" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'ship' ) "ship" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'undershirt' ) "undershirt" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'simmer' ) "simmer" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'squirrel' ) "squirrel" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'ceiling' ) "ceiling" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'legs' ) "legs" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'jacket' ) "jacket" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'book' ) "book" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'stomach' ) "stomach" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'clock' ) "clock" , ( SELECT dm.digital FROM item_content ic JOIN content c ON ic.content_id = c.id JOIN property p ON c.property_id = p.id JOIN digital_matter dm ON dm.content_id = c.id WHERE ic.item_id = ri.item_id AND p.code = 'furniture' ) "furniture" FROM rubric_item ri WHERE ri.rubric_id = ( SELECT id FROM rubric WHERE code = 'tea' ) ;

A view works a little faster than a direct request, and the use of a view reduces the number of lines of code. If the request can easily be inflated by 1000 or more lines, then accessing the view is just one line - less data is sent between the DBMS server and the application server, less work is done to the DBMS parser.

Changing data immediately, without additional actions, is reflected in the view.
When changing the list of characteristics (attributes), you need to re-create the view.

Using EAV Model

For a regular online store that sells both refrigerators and auto parts, the use of EAV is more than justified, for each item you will not create a new table or an additional column in the table of characteristics. In addition, in each rubric there are usually no more than a hundred titles, and for each rubric a separate materialized view is a good idea; to search among hundreds of other lines is faster than searching among millions of attribute lines.

Changes in the values of the characteristics occur once in a life - at the moment of adding a product item. Changes in the list of characteristics occur with each turn of the fashion - no more than once every six months. Adding commodity items - each delivery - no more than once a week.

That is, we have data that is more often read and almost never change.
The EAV model is a great fit for this use. Optimization through the materialized views gives the reading speed not lower than the reading speed from ordinary tables.

Bonus

In the previous article, I promised to share classes for generating EAV requests. I still haven't got a hand in writing them, but I can share a repository in which all this is there , the only thing that is not there is documentation.

If someone undertakes to deal with this cemetery of good intentions, then contact, I will help (my contacts are indicated in the profile, it is better to write in VK).

If you want to repeat the experiment

For those to whom my collective testing with testing optimizations seemed dubious, I suggest testing them on their own. All scripts for this are laid out in the repository .
The sequence of actions is described in use_case.sql .

In addition, using a bunch:

noun.txt
adjective.txt
get_pdo.php
words_input.php
data_generation.php
ddl_generation.php

You can create your own data set and conduct your experiments.

There is not a lot of code in the scripts (there are 1,000 lines per three php files), cut as you like.

The code works under PHP 7.1 and PostgreSQL 10.1, after file processing it will work for earlier versions.

Conclusion

If you like, subscribe to our channel, put likes, write comments.

Source: https://habr.com/ru/post/343776/

All Articles

Ideal catalog, data sample optimization

Introduction

Table of contents on the series of articles "Ideal catalog"

Brief conclusions

Methods of working with the EAV model

Testing method

Test results

Possible optimizations

Materialized view

Table

Representation

Using EAV Model

Bonus

If you want to repeat the experiment

Conclusion

More articles: