📜 ⬆️ ⬇️

Expanding a wide table in a column (EAV pattern)

Task


There is an entity that is characterized by a huge and often variable number of parameters. The task of storing these entities and even so that the search can also be conducted preferably also with the construction of the index.


I’ll say right away that I don’t know how this problem is solved by reference, but it was necessary to solve it quickly and therefore I thought and made the decision about which I will tell.

The first.
The number of parameters is variable => we store different properties of the entity in a table of this type:
prop_description
| (int) id | (enum: bool, int, float, string) prop_type | (varchar) prop_name |
')
Now consider the entity and try to select the primary (most used in the search) and the secondary properties of the entity, create the table on the primary properties:
objects
| id | meta data - all primary entities |

And how to store all those 10, 20, 300 secondary characteristics?


Initially, I defined the division by type because I thought that storing a number in a row and searching for them was blasphemous, so we create 3 tables.
int_properties, float_properties, string_properties,
(int) object_id - external to the table of primary key objects
(int) prop_id - external to the table of properties primary key
value - the type corresponds to the table.
I assume that you understand what I mean?
So when I save an object, I get one record in the objects table and as many records in int_properties and float_properties and string_properties, as I have properties of such an object, but no more than the number of lines in prop_description.
Such data storage is called 3.5 normal form

I think the question of how to write your essence in such a presentation you decide for yourself and there is no big problem here: the cycle by data, the choice of the table, the record.

The search problem is much more interesting, it is clear that in order to find something in such a structure, you need to multiply the tables somehow, the whole question is how? Not really that hard.
Further examples I write in PHP because its syntax is the most obvious.
We write a simple SQL query designer:
/**
* $prop_descr[$properties_name] = array('id'=>$prop_id, 'type'=>$prop_type)
* , ,
*/
$selectToken=array();
$selectQuery= ' FROM objects' ;
foreach ($data as $properties_name=>$ value ){
$prop_id=$prop_descr[$properties_name][ 'id' ];
$prop_type=$prop_descr[$properties_name][ 'type' ];
$tableName= 't_' .$properties_name;
$selectToken[]=$tableName . '.`value` as ' .$properties_name;
// LEFT
$selectQuery .= 'INNER JOIN `' . $prop_type . '_properties` AS `' . $tableName . '`
ON (`objects`.`id` = `'
. $tableName . '`.`object_id`
AND `'
.$tableName. '`.`prop_id ` = ' . $prop_id . ')' ;
}
$selectQuery = 'SELECT ' .implode( ', ' , $selectToken). ' ' .$selectQuery;



PS In the code there are intentionally no checks for logic, they have no relation!

So we get the result, expanded into rows, of course, to search for such a result table, you also need to write $ tableName.`value in WHERE, but this is not difficult to achieve by setting up an array of matches: array ($ properties_name => $ tableName. ' .`value`);

Pros :
- it’s really easy to add and delete features of objects (that is, it's easier to maintain).
- the search is relatively fast (certainly faster than bitwise masks) because indexes are used (primary key)
- the tables are amazingly standardized and just draw, explain, present logic.
- # more from kmmbvnr

Cons :
- this is a little slower than just one mega wide table on a previously known query.
- SQL is not readable (but the optimizer hacks its current path).

UPD thank Joshua for the title
en.wikipedia.org/wiki/Entity-Attribute-Value_model

Source: https://habr.com/ru/post/45935/


All Articles