When the project is closer to the end of the development, it becomes clear that you cannot do without the versioning of the data, because any user can go in and delete what has been created for a long time by dozens of other people, you have to find a solution that requires minimal effort. Without going into the details of a specific project on which such a need arose, imagine a google docs spreadsheets document editable to any site visitor.
Suppose Google engineers are using a MySQL database with this structure:
')
documents
--id
--name
--creator_id
sheets
--id
--document_id
--name
rows
--number
--sheet_id
--height
columns
--number
--sheet_id
--width
cells
--id
--sheet_id
--color
--content
--row_number
--col_number
--creator_id
As it was already written above, anyone has access to the document, and, for example, changing the value of a cell to a swear word will permanently destroy the previous value.
Structure requirements
Required
- Ability to save previous cell values
- Ability to view previous revisions of the online document
- The ability to roll back to the previous version
- Some parameters may not have different versions (id, sheet_id)
- Implementing the used ORM
Desirable
- Minimum possible labor for data transfer to a new structure, selection, modification and addition of data
- Saving disk space
- Highlighting changed cells when viewing old revisions
Implementation
After much thought, the "formula" was derived. What if you make an additional table and zadzhoyny the main table and the new table containing different versions of values.
First of all, we need a table with revisions:
revisions
--id
--parent_id
--document_id
--created_date
Spreading the cells table by 2, so that the cells remain information that does not require versioning, and the cells_data - user-modified information. In addition, we add the created_in_revision_id, deleted_in_revision_id fields to the cells_data table, and the user ID that made the changes.
cells
--id
--sheet_id
--row_number
--col_number
--creator_id
cells_data
--cell_id
--color
--content
--data_creator_id
--created_in_revision_id
--deleted_in_revision_id
(primary key on cell_id + created_in_revision_id)
In the code for the Document object (if we program objectally) we add the getRevisionCondition method ($ revisionId = false), which should return a prefix to the SQL Quarae like "created_in_revision_id in (0,100,300,301) and deleted_in_revision_id not in (0,100,300,301)". Those. contain the current revision and all its parentes in the “in (...)” and “not in (...)” construction
Sample
Next kvar, which previously looked like this:
select * from cells where row_number=3 and col_number=2
turns into:
select c.*,cd.* from cells c,cells_data cd where row_number=3 and col_number=2 and id=cell_id and $revisionCondition
Of course, the field names of these tables should not be duplicated.
Insert a new entry
Here everything is as simple as when sampling. Check if the revision has expired and take the last one. For example:
$revision=$document->updateRevisionIfExpired()
We first insert into the main table (cells), then into the table with versioned data (cells_data). In the created_in_revision_id field, we write the ID of the last revision.
Delete record
Here we will try to save disk space. If, for example, we set 30 minutes as the revision lifetime, then we compare the revision of the deleted record with the current revision and:
- if they are the same, just delete from the database
- if the revision has already been changed, then we just write a fresh revision id in the table field cells_data - deleted_in_revision_id
For node.js, it would look like this:
if(cellRevisionId==currentRevision.getId()){ db.online.query("delete from cells_data where cell_id="+cellId+" and created_in_revision_id="+cellRevisionId) }else{ db.online.update('cells_data',{'deleted_in_revision_id':currentRevision.getId()},{'cell_id': cellId, 'created_in_revision_id' : cellRevisionId}) }
Record Update
Changing data is something like deletion. If the revision has not changed, the data is simply updated; if it has changed, the deleted_in_revision_id field is updated and a new record is inserted into the cells_data table with the ID of the new revision created_in_revision_id.
Instead of conclusion
Because Most of the tables that need versioning have now been transferred to a new structure - I can highlight advantages and disadvantages:
Virtues
- All selections and insertions (in other words, selects and inserts) were redone as quickly as possible and without any complications.
- Does everything that was required and almost everything that was desired.
disadvantages
- Deleting, updating the record still happens a little too tightly wound and a more automated script would not hurt
- Perhaps you need to do something with requests like in (...) and not in (...), since they list all revisions for the document