I think many people needed to store files associated with a record in a table. It can be a picture for the news, an avatar, a file uploaded by the user - yes, whatever. Usually, in this case, they arrive simply - the file is placed in the file system, and the link to it is written to the database record.
But this classic campaign has many drawbacks:
- files are not deleted when deleting the corresponding database record
- problems while trying to update a file
- synchronization violation between the database and the file system during a transaction rollback
- when backing up and restoring information in the database, out-of-sync with the file system may occur
- files are not subject to access restrictions imposed by the database
You can read more about the problems that arise when files are stored separately from the database in the
SQL Antipatterns presentation, section Phantom Files, page 60. By the way, the presentation author suggests a solution - to store files directly in the database, in the BLOB type field. The truth is that there should be a remark that this must be a weighted decision in each specific case. After all, with this method of storing files, the web server should, at each request, invoke a script that will extract the file from the database and give it to the user, which will inevitably adversely affect performance.
To find a solution to this problem, a brainstorm was conducted and several solutions were invented:
- Before deleting a record, do a SELECT with the same condition and get the names of the files to be deleted. The problem is that if there are a lot of deleted files, this operation may take some time and for good time you need to block the table for reading and writing, and in many cases it is unacceptable.
- Before deleting, set the “to be deleted” label on deleted records, get all records with this label and delete files associated with these records, and finally delete all records with this label. Queries working with this table should be refined so that they do not select records with the flag set. Disadvantages - the need to edit multiple requests, moreover, in our project, records for deletion are selected by quite complex SELECTs that cannot be redone into one UPDATE.
- The first two methods try to solve the problem of “lost” files when deleting records in the database, which occurs when the “classic” method of storing files, but they do not solve the remaining problems of this approach, so we tried to come up with solutions that use the positive aspects of storing files directly in the database and get rid of the shortcomings inherent in this approach.
- Use triggers . Unfortunately, MySQL does not have support for working with files in its language, such commands would have to be implemented independently, picking at the source code of MySQL. Of the minuses - the files should be stored on the same host as the database, the need for refinement of MySQL, we did not find such ready-made solutions.
- Store files in the database, but send them directly to the web server, without PHP. You can accomplish this by writing a module to a web server (nginx for example) that would allow you to give files directly from MySQL or by using the MySQLfs file system driver . This approach solves all the problems listed above, but its disadvantage is the additional overhead of storing files in MySQL.
- A specialized storage engine for MySQL that stores records as files.
Let us dwell in more detail in the last paragraph. After all, what the file system is is a specialized database that allows you to get a record — its contents — using the key “file name”. That is, you can implement your own data storage engine for MySQL, in which each record will have three fields:
CREATE TABLE `data_storage`.`files` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`path` VARCHAR( 255 ) ,
`data` BLOB
) ENGINE = FILES
You can insert data into such a table only in the `data` field, while they are simply saved to a file, the unique name is automatically generated (using the ʻid` field as a prefix) - for example, 764533, and the` path` field is automatically substituted The correct way in which MySQL put our data - for example, '/mnt/storage/mysqldata/76/45/33/764533_myfile.jpg'. Thus, the data stored in such a table can be accessed as simple files, while MySQL will maintain data integrity. Thus, this way of storing files is devoid of almost all the drawbacks of the classical approach (except for access restriction, but it can be done using a simple script and the
X-Accel-Redirect nginx header) and at the same time does not reduce the performance when uploading files to clients.
The problem is small - it was not possible to find a ready-made implementation of such a data storage engine for MySQL, although the idea is generally simple. Perhaps one of the habroops will prompt a link to the ready implementation of such a storage engine, because the idea floats on the surface, and someone could have realized it for sure.
Article written in collaboration with
viperetPS transferred to MySQL blog