I had to observe disputes on this topic in several forums, sometimes even several times. Basically, these topics concerned the storage of images, less often text files. I myself belong to the opponents of this method and in the article I will try to present reasonable evidence that storing files in the database is inconvenient and negatively affects the speed of the system as a whole. Since I mainly work with MySQL, I will consider this issue from the point of view of storing files in its databases when developing for WEB.
1.
RAM . The most important argument against, in my opinion, is the fact that even when you just need to give the file to the user (for example, display an image), you still have to load it into RAM, because All data selected by the query from the database is loaded into RAM. At the time of this writing, the most expensive, after the processor, on dedicated servers is RAM. At the same time, I’m not talking about regular hosting, where in the overwhelming majority of cases a certain amount of RAM is allocated for each account, exceeding which results in “Fatal error: Allowed memory size of X bytes exhausted (tried to allocate Y bytes)” or angry letters from support.
2.
Returning files . If the files are stored in the database, then in order to return them, in any case, you will have to use a script written in one or another language, which should do the following:
2.1 Open a new connection to the database, the number of which is far from infinite, or to occupy an existing but free connection, which is bad for the same reason;
2.2 Request to select the contents of the file from the table. This is where the 2nd problem arises:
2.2.1 Downloading the contents of a file into RAM (see clause 1);
2.2.2 What parameter to search for the file in the database. It is logical to assume that the fastest will be the search by integer ID, but in this case, in the link to the file issue script, you will also need to use this ID (for example: <img src = '. / Getimage.php? Id = 1111'>), in the case of the formation of such links manually complicates the work.
2.3 Give the necessary header and file contents.
And all these steps will be done for each file, and if there are 20 of them on the page, and the download goes simultaneously, then there is a risk of not seeing one.
Also, this way of storing files almost deprives you of the following possibilities:
- distribute files across multiple servers and link directly to these servers for downloading;
- set a
reverse proxy server to speed up "slow clients"
- use
CDN .
')
3.
Dumps . One of the arguments “for” often cites the following: “If I need to transfer the site to another hosting, I only need to make a dump and copy the code”. Personally, I do not consider this argument weighty for the following reasons:
3.1 Let's finally start writing sites so that they work quickly, and not just that they are easily transferred;
3.2 Storage of any, and especially binary files in the database leads to the following:
3.2.1 because the database increases, the time for creating its dump and its size increases accordingly;
3.2.2 the presence in the dump of the contents of a binary file greatly impairs its readability, as well, which is important, not all console editors can open a large dame for editing, for example MCEDIT cannot;
3.2.3 with a high probability there will be problems with uploading such dumps:
- first, it will be quite a long process;
- secondly, it can be quite difficult, especially if the site is located on a hosting that does not allow connecting to a database from the outside and does not allow access to the server via ssh so that you can use the mysql tool. In this case, you will have to use scripts like phpMyAdmin (although in my experience with it I can say with almost 100% certainty that using it to dump a large size 50 100 MB is almost impossible task) or
Sypex Dumper .
4.
Differentiation of access to files . Another argument “for” in many discussions is the fact that when storing data in a database, it is easier to organize the delimitation of access to files for different users of the site. I indirectly agree with this argument, since it is really simpler, but firstly, it does not outweigh the first three points, and secondly, you can use the methods I described in this
article to delimit access.
5.
Centralized storage . This item is also trying to write to the file storage asset in the database, arguing that if there are a lot of servers that work with the same files, it is much more convenient to store them in one place. In my opinion, this is also not an argument, since even if the files are physically stored on different servers, then nothing prevents them from attacking all the servers where they should be used, while creating the same directory structure on all servers.
That seems to be all, but I want to remind once again that I considered this issue from the programming side of the web, namely for MySQL. At the same time, I realize that for other areas of programming and other DBMS, such as MSSQL and Oracle, all or part of my arguments may turn out to be incorrect.