/var/mail
directory, producing a huge amount of output. Usually this process went to infinity and had to kill the dump; in other cases, it did end up with a terabyte (s) of data that seemed to be perfectly compressed. When I once again got such a giant tar file, I checked it - and found out that it partially consists of zero bytes, which the tar -t
testing team doesn’t like very much, after which everything returns to normal./var/mail
to new Linux file servers under Ubuntu 18.04 and therefore switched to a later and more standard version of GNU Tar than is on OmniOS machines. We hoped that this would solve our problems, but almost immediately the same incident occurred. This time GNU Tar worked on the Ubuntu machine, where I am well acquainted with all the available debugging tools, so I checked the running tar
process. The test showed that tar
produces an endless stream of read()
, returning 0 bytes: read(6, "", 512) = 0 read(6, "", 512) = 0 [...] read(6, "", 512) = 0 write(1, "\0\0\0\0\0"..., 10240) = 10240 read(6, "", 512) = 0 [...]
lsof
said that file descriptor 6 is someone's mailbox.apt-get source tar
I downloaded the source code and started looking for read()
system calls that do not check for file completion. Having examined several levels of indirect addressing, I found an obvious place where such a check seems to be omitted, namely in the function sparse_dump_region
from the file sparse.cs . And then I remembered something.ftruncate()
to resize mailboxes; sometimes it expands them, temporarily creating a sparse section of the file, until it fills it, and, possibly, sometimes compresses it. This seemed to coincide with the current situation: sparse areas are related, and reducing the file size with ftruncate()
creates a situation where tar unexpectedly encounters file termination.sparse_dump_region
does not reset sparse areas of the file, but resets not sparse (well, of course), and is used for all files (sparse or not) if you run tar with the argument --sparse
. Thus, the actual error is that if you run GNU Tar with the argument --sparse
and the file is compressed during its reading, tar cannot correctly handle the end of the file received earlier than expected . If the file grows again, tar restores.lsof
, and I could find and see the source code of my version of GNU Tar and run it with the OmniOS debugger (although GDB is not installed there), and so on. But I did not. Instead, we shrugged and moved on. I had to move the file system under Ubuntu so that I could lift a finger and figure out the problem.--sparse
when backing up. Mailboxes should not be sparse, and if this happens, we still compress the file system backups , so all these zero bytes will compress well.Source: https://habr.com/ru/post/434624/
All Articles