Teaching FreeBSD to look for files by extended attributes.
Annotation: the FreeBSD OS has a file system-level control mechanism for extended file attributes (extattr). Using the standard commands setextattr, getextattr and rmextattr you can create, read and delete comments, keywords and other file metadata. But searching for files by extended attributes is not yet possible. In addition, extended attributes are lost when copying a file. In two articles, I will offer my patches for the find and cp commands that eliminate these shortcomings. However, I am not a professional programmer and did patches for my tasks, so the proposed solution should be considered solely as a proof of concept and modifying with a file.
Inherited from ancient times, we got monstrous and awkward
hierarchical file systems. But along with the rapid increase in the amount of information, their main drawback is manifested more and more clearly - the limited possibilities of classification. There are actually two such possibilities: to assign meaningful file names and sort these files into directories with meaningful names. It worked well when the most common photographic film contained only 36 frames. They were spent slowly, sometimes for several months, carefully choosing the objects to be photographed. Then 36 photos could be scanned and placed in the folders “Dacha”, “Vacation” and “Cats”. But now, when hundreds of pictures can be taken on a smartphone’s camera in a day, and a couple of hundreds of PDF instructions can be downloaded from the Internet in an hour, this approach becomes too time consuming. The increased volume of information is no longer able to be classified hierarchically - the human brain is simply confused in such a hierarchy. Therefore, over time, the “New Folder (2)”, “All Pictures” and “Disassemble” directories appear on the computer.
In response to the limitations that have arisen, the concept of a
semantic file system has appeared. In such a file, some metadata is assigned to the files, which are then used for classification and search. The directory hierarchy is no longer relevant here, just like the file name. A striking example is cloud object storage, in which all objects are located in a flat address space and are provided with metadata describing them. Here they are only intended to work at the API level, and not at the user level. Object stores have not reached desktop systems yet.
And I wanted to assign keywords to all my files and store them on the same level on my local machine right now. I have many interests, but since I don’t remember everything, I constantly have to write down notes, recipes, techniques, instructions and photos. To shove it all into folders - there is no longer any strength or time.
')
Fortunately, desktop FS are catching up with the trend. In many of them, it has long been possible to set keywords for some file types (in Windows, as far as I know, for Word documents and images). However, in most cases, this metadata is stored in the file itself, and not at the file system level (hence the limitation on file type - not everyone supports EXIF ​​and IPTC). But my favorite FreeBSD provides a mechanism for managing the extended attributes of a file of any type at the file system level. As I understand it, somewhere in the depths of the FS, a meta-file is created that is tightly associated with the main file, and all the metadata is written to it. To work with extended attributes, FreeBSD has a family of commands, setextattr, getextattr, and rmextattr.
A simple example:
$ setextattr user comment cats cat.jpg $ getextattr user comment cat.jpg cat.jpg cats
In principle, this is something you can do. If it were not for one “but”: there is no mechanism for searching files by extended attributes in the fryashechka. What is the use of tagging files if I still can’t find them by tags?
But if the mountain does not go to Mohammed ... In general, I decided to teach the fryashechka himself to look for my documents by keywords. But I’m a chemical analyst, not a programmer, and I studied coding myself on Google. Therefore, first of all, I turned to Google. Suddenly everything was thought up long before me, and I don’t have to painfully remember the peculiarities of working with pointers in C? It turned out - yes, it was invented. I found
one patch for the find command. That's just bad luck - the patch taught the find command to check whether a specific attribute is set on the file. And that's all. And look for the contents of this attribute - dismiss.
It was necessary, nevertheless, to suffer with pointers and memory in C. A couple of sleepless nights, three liters of green tea and a black belt for googling - and I got
such a patch (for FreeBSD 11.2.0-RELEASE) .
You can apply it as follows (the source code must be installed on FreeBSD):
cd /usr/src/usr.bin/find patch < /patch-find.diff make make install clean
And use this:
find . -userattr comment=cats ./cat.jpg
This command will find all files whose comment attribute contains the substring cats.
Of course, it’s a bit early to talk about some working solution - in my code there are practically no checks of the input parameter, and the buffer sizes are taken from the bald (I read somewhere that the limit on the size of the extended attribute is 1024 bytes, but then I could not find this information). I invite everyone to participate in bringing the project to readiness and
post the complete find utility code , honestly copied from the official FreeBSD repository, with my changes.
... So, now I could somehow search for files by keywords. But a new attack arose: when copying a file, all associated extended attributes were lost ... The problem was in the cp command, but about that
next time .