
In this topic, I will look at four types of metadata that can be attached to a file or directory using the
NTFS file system. I will describe the purposes for which one or another type of metadata can be used; I will give an example of its use in any Microsoft technology or third-party software.
It will be about reparse points, object id and other types of data that a file may contain besides its main content.
Object id
The object identifier is 64 bytes, which can be attached to a file or directory. Of these, the first 16 bytes allow you to uniquely identify a file within a volume and access it not by name, but by identifier. The remaining 48 bytes can contain arbitrary data.
Object identifiers exist in NTFS since Windows 2000. In the system itself, they are used to track the location of the file referenced by the shortcut (.lnk). Suppose a file referenced by a shortcut has been moved within a volume. When you launch the shortcut, it will still open. The special Windows service, in case the file is not found, will attempt to open the file not by its name, but by the previously created and saved identifier. If the file has not been deleted and has not left the limits of the volume, it will open, and the shortcut will again point to the file.
')
Object identifiers were used in the iSwift technology of Kaspersky Anti-Virus version 7. Here is how this technology is described: The
technology was developed for the NTFS file system. In this system, each object is assigned an NTFS identifier. This identifier is compared with the values of the special iSwift database. If the values of the database with the NTFS identifier do not match, then the object is checked or rechecked if it was changed.However, the oversupply of the created identifiers caused problems with scanning the disk with the standard chkdsk verification utility, it took too long. In the next versions of Kaspersky Anti-Virus, they refused to use NTFS Object Id.
Reparse point
In the NTFS file system, a file or directory may contain a reparse point, which translates into Russian as a
“ reparse point
” . Special data is added to a file or directory, the file ceases to be a regular file and only a special file system filter driver can process it.

On Windows, there are types of reparse points that can be processed by the system itself. For example, through reparse points in Windows, symbolic links (symlink) and connections (junction point) are implemented, as well as volume mount points to a directory (mount points).
The reparse buffer attached to the file is a buffer that has a maximum size of 16 kilobytes. It is characterized by the presence of a tag that tells the system what type of reprocessing point it belongs to. When using a reparse buffer of a proprietary type, it is still necessary to set the GUID in it in a special field, and it may be absent in Microsoft reparse buffers.
What types of reparse points are there? I will list the technologies that use reparse points. These are Single Instance Storage (SIS) and Cluster Shared Volumes in Windows Storage Server 2008 R2, Hierarchical Storage Management, Distributed File System (DFS), Windows Home Server Drive Extender. These are Microsoft technologies; there are no third-party technologies mentioned here that use reprocessing points, although there are some.
Extended attributes
Extended file attributes . About them was
my previous topic . Here it is worth mentioning only that under Windows this technology is practically not used. Of the software I know, only Cygwin
uses extended attributes to store POSIX access rights. A single file on NTFS can have either extended attributes or a reparse point buffer. Simultaneous installation of both is impossible. The maximum size of all extended attributes for a single file is 64 Kb.
Alternate data streams
Additional file streams. About them already, probably, everyone knows. I will list the main features of this type of metadata: naming (that is, a file can have several streams, and each has its own name), direct access from the file system (they can be opened using the format "file name, colon, stream name"), unlimited size , the ability to start the process directly from the stream (and the ability to implement through this
fileless process ).
Used in the iStream technology of Kaspersky Anti-Virus. They are used in Windows itself, for example, when downloading a file from the Internet, a Zone.Identifier stream is attached to it, containing information about the location from which the file was received. After running the executable file, the user can see the message
“Unable to verify the publisher. Do you really want to run this program? ” .
So the user is given additional protection from the rash launch of programs received from the Internet. This is just one application of streams, and so you can store a variety of data in them. The mentioned Kaspersky Anti-Virus stored the checksums of each file there, but later, for some reason, they also refused this technology.
Anything else?
There is also
a security identifier , plus standard file attributes, to which there is no direct access, despite the fact that they are also implemented as file streams. And they, and extended attributes, and reparse and object id are all file streams from a system perspective. There is no sense in directly changing the security identifier shown in the following picture as :: $ SECURITY_DESCRIPTOR; let the system do the change. The system itself does not give direct access to other types of streams. So that's it.
Viewing the object id contents, reprocessing points, as well as working with extended attributes and alternative file streams is possible using the
NTFS Stream Explorer program, as well as through the system console utility
fsutil .