📜 ⬆️ ⬇️

Tracking directory changes: how it is done in different operating systems

I would like to devote an article to the review of the API provided by different operating systems to monitor changes in the directory. The article appeared as the result of my work on the demons tracking the changes for the dklab_realsync utility ( article on the habr , github repository ) and my own, which I don’t want to announce.

Windows ReadDirectoryChangesW


For the Windows operating system, there is a great function ReadDirectoryChangesW , which returns a set of changes for a directory, including a flag for working recursively (bWatchSubtree). Thus, the implementation of tracking changes in the directory is not difficult and in the same dklab_realsync implementation takes 80 lines of code or 3.5 KB. Interestingly, in Windows, these events are supported even through SMB!

However, there are certain pitfalls:


')
Conclusion: The ReadDirectoryChangesW function allows you to easily find out about all the events in the files, but the event queue may overflow and then you will need to perform a full scan of the file system. Also, delivery of events is possible before they become relevant.

Mac OS X, FSEvents


Mac OS X also has a convenient and simple API for tracking changes in the file system called FSEvents . Using this API, the simplest daemon implementation is 50 lines of code or 1.8 KB. The queue cannot overflow (!), But a full scan may still be required if the fseventsd daemon crashes. It should be noted that this API, up to version 10.7, does not provide changes by files, it only reports directories in which something has changed. Since events do not disappear anywhere and are written to the log ( FSEvents service stores events in a persistent, per-volume database ), specifying with accuracy for the directory saves disk space.

Conclusion: FSEvents API for Mac OS X is the most unusual of all such APIs. The queue does not overflow and even has the opportunity to receive events from the past. However, the event details are given up to the directory (up to version 10.7), which means that the daemon is less efficient for file synchronization.

Linux inotify


In linux vanilla kernel, there is one way to monitor changes in a directory - it is inotify . There is good and detailed documentation for this API, but there is no support for recursive change tracking! Also, inotify has a limit on the maximum number of objects that can be monitored. The simplest daemon implementation takes up 250 lines of code or 8 kb. Static build using dietlibc takes about 14 kb. Another unpleasant point is that the application itself must maintain correspondences between the watch descriptor (in our case it is always a directory) and the name. There is a function inotify_add_watch , which passes the path to the monitored directory, but there is no inverse - inotify_get_path, which would return this same path by the passed descriptor. Events, however, contain only a watch descriptor and a relative path to the changed file within the directory.

Pitfalls of recursive directory tracking via inotify:



FreeBSD, Mac OS X, kqueue


FreeBSD and Mac OS X allow you to track changes using kqueue, which is similar to inotify in its characteristics and also does not have the ability to recursively track directories. Also, kqueue accepts open file (directory) descriptors as arguments, so when using this API, the restrictions on the number of tracked directories are even stricter.

Total:


MechanismQueue overflowRecursive?Max. of objectsDetailing
ReadDirectoryChangesWYesYes-file
FSEventsNotYes-file (10.7+)
inotifyYesNot8192file
kqueueYesNot1024file
As you can see, all APIs have their advantages and disadvantages. The least convenient mechanisms are kqueue and inotify, but they are also the most effective and reliable. Commercial operating systems provide more convenient mechanisms for tracking changes, but they also have their own characteristics. I hope you now have a better idea of ​​how hard the fate of Dropbox and similar programs need to get along with all this and implement reliable and effective data synchronization :).

* Picture taken from www.alexblogger.com/2008_01_01_archive.html

Source: https://habr.com/ru/post/164775/


All Articles