In search of a ready-made bike for solving the problem of monitoring changes in a file system with linux + freebsd support, I came across a nice python watchdog (
github ,
packages.python.org ). Which besides interesting to me, the OS also supports MacOS (has its own specifics) and Windows.
Those who are interested in this question and who will not be deterred by the author’s Indian origin, please
.
Installation
You can take the finished version of the PIP:
$ pip install watchdog
The PIP itself is installed as a python-pip package, devel / py-pip port, etc.Or collect from source through setup.py.
In sufficient detail, everything is written in the
original manual . True, there is a description of version 0.5.4, and now 0.6.0 is relevant. However, the whole difference is in editing copyrights and replacing the indentation of 4 spaces with an indent of 2. “Google code style” :)
')
In general, there are quite a few features of the assembly on the versions of the python itself and on the target platform. They are all described in the link above, but if necessary, I will add to the article in brief in Russian.
In addition, it is possible to assemble a module on an incompatible OS, but then a fallback implementation will appear, making “snapshots” of the file system structure with subsequent comparisons. Perhaps, someone did this for solving a similar problem :)I myself tried to build under ubuntu 11.4 and freebsd-8.2 RELEASE, there were no problems with the assembly and operation.
Basic example
Suppose that we are interested in changes in a certain path / path / to / smth, associated with the creation, deletion and renaming of files and directories.
We connect:
from watchdog.observers import Observer from watchdog.events import FileSystemEventHandler
The Observer class is selected in /observers/__init__.py based on the capabilities of your OS, so there is no need to decide for yourself what to choose.
The FileSystemEventHandler class is the base class of the change event handler. He knows little, but we will teach his descendant:
class Handler(FileSystemEventHandler): def on_created(self, event): print event def on_deleted(self, event): print event def on_moved(self, event): print event
The full list of methods can be seen in FileSystemEventHandler.dispatch itself: on_modified, on_moved, on_created, on_deleted.Run this all:
observer = Observer() observer.schedule(Handler(), path='/path/to/smth', recursive=True) observer.start()
Observer is a relatively distant descendant of threading.Thread, respectively, after the start () call, we get the background thread that keeps track of changes. So if the script ends immediately, then we will not get anything sensible. The implementation of the expectations depends primarily on the use of the module in a real project, now you can just make a crutch:
try: while True: time.sleep(0.1) except KeyboardInterrupt: observer.stop() observer.join()
We are waiting for the events of the FS changes until the arrival of Ctrl + C (SIGINT), after which we tell our thread to finish and wait for it to perform.
Run the script, go our way and:
# mkdir foo # touch bar # mv bar baz # cd foo/ # mkdir foz # mv ../baz ./quz # cp ./quz ../hw # cd .. # rm -r ./foo # rm -f ./*
At the exit of the script we have:
<DirCreatedEvent: src_path=/path/to/smth/foo> <FileCreatedEvent: src_path=/path/to/smth/bar> <FileMovedEvent: src_path=/path/to/smth/bar, dest_path=/path/to/smth/baz> <DirCreatedEvent: src_path=/path/to/smth/foo/foz> <FileMovedEvent: src_path=/path/to/smth/baz, dest_path=/path/to/smth/foo/quz> <FileCreatedEvent: src_path=/path/to/smth/hw> <FileDeletedEvent: src_path=/path/to/smth/foo/quz> <DirDeletedEvent: src_path=/path/to/smth/foo/foz> <DirDeletedEvent: src_path=/path/to/smth/foo> <FileDeletedEvent: src_path=/path/to/smth/hw>
The
descendants of FileSystemEvent listed in watchdog / events.py come to methods of our class Handler in the event field.
Everyone has the properties src_path, is_directory, event_type ("created", "deleted", etc.). For the moved event, the dest_path property is added.
Well, if you don't want anything else ... Is there anything else?
Descendants of FileSystemEventHandler remain with us for a snack:
PatternMatchingEventHandler can be used to receive events only for those FS nodes whose names match the mask with the rules:
- * any characters
- ? any single character
- [seq] any single character specified
- [! seq] any single character NOT NOT specified
The task of the rules is performed when creating:
class Handler(PatternMatchingEventHandler): pass event_handler = Handler( patterns = ['*.py*'], ignore_patterns = ['cache/*'], ignore_directories = True, case_sensitive = False ) observer = Observer() observer.schedule(event_handler, path='/home/LOGS/', recursive=True)
RegexMatchingEventHandler does the same, but with explicit instructions of regexp expressions in the constructor:
class Handler(RegexMatchingEventHandler): pass event_handler = Handler( regexes = ['\.py.?'], ignore_regexes = ['cache/.*'], ignore_directories = True, case_sensitive = False )
The PatternMatchingEventHandler inside itself eventually translates patterns into regulars, so it has to work more slowly due to the presence of such an overhead.
Finally, LoggingEventHandler logs everything to log via logging.info ().
- That's all. Maybe someone will come in handy.
PS
When tracking a directory in which (and its children) contain non-ascii-named folders / files, an exceptions.UnicodeEncodeError exception occurs in watchdog depths. In Linux (inotify) it occurs in
watchdog.observers.inotify.Inotify._add_watch .
The reason is reading the content in ascii encoding.
To correct the situation, you can patch the method:
from watchdog.observers.inotify import Inotify _save = Inotify._add_watch Inotify._add_watch = lambda self, path, mask: _save(self, path.encode('utf-8'), mask)
Here is an example of the original string, and its repr () before and after processing encode ():
/home/atercattus/.wine/drive_c/users/Public/ u'/home/atercattus/.wine/drive_c/users/Public/\u0420\u0430\u0431\u043e\u0447\u0438\u0439 \u0441\u0442\u043e\u043b' '/home/atercattus/.wine/drive_c/users/Public/\xd0\xa0\xd0\xb0\xd0\xb1\xd0\xbe\xd1\x87\xd0\xb8\xd0\xb9 \xd1\x81\xd1\x82\xd0\xbe\xd0\xbb'