📜 ⬆️ ⬇️

rawdog - RSS aggregator without excessive requests

Lyrical introduction


In connection with the recent budding from Habrahabr new resource, I had the need to equip a convenient way to read both resources. The first thought, of course, was about RSS, since the engine at both sites supports it. The real trifles remained - to find a good RSS aggregator that could be installed on a weak VPS (since the fate of Google Reader was somewhat dampened by the desire to rely on third-party services).

At first, a tip from Tsyganov_Ivan led to the Tiny Tiny RSS aggregator, which seemed like a real “silver bullet”. However, closer acquaintance with system requirements somewhat cooled my ardor - to pile up a full-fledged LAMP on a typewriter with God forbid 256 meters of unallocated memory, and all this for the sake of a resource literally for one person? Moreover, acquaintance with the FAQ, which contained links to frankly mocking answers on the package forum, finally discouraged the desire to deal with tt-rss.

The first round of the search ended in failure, because alternatives (like FeedHQ ) required roughly the same thing. Desperate, I was already going to write the tool I needed myself and began to look for suitable libraries for Python (to which I have a weakness) when I came across practically what was needed .

The name RAWDOG itself hints that the author at the time of writing was overwhelmed with similar feelings. This utility is designed to run manually or by cron and can only do one thing: parse the specified RSS feeds and write new items to the output file using the specified pattern.
')

Installation and Setup


Since rawdog is present in the Ubuntu repository, getting a package is not difficult. But the setting has its own characteristics.
First, you will have to add the rawdog call yourself to the crontab, or to cron. *. It will look something like this:

  rawdog --dir WORKDIR --log /var/log/rawdog/rawdog.log --no-lock-wait --update --write 

where the key - no-lock-wait will not allow to run the second copy of rawdog, and WORKDIR - the working directory of the utility.

The fact is that rawdog searches for a configuration file and keeps all its temporary files in one working directory - by default ~ / .rawdog . This may be convenient for a workstation, but it is against the usual practice. If you, like me, like order and uniformity, you can specify a different working directory using the --dir key, which allowed you to send the working directory to / var / cache / rawdog (since its main contents, apparently, the cache of downloaded tapes) . Since the configuration file is also searched there (the –config key allows you to specify an additional config, but does not cancel the search for the main one), it was replaced with a symbolic link, and then went along with the templates in / etc.

A well-documented example of a configuration file can be found on the web , so I will only briefly indicate the main directives:


Customize logrotate


Since rawdog is usually called several times a day, and generates about a kilobyte of logs each time, it makes sense either to disable logging completely (by removing the --log key) or to configure logrotate. For the latter, it suffices to put in /etc/logrotate.d/ a file of approximately the same content (assuming that you have chosen the same path to the log file as I):
/var/log/rawdog/rawdog.log { weekly missingok rotate 5 compress delaycompress notifempty } 


Induce beauty


The built-in template for rawdog is minimalist, if not tougher, so it makes sense to set your own template files. The most important is the pagetemplate template, since it is in it that you can define styles and include the necessary scripts. To see the default page template, you can use the following command (be sure to specify --dir WORKDIR if you, like me, moved the working directory):

  rawdog -s pagetemplate> template.html 


Any embedded template can be viewed with a similar command, replacing the pagetemplate with the template name. Templating is implemented through a simple search with a replacement, although there is a conditional operator that allows you to insert a stub in the absence of a value. By the way, you can define your variables using the define directive VARNAME VALUE (globally) or the parameter define_VARNAME = VALUE (for a separate RSS feed).

It should be noted that each entry by default is marked with the feed-FEEDID CSS class, where FEEDID is the source id specified in the parameters above. This allows you to set your design for records from different sources (for example, show the site icon next to the title).

Grouping tapes into separate issue


Offhand, you can come up with one way to make it relatively easy to create several coexisting tape collections, with separate sets of subscriptions, target files, and design.

To do this, cron. * Instead of the above call is placed in the spirit of:

 #!/bin/sh WORKDIRS=/var/cache/rawdog CONFIGS=/etc/rawdog PLUGINS=/usr/share/rawdog/plugins LOGS=/var/log/rawdog for CFG in "$CONFIGS/"*.conf do WORKDIR="$WORKDIRS/"`basename "$CFG" .conf` [ -d "$WORKDIR" ] || mkdir -p "$WORKDIR" [ -f "$WORKDIR/config" ] || ln -s -f "$CFG" "$WORKDIR/config" if [ -d "$PLUGINS" ]; then [ -d "$WORKDIR/plugins" ] || ln -s -f "$PLUGINS" "$WORKDIR/plugins" fi rawdog --dir "$WORKDIR" --log "$LOGS/rawdog" --no-lock-wait --update --write done 

The principle of operation is simple: for each * .conf file in / etc / rawdog, if necessary, a corresponding working subdirectory in / var / cache / rawdog will be created , and a link to the configuration file itself will be placed in it. There will also be placed (if absent) a link to a directory with common plugins.
For more convenience, you can make general settings in a separate file ( / etc / rawdog / config or / etc / default / rawdog ), including it in the * .conf files using the include directive.

Plugin extensions


rawdog searches for Python scripts located in the plugins subdirectory in the rawdog working directory. A number of ready-made plug-ins (in particular, multipage output and output in RSS format) can be found on the author’s website.

Source: https://habr.com/ru/post/240545/


All Articles