📜 ⬆️ ⬇️

Six weeks before Google Reader closes - we save everything we can

image

Google Reader appeared in 2005. A year or two later, I began to use it as the main source of information. And suddenly, here you are, get it - not profitable, not in-line, we close ... As a result, firstly we lost an advanced (geek) and loyal audience, and secondly, these geeks immediately began writing or adding various alternatives. Segmentation has intensified, the problem of choice has arisen, and in general some folks got pissed off ...

For all this time, I have accumulated about 30 subscriptions, which I regularly read and plan to continue in the future. The official recommendation in the blog suggests using the Google Takeout service to upload subscriptions and bookmarks to a file.

I went, unloaded. I looked for alternatives ( one , two , three , four ). Found, uploaded. Immediately problems:
')


For the sake of completeness of history and the rescue of posts from dead blogs, it was necessary to strain and write a tool to download full-text articles from Google Reader (all that was in the RSS feed). There are articles to which I occasionally return, but I don’t have the habit of saving to instapaper / scrapbook / evernote. In addition, I often used the services that make Full-Text RSS from scanty streams (such as HackerNews) and in this regard, my subscriptions are quite readable right in the reader.

To work with the Reader API, there is documentation and a couple of modules for Python (sorry, I did not watch other languages). Of these, you can immediately take the libgreader and do not read the rest. The result is a fetch-google-reader project on Github.

1. Install (preferably in virtualenv, plus for python <2.7, the argparse module is also needed):

pip install git+git://github.com/max-arnold/fetch-google-reader.git curl -s -k https://raw.github.com/max-arnold/fetch-google-reader/master/requirements.txt | xargs -n 1 pip install 


2. Create a directory where the articles will be saved:

 mkdir rss-backup cd rss-backup 


3. Get the list of RSS-subscriptions:

 fetch-greader.py -u YOUR-USERNAME@gmail.com -p YOUR-PASSWORD * Please specify feed number (-f, --feed) to fetch: * [0] Atomized [1] Both Sides of the Table [2] Hacker News [3] Signal vs. Noise [4] :  /  


Select the desired and run the download:

 fetch-greader.py -u YOUR-USERNAME@gmail.com -p YOUR-PASSWORD -f 0 * Output directory: atomized * ---> atomized/2011-05-24-i-hate-google-everything/index.html ---> atomized/2011-01-19-toggle-between-root-non-root-in-emacs-with-tramp/index.html ---> atomized/2010-10-19-ipad/index.html ---> atomized/2010-09-01-im-not-going-back/index.html ---> atomized/2010-08-31-they-cant-go-back/index.html ---> atomized/2010-08-28-a-hudson-github-build-process-that-works/index.html ---> atomized/2010-08-18-frame-tiling-and-centering-in-emacs/index.html ---> atomized/2010-08-17-scratch-buffers-for-emacs/index.html ---> atomized/2010-07-01-reading-apress-pdf-ebooks-on-an-ipad/index.html 


By default, all articles from the selected thread are downloaded, but you can limit yourself only to those marked by adding the --starred key. Also with the help of the --dir key, you can independently specify the directory where the files will be saved.

The RSS feed is saved to a directory with the name derived from the name of the stream (converted to Latin). Each article is saved in a separate directory with the name derived from the date and title of the article. This is done in order to be able to save additional metadata or images there. At the moment, the pictures are not saved. The utility is designed for blogs that are no longer offline, but nothing prevents it from being finalized.

Question to the audience


Which of the existing alternatives does not limit the depth of the history of the RSS feed (i.e., stores everything that has ever been downloaded) and forever caches full-text content? And just in case, has an API or export so that all this can be pulled to your computer when the cards again lie wrong?

PS Picture © mashable.com

Source: https://habr.com/ru/post/180111/


All Articles