Do-it-yourself Python Podcast-downloader

It just so happened that my attention was attracted by the inscription: "Listen to the podcasts correctly by subscription." I don’t feel much trepidation before listening to someone else’s dia (mono) logs, but it’s very difficult to dig deeper into the rss and xml scheme of work under Python.
It is probably worth noting right away that the attached code is in the alpha stage, but now you can use it (I downloaded 17 issues of habrakast for myself), and thanks to the semantics of Python, the source code is easily read without additional documentation. It is assumed that this article will help primarily people mastering Python, and only the second to those who need the functionality of the program (good, as far as I know, there is enough of such software).
Initially, the program was conceived as a demon downloading fresh podcasts before you want to listen to them. Accordingly, she did not need any GUI. All that was needed was to do their work quietly with a certain flexibility and ease of setup.
I imagined how podcasts are downloaded during breakfast, and during the trip to work they are tapped, and I started programming.

Materiel

Links to mp3 files (podcasts) are stored in the subscription XML file inside the attributes of one of the tags: media: content or enclosure, for each item (issue). So pick up this good in Python is easy:

  <code> item_node = file_xml.getElementsByTagName ("item")
 for item in item_node:
    title = self.get_tag_content (item, "title")
    description = self.get_tag_content (item, "itunes: summary")
    media_content = self.get_tag_content (item, "media: content", "url")
    enclosure = self.get_tag_content (item, "enclosure", "url")
 # Here self.get_tag_content () is a small self-writing function that causes no exceptions on error.

It remains only to carefully fold the podcasts on the hard disk. Here, under the influence of FlashGet, I hoped that it would be possible to implement downloading in several streams, however, as it turned out later, the podcast server does not fully support the required HTTP header Range.

Practice

I divided the task into 4 parts:
1. Jump and catch information from rss-feed.
2. Sync podcasts with the server.
3. Download the required file.
4. Coordinating the previous three points.
In the attached archive you will find their implementation in the files rss.py, keeper.py, dnld.py and main.pyw, respectively. In each class implementation there is a verbose property, which is responsible for displaying debug information in the console mode. The last file has the pyw extension, emphasizing with this the purpose of the program - to make things quiet, but now it's too early to talk about it, so verbose is equal to True everywhere. You will also find the go.bat file - it deletes the compiled files of the previous launch and leaves the console window open after the program has finished, so that you can see the output.
This is my first article on Habré, and I'm not sure that I fit into the subject ... I hope the work of my three days will help someone in the way of learning the most beautiful cross-platform language. Download project

Source: https://habr.com/ru/post/28314/

All Articles

Do-it-yourself Python Podcast-downloader

Materiel

Practice

More articles: