📜 ⬆️ ⬇️

Bash RSS parser for LostFilm.TV: Transmission + SQLite + mkvtools

Good morning, Habr!

In response to the previous topic about the RSS parser LostFilm.TV I want to post my version that has been running for about 4 months without any errors.
The essence of the idea is that the server downloads the serials and puts them into folders, making out the files with the cover and the normal title.
However, not a single script is involved in the work of the entire system, but a whole set of scripts. In such a system, the scripts are divided into pre-processing and post-processing.
And, of course, for aesthetes: the files of each series should look beautiful and be arranged in folders.


pre-processor


Algorithm parser

• Read RSS feed
• Perform analysis on individual elements
• Weed out old items
• Add new items to the database and download queue
')
Parser code

#!/bin/bash export PRFX="/var/lib/transmission-daemon" export SELF=$(basename $(readlink -f $0)) #   . $PRFX/.$SELF/config #    . $PRFX/.funcs/sqlite . $PRFX/.funcs/transmission #       check_last() { if [ ! -f $LFW_RSS_LAST ] || [ $1 -gt $(cat $LFW_RSS_LAST) ]; then printf "$1" > $LFW_RSS_LAST return 0 else return 1 fi } #    parse_title() { echo "$1" | sed -r 's/^([^(]+)[. ]+\((.+)\)[. ]+([^(]+)[. ]+\((.+)\)[. ]*(\[720p\]){0,1}[. ]*\(S0*([0-9]+)E0*([0-9]+).*\)$/\1|\2|\3|\4|\5|\6|\7/' } logger -t $SELF -- '  RSS/Atom' #  RSS- rsstail -1NHlp -n 30 -u "$LFW_RSS_URL" | #        iconv -f cp1251 | sed -r '/^\s*$/d; s/^\s+//; s/\s+$//' | sed -r '$!N; s/\n/|/; $!N; s/\n/|/' | #    UNIXTIME      (IFS='|'; while read item_title item_link item_date; do printf '%s|%s|%s\n' $(date -d "$item_date" +%s) "$item_title" "$item_link" done) | sort | #    (IFS='|'; while read item_date item_title item_link; do if check_last $item_date; then printf '%s|%s|%s\n' $item_date "$(parse_title "$item_title")" "$item_link" fi done) | #    (IFS='|'; while read date name_ru name_en title_ru title_en hd season episode link; do #      id=$(printf 'SELECT id FROM series WHERE title_en = "%s";' "$name_en" | db_query $LFW_DB) #    ,   if [ -n "$id" ]; then #  -     tr_file="/tmp/lostfilm_$(uuidgen).torrent" if wget -nv -q --header "$LFW_WGET_AUTH" "$link" -O "$tr_file"; then #         ,     if [ -z $(printf 'SELECT id FROM episodes WHERE series = %d AND season = %d AND episode = %d;' $id $season $episode | db_query $LFW_DB) ]; then printf 'INSERT INTO episodes (series, season, episode, title_en, title_ru) VALUES (%d, %d, %d, "%s", "%s");' \ $id $season $episode $title_en $title_ru | db_query $LFW_DB fi #     HD  SD   if [ -z "$hd" ]; then file=$(transmission-show "$tr_file" | sed -r '/^Name:/!d; s/^Name:\s*(.+)\s*$/\1/') #' printf 'INSERT INTO files (id, date, filename) VALUES ((SELECT id FROM episodes WHERE series = %d AND season = %d AND episode = %d), %d, "%s");' \ $id $season $episode $date $file | db_query $LFW_DB if [ $(printf 'SELECT tracked FROM series WHERE id = %d;' $id | db_query $LFW_DB) -ne 0 ]; then transmission --add "$tr_file" --start > /dev/null fi logger -t $SELF -- $(printf '  SD  «%s» ( %s,  %s) — «%s»' "$name_ru" "$season" "$episode" "$title_ru") #' else file=$(transmission-show "$tr_file" | sed -r '/^Name:/!d; s/^Name:\s*(.+)\s*$/\1/') #' printf 'INSERT INTO files_hd (id, date, filename) VALUES ((SELECT id FROM episodes WHERE series = %d AND season = %d AND episode = %d), %d, "%s");' \ $id $season $episode $date $file | db_query $LFW_DB if [ $(printf 'SELECT tracked_hd FROM series WHERE id = %d;' $id | db_query $LFW_DB) -ne 0 ]; then transmission --add "$tr_file" --start > /dev/null fi logger -t $SELF -- $(printf '  HD  «%s» ( %s,  %s) — «%s»' "$name_ru" "$season" "$episode" "$title_ru") #' fi fi [ -f "$tr_file" ] && rm -f "$tr_file" fi done) logger -t $SELF -- '' 


post handler


Further, when the file is uploaded, it needs to be aesthetically pleaded (my wife prefers to see all the series, arranged in folders with the name of the series and have a file name containing the season / episode number and the name of the series, as well as a beautiful cover icon for the series).
After downloading the series, a chain of scripts is launched, aimed at processing various kinds of files, among which there is also lostfilm.tv.

Algorithm of post script processing

• Check for the presence in the database of the file name obtained from the daemon, otherwise exit
• Unload cover from base
• Apply mkvtools (track filter, insert cover, write title)
• Put in the folder by the name of the series (create, if not)
• Notify signed by mail and via SMS (on my phone, my wife by mail)

Post-processing script

 #!/bin/bash export SELF="lostfilm-rss" #   . $PRFX/.$SELF/config #    . $PRFX/.funcs/sqlite . $PRFX/.funcs/transmission . $PRFX/.funcs/mkv_tools . $PRFX/.funcs/mail_notify . $PRFX/.funcs/utils . $PRFX/.funcs/sms_notify MAIL_LIST_HD="me@a***n.ru" #        data="$(printf 'SELECT s.title_ru, e.season, e.episode, e.title_ru, s.id FROM episodes e, series s, files_hd f WHERE e.series = s.id AND e.id = f.id AND f.filename = "%s";' "$TR_TORRENT_NAME" | db_query $LFW_DB)" #    ,   if [ -n "$data" ]; then #    name=$(echo $data | cut -d'|' -f1) s=$(echo $data | cut -d'|' -f2) e=$(echo $data | cut -d'|' -f3) part=$(echo $data | cut -d'|' -f4) id=$(echo $data | cut -d'|' -f5) #    mkv_file=$(printf '/mnt/videos/Series.HD/%s [%s.%s] — %s.mkv' "$name" "$s" "$e" "$part") #   - mkv_title=$(printf '«%s» •  %s,  %s • «%s»' "$name" "$s" "$e" "$part") #     mkv_poster="/tmp/mkv_poster_$(uuidgen).jpg" printf 'SELECT data FROM posters WHERE series = %d ORDER BY date DESC LIMIT 1;' $id | db_query $LFW_DB | base64 -d - > $mkv_poster logger -t $SELF -- $(printf ' HD  «%s» ( %s,  %s) — «%s»' "$name" "$s" "$e" "$part") #' #  -    if to_mkv "/mnt/torrent/$TR_TORRENT_NAME" "$mkv_file" "$mkv_title" "$mkv_poster"; then logger -t $SELF -- $(printf ' HD  «%s» ( %s,  %s) — «%s»' "$name" "$s" "$e" "$part") #' #    ,       ,     ,      transmission -t $TR_TORRENT_ID --remove-and-delete > /dev/null logger -t $SELF -- $(printf '   HD  «%s» ( %s,  %s) — «%s»' "$name" "$s" "$e" "$part") #' #    mail_notify "$MAIL_LIST_HD" "$(printf '   «%s» ( %s,  %s) — «%s»' "$name" "$s" "$e" "$part")" \ "$(printf '<h3>«%s» ( %s,  %s) — «%s»</h3><br/><i>: HD</i>' "$name" "$s" "$e" "$part")" #   sms_notify a***n "$(printf '  «%s» ( %s,  %s) — «%s» \n[%s]' "$name" "$s" "$e" "$part" "$(disk_info)")" else logger -t $SELF -- $(printf '    HD  «%s» ( %s,  %s) — «%s»' "$name" "$s" "$e" "$part") #' fi rm -f $mkv_poster fi 


Database


From the scripts you can see that the work goes with the database, in my version using SQLite3

Database structure

 CREATE TABLE episodes ( id INTEGER PRIMARY KEY AUTOINCREMENT, series INTEGER REFERENCES series(id) ON UPDATE CASCADE ON DELETE CASCADE, season INTEGER NOT NULL, episode INTEGER NOT NULL, title_en TEXT NOT NULL, title_ru TEXT NOT NULL, UNIQUE(series, season, episode) ); CREATE TABLE files ( id INTEGER PRIMARY KEY REFERENCES episodes(id) ON UPDATE CASCADE ON DELETE CASCADE, date INTEGER NOT NULL, filename TEXT NOT NULL ); CREATE TABLE files_hd ( id INTEGER PRIMARY KEY REFERENCES episodes(id) ON UPDATE CASCADE ON DELETE CASCADE, date INTEGER NOT NULL, filename TEXT NOT NULL ); CREATE TABLE posters ( id INTEGER PRIMARY KEY AUTOINCREMENT, series INTEGER REFERENCES series(id) ON UPDATE CASCADE ON DELETE CASCADE, date INTEGER NOT NULL, data TEXT NOT NULL ); CREATE TABLE series ( id INTEGER PRIMARY KEY AUTOINCREMENT, title_en TEXT NOT NULL, title_ru TEXT NOT NULL, tracked INTEGER DEFAULT 0 , tracked_hd integer default 1); 


Covers are stored inside the database in the form of base64, it is clear that this will make the base fatter, but I still prefer to keep everything about the series inside the same database.

TODO and PS


It remains to do those things that are not so significant for me and may be once done.
1. Transfer the database to MySQL / PostgreSQL
2. Make a web interface to the database
3. Make an automatic addition of the new series to the database (with the cover)
n. Future fixes in RSS Feed Recognizer

The rest of the public should be clear. There are comments in the code.

The script has been working for 4 months without failures and errors (although today there was Eureka with season number 0, but it was a cant on the site itself). I myself am a soldier and am at home about once every 1-2 weeks, respectively, everything works in automatic mode with rare control, I just come in and merge new series on the screw to see something in the unit.

Thanks for attention!

Source: https://habr.com/ru/post/127888/


All Articles