📜 ⬆️ ⬇️

We write a script under Linux, reading the latest articles from Habra in mp3 file.

Many of us would like to save some more time. One technology to do this is tts (text to speech): When a computer reads aloud some text. Agree, it would be nice to listen to all the latest articles from Habr while you are cooking, cleaning an apartment, dressing or tying shoelaces ie. in those moments when your eyes and hands are busy, and your brain and ears are almost free.
Recently, I wrote a script that allows you to translate in MP3 all the latest articles from Habr in order to be able to listen to them. By slightly changing it, you will be able to receive articles from other sites; instead of writing to mp3, you can read articles on your computer immediately, and the script has a fairly simple way to work with rss via the linux console.

From the beginning I want to warn you that since in Linux it is difficult with tts we will run the Win version via Vine.
For the script you will need to put wine, then the API for tts to work, the file should be called something like Spchapi.exe (after all, you have a license to use Windows and its components ^ _ ^?). After that we find the screw engine for voice and put it too, for example, you can try the trial version of Digalo.
To check whether everything is set correctly, you need to run some kind of reader. If you follow it to the talker website, do not forget to download the Govorilka CP also, it will be launched via a script.
So, you installed wine, Spchapi.exe, voice engine and downloaded Govorilka CP.
Now you can go directly to the script.
First you need to create a working directory, I will use ~ / rss2mp3 /, you are free to choose yourself.
to get started we will enter the directory
$ cd ~/rss2mp3/
As we know rss Habra is located at habrahabr.ru/rss/main
From it and we will download:
$ wget habrahabr.ru/rss/main -O rssindex.tmp

File in which already read links will be stored:
$ touch rssold.tmp

We pull out all links to posts from the RCC:
$ cat rssindex.tmp | grep '' | sed 's///g;s/<\/link>//g;s/ *//g' > rss.tmp

:
$ comm rssold.tmp rss.tmp -13 | sed 's/\t*//;s/ *//'> rsslinks.tmp

.. rsslinks.tmp

:
$ comm rssold.tmp rss.tmp | sed 's/\t*//;s/ *//' > rssold.tmp

rsslinks.tmp :
$ for a in $( cat rsslinks.tmp | tr "\r\n" " ");
do


:
$ wget $a -O rsshtm.tmp;

html .
$ cat rsshtm.tmp | sed -n -e '//, // p' | html2text -nobs > rsstext.tmp
, ( ).
html2text , .

, govorilka cp wine /home/psysonic/gvrlcp.exe, .. !
:
$ wine /home/psysonic/gvrlcp.exe -s70 -f rsstext.tmp -TO rsstmp.wav


3, . ( ?)
$ lame -V0 rsstmp.wav "$(head -1 rsstext.tmp | sed 's/\*//g' | iconv -f cp1251).mp3"

.
done

, .

:
cd ~/rss2mp3
wget habrahabr.ru/rss/main -O rssindex.tmp
touch rssold.tmp
cat rssindex.tmp | grep '' | sed 's///g;s/<\/link>//g;s/ *//g' > rss.tmp
comm rssold.tmp rss.tmp -13 | sed 's/\t*//;s/ *//'> rsslinks.tmp
comm rssold.tmp rss.tmp | sed 's/\t*//;s/ *//' > rssold.tmp
for a in $( cat rsslinks.tmp | tr "\r\n" " ");
do
wget $a -O rsshtm.tmp;
cat rsshtm.tmp | sed -n -e '//, // p' | html2text -nobs > rsstext.tmp
wine /home/psysonic/gvrlcp.exe -s70 -f rsstext.tmp -TO rsstmp.wav
lame -V0 rsstmp.wav "$(head -1 rsstext.tmp | sed 's/\*//g' | iconv -f cp1251).mp3"
done

Ps .

upd

')

Source: https://habr.com/ru/post/19816/


All Articles