We read Aldebaran with convenience

Surely some habra people read books on the site http://lib.aldebaran.ru . They know that for some time the text on the site is “protected” from copying. Of course, most of the books remained available for download in easy-to-read formats, but many popular books can only be read online. What awkward.
The following simple script on Python downloads the entire book, along the way cutting out all the rubbish stuffed there to make it difficult to copy.

 import urllib, re, sys
 p_key = re.compile ('http: \ / \ / lib \ .aldebaran \ .ru \ / getpage \ / 1 \ /.*? "')
 p_span = re.compile ('<span class = h> (. *?) <\ / span>')
 p_s = re.compile ('\'; s \ + = \ '')
 p_p = re.compile ('(<p>. *? <\ / p>)')
 p_url_name = re.compile ('^ (http: \ / \ / lib.aldebaran.ru \ / author \ /.*? __)')
 def getpage (url):
     try:
         key = p_key.findall (urllib.urlopen (url) .read ()) [0] [- 33: -1]
     except:
         return none
     urllib.urlopen ("http://lib.aldebaran.ru/getpage/1/" + key) .read ()
     page = urllib.urlopen ("http://lib.aldebaran.ru/getpage/2/" + key) .read ()
     page = p_span.sub ('', page)
     page = '' .join (p_p.findall (page))
     page = p_s.sub ('', page)
     return page

 url = sys.argv [1]
 url = p_url_name.findall (url) [0]
 i = 1
 while true:
     page = getpage (url + str (i) + '.html')
     i + = 1
     if page! = None:
         print page
     else:
         sys.exit (0)

The script takes a command line parameter link to any page of the book and displays its text on standard output.
It’s very easy to register:
python lit.py

PS Please, set ethics aside. Personally, I pay for my favorite books directly to the author.
UPD
I'm not the only one so smart. In the comments suggested
eBookDownloader is a complete application that supports fiction books, aldebars, lite portals (requires. NET)
As well as a small plug-in for FireFox DirectX DirectX

Source: https://habr.com/ru/post/21082/

All Articles

We read Aldebaran with convenience

More articles: