📜 ⬆️ ⬇️

Getting your favorite audio from pandora.com

For those who do not know, pandora.com is an Internet radio that picks up songs according to user preferences. Recently, a friend of mine wanted to download a list of favorite audio recordings. But on the very Pandora, this is not possible. So I had to climb into her gut ...


So, from Pandora, we will receive a list of song titles and artists, then we will download them using the contact API.

Step 1. Go to Pandora, and see what happens when we request a list of favorite songs. We observe such a query:
Request URL:http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy&cachebuster=1367100054190 Request Method:GET Status Code:200 OK Request Headersview source Accept:*/* Accept-Charset:windows-1251,utf-8;q=0.7,*;q=0.3 Accept-Encoding:gzip,deflate,sdch Accept-Language:ru,en-US;q=0.8,en;q=0.6 Cookie:at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D; v3ad=1:20:1:48206::5:0:0:0:505:011:MI:26163:0:1:0:0; __utma=118078728.1866197791.1367091864.1367091864.1367098565.2; __utmb=118078728.4.10.1367098565; __utmc=118078728; __utmz=118078728.1367091864.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); v2regbstage=true; atn=AT-1367099945481-858 Host:www.pandora.com Proxy-Connection:keep-alive Referer:http://www.pandora.com/profile/likes/evgeny.vyalyy User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.22 (KHTML, like Gecko) Ubuntu Chromium/25.0.1364.160 Chrome/25.0.1364.160 Safari/537.22 X-Requested-With:XMLHttpRequest Query String Parametersview sourceview URL encoded likeStartIndex:0 thumbStartIndex:5 webname:evgeny.vyalyy cachebuster:1367100054190v 


Let's try to model this query. We use a bunch of python requests + BeautifulSoup:
')
 resp = response.get("http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy&cachebuster=1367100054190", headers={"Cookie":"at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D; v3ad=1:20:1:48206::5:0:0:0:505:011:MI:26163:0:1:0:0; __utma=118078728.1866197791.1367091864.1367091864.1367098565.2; __utmb=118078728.4.10.1367098565; __utmc=118078728; __utmz=118078728.1367091864.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); v2regbstage=true; atn=AT-1367099945481-858"}) soup = BeautifulSoup.BeautifulSoup(resp.text) print soup 


We get a lot of not very informative html.

But our request contains somehow suspiciously many parameters. Let's try to reduce a little:

 resp = response.get("http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy", headers={"Cookie":"at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D;"}) soup = BeautifulSoup.BeautifulSoup(resp.text) print soup 


Hurray, the answer has not changed!
Now, having rummaged in the answer, we get that all the information is stored in a div with the infobox-body class. This is what this div looks like:

 <div class="infobox-body"> <h3 class="s-0 line-h-1_4 normal"> <a href="/lynyrd-skynyrd/live-from-freedom-hall/sweet-home-alabama-live-from-freedom-hall" class="first">Sweet Home Alabama (Live From Freedom Hall)</a> </h3> <p class="s-0 line-h-1_4"> by <a href="/lynyrd-skynyrd">Lynyrd Skynyrd</a> </p> <p class="s-0 line-h-1_4"> <span class="profile_user_name">You</span> liked this on <a href="#" data-viewer-is-owner="true" data-station-id="1380018751859442317" class="like_context_stationname">The Offspring Radio</a>. </p> </div> 


So, now we can pull out all the information we are interested in:

 import re PATT = re.compile(">(.*?)<") for x in soup.findAll(attrs={"class":"infobox-body"}): print [PATT.findall(str(xa))[0], PATT.findall(str(xpa))[0]] 


The first step is passed! =)

Step two. Search and download records from vk.com

Go to vk.com/editapp?act=create and create a new application. Now we need to get access_token. In order not to suffer, I decided to get the access_token manually, and just insert it into the body of the script. So, let's go to
oauth.vk.com/authorize?client_id=3608669&scope=audio&redirect_uri=https://oauth.vk.com/blank&display=wap&response_type=token

Redirect us to a new page.
oauth.vk.com/blank.html#access_token=***&expires_in=86400&user_id=17738938


We pull out the access_token of interest from the anchor. We will use it for requests to vk.api.

We write a small audio search function:

 ACCESS_TOKEN = *** def audio_search(string): resp = r.get("https://api.vk.com/method/audio.search?q=%(q)s&sort=2&access_token=%(ACCESS_TOKEN)s"%{"q":string, "ACCESS_TOKEN":ACCESS_TOKEN}) return resp.json() 


It returns the most popular search result string (among audio).
The function response is:

 >>> audio_search("My little horse") {u'response': [1, {u'album': u'27504721', u'artist': u'\u041d\u0435\u0438\u0437\u0432\u0435\u0441\u0442\u0435\u043d', u'url': u'http://cs521522.vk.me/u3391535/audios/746ddef4902c.mp3', u'title': u'my little horse', u'duration': 208, u'aid': 159749117, u'owner_id': 3391535}]} 


Now we know the url to download. You can download using the standard function urllib.urlretrieve.

The result is such a script:

yadi.sk/d/7bP26GIQ4POa6

How to work with him:

1) The script requires requests and BeautifulSoup packages installed (sudo pip install requests BeautifulSoup)
2) You need to get the value of cookies at = ... from pandora.com (see above)
3) Need to get ACCESS_TOKEN as done above
4) You need to set the parameter COUNT_OF_SONGS - the number of songs you want to download (None, if you need to download all)
5) DOWNLOAD_FOLDER_NAME = “audio” is the directory where the downloaded music will be saved.
6) LOGIN - your login on pandora.com

The corresponding parameters should be written in the script body.
Listen to your favorite music, and remember that piracy is a sin =)

UPD . Accidentally forgot to update the login code. I'm sorry
UPD2 At the request of the user DenimTornado the same script for lastfm

yadi.sk/d/U7kAZFZh4P5Yz

Setting parameters:


UPD3

From Setti user:

Modified version for LastFM
yadi.sk/d/tagClpSf4VsqQ

+ Added BeautifulSoup to the folder with the script. Now it is not necessary to install it
+ In the old version, the search took place only by the track name. Now and by the name of the artist. Otherwise, the contact gives out just what is horrible.
+ Fixed naming of downloadable files: special characters are removed.
+ File names that are too long are clipped.
+ Output in separate settings for lastfm request: limit and page. Now you can download packs of 10, 50, 100, 500, etc. tracks page by page. If you have too many tracks, or you want to follow the download result on the example of a slice, set the appropriate page and limit parameters

Source: https://habr.com/ru/post/178215/


All Articles