📜 ⬆️ ⬇️

Testing web applications with mechanize

In the footsteps of Habratopic about Watir , an automated testing tool for web applications in Ruby, I decided to write a short article about a similar tool for the Python language. It's about a wonderful library mechanize . Unlike Watir, mechanize is not sharpened by any OS, and is a superstructure above the Python libraries urllib and urllib2.

The library itself is a browser emulator (without javascript support) and allows you to solve problems of any class (with a debugging for “disabled” javascript) in which the use of a browser is necessary. In particular, I first went to this library, when I had to download a huge number of scientific articles from a single repository that required authorization and stored PDF documents so that without the help of auxiliary tools I had to download only one document, which I did for 2 hours, until I didn’t remember WWW :: Mechanize PERL library (I read about its capabilities some time ago) and didn’t type WWW :: Mechanize python in Google, which led me to sorforge.

But pretty lyrics.

It all starts with the installation of the library. It is best to read about this in detail in the download section . In short, there are 3 main ways to get this library:

')
Personally, I chose a third way for myself, which makes no sense to describe in detail in this article, because I’ll fail to write better and in more detail than it was done on the offsite.

In order to show a couple of possibilities of this library, I wrote a small script that goes to Habr, tries to log in, and if I managed to do this, I search for the main Habratopik with the maximum number of comments and display all this to the user.

  1. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  2. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  3. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  4. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  5. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  6. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  7. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  8. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  9. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  10. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  11. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  12. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  13. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  14. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  15. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  16. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  17. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  18. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  19. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  20. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  21. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  22. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  23. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  24. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  25. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  26. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  27. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  28. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  29. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  30. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  31. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  32. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  33. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  34. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  35. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  36. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  37. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  38. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  39. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  40. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  41. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  42. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  43. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  44. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  45. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  46. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  47. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  48. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  49. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  50. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  51. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  52. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  53. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  54. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
  55. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url

Source

Run the script and get something like:
  $ python habratest.py 
 Welcome to Habrahabr
 Trying to login
 Successfully logged in as krig
 We are starting to search for articles on the main page.
 Habratopikov without habrakatment: 0
 Maximum number of habrakazment: 72
 Most Commented Topic: Free Computers
 And it is located at: http://habrahabr.ru/blogs/startup_ideas/52540/ 


Other, no less interesting features of this wonderful library, you can look at its offsite .

Source: https://habr.com/ru/post/52554/


All Articles