python crawler learning basic course, grab beautiful pictures in batch!

python's grab function is actually very powerful, of course, it can't be wasted, hehe. Now let's share with you an automatic grab program for beauty pictures written by python!

The urllib2 module and the regular expression module are used. The code is as follows:

 1 use python Grab beauty pictures in batch
 2  
 3 #!/usr/bin/env python
 4 #-*- coding: utf-8 -*-
 5 #adopt urllib(2)Module download network content
 6 import urllib,urllib2,gevent
 7 #Introducing regular expression module and time module
 8 import re,time
 9 from gevent import monkey
10  
11 '''
12 What I don't know in the learning process python Learning exchange buckle qun,934109170,There are good learning courses, development tools and e-books in the group.
13 Share with you python Enterprise's current talent demand and how to learn well from zero basis python,And what to learn.
14 '''
15 monkey.patch_all()
16  
17 def geturllist(url):
18     url_list=[]
19     print url       
20     s = urllib2.urlopen(url)
21     text = s.read()
22     #Regular matching, matching pictures in it
23     html = re.search(r'<ol.*</ol>', text, re.S)
24     urls = re.finditer(r'<p><img src="(.+?)jpg" /></p>',html.group(),re.I)
25     for i in urls:
26         url=i.group(1).strip()+str("jpg")
27         url_list.append(url)
28     return url_list
29  
30 def download(down_url):
31     name=str(time.time())[:-3]+"_"+re.sub('.+?/','',down_url)
32     print name
33     urllib.urlretrieve(down_url, "D:\\TEMP\\"+name)
34  
35 def getpageurl():
36     page_list = []
37     #Loop through list pages
38     for page in range(1,700):
39         url="http://jandan.net/ooxx/page-"+str(page)+"#comments"
40         #Generative url Add to page_list in
41         page_list.append(url)
42     print page_list
43     return page_list
44 if __name__ == '__main__':
45     jobs = []
46     pageurl = getpageurl()[::-1]
47     #Download pictures
48     for i in pageurl:
49         for (downurl) in geturllist(i):
50             jobs.append(gevent.spawn(download, downurl))
51     gevent.joinall(jobs)

The program is not 45 lines long, it's not too difficult. You can study it. Here I'm just casting bricks to draw jade. You can develop other grabbing programs according to the principle. Ha ha, you want to go.

Keywords: Android Python network

Added by packland on Sat, 02 Nov 2019 15:19:34 +0200