requests Library
Although the urllib module in Python's standard library already contains most of the functions we normally use, its API doesn't feel good to use, and Requests advertises "HTTP for Humans", indicating that it is more concise and convenient to use.
Installation and documentation address
pip can be installed easily:
pip install requests
Chinese documents: http://docs.python-requests.org/zh_CN/latest/index.html
github address: https://github.com/requests/requests
Send GET request
-
The simplest way to send a get request is through requests Get to call:
response = requests.get("http://www.baidu.com/")
-
Add headers and query parameters:
If you want to add headers, you can pass in the headers parameter to add header information in the request header. If you want to pass parameters in a url, you can use the params parameter. The relevant example codes are as follows:import requests kw = {'wd':'China'} headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"} # params receives the query parameters of a dictionary or string. The dictionary type is automatically converted to url code without urlencode() response = requests.get("http://www.baidu.com/s", params = kw, headers = headers) # View the response content, response Text returns data in Unicode format print(response.text) # View the response content, response Byte stream data returned by content print(response.content) # View full url address print(response.url) # View response header character encoding print(response.encoding) # View response code print(response.status_code)
Send POST request
-
The most basic post request can use the post method:
response = requests.post("http://www.baidu.com/",data=data)
-
Incoming data:
At this time, don't use urlencode for coding. Just pass it in to a dictionary. For example, the code of the data requested to pull the hook:import requests url = "https://www.lagou.com/jobs/positionAjax.json?city=%E6%B7%B1%E5%9C%B3&needAddtionalResult=false&isSchoolJob=0" headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36', 'Referer': 'https://www.lagou.com/jobs/list_python?labelWords=&fromSearch=true&suginput=' } data = { 'first': 'true', 'pn': 1, 'kd': 'python' } resp = requests.post(url,headers=headers,data=data) # If it is json data, you can call the json method directly print(resp.json())
Use agent
It is also very simple to add a proxy using requests. Just pass the proxies parameter in the requested method (such as get or post). The example code is as follows:
import requests url = "http://httpbin.org/get" headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36', } proxy = { 'http': '171.14.209.180:27829' } resp = requests.get(url,headers=headers,proxies=proxy) with open('xx.html','w',encoding='utf-8') as fp: fp.write(resp.text)
cookie
If a cookie is included in a response, you can use the cookie attribute to get the returned cookie value:
import requests url = "http://www.renren.com/PLogin.do" data = {"email":"970138074@qq.com",'password':"pythonspider"} resp = requests.get('http://www.baidu.com/') print(resp.cookies) print(resp.cookies.get_dict())
session
Previously, using the urllib library, you can use opener to send multiple requests, and cookies can be shared among multiple requests. If we use requests to share cookies, we can use the session object provided by the requests library. Note that the session here is not the session in web development. This place is just a session object. Take logging in to Renren as an example and use requests to implement it. The example code is as follows:
import requests url = "http://www.renren.com/PLogin.do" data = {"email":"970138074@qq.com",'password':"pythonspider"} headers = { 'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36" } # Sign in session = requests.session() session.post(url,data=data,headers=headers) # Visit Dapeng personal Center resp = session.get('http://www.renren.com/880151247/profile') print(resp.text)
Handling untrusted SSL certificates
For websites that have trusted SSL integers, such as https://www.baidu.com/ , then you can directly return the normal response using requests. The example code is as follows:
resp = requests.get('http://www.12306.cn/mormhweb/',verify=False) print(resp.content.decode('utf-8'))