Python之Requests模块利用详解

Requests模块是一个用于网络会见的模块，其实雷同的模块有许多，好比urllib，urllib2，httplib，httplib2，他们根基都提供相似的成果，那为什么Requests模块就可以或许脱引而出呢？可以打开它的官网看一下，是一个“人类“用的http模块。那么，它毕竟奈何的人性化呢？相信假如你之前用过urllib之类的模块的话，比拟下就会发明它确实很人性化。

一、导入

下载完成后，导入模块很简朴，代码如下：

import requests

二、请求url

这里我们列出最常见的发送get可能post请求的语法。

1.发送无参数的get请求：

r=requests.get("http://pythontab.com/justTest")

此刻，我们获得了一个响应工具r，我们可以操作这个工具获得我们想要的任何信息。

上面的例子中，get请求没有任何参数，那假如请求需要参数怎么办呢？

2.发送带参数的get请求

payload = {'key1': 'value1', 'key2': 'value2'}
r = requests.get("http://pythontab.com/justTest", params=payload)

以上得知，我们的get参数是以params要害字参数通报的。

我们可以打印请求的详细url来看看到底对差池：

>>>print r.url
http://pythontab.com/justTest?key2=value2&key1=value1

可以看到确实会见了正确的url。

还可以通报一个list给一个请求参数：

>>> payload = {'key1': 'value1', 'key2': ['value2', 'value3']}
>>> r = requests.get("http://pythontab.com/justTest", params=payload)
>>> print r.url
http://pythontab.com/justTest?key1=value1&key2=value2&key2=value3

以上就是get请求的根基形式。

3.发送post请求

r = requests.post("http://pythontab.com/postTest", data = {"key":"value"})

以上得知，post请求参数是以data要害字参数来通报的。

此刻的data参数通报的是字典，我们也可以通报一个json名目标数据，如下：

>>> import json
>>> import requests
>>> payload = {"key":"value"}
>>> r = requests.post("http://pythontab.com/postTest", data = json.dumps(payload))

由于发送json名目数据太常见了，所以在Requests模块的高版本中，又插手了json这个要害字参数，可以直接发送json数据给post请求而不消再利用json模块了，见下：

>>> payload = {"key":"value"}
>>> r = requests.post("http://pythontab.com/postTest", json=payload)

假如我们想post一个文件怎么办呢？这个时候就需要用到files参数了：

>>> url = 'http://pythontab.com/postTest'
>>> files = {'file': open('report.xls', 'rb')}
>>> r = requests.post(url, files=files)
>>> r.text

我们还可以在post文件时指定文件名等特另外信息：

>>> url = 'http://pythontab.com/postTest'
>>> files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})}
>>> r = requests.post(url, files=files)

tips：强烈发起利用二进制模式打开文件，因为假如以文本文件名目打开时，大概会因为“Content-Length”这个header而堕落。

可以看到，利用Requests发送请求简朴吧！

三、获取返复书息

下面我们来看下发送请求后如何获取返复书息。我们继承利用最上面的例子：

>>> import requests
>>> r=requests.get('http://pythontab.com/justTest')
>>> r.text

r.text是以什么编码名目输出的呢？

>>> r.encoding
'utf-8'

本来是以utf-8名目输出的。那假如我想改一下r.text的输格外式呢？

>>> r.encoding = 'ISO-8859-1'

这样就把输格外式改为“ISO-8859-1”了。

#p#分页标题#e#

尚有一个输出语句，叫r.content，那么这个和r.text有什么区别呢？r.content返回的是字节约，假如我们请求一个图片地点而且要生存图片的话，就可以用到，这里举个代码片断如下：

def saveImage( imgUrl,imgName ="default.jpg" ):
    r = requests.get(imgUrl, stream=True)
    image = r.content
    destDir="D:\"
    print("生存图片"+destDir+imgName+"\n")
    try:
        with open(destDir+imgName ,"wb") as jpg:
            jpg.write(image)     
            return
    except IOError:
        print("IO Error")
        return
    finally:
        jpg.close

适才先容的r.text返回的是字符串，那么，假如请求对应的响应是一个json，那我可不行以直接拿到json名目标数据呢？r.json()就是为这个筹备的。

我们还可以拿随处事器返回的原始数据，利用r.raw.read()就可以了。不外，假如你确实要拿到原始返回数据的话，记得在请求时加上“stream=True”的选项，如：

r = requests.get('https://api.github.com/events', stream=True)。

我们也可以获得响应状态码：

>>> r = requests.get('http://pythontab.com/justTest')
>>> r.status_code
200

也可以用requests.codes.ok来指代200这个返回值：

>>> r.status_code == requests.codes.ok
True

四、关于headers

我们可以打印出响应头：

>>> r= requests.get("http://pythontab.com/justTest")
>>> r.headers

｀r.headers｀返回的是一个字典，譬喻：

{
    'content-encoding': 'gzip',
    'transfer-encoding': 'chunked',
    'connection': 'close',
    'server': 'nginx/1.0.4',
    'x-runtime': '147ms',
    'etag': '"e1ca502697e5c9317743dc078f67693a"',
    'content-type': 'application/json'
}

我们可以利用如下要领来取得部门响应头以做判定：

r.headers['Content-Type']

可能

r.headers.get('Content-Type')

假如我们想得到请求头（也就是我们向处事器发送的头信息）该怎么办呢？可以利用r.request.headers直接得到。

同时，我们在请求数据时也可以加上自界说的headers（通过headers要害字参数通报）：

>>> headers = {'user-agent': 'myagent'}
>>> r= requests.get("http://pythontab.com/justTest",headers=headers)

五、关于Cookies

假如一个响应包括cookies的话，我们可以利用下面要领来获得它们：

>>> url = 'http://www.pythontab.com'
>>> r = requests.get(url)
>>> r.cookies['example_cookie_name']
'example_cookie_value'

我们也可以发送本身的cookie(利用cookies要害字参数)：

>>> url = 'http://pythontab.com/cookies'
>>> cookies={'cookies_are':'working'}
>>> r = requests.get(url, cookies=cookies)

六、关于重定向

有时候我们在请求url时，处事器会自动把我们的请求重定向，好比github会把我们的http请求重定向为https请求。我们可以利用r.history来查察重定向：

>>> r = requests.get('http://pythontab.com/')
>>> r.url
'http://pythontab.com/'
>>> r.history
[]

从上面的例子中可以看到，我们利用http协议会见，功效在r.url中，打印的却是https协议。那假如我非要处事器利用http协议，也就是克制处事器自动重定向，该怎么办呢？利用allow_redirects 参数：

r = requests.get('http://pythontab.com', allow_redirects=False)

七、关于请求时间

我们可以利用timeout参数来设定url的请求超时时间（时间单元为秒）：

requests.get('http://pythontab.com', timeout=1)

八、关于署理

#p#分页标题#e#

我们也可以在措施中指定署理来举办http或https会见（利用proxies要害字参数），如下：

proxies = {
  "http": "http://10.10.1.10:3128",
  "https": "http://10.10.1.10:1080",
}
requests.get("http://pythontab.com", proxies=proxies)

九、关于session

我们有时候会有这样的环境，我们需要登录某个网站，然后才气请求相关url，这时就可以用到session了，我们可以先利用网站的登录api举办登录，然后获得session，最后就可以用这个session来请求其他url了：

s=requests.Session()
login_data={'form_email':'[email protected]','form_password':'yourpassword'}
s.post("http://pythontab.com/testLogin",login_data)
r = s.get('http://pythontab.com/notification/')
print r.text

个中，form_email和form_password是豆瓣登录框的相应元素的name值。

十、下载页面

利用Requests模块也可以下载网页，代码如下：

r=requests.get("http://www.pythontab.com")
with open("haha.html","wb") as html:
    html.write(r.content)
html.close()

1.发送无参数的get请求：

2.发送带参数的get请求

3.发送post请求

关键字：