requests Python中最好用的網絡請求工具基礎速記+最佳實踐

2023-06-25 09:52:50 來源 : 博客園

簡介

requests 模塊是寫python腳本使用頻率最高的模塊之一。很多人寫python第一個使用的模塊就是requests，因為它可以做網絡爬蟲。不僅寫爬蟲方便，在日常的開發(fā)中更是少不了requests的使用。如調用后端接口，上傳文件，查詢數(shù)據庫等。本篇詳細介紹requests的使用。requests 是?Python編寫的第三方庫，它基于python自帶網絡庫urllib3封裝完成。采?Apache2 Licensed開源協(xié)議的 HTTP 庫。它? urllib3 更加?便，可以節(jié)約使用者?量的時間。

下面從如下6個方面，全面講解requests模塊

(資料圖)

簡單使用請求方法請求參數(shù)請求返回異常捕獲提升性能功能快速傳送門

不需要看完全篇內容，直接跳轉到需要查找的功能上傳文件：請求參數(shù)->files使用認證接口調用：請求參數(shù)->header使用json接口調用：請求參數(shù)->json使用form表單接口調用：請求參數(shù)->data使用

requests初識

requests 是一個第三方庫，使用之前需要安裝。安裝命令如下：

pip3 install requests -i https://pypi.tuna.tsinghua.edu.cn/simple

最簡單請求，發(fā)送一個get請求，獲得返回值。

import requestsres = requests.get("http://www.baidu.com")print(res)>>>

從如上代碼可以看出，使用requets發(fā)送請求只需要一行代碼就可以搞定，是非常簡單的事情。而requests的設計理念就是 **Requests** is an elegant and simple HTTP library for Python, built for human beings.意思就是：requests是一個優(yōu)雅而簡單的 Python HTTP 庫，它是為人類構建的。由于不同版本之間參數(shù)和功能略有差異，所以說明本文使用的requests版本是 2.31.0

請求方法

requests支持大部分的HTTP請求方法。具體如下：

關于每一個請求方法的使用下面一一列舉出來。以下示例基于本地啟動的后端服務，嘗試跑示例請更換url。

get請求獲取記錄

import requestsurl = "http://127.0.0.1:8090/demos"res = requests.get(url)print(res.json()) # 返回json反序列化之后的字典對象>>>{"result": [{"age": 0, "create_at": "Mon, 29 May 2023 22:05:40 GMT", "id": 2, "name": "string", "status": 0, "update_at": "Mon, 29 May 2023 22:05:40 GMT", "user_id": 0}, {"age": 100, "create_at": "Sun, 11 Jun 2023 10:38:28 GMT", "id": 3, "name": "ljk", "status": 0, "update_at": "Sun, 11 Jun 2023 10:38:28 GMT", "user_id": 223}], "total": 2}

post請求創(chuàng)建記錄

import requestsurl = "http://127.0.0.1:8090/demo"payload = {  "age": 18,  "desc": "post_demo",  "name": "post_method",  "user_id": 102}# body體會自動json序列化res = requests.post(url, json=payload)print(res.json())>>>{"age": 18, "create_at": "Sun, 11 Jun 2023 16:14:40 GMT", "id": 4, "name": "post_method", "status": 0, "update_at": "Sun, 11 Jun 2023 16:14:40 GMT", "user_id": 102}

put請求更新記錄

import requestsurl = "http://127.0.0.1:8090/demo/4"payload = {  "age": 20,  "user_id": 1001}res = requests.put(url, json=payload)print(res.json())>>>{"msg": "success"}

delete請求刪除記錄

import requestsurl = "http://127.0.0.1:8090/demo/4"res = requests.delete(url)print(res.json())>>>{"msg": "success"}

head請求獲取header

import requestsurl = "http://127.0.0.1:8090/demos"res = requests.head(url)print(res.ok)print(res.headers)>>>ok{"Server": "Werkzeug/2.3.6 Python/3.9.6", "Date": "Sat, 17 Jun 2023 06:34:44 GMT", "Content-Type": "application/json", "Content-Length": "702", "Connection": "close"}

從返回結果的headers中可以找到返回的數(shù)據類型 "Content-Type": "application/json"，這說明返回的數(shù)據是json編碼格式的，所以需要json反序列化之后才能使用。

patch請求更新部分數(shù)據

import requestsurl = "http://127.0.0.1:8090/demo/4"payload = {  "age": 200}res = requests.patch(url, json=payload)print(res.json())"""{"msg": "success"}"""

options請求查看接口要求

import requestsurl = "http://127.0.0.1:8090/demo/4"headers={        "Access-Control-Request-Method": "GET",        "Origin": "*",        "Access-Control-Request-Headers": "Authorization",    }res = requests.options(url, headers=headers)print(res.ok)print(res.headers)>>>True{"Server": "Werkzeug/2.3.6 Python/3.9.6", "Date": "Sat, 17 Jun 2023 06:38:21 GMT", "Content-Type": "text/html; charset=utf-8", "Allow": "OPTIONS, DELETE, PUT, PATCH, HEAD, GET", "Content-Length": "0", "Connection": "close"}

從返回的headers中可以看到，該接口允許的請求包括："Allow": "OPTIONS, DELETE, PUT, PATCH, HEAD, GET"，所以該接口可以使用允許的方法去訪問。相反沒有允許的方法是無法訪問的該接口的。

請求參數(shù)

request 請求的函數(shù)簽名如下，可以看出requests支持非常多的參數(shù)。截止當前版本2.31.0一共16個參數(shù)。

def request(        self,        method,        url,        params=None,        data=None,        headers=None,        cookies=None,        files=None,        auth=None,        timeout=None,        allow_redirects=True,        proxies=None,        hooks=None,        stream=None,        verify=None,        cert=None,        json=None,    ):

參數(shù)說明：

params 使用示例

功能：拼接請求url在get請求中如果攜帶查詢參數(shù)如分頁查詢

http://127.0.0.1:8090/demos?offset=10&limint=10

查詢部分的參數(shù)有兩種寫法，第一是直接拼接成如上的url，另一種寫法是使用params參數(shù)。將查詢的參數(shù)定義為字典，傳入到params中。

url = "http://127.0.0.1:8090/demos"res = requests.get(url, params={"offset": 1, "limit": 10})print(res.json()) print(res.url) # 返回請求的url>>>{"result": [{"age": 200, "create_at": "Sun, 11 Jun 2023 10:38:28 GMT", "id": 3, "name": "ljk", "status": 0, "update_at": "Sun, 11 Jun 2023 10:38:28 GMT", "user_id": 1002}], "total": 2}http://127.0.0.1:8090/demos?offset=1&limit=10

請求返回對象有一個url屬性，可以展示請求的方法。可以看到params將傳入的字典追加到url當中。

data 使用示例

功能：保存請求body體、上傳文件使用data發(fā)送一個body是json格式的請求，首先設置header中數(shù)據格式為json，然后使用json序列化body。

import jsonimport requestsurl = "http://127.0.0.1:8090/demo"payload = {  "age": 18,  "desc": "post_demo",  "name": "post_method",  "user_id": 102}headers = {"Content-Type": "application/json"}res = requests.post(url, data=json.dumps(payload), headers=headers)print(res.json())

知識加油站:

Content-Type字段：header 頭部信息中有一個 Content-Type 字段，該字段用于客戶端告訴服務器實際發(fā)送的數(shù)據類型，比如發(fā)送的數(shù)據可以是文件、純文本、json字符串、表單等。在requests中常用的數(shù)據類型有5種：

application/x-www-form-urlencoded：form表單數(shù)據被編碼為key/value格式發(fā)送到服務器。請求默認格式multipart/form-data：不僅可以傳輸參數(shù)，還可以傳輸文件text/xml ： XML格式。發(fā)送的數(shù)據必須是xml格式application/json：json 格式。發(fā)送的數(shù)據必須是json格式text/plain ：純文本格式

form-data 提交數(shù)據的接口某些接口需要發(fā)送multipart/form-data類型的數(shù)據，有兩種方法：

手動組建form-data并修改headers通過files參數(shù)傳遞form-data，推薦此種方式

手動組建form-data

import requestsurl = "http://www.demo.com/"payload = """------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data;    name=\"phone\"\n\n{}\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data;     name=\"idnum\"\n\n{}\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data;    name=\"name\"\r\n\r\n{}\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data;     name=\"products\"\r\n\r\n {}\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW--""".format(12374658756, 23, "demo", [201,])headers = {    "content-type": "multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW"}resp = requests.post(url, data=payload, verify=False, headers=headers)

通過files傳遞

import  requestsfiles = {    "schoolId": (None, -1),    "schoolName": (None, ""),    "reward": (None, 5),    "publishText": (None, "測試測試"),    "tags": (None, 1),    "image": ("image.jpg", open("%s/resource/upload_images/image.jpg" % PATH_DIR, "rb"), "application/octet-stream")}response = requests.post(url, files=files)

json 使用示例

功能：保存body體并json序列化后端接口接受json格式的數(shù)據，除了使用json.dumps序列化body之后，使用json參數(shù)是更方便的選擇。json參數(shù)會自動將傳入的字典序列化并添加json格式的頭信息。

import requestsurl = "http://127.0.0.1:8090/demo"payload = {  "age": 18,  "desc": "post_demo",  "name": "post_method",  "user_id": 102}res = requests.post(url, json=payload)print(res.json())

header 使用示例

功能：保存header信息，可用于偽裝瀏覽器，攜帶認證信息等公共接口為了反爬蟲都會校驗請求頭里的信息，非瀏覽器的請求會被拒絕。使用特定的headers信息即可將腳本偽裝成瀏覽器。接口中通常需要校驗認證信息，需要攜帶token發(fā)起請求，token就需要再headers中指定。

import requestsurl = "http://127.0.0.1:8090/demo"headers = {    "User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"    "mtk": "xxxxx"}res = requests.get(url, headers=headers)print(res.json())

files 使用示例

功能：上傳文件上傳文件首先打開一個文件獲得文件句柄，然后傳入files中?？梢陨蟼饕粋€或多個文件。建議使用二進制的方式讀取文件，requests 可能會為你提供 header 中的 Content-Length。

import requestsurl = "http://127.0.0.1:8090/demo"filea = open("a.txt", "rb")fileb = open("b.txt", "rb")res = requests.post(url, files={"file_a": filea, "file_b": fileb})print(res.json())

timeout 使用示例

功能：指定請求的超時時間超時可分為連接超時和讀取超時分別設置連接超時和讀取超時，timeout=(連接超時時間，讀取超時時間)統(tǒng)一設置連接超時和讀取超時， timeout=超時時間

url = "http://127.0.0.1:8090/demo/10"res = requests.get(url, timeout=(3, 10))print(res.json())

hooks 使用示例

功能：添加鉤子函數(shù)Hooks即鉤子方法，用于在某個流程執(zhí)行時捎帶執(zhí)行另一個自定義的方法。requests庫只支持一個response的鉤子，在響應返回時可以捎帶執(zhí)行我們自定義的某些方法?？梢杂糜诖蛴∫恍┬畔?，做一些響應檢查或在響應對象中添加額外的信息。

import requestsdef verify_res(res, *args, **kwargs):    res.status = "PASS" if res.status_code == 200 else "FAIL"    print(res.status)url = "http://www.baiu.com"response = requests.get(url, hooks={"response": verify_res})print("result_url " + response.url)

除了為某一個請求自定義鉤子之外，還可以給所有請求都自定鉤子函數(shù)。

# 創(chuàng)建自定義請求對象時，修改全局模塊拋出錯誤異常seesion = requests.Session()def hook_func():    passhttp.hooks["response"] = [hook_func]session.get("xxx")

返回對象

每一次請求都需要獲取詳細準確的返回結果，requests請求返回的是一個response對象，該對象有豐富的屬性和方法。

content、text、json() 的區(qū)別

content 返回是的二進制的內容，text返回是字符串格式的內容，json()返回的是序列化的內容。

import requestsurl = "http://127.0.0.1:8090/demo/5"res = requests.get(url)print(f"content類型 -> type: {type(res.content)}\n 內容: {res.content}")print(f"text類型 -> type: {type(res.text)}\n 內容: {res.text}")print(f"json()類型 -> type: {type(res.json())}\n 內容: {res.json()}")>>>content類型 -> type:  內容: b"{\n  "age": 18,\n  "id": 5,\n  "name": "post_method",\n  "status": 0,\n  "user_id": 102\n}\n"text類型 -> type: 內容: {  "age": 18,  "id": 5,  "name": "post_method",  "status": 0,  "user_id": 102}json()類型 -> type: 內容: {"age": 18, "id": 5, "name": "post_method", "status": 0, "user_id": 102}

從以上返回結果的類型可以清晰看出三者之間的不同。通常接口返回json格式的數(shù)據比較好處理。推薦使用：

確切知道接口返回的json格式的字符串，使用response.json()獲取結果不知道接口返回的數(shù)據格式，使用response.text獲取結果status_code 和 ok

status_code 是接口的標準響應碼，ok 是表示一個請求是否正常。關于正常的定義可以參見ok函數(shù)的函數(shù)說明。

@propertydef ok(self):    """Returns True if :attr:`status_code` is less than 400, False if not.

import requestsurl = "http://127.0.0.1:8090/demo/5"res = requests.get(url)print(f"狀態(tài)碼:{res.status_code}, 是否ok: {res.ok}")url = "http://127.0.0.1:8090/demo/10"res = requests.get(url)print(f"狀態(tài)碼:{res.status_code}, 是否ok: {res.ok}")>>>狀態(tài)碼:200, 是否ok: True狀態(tài)碼:404, 是否ok: False

接口標準響應碼：

信息響應 (100–199)成功響應 (200–299)重定向消息 (300–399)客戶端錯誤響應 (400–499)服務端錯誤響應 (500–599)reason 簡要結果說明

reason 可以獲取請求的簡單結果描述。200的結果是200，非200的結果都會有一個簡潔的說明。

import requestsurl = "http://127.0.0.1:8090/demo/5"res = requests.get(url)print(f"狀態(tài)碼:{res.status_code}, reason: {res.reason}")>>>狀態(tài)碼:404, reason: NOT FOUNDurl = "http://127.0.0.1:8090/demo/5"res = requests.get(url)print(f"狀態(tài)碼:{res.status_code}, reason: {res.reason}")>>>狀態(tài)碼:500, reason: INTERNAL SERVER ERROR

header 和 cookies 的展示

在調用需要登陸的接口可能需要認證之后的cookies和header中某些特殊字段，所以在請求返回中通過header和cookies拿到相應的參數(shù)。

import requestsurl = "http://127.0.0.1:8090/demo/5"res = requests.get(url)print(f"header: {res.headers}")print(f"cookies: {res.cookies}")>>>header: {"Server": "Werkzeug/2.3.6 Python/3.9.6", "Date": "Tue, 13 Jun 2023 13:27:13 GMT", "Content-Type": "application/json", "Content-Length": "85", "Connection": "close"}cookies:

異常捕獲

網絡請求通常會存在很多可能的錯誤，特別是http請求還有復雜的后端接口。所以對于錯誤信息的捕獲就特別重要，合理的捕獲異常信息可以極大的增強代碼的及健壯性。requests 提供了多種異常庫，包括如下：

class RequestException(IOError):    pass class InvalidJSONError(RequestException):    pass class JSONDecodeError(InvalidJSONError, CompatJSONDecodeError):    pass class HTTPError(RequestException):    pass class ConnectionError(RequestException):    pass class ProxyError(ConnectionError):    pass class SSLError(ConnectionError):    pass class Timeout(RequestException):    pass class ConnectTimeout(ConnectionError, Timeout):    pass class ReadTimeout(Timeout):    pass class URLRequired(RequestException):    pass class TooManyRedirects(RequestException):    pass class MissingSchema(RequestException, ValueError):    pass class InvalidSchema(RequestException, ValueError):    pass class class InvalidURL(RequestException, ValueError):    pass class InvalidHeader(RequestException, ValueError):    pass class InvalidProxyURL(InvalidURL):    pass class ChunkedEncodingError(RequestException):    pass    class ContentDecodingError(RequestException, BaseHTTPError):    passclass StreamConsumedError(RequestException, TypeError):    pass class RetryError(RequestException):    pass class UnrewindableBodyError(RequestException):    pass

挑選最常用的幾個異常加以說明

未捕獲異常

沒有捕獲異常，當異常發(fā)生時最后會導致程序異常退出。

url = "http://127.0.0.1:8090/demo/10"res = requests.get(url)>>>Traceback (most recent call last):  File "/Users/ljk/Documents/python_env/dev/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn    conn = connection.create_connection(  File "/Users/ljk/Documents/python_env/dev/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection    raise err  File "/Users/ljk/Documents/python_env/dev/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection    sock.connect(sa)ConnectionRefusedError: [Errno 61] Connection refused

RequestException

RequestException 可以捕獲requests請求所有的異常，是最大顆粒度的異常。

import requestsurl = "http://127.0.0.1:8090/demo/10"try:    res = requests.get(url)except requests.exceptions.RequestException as e:    print("something error:")    print(e)else:    print(f"狀態(tài)碼:{res.status_code}, 是否ok: {res.ok}")finally:    print("request end")>>>something error:HTTPConnectionPool(host="127.0.0.1", port=8090): Max retries exceeded with url: /demo/10 (Caused by NewConnectionError(": Failed to establish a new connection: [Errno 61] Connection refused"))request end

ConnectionError

ConnectionError 可以捕獲請求中網絡相關的錯誤，如網絡不可達，拒絕連接等。使用ConnectionError捕獲到拒絕連接的錯誤。

import requestsurl = "http://127.0.0.1:8090/demo/10"try:    res = requests.get(url, timeout=1)except requests.exceptions.ConnectionError as e:    print("something error:")    print(e)else:    print(f"狀態(tài)碼:{res.status_code}, 是否ok: {res.ok}")finally:    print("request end")>>>something error:HTTPConnectionPool(host="127.0.0.1", port=8090): Max retries exceeded with url: /demo/10 (Caused by NewConnectionError(": Failed to establish a new connection: [Errno 61] Connection refused"))request end

ConnectTimeout

請求拒絕是對端服務器收到了請求但是拒絕連接，而ConnectTimeout是沒有和對端服務器建立連接而超時。

import requestsurl = "http://www.facebook.com"try:    res = requests.get(url, timeout=10)except requests.exceptions.ConnectTimeout as e:    print("something error:")    import pdb    pdb.set_trace()    print(e)else:    print(f"狀態(tài)碼:{res.status_code}, 是否ok: {res.ok}")finally:    print("request end")>>>something error:HTTPConnectionPool(host="www.facebook.com", port=80): Max retries exceeded with url: / (Caused by ConnectTimeoutError(, "Connection to www.facebook.com timed out. (connect timeout=10)"))request end

ReadTimeout

ReadTimeout 是和對端服務器建立了連接，接口返回時超時。在請求接口中睡眠10s，人為制造一個讀取超時。

class Demo(MethodView):    @swag_from("./apidocs/get.yml")    def get(self, demo_id):        """獲取單個demo數(shù)據"""        # 直接查詢數(shù)據庫也可，封裝成函數(shù)可以做一些緩存        import time        time.sleep(5)        demo = DemoTable.get_by_demo_id(demo_id)        return json_response(data=demo.to_dict())

import requestsurl = "http://127.0.0.1:8090/demo/10"try:    res = requests.get(url, timeout=1)except requests.exceptions.ReadTimeout as e:    print("something error:")    print(e)else:    print(f"狀態(tài)碼:{res.status_code}, 是否ok: {res.ok}")finally:    print("request end")>>>something error:HTTPConnectionPool(host="127.0.0.1", port=8090): Read timed out. (read timeout=1)request end

接口錯誤的異常處理

requests請求中所有的接口本身出錯都不會拋出異常，比如接口404,500,502等都不會主動拋出異常，而是通過異常狀態(tài)碼展示出來。

import requestsurl = "http://127.0.0.1:8090/demo/10"try:    res = requests.get(url, timeout=10)except requests.exceptions.RequestException as e:    print("something error:")    print(e)else:    print(f"狀態(tài)碼:{res.status_code}, 是否ok: {res.ok}")finally:    print("request end")>>>狀態(tài)碼:404, 是否ok: Falserequest end狀態(tài)碼:502, 是否ok: Falserequest end

可以看到使用最大返回的異常捕獲也沒有捕獲到接口相關的異常，所以接口異常需要通過status_code狀態(tài)碼去判斷。狀態(tài)碼有很多，如果不想寫很多if else判斷語句，可以使用 response.raise_for_status() 來拋出異常。raise_for_status() 是一個類似斷言assert的方法，如果請求不是200就拋出一個異常。

import requestsurl = "http://127.0.0.1:8090/demo/10"res = requests.get(url, timeout=5)res.raise_for_status()print(res.json())>>>Traceback (most recent call last):  File "/Users/ljk/Documents/code/daily_dev/requests_demo/method_demo.py", line 166, in     res.raise_for_status()  File "/Users/ljk/Documents/python_env/dev/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status    raise HTTPError(http_error_msg, response=self)requests.exceptions.HTTPError: 404 Client Error: NOT FOUND for url: http://127.0.0.1:8090/demo/10

提高請求效率的方法多線程

低效的請求：當有大量的請求任務時使用for循環(huán)逐個遍歷請求是非常低效的實現(xiàn)。網絡IO最耗時的地方便是等待請求的返回，而for循環(huán)是順序執(zhí)行，只有在前一個請求返回之后才能繼續(xù)下一個，大量的時間都浪費在網絡等待中。

多線程優(yōu)化：使用多線程能夠顯著提高代碼效率，減少請求耗時。原理是：python的多線程在遇到網絡請求時會主動讓CPU,所以當大量請求線程執(zhí)行時，一個線程遇到網絡請求就讓出CPU給其他線程使用，不會阻塞等待請求返回。這樣大量請求都能同一時間發(fā)送出去。for循環(huán)請求和多線程請求對比：

import timeimport threadingimport requests# for循環(huán)start = time.time()for i in range(10):    res = requests.get("https://www.csdn.net/", timeout=3)end = time.time()print(f"總計耗時:{end-start}")# 多線程def get_request():    res = requests.get("https://www.csdn.net/", timeout=3)start = time.time()t_list = []for i in range(10):    t = threading.Thread(target=get_request)    t_list.append(t)    t.start()for t in t_list:    t.join()end = time.time()print(f"總計耗時:{end-start}")>>>總計耗時:6.254332065582275總計耗時:0.740969181060791

可以看出多線程的耗時幾乎是for循環(huán)的10分之一，將整體的請求耗時降低了一個層級。在多線程請求時如果線程超過10個，比較推薦使用線程池的技術，能夠有效減少線程的創(chuàng)建耗時。

from concurrent.futures import ThreadPoolExecutordef get_request():    res = requests.get("https://www.csdn.net/", timeout=3)with ThreadPoolExecutor(max_workers=2) as pool:    for i in range(10):        pool.submit(get_request)

復用TCP鏈路

每調用一次requests方法請求一次目標服務器，本地機器和目標服務器之間都會建立一次TCP連接，然后傳輸http請求的數(shù)據。在發(fā)起大量請求的情況下建立過多的tcp連接不僅會導致代碼耗時增加，而且會讓目標服務器承受網絡讀寫壓力。使用session可以做到多個請求共用一個TCP連接，在大量請求的場景下能夠有效減少代碼耗時和降低目標服務器壓力。使用session非常簡單，只需要多做一步實例化一個session對象即可，示例如下：

# 初始化一個session對象，相當于建立一個tcp連接s = requests.Session()for i in range(100):    res = s.get(f"https://www.target.com/i")    print(res.text)# 另一種使用方法with requests.Session() as s:    s.get("https://httpbin.org/get")

普通請求和復用tcp連接請求耗時對比：

import threading# 普通連接def get_request():    res = requests.get("https://www.csdn.net/", timeout=3)start = time.time()t_list = []for i in range(10):    t = threading.Thread(target=get_request)    t_list.append(t)    t.start()for t in t_list:    t.join()end = time.time()print(f"總計耗時:{end-start}")# 復用tcp連接def get_request_session(s):    res = s.get("https://www.csdn.net/", timeout=3)start = time.time()t_list = []with requests.Session() as s:    for i in range(10):        t = threading.Thread(target=get_request_session, args=(s,))        t_list.append(t)        t.start()        for t in t_list:        t.join()        end = time.time()        print(f"總計耗時:{end-start}")>>>總計耗時:0.9967081546783447總計耗時:0.7688210010528564

可以看出，復用TCP之后速度有更進一步的提升。

重試機制

通常在一次請求中如果超時了還會重試幾次，實現(xiàn)重試邏輯通常會使用一個記次的邏輯。可能會寫出如下的代碼：

i = 0while i < 3:    try:        res = requests.get(url, timeout=5)        break    except requests.exceptions.Timeout:        i += 1

其實重試的功能requests已經提供了。requests提供了一個傳輸適配器的方法完成一些如重試機制、心跳檢測等功能能。重試機制：每當 Session 被初始化，就會有默認的適配器附著在 Session 上，其中一個供 HTTP 使用，另一個供 HTTPS 使用。requests允許用戶創(chuàng)建和使用他們自己的傳輸適配器，實現(xiàn)他們需要的特殊功能。示例如下：

import timefrom requests.adapters import HTTPAdapters = requests.Session()# 為session添加適配器。根據url是否為https選擇一個即可s.mount("http://", HTTPAdapter(max_retries=3))s.mount("https://", HTTPAdapter(max_retries=3))start = time.time()try:    res = s.get("http://www.facebook.com", timeout=5)    print(res.text)except requests.exceptions.Timeout as e:    print(e)end = time.time()print(end-start)>>>HTTPConnectionPool(host="www.facebook.com", port=80): Max retries exceeded with url: / (Caused by ConnectTimeoutError(, "Connection to www.facebook.com timed out. (connect timeout=5)"))20.0400869846344

說明：以上代碼一共耗時20s，然后拋出異常。一次正常的請求加上三次重試，每次5s超時，所以是20s。三次之后請求還是超時，拋出timeout的異常并被捕獲到。

附錄 resquests 最核心代碼

def send(        self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None    ):        """Sends PreparedRequest object. Returns Response object.        :param request: The :class:`PreparedRequest ` being sent.        :param stream: (optional) Whether to stream the request content.        :param timeout: (optional) How long to wait for the server to send            data before giving up, as a float, or a :ref:`(connect timeout,            read timeout) ` tuple.        :type timeout: float or tuple or urllib3 Timeout object        :param verify: (optional) Either a boolean, in which case it controls whether            we verify the server"s TLS certificate, or a string, in which case it            must be a path to a CA bundle to use        :param cert: (optional) Any user-provided SSL certificate to be trusted.        :param proxies: (optional) The proxies dictionary to apply to the request.        :rtype: requests.Response        """        try:            conn = self.get_connection(request.url, proxies)        except LocationValueError as e:            raise InvalidURL(e, request=request)        self.cert_verify(conn, request.url, verify, cert)        url = self.request_url(request, proxies)        self.add_headers(            request,            stream=stream,            timeout=timeout,            verify=verify,            cert=cert,            proxies=proxies,        )        chunked = not (request.body is None or "Content-Length" in request.headers)        if isinstance(timeout, tuple):            try:                connect, read = timeout                timeout = TimeoutSauce(connect=connect, read=read)            except ValueError:                raise ValueError(                    f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "                    f"or a single float to set both timeouts to the same value."                )        elif isinstance(timeout, TimeoutSauce):            pass        else:            timeout = TimeoutSauce(connect=timeout, read=timeout)        try:            resp = conn.urlopen(                method=request.method,                url=url,                body=request.body,                headers=request.headers,                redirect=False,                assert_same_host=False,                preload_content=False,                decode_content=False,                retries=self.max_retries,                timeout=timeout,                chunked=chunked,            )        except (ProtocolError, OSError) as err:            raise ConnectionError(err, request=request)        except MaxRetryError as e:            if isinstance(e.reason, ConnectTimeoutError):                # TODO: Remove this in 3.0.0: see #2811                if not isinstance(e.reason, NewConnectionError):                    raise ConnectTimeout(e, request=request)            if isinstance(e.reason, ResponseError):                raise RetryError(e, request=request)            if isinstance(e.reason, _ProxyError):                raise ProxyError(e, request=request)            if isinstance(e.reason, _SSLError):                # This branch is for urllib3 v1.22 and later.                raise SSLError(e, request=request)            raise ConnectionError(e, request=request)        except ClosedPoolError as e:            raise ConnectionError(e, request=request)        except _ProxyError as e:            raise ProxyError(e)        except (_SSLError, _HTTPError) as e:            if isinstance(e, _SSLError):                # This branch is for urllib3 versions earlier than v1.22                raise SSLError(e, request=request)            elif isinstance(e, ReadTimeoutError):                raise ReadTimeout(e, request=request)            elif isinstance(e, _InvalidHeader):                raise InvalidHeader(e, request=request)            else:                raise        return self.build_response(request, resp)

關鍵詞：