生活随笔
收集整理的這篇文章主要介紹了
python3通过request多进程获取驾校一点通试题库
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
通過開發者工具找到試題鏈接地址;對試題鏈接的url進行分析,發現index是試題id名稱,構造隨機數,可使用range或者excel拉出全部;對json數據進行字段分析我這里分開寫了兩個腳本,一個是獲取數據一個是轉成excel,本文主要為多進程獲取數據開發環境python3.9.1/windows10/vscode #coding:utf-8
import requests
from concurrent.futures import ProcessPoolExecutor
import json# 通過url獲取數據
# url = 'http://mnks.jxedt.com/get_question?r=0.5376675619396274&index=3'
urls_list = []
with open('D:/YYFX/ip.txt','r') as f:for line in f:#print line,urls_list.append(line.replace('\n', ''))
#模擬瀏覽器header
hea = {'User-Agent':'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36'}
#進程
pool = ProcessPoolExecutor(20)
def get_page(url):#requests.get 自帶 json.loadresponse = requests.get('http://%s'%(url),headers = hea,timeout = 30 ,verify=False)response = response.content#將bytes轉換成字符串response = response.decode('utf-8')return responsedef read_data(future,*args,**kwargs):response = future.result()state = json.loads(response)
# print(response.status_code,response.url)print (state)#product = response1["question"]+'\n'with open('%s.json'%'data','a',encoding='utf-8') as f:
#保存json數據防止亂碼f.write(json.dumps(state,ensure_ascii=False) + '\n')f.close()def main():for url in urls_list:done = pool.submit(get_page,url)done.add_done_callback(read_data)
if __name__ == '__main__':main()pool.shutdown(wait=True)f.close()
?
總結
以上是生活随笔為你收集整理的python3通过request多进程获取驾校一点通试题库的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。