UA池和IP代理池使用
生活随笔
收集整理的這篇文章主要介紹了
UA池和IP代理池使用
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
UA池:User-Agent池
- 作用:盡可能多的將scrapy工程中的請(qǐng)求偽裝成不同類型的瀏覽器身份。
- 操作流程:
? ? 1.在下載中間件中攔截請(qǐng)求
? ? 2.將攔截到的請(qǐng)求的請(qǐng)求頭信息中的UA進(jìn)行篡改偽裝
? ? 3.在配置文件中開啟下載中間件
代碼展示
#導(dǎo)包 from scrapy.contrib.downloadermiddleware.useragent import UserAgentMiddleware import random #UA池代碼的編寫(單獨(dú)給UA池封裝一個(gè)下載中間件的一個(gè)類) class RandomUserAgent(UserAgentMiddleware):def process_request(self, request, spider):#從列表中隨機(jī)抽選出一個(gè)ua值ua = random.choice(user_agent_list)#ua值進(jìn)行當(dāng)前攔截到請(qǐng)求的ua的寫入操作request.headers.setdefault('User-Agent',ua)user_agent_list = ["Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 ""(KHTML, like Gecko) Chrome/22.0.1207.1 Safari/537.1","Mozilla/5.0 (X11; CrOS i686 2268.111.0) AppleWebKit/536.11 ""(KHTML, like Gecko) Chrome/20.0.1132.57 Safari/536.11","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.6 ""(KHTML, like Gecko) Chrome/20.0.1092.0 Safari/536.6","Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.6 ""(KHTML, like Gecko) Chrome/20.0.1090.0 Safari/536.6","Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.1 ""(KHTML, like Gecko) Chrome/19.77.34.5 Safari/537.1","Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.5 ""(KHTML, like Gecko) Chrome/19.0.1084.9 Safari/536.5","Mozilla/5.0 (Windows NT 6.0) AppleWebKit/536.5 ""(KHTML, like Gecko) Chrome/19.0.1084.36 Safari/536.5","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 ""(KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3","Mozilla/5.0 (Windows NT 5.1) AppleWebKit/536.3 ""(KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0) AppleWebKit/536.3 ""(KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3","Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 ""(KHTML, like Gecko) Chrome/19.0.1062.0 Safari/536.3","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 ""(KHTML, like Gecko) Chrome/19.0.1062.0 Safari/536.3","Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 ""(KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 ""(KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3","Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.3 ""(KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3","Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 ""(KHTML, like Gecko) Chrome/19.0.1061.0 Safari/536.3","Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.24 ""(KHTML, like Gecko) Chrome/19.0.1055.1 Safari/535.24","Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/535.24 ""(KHTML, like Gecko) Chrome/19.0.1055.1 Safari/535.24" ] UA池代理池
- 作用:盡可能多的將scrapy工程中的請(qǐng)求的IP設(shè)置成不同的。
- 操作流程:
? ? 1.在下載中間件中攔截請(qǐng)求
? ? 2.將攔截到的請(qǐng)求的IP修改成某一代理IP
????3.在配置文件中開啟下載中間件
代碼展示:
class Proxy(object):def process_request(self, request, spider):#對(duì)攔截到請(qǐng)求的url進(jìn)行判斷(協(xié)議頭到底是http還是https)#request.url返回值:http://www.xxx.comh = request.url.split(':')[0] #請(qǐng)求的協(xié)議頭if h == 'https':ip = random.choice(PROXY_https)request.meta['proxy'] = 'https://'+ipelse:ip = random.choice(PROXY_http)request.meta['proxy'] = 'http://' + ip#可被選用的代理IP PROXY_http = ['153.180.102.104:80','195.208.131.189:56055', ] PROXY_https = ['120.83.49.90:9000','95.189.112.214:35508', ] ip池?
轉(zhuǎn)載于:https://www.cnblogs.com/awfj/p/11114322.html
總結(jié)
以上是生活随笔為你收集整理的UA池和IP代理池使用的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Android进阶知识:ANR的定位与解
- 下一篇: 如何从开始掌控会议?