Python scrapy 命令行传参 以及发送post请求payload参数
生活随笔
收集整理的這篇文章主要介紹了
Python scrapy 命令行传参 以及发送post请求payload参数
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
class SciencedirectspiderSpider(scrapy.Spider):name = 'sciencedirectspider'allowed_domains = ['sciencedirect.com']start_urls = ['https://www.sciencedirect.com/search?qs=kidney%20stone']# 在初始化這里進行def __init__(self, year='', search='', **kwargs):self.year = yearself.search = searchself.urls = 'https://www.sciencedirect.com/search?qs=' + search + '&years=' + year + '&sortBy=date'self.browser = webdriver.Chrome(chrome_options=chorme_options)super().__init__()def start_requests(self):# //*[@id="srp-pagination"]/li[1]/text()[4]url = "https://www.sciencedirect.com/search?qs=kidney%20stone"response = scrapy.Request(self.urls, callback=self.page, meta={'url': self.urls})yield response
執行命令:scrapy crawl sciencedirectspider --nolog -a "search=kidney stone" -a "year=2019"
注意一個-a一個參數
main執行語句:
from scrapy.cmdline import execute # execute(['scrapy', 'crawl', 'sciencedirectspider','--nolog']) # 不打印日志 execute(['scrapy', 'crawl', 'sciencedirectspider','--nolog','-a','search=kidney stone','-a','year=2019']) # 不打印日志post請求payload參數
''' 遇到問題沒人解答?小編創建了一個Python學習交流QQ群:778463939 尋找有志同道合的小伙伴,互幫互助,群里還有不錯的視頻學習教程和PDF電子書! ''' class IeeexplorespiderSpider(scrapy.Spider):name = 'ieeexplorespider'allowed_domains = ['ieeexplore.ieee.org']start_urls = ['http://ieeexplore.ieee.org/']headers = {"Content-Type": "application/json","Host": "ieeexplore.ieee.org","Origin": "https://ieeexplore.ieee.org","User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36"}def start_requests(self):# url = "https://ieeexplore.ieee.org/search/searchresult.jsp?newsearch=true&queryText=Security%20Analytics"url = "https://ieeexplore.ieee.org/rest/search"data = {"highlight": True,"matchPubs": True,"newsearch": True,"pageNumber": "1","queryText": "Security Analytics","returnFacets": ["ALL"],"returnType": "SEARCH"}response = scrapy.Request(url=url, body=json.dumps(data), method='POST', callback=self.parse,headers=self.headers)yield responsedef parse(self, response):print(123)print(response.text)總結
以上是生活随笔為你收集整理的Python scrapy 命令行传参 以及发送post请求payload参数的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python常用的日期时间模块
- 下一篇: Python之一行代码