當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

爬去当当热销图书信息

發布時間：2023/12/31 编程问答 23 豆豆

生活随笔收集整理的這篇文章主要介紹了爬去当当热销图书信息小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

運行環境：python 3.6.0

目的：練練手，爬去當當圖書熱門圖書的信息并且存儲

import requests import re import threading import jsonbase_url = url = 'http://bang.dangdang.com/books/fivestars/01.00.00.00.00.00-recent30-0-0-1-'headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.22 Safari/537.36 SE 2.X MetaSr 1.0' }def get_page(page):"""爬去當當網頁信息:param page: 頁碼:return: 網頁信息"""try:url = base_url + str(page)# print(type(url))response = requests.get(url=url, headers=headers)return response.textexcept requests.ConnectionError as e:print('Error', e.args)return Nonedef pase_info(item):"""提取圖書信息:param item: 網頁代碼:return: 圖書信息"""list_num = '<div class="list_num.*?">(.*?).</div>.*?'pic = '<div class="pic">.*?<a.*?>.*?<img src="(.*?)" alt=.*?>.*?</a>.*?</div>.*?'title = '<div class="name">.*?<a href=.*?title="(.*?)">.*?</a>.*?</div>.*?'biaosheng = '<div class="biaosheng">(.*?)<span>(.*?)</span></div>.*?'price = '<div class="price">.*?<p>.*?<span.*?class="price_n">(.*?)</span>.*?</p>.*?</div>.*?'pattern = re.compile('<li>.*?{}{}{}{}{}.*?</li>'.format(list_num, pic, title, biaosheng, price), re.S)items = re.findall(pattern, item)return itemsdef save_info(book):"""存儲到本地:param book: 圖書信息:return: None"""with open('當當圖書.txt', 'a+', encoding='utf-8') as f:# f.write(json.dumps(book, ensure_ascii=False))f.write(str(book))f.write('\n')def main(each):"""對一整個網頁信息的抓取及存儲:param each: 頁碼范圍:return: None"""response = get_page(each)book_info = pase_info(response)for book in book_info:# print(book)info = {'num': book[0],'pic': book[1],'price': book[5],'biaosheng': book[3] + book[4],'title': book[2]}# print(info)save_info(info)if __name__ == '__main__':MIN_PAGE = 1MAX_PAGE = 25for each in range(MIN_PAGE, MAX_PAGE + 1):print('第 %s 頁' % each)# th = threading.Thread(target=main, args=[each])# th.start()main(each)

運行結果：

ps：其實我本來想用多線程的，誰知道因為順序的原因輸出無序，存儲也無序了

總結

以上是生活随笔為你收集整理的爬去当当热销图书信息的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

编程问答

爬去当当热销图书信息

總結