當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

【爬虫】-爬取食品检验结果

發布時間：2025/7/14 编程问答 19 豆豆

生活随笔收集整理的這篇文章主要介紹了【爬虫】-爬取食品检验结果小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

# 需求爬取網站上的所有的關于藥品檢驗記錄的信息,并輸出到excel文件中 import requests from bs4 import BeautifulSoup import redef find_all_a_tag():"""考慮到只有一次使用機會，這里uel就寫死好了，沒有持續傳參數:return:"""url = r'http://www.nifdc.org.cn/CL0873/'html_t = requests.get(url)soup = BeautifulSoup(html_t.text, 'lxml')# 這里的注釋：做成標簽文檔樹的形式檢查一下html文件是否爬取正確# res = soup.prettify()# print(res)a_list = soup.find_all(id='table297')[0].find_all('a')path_url = 'http://www.nifdc.org.cn/'url_list = []# 找到所有的html后綴，完成拼接for i in a_list:# print(i)path_url_plus = re.findall('C\w{5}', str(i))[0]p = path_url + path_url_plus + '/'url_list.append(p)return url_listdef single_page_get(url):"""解析傳入url的a標簽:param url::return:"""html_text = requests.get(url)soup = BeautifulSoup(html_text.text, 'lxml')tag = soup.find_all(id='table5')[0].find_all('a')return tagdef get_all_a():"""返回所有的和記錄有關的a標簽0.a標簽內所有的C開頭/結尾的剔除1.a標簽內如果是http開頭的，直接加入列表2.如果是..開頭的，拼接上一個tag:return:"""all_list = []tags = find_all_a_tag()for tag in tags:tag_t = single_page_get(tag)# print(tag)for i in tag_t:path_url_plus = re.findall('C\w{5}/\d+\.html', str(i))path_url_plus_1 = re.findall('http.+\.htm', str(i))path_url_plus_2 = re.findall('attach.+\.htm', str(i))if path_url_plus_2:fin_a_path2 = ('http://www.nifdc.org.cn/'+path_url_plus_2[0])all_list.append(fin_a_path2)if path_url_plus_1:fin_a_path1 = path_url_plus_1[0]all_list.append(fin_a_path1)# print(i)a_path_url_plus = re.findall('\d+\.html', str(path_url_plus))if a_path_url_plus:fin_a_path3 = (tag+a_path_url_plus[0])all_list.append(fin_a_path3)return all_listdef get_re():passif __name__ == '__main__':all_list = get_all_a()# print(len(all_list))for i in all_list:print(i) 爬蟲源碼

用爬蟲的知識爬取到企業的所有的a標簽內容，然后再做信息數據的提取，目前只找到了所有的a標簽數據

轉載于:https://www.cnblogs.com/pandaboy1123/p/9712656.html

總結

以上是生活随笔為你收集整理的【爬虫】-爬取食品检验结果的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： windows2016重新配置sid
下一篇：前端 css+js实现返回顶部功能

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

编程问答

【爬虫】-爬取食品检验结果

總結