當前位置：首頁 > 编程语言 > python >内容正文

python

python爬取新浪新闻首页_学习了《python网络爬虫实战》第一个爬虫，爬取新浪新闻...

發布時間：2024/7/19 python 30 豆豆

生活随笔收集整理的這篇文章主要介紹了 python爬取新浪新闻首页_学习了《python网络爬虫实战》第一个爬虫，爬取新浪新闻... 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

請安裝anaconda,其中附帶的spyder方便運行完查看變量

1.進入cmd控制臺，

輸入 pip install BeautifulSoup4

pip install requests

2.編寫代碼，代碼已經很清晰了，直接運行不會報錯并有成功的結果def getNewsDetail(newsUrl):

import requests

from bs4 import BeautifulSoup

from datetime import datetime

newsWeb = requests.get(newsUrl)

newsWeb.encoding = 'utf-8'

soup = BeautifulSoup(newsWeb.text,'lxml')

result = {}

result['title'] = soup.select('.main-title')[0].text

result['newsSource'] = soup.select('.source')[0].text

timeSource = soup.select('.date')[0].text

result['datetime'] = datetime.strptime(timeSource,'%Y年%m月%d日 %H:%M')

result['article'] = soup.select('.article')[0].text

result['editor'] = soup.select('.show_author')[0].text.strip('責任編輯：')

result['comment'] = soup.select('.num')[0].text

return result

def parseListLinks(url):

import requests

import json

newsDetails = []

request = requests.get(url)

jsonLoad = json.loads(request.text.lstrip(' newsloadercallback(').rstrip(');'))

newsUrls = []

for item in jsonLoad['result']['data']:

newsUrls.append(item['url'])

for url in newsUrls:

newsDetails.append(getNewsDetail(url))

return newsDetails

if __name__ == '__main__':

#獲取單個新聞頁面的信息

newsUrl = 'http://news.sina.com.cn/s/wh/2018-01-08/doc-ifyqkarr7830426.shtml'

newsDetail = getNewsDetail(newsUrl)

#獲取整個列表各個新聞頁面的信息

rollUrl='http://api.roll.news.sina.com.cn/zt_list?channel=news&cat_1=gnxw\

&cat_2==gdxw1||=gatxw||=zs-pl||=mtjj&level==1||=2&show_ext=1&show_all=1&\

show_num=22&tag=1&format=json&page=23&callback=newsloadercallback&_=1515911333929'

newsDetails = parseListLinks(rollUrl)

總結

以上是生活随笔為你收集整理的python爬取新浪新闻首页_学习了《python网络爬虫实战》第一个爬虫，爬取新浪新闻...的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： python decimal_【进阶】嫌
下一篇： python操作csv文件第7行开始的数

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

python

python爬取新浪新闻首页_学习了《python网络爬虫实战》第一个爬虫，爬取新浪新闻...

總結