如何在html页面遍历对象,Python:如何使用LXML/Requests遍历HTML元素对象?
我嘗試使用LXML&Requests從網站創建一個數據表。我需要標簽中的文本和標簽中包含的文本。以下是HTML:
HelenaHelena ValleyEast HelenaHelena ValleyHelenaHelena Valley基于此,我想創建一個如下表:
^{pr2}$
使用Requests&LXML,我嘗試遍歷div class="houses"以獲得所需的內容,但每次我嘗試打印值時,它都會打印以下內容:['107', '237', '104']
['MT', 'MT', 'MT']
['Occupied', 'Occupied', 'Vacant']
['Helena', 'East Helena', 'Helena']
['Helena Valley', 'Helena Valley', 'Helena Valley']
['107', '237', '104']
['MT', 'MT', 'MT']
['Occupied', 'Occupied', 'Vacant']
['Helena', 'East Helena', 'Helena']
['Helena Valley', 'Helena Valley', 'Helena Valley']
['107', '237', '104']
['MT', 'MT', 'MT']
['Occupied', 'Occupied', 'Vacant']
['Helena', 'East Helena', 'Helena']
['Helena Valley', 'Helena Valley', 'Helena Valley']
這是我的部分代碼:link = "example.com"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
response = requests.get(link, headers=headers, allow_redirects=False)
sourceCode = response.content
htmlElem = html.document_fromstring(sourceCode)
houses = htmlElem.find_class('houses')
for house in houses:
houseNumber = house.xpath('//input[@class="houseNumber"]/@value')
houseState = house.xpath('//input[@class="houseState"]/@value')
houseStatus = house.xpath('//input[@class="houseStatus"]/@value')
如何在上面所示的表中捕獲數據?我可以用不同的方式遍歷houses對象嗎?在
更新:@efirvida我已將代碼修改為以下內容:link = "example.com"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
response = requests.get(link, headers=headers, allow_redirects=False)
sourceCode = response.content
htmlElem = html.document_fromstring(sourceCode)
houses = htmlElem.find_class('houses')
houseNumber = []
houseState = []
houseStatus = []
for house in houses:
houseNumber.append(house.xpath('//input[@class="houseNumber"]/@value'))
print(houseNumber)
houseState.append(house.xpath('//input[@class="houseState"]/@value'))
houseStatus.append(house.xpath('//input[@class="houseStatus"]/@value'))
data = map(list, zip(*[houseNumber,houseState,houseStatus]))
當我這樣做時,會有以下指紋:[['107', '237', '104']]
[['107', '237', '104'], ['107', '237', '104']]
[['107', '237', '104']], ['107', '237', '104'], ['107', '237', '104']]
總結
以上是生活随笔為你收集整理的如何在html页面遍历对象,Python:如何使用LXML/Requests遍历HTML元素对象?的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: android获取QQ昵称,【Ctrl.
- 下一篇: 获取script内html元素,Pyth