當(dāng)前位置：首頁 > 编程语言 > python >内容正文

python

pythonjava解释xml_Python解析XML文档

發(fā)布時(shí)間：2024/8/23 python 23 豆豆

生活随笔收集整理的這篇文章主要介紹了 pythonjava解释xml_Python解析XML文档小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

解析XML主要用到pytohn自帶的XML庫，其次還是lxml庫

XML結(jié)構(gòu)，先以一個(gè)相對(duì)簡(jiǎn)單但功能比較全的XML文檔為例

dive into mark

currently between addictions

tag:diveintomark.org,2001-07-29:/

2009-03-27T21:56:07Z

Mark

http://diveintomark.org/

Dive into history, 2009 edition

href='http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition'/>

tag:diveintomark.org,2009-03-27:/archives/20090327172042

2009-03-27T21:56:07Z

2009-03-27T17:20:42Z

Putting an entire chapter on one page sounds

bloated, but consider this — my longest chapter so far

would be 75 printed pages, and it loads in under 5 seconds…

On dialup.

Mark

http://diveintomark.org/

Accessibility is a harsh mistress

href='http://diveintomark.org/archives/2009/03/21/accessibility-is-a-harsh-mistress'/>

tag:diveintomark.org,2009-03-21:/archives/20090321200928

2009-03-22T01:05:37Z

2009-03-21T20:09:28Z

The accessibility orthodoxy does not permit people to

question the value of features that are rarely useful and rarely used.

Mark

A gentle introduction to video encoding, part 1: container formats

href='http://diveintomark.org/archives/2008/12/18/give-part-1-container-formats'/>

tag:diveintomark.org,2008-12-18:/archives/20081218155422

2009-01-11T19:39:22Z

2008-12-18T15:54:22Z

These notes will eventually become part of a

tech talk on video encoding.

先簡(jiǎn)單的看一下這個(gè)XML的結(jié)構(gòu)

#這里定義了命名空間(namespace) http://www.w3.org/2005/Atom

#這里的沒有text，但是里面有相應(yīng)的屬性

href='http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition'/>

首先有一個(gè)全局的根元素

在根元素下面有title,subtitle,id,update,link,entry子元素

在entry元素下面還有author,title,link,id,updated,published,category,summary子元素 (姑且稱為孫元素)

在author元素下面還有name,uri子元素(這該稱為曾孫元素了吧~ 哈哈)

結(jié)構(gòu)還是挺清晰的

下面我們用python的方法來一步步的取出在元素<>>這間的content以為元素內(nèi)的屬性

使用的方法主要有

tree = etree.parse() 解析XML

root = tree.getroot() 得到根元素

root.tag 根元素名稱

root.attrib 顯示元素的屬性

root.findall() 查找元素

下面請(qǐng)看代碼，都已經(jīng)將注釋與結(jié)果寫在里面

import xml.etree.ElementTree as etree #將xml.etree.ElementTree引入

tree = etree.parse('feed.xml') #解析XML

root = tree.getroot()

print root

#元素即列表

print root.tag

#{http://www.w3.org/2005/Atom}feed

# ElementTree使用{namespace}localname來表達(dá)xml元素

for child in root:

print child

# 這里只顯示一級(jí)子元素，而子元素的子元素將不會(huì)被遍歷

#屬性即字典

print root.attrib

#{'{http://www.w3.org/XML/1998/namespace}lang': 'en'}

#我們注意到feed下面的link這個(gè)元素有屬性

print root[4].attrib

#{'href': 'http://diveintomark.org/', 'type': 'text/html', 'rel': 'alternate'}

print root[3].attrib

#{} 將會(huì)得到一個(gè)空字典，因?yàn)閡pdated元素內(nèi)沒有屬性值

#查找元素

entrylist = root.findall('{http://www.w3.org/2005/Atom}entry')

print entrylist

# [,

# 3.org/2005/Atom}entry at 18425d0>,

# t 1842968>]

print root.findall('{http://www.w3.org/2005/Atom}author')

# 這里將得到一個(gè)空列表，因?yàn)閍uthor不是feed的直接子元素

#查找子元素

entries = tree.findall('{http://www.w3.org/2005/Atom}entry') #先找到entry元素·

title = entries[0].find('{http://www.w3.org/2005/Atom}title')#接著再找title元素

print title.text

#'Dive into history, 2009 edition'

all_links = tree.findall('//{http://www.w3.org/2005/Atom}link') #在元素前面加'//' 則可以在所有元素里查找包括子元素和孫元素

# [,

# ,

# ]

print all_links[0].attrib #將會(huì)得到這個(gè)Link的屬性字典

# {'href': 'http://diveintomark.org/',

# 'type': 'text/html',

# 'rel': 'alternate'}

關(guān)于XML庫解析與查找XML文檔基本的方法就這些了，現(xiàn)在通過一個(gè)實(shí)例來學(xué)以至用下

還是回到微信的XML解析上，微信將用戶的信息POST到你的服務(wù)器上，基本形式如下

1348831860

1234567890123456

現(xiàn)在我們來通過上面介紹的方法來獲得元素中的‘this is a test’字段

import xml.etree.ElementTree as etree

weixinxml = etree.parse('weixinpost.xml')

wroot = weixinxml.getroot()

print wroot.tag

for child in wroot:

print child.tag

if wroot.find('Content') is not None:

print wroot.find('Content').text

else:

print 'Nothing found'

這樣簡(jiǎn)單幾步就可以把想要的內(nèi)容取出來了

總結(jié)

以上是生活随笔為你收集整理的pythonjava解释xml_Python解析XML文档的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：开发接口文档_更优更稳更好，看文档驱动开
下一篇： sklearn svm如何选择核函数_文

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

python

pythonjava解释xml_Python解析XML文档

總結(jié)