當前位置：首頁 > 编程语言 > python >内容正文

python

python提取XML信息保存为txt

發布時間：2024/3/13 python 32 豆豆

生活随笔收集整理的這篇文章主要介紹了 python提取XML信息保存为txt 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

前言
一、CVAT導出的xml格式
二、使用步驟
- 1.引入庫
- 2.讀入xml文件信息，獲取所有的image標簽
- 3.numpy對數據進行重組，并保存
- 4.結果
三、驗證坐標點信息

前言

最近要弄人臉坐標點數據集，在使用CVAT對人臉坐標點進行標記后，將數據以CVAT for images 1.1格式進行導出，坐標點的信息保存在xml中。我需要的是與300W數據集相同的格式，通過python的xml.dom進行操作，提取出每張照片對應的人臉坐標點信息，保存為txt文件。并通過繪制圖像，驗證標注點信息是否正確。

一、CVAT導出的xml格式

從圖中可以看到，主要的信息標簽在image下，image標簽下嵌入了points標簽（這個為需要的內容）

二、使用步驟

1.引入庫

import os import numpy as np import pandas as pd# 使用dom from xml.dom.minidom import parse import xml.dom.minidom

2.讀入xml文件信息，獲取所有的image標簽

# 保存信息 data = []# 使用minidom解析器打開 XML 文檔 DOMTree = xml.dom.minidom.parse("./data/Face_points/annotations.xml") collection = DOMTree.documentElement# 在集合中獲取所有image images = collection.getElementsByTagName("image")# 打印每部電影的詳細信息 for image in images:print ("*****image*****")if image.hasAttribute("id"):print ("id: %s" % image.getAttribute("id"))if image.hasAttribute("name"):print ("name: %s" % image.getAttribute("name"))data.append(image.getAttribute("name"))if image.hasAttribute("width"):print ("width: %s" % image.getAttribute("width"))if image.hasAttribute("height"):print ("height: %s" % image.getAttribute("height"))points = image.getElementsByTagName('points')for point in points:if point.hasAttribute("points"):print ("point: %s" % point.getAttribute("points"))data.append(point.getAttribute("points"))

3.numpy對數據進行重組，并保存

data = np.array(data).reshape((-1,2))for i in range(0,data.shape[0]):points_str = data[i][1]points_str = points_str.replace(";",",")points_list = points_str.split(',')points = np.array(points_list).reshape(-1,2)file_name = data[i][0]np.savetxt(os.path.join('./data/Face_points/images/',file_name.replace(".png",".txt")),points, fmt="%s")

4.結果

三、驗證坐標點信息

import os import pandas as pd import numpy as np from PIL import Image import matplotlib.pyplot as plt from torchvision import transforms, utils import cv2from skimage import io, transform# Ignore warnings import warnings warnings.filterwarnings("ignore")image = cv2.imread('./images/indoor_003.png', 1) # 1 彩色，0 灰色# 讀取對應的pts文件 with open('./images/indoor_003.txt') as file_obj:contents = file_obj.readlines();#i = 0 landmarks = [] for line in contents:TT = line.strip("\n") # strip() 方法用于移除字符串頭尾指定的字符（默認為空格或換行符）或字符序列。# print TTTT_temp = TT.split(" ")x = float(TT_temp[0])y = float(TT_temp[1].strip("\r")) # \r :回車landmarks.append((x, y))#i += 1 print(landmarks, len(landmarks))# 將關鍵點標在圖片上 ''' cv2.circle(image, center_coordinates, radius, color, thickness)image:它是要在其上繪制圓的圖像。center_coordinates：它是圓的中心坐標。坐標表示為兩個值的元組，即(X坐標值，Y坐標值)。radius:它是圓的半徑。color:它是要繪制的圓的邊界線的顏色。對于BGR，我們通過一個元組。例如：(255，0，0)為藍色。thickness:它是圓邊界線的粗細像素。厚度-1像素將以指定的顏色填充矩形形狀。返回值：它返回一個圖像。cv2.putText(img, text, org, fontFace, fontScale, color, thickness=None, lineType=None, bottomLeftOrigin=None)各參數依次是：圖片，添加的文字，左上角坐標，字體，字體大小，顏色，字體粗細 ''' m = 0 # 標號初始為0 for point in landmarks:# print(point[0],point[1])cv2.circle(image, (int(point[0]), int(point[1])), 2, (0, 255, 0), -1) # 顏色順序：BGR (0, 255, 0)綠色,-1 實心圓m += 1cv2.putText(image, str(m), (int(point[0]), int(point[1])), cv2.FONT_HERSHEY_SIMPLEX, 0.25, (0, 0, 255),1) # 每個關鍵點上標號 # plt.scatter(np.transpose(point)[0], np.transpose(point)[1]) # 散點圖 # plt.show() cv2.imshow("pointImg", image) cv2.waitKey()

總結

以上是生活随笔為你收集整理的python提取XML信息保存为txt的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：嵌入式系统框架----硬件篇
下一篇： win7 台式电脑怎么调节屏幕亮度