Python lxml(objectify):Xpath麻烦
问题描述:
我试图解析XML文档,使用lxml objectify和xpath提取数据。下面是该文件的一个剪断:Python lxml(objectify):Xpath麻烦
<?xml version="1.0" encoding="UTF-8"?>
<Assets>
<asset name="Adham">
<pos>
<x>27913.769923</x>
<y>5174.627773</y>
</pos>
<description>Ba bla bla</description>
<bar>(null)</bar>
</general>
</asset>
<asset name="Adrian">
<pos>
<x>-179.477707</x>
<y>5286.959359</y>
</pos>
<commodities/>
<description>test test test</description>
<bar>more bla</bar>
</general>
</asset>
</Assets>
我有以下方法:
def getALLattributesX(self, _root):
'''Uses getattributeX and parses through the attribute dict, assigning
values as it goes. _root is the main document root'''
for k in self.attrib:
self.getattributeX(_root, self.attribPaths[k], k)
...调用该方法:
def getattributeX(self, node, x_path, _attrib):
'''Gets a value from an xml node indicated by an xpath
and assigns it to a the appropriate. If node does not exists
it assigns "error"
'''
print node.xpath(x_path)[0].text
try:
self.attrib[_attrib] = node.xpath(x_path)
except KeyError:
self.misload = True
#except AttributeError:
# self.attrib[attrib] = "error loading " + attrib
#self.misload = True
print语句是从测试。当我执行第一个方法时,它通过xml文档解析,成功停止每个资产对象。我必须为它找到的变量的字典,并为它使用路径免费字典,如下定义:
class tAssetList:
alist = {} #dict of assets
tlist = []
tree = None # XML tree
root = None #root elem
def readXML(self, _filename):
#Load file
fileobject = open(_filename, "r") #read-only
self.tree = objectify.parse(fileobject)
self.root = self.tree.getroot()
for elem in self.root.asset:
temp_asset = tAsset()
a_name = elem.get("name") # get name, which is the key for dict
temp_asset.getALLattributesX(elem)
self.alist[a_name] = temp_asset
class tAsset(obs.nxObject):
def __init__(self):
self.attrib = {"X_pos" : None, "Y_pos" : None}
self.attribPaths = {"X_pos" : '/pos/x', "Y_pos" : '/pos/y'}
然而,XPath的似乎并不奏效时,我把它叫做节点上(这是一个客观的XML节点)。它只是输出[],如果我直接将其等同,并且如果我尝试:[0] .text,它会给索引超出范围错误。
这是怎么回事?
答
/pos/x
和/pos/y
是绝对的XPath表达式,它们不选择任何元素,因为提供的XML文档没有pos
顶层元素。
尝试:
pos/x
和
pos/y
+1绝对和相对表现之间的正确区分。 – 2011-03-29 18:43:45
我认为这可能与此有关,但我不确定其中的差异。它工作得很好,谢谢! – Biosci3c 2011-03-29 19:51:30
@ Biosci3c:不客气。 – 2011-03-29 21:07:22