使lxml.objectify忽略xml命名空间？

问题描述：

所以我得处理一些XML，看起来像这样：使lxml.objectify忽略xml命名空间？

<ns2:foobarResponse xmlns:ns2="http://api.example.com"> 
    <duration>206</duration> 
    <artist> 
    <tracks>...</tracks> 
    </artist> 
</ns2:foobarResponse>

我发现LXML和它的objectify模块，可以让你在一个Python的方式遍历XML文档，像一本字典。
问题是：它使用了伪造的XML命名空间的每一次尝试访问一个元素，这样的时刻：

from lxml import objectify 

tree = objectify.fromstring(xml) 
print tree.artist 
# ERROR: no such child: {http://api.example.com}artist

它试图与父命名空间访问<artist>，但标签不使用纳秒。

任何想法如何解决这个问题？谢谢

答

根据lxml.objectify documentation，属性查找默认使用其父元素的名称空间。

什么你可能要工作将是：如果你的孩子有一个非空的命名空间（“{http://foo/}艺术家”，例如）

这样

print tree["{}artist"]

的QName语法的工作，但不幸的是，它看起来像当前的源代码将空名称空间视为否名称空间，因此所有objectify的查找善良将有助于用父命名空间替换空名称空间，并且您运气不佳。

这可能是一个bug（“{} artist”should work），或者是一个针对lxml人员的增强请求。

就目前而言，做的最好的事情大概是：

print tree.xpath("artist")

这是我不清楚多少表现打你会采取在这里使用XPath，但是这肯定的作品。

答

仅供参考：请注意，由于lxml 2.3，此功能与预期一致。

从lxml的更新日志：

“ [...]

2.3（2011-02-06）功能的加入

在寻找孩子，lxml.objectify需要'{}标记' 一个空的名称空间，而不是父名称空间。

[...]”

在行动：

>>> xml = """<ns2:foobarResponse xmlns:ns2="http://api.example.com"> 
... <duration>206</duration> 
... <artist> 
...  <tracks>...</tracks> 
... </artist> 
... </ns2:foobarResponse>""" 
>>> tree = objectify.fromstring(xml) 
>>> print tree['{}artist'] 
artist = None [ObjectifiedElement] 
    tracks = '...' [StringElement] 
>>>

使lxml.objectify忽略xml命名空间？

相关推荐