使用Python来提取的XML信息,输出作为一个列表
问题描述:
我想从这个XML文档中提取数据,并具有输出是一个列表:使用Python来提取的XML信息,输出作为一个列表
例如:
['10-Yard Fight (USA, Europe)', '1942 (Japan, USA)', .......]
我可以只弄清楚如何制作许多独立的名单。
例如:
['10-Yard Fight (USA, Europe)']
['1942 (Japan, USA)']
[.......]
XML示例:
<?xml version="1.0"?>
<menu>
<header>
<listname>Nintendo Entertainment System</listname>
<id>003</id>
<lastlistupdate>10/16/2014</lastlistupdate>
<listversion>1.1 Final</listversion>
<manufacturer>Nintendo</manufacturer>
<media>
<artwork></artwork>
<video></video>
</media>
<exporterversion>HyperList XML Exporter Version 1.3 Copywrite (c) 2009-2011 William Strong</exporterversion>
</header>
<game name="10-Yard Fight (USA, Europe)" index="true" image="1" id="0034232">
<description>10-Yard Fight (USA, Europe)</description>
<cloneof></cloneof>
<crc>3D564757</crc>
<manufacturer>Nintendo</manufacturer>
<year>1985</year>
<genre>Football/Sports</genre>
<rating>HSRS - GA (General Audience)</rating>
<enabled>Yes</enabled>
</game>
<game name="1942 (Japan, USA)" index="" image="">
<description>1942 (Japan, USA)</description>
<cloneof></cloneof>
<crc>171251E3</crc>
<manufacturer>Capcom</manufacturer>
<year>1986</year>
<genre>Shoot-'Em-Up</genre>
<rating>HSRS - GA (General Audience)</rating>
<enabled>Yes</enabled>
</game>
<game name="1943 - The Battle of Midway (USA)" index="" image="">
<description>1943 - The Battle of Midway (USA)</description>
<cloneof></cloneof>
<crc>12C6D5C7</crc>
<manufacturer>Capcom</manufacturer>
<year>1988</year>
<genre>Shoot-'Em-Up</genre>
<rating>HSRS - GA (General Audience)</rating>
<enabled>Yes</enabled>
</game>
</menu>
我的样品Python代码
from xml.dom import minidom
def databaseGameExtraction(xml):
xmldoc = minidom.parse(xml)
games = xmldoc.getElementsByTagName('game')
for game in games:
romKey = game.attributes['name']
roms = [romKey.value]
print(roms)
return roms
databaseGameExtraction('Nintendo Entertainment System.xml')
还,我希望得到的 'Nintendo娱乐系统' 的值是也返回。
在完美的世界中,当从另一个函数调用时,函数将返回列表形式的rom和列表形式的系统名称。
感谢,
- 一个很初级编码器
答
我想你需要
roms = []
for game in games:
romKey = game.attributes['name']
roms.append(romKey.value)
print("all roms:", roms)
答
您需要从XML反复建立roms
列表:
roms = []
for game in games:
rom_key = game.attributes['name']
roms.append(rom_key.value)
或更好写成list-comprehension:
roms = [game.attributes['name'].value for game in games]
您也可以提取“任天堂娱乐系统”使用:
xmldoc.getElementsByTagName('listname')[0].firstChild.data
这给我们留下了:
from xml.dom import minidom
def databaseGameExtraction(xml):
xmldoc = minidom.parse(xml)
roms = [game.attributes['name'].value
for game in xmldoc.getElementsByTagName('game')]
compagny = xmldoc.getElementsByTagName('listname')[0].childNodes[0].data
return roms, compagny
roms, compagny = databaseGameExtraction('Nintendo Entertainment System.xml')
print(compagny)
print(roms)
+0
谢谢,这是它的一部分,我无法得到“列表理解工作的示例代码,但得到了我想要的结果。 –
+0
@BenElder Woops。忘了使用'.value'从'minidom.Attr'对象中提取字符串。现在修复。 –
这工作太感谢 –