使用元素树读取xml文件

问题描述：

<root> 
    <Group>  
    <ChapterNo>1</ChapterNo>  
    <ChapterName>A</ChapterName>  
    <Line>1</Line>  
    <Content>zfsdfsdf</Content>  
    <Synonyms>fdgd</Synonyms>  
    <Translation>assdfsdfsdf</Translation>  
    </Group>  
    <Group>  
    <ChapterNo>1</ChapterNo>  
    <ChapterName>A</ChapterName>  
    <Line>2</Line>  
    <Content>ertreter</Content>  
    <Synonyms>retreter</Synonyms>  
    <Translation>erterte</Translation>  
    </Group>  
    <Group>  
    <ChapterNo>2</ChapterNo>  
    <ChapterName>B</ChapterName>  
    <Line>1</Line>  
    <Content>sadsafs</Content> 
    <Synonyms>sdfsdfsd</Synonyms> 
    <Translation>sdfsdfsd</Translation> 
    </Group> 
    <Group> 
    <ChapterNo>2</ChapterNo> 
    <ChapterName>B</ChapterName> 
    <Line>2</Line> 
    <Content>retete</Content> 
    <Synonyms>retertret</Synonyms> 
    <Translation>retertert</Translation> 
    </Group> 
</root>

我这样试过.......

root = ElementTree.parse('data.xml').getroot() 
ChapterNo = root.find('ChapterNo').text 
ChapterName = root.find('ChapterName').text 
GitaLine = root.find('Line').text 
Content = root.find('Content').text 
Synonyms = root.find('Synonyms').text 
Translation = root.find('Translation').text

但它显示了一个错误

ChapterNo=root.find('ChapterNo').text 
AttributeError: 'NoneType' object has no attribute 'text'`

现在我想要得到的一切ChapterNo，ChapterName等分别使用元素树，我想将这些数据插入数据库....任何人都可以帮助我？

RGDS，

Nimmy

我试过......... root = ElementTree.parse（'data.xml'）。getroot（） ChapterNo = root.find（'ChapterNo'）。text ChapterName = root.find（ 'ChapterName'）。text GitaLine = root.find（'Line'）。text Content = root.find（'Content'）。text 同义词= root.find（'Synonyms'）。text Translation = root。文本 AttributeError：'NoneType'对象没有属性'text'“ – Nimmy 2011-02-01 10:02:19

将其添加到您的问题中，其'hard'（'Translation'）。但是显示错误”ChapterNo = root.find（'ChapterNo'阅读评论。 – 2011-02-01 10:03:11

答

ChapterNo不是root直接孩子，所以root.find('ChapterNo')将无法正常工作。您将需要使用xpath语法来查找数据。

此外，还有多次出现ChapterNo，ChapterName等，因此您应该使用findall并遍历结果以获取每个文本。

chapter_nos = [e.text for e in root.findall('.//ChapterNo')]

等等。

答

下面是一个小例子，使用sqlalchemy来定义一个对象，该对象将提取数据并将其存储在sqlite数据库中。

from sqlalchemy import create_engine, Unicode, Integer, Column, UnicodeText 
from sqlalchemy.orm import create_session 
from sqlalchemy.ext.declarative import declarative_base 

engine = create_engine('sqlite:///chapters.sqlite', echo=True) 
Base = declarative_base(bind=engine) 

class ChapterLine(Base): 
    __tablename__ = 'chapterlines' 
    chapter_no = Column(Integer, primary_key=True) 
    chapter_name = Column(Unicode(200)) 
    line = Column(Integer, primary_key=True) 
    content = Column(UnicodeText) 
    synonyms = Column(UnicodeText) 
    translation = Column(UnicodeText) 

    @classmethod 
    def from_xmlgroup(cls, element): 
     l = cls() 
     l.chapter_no = int(element.find('ChapterNo').text) 
     l.chapter_name = element.find('ChapterName').text 
     l.line = int(element.find('Line').text) 
     l.content = element.find('Content').text 
     l.synonyms = element.find('Synonyms').text 
     l.translation = element.find('Translation').text 
     return l 

Base.metadata.create_all() # creates the table

下面是如何使用它：

from xml.etree import ElementTree as etree 

session = create_session(bind=engine, autocommit=False) 
doc = etree.parse('myfile.xml').getroot() 
for group in doc.findall('Group'): 
    l = ChapterLine.from_xmlgroup(group) 
    session.add(l) 

session.commit()

我已经在你的XML数据测试此代码，它工作正常，一切都插入到数据库中。

答

解析您简单的两层次的数据结构和组装为每个组的字典，所有你需要做的是这样的：

>>> # what you did to get `root` 
>>> from pprint import pprint as pp 
>>> for group in root: 
...  d = {} 
...  for elem in group: 
...   d[elem.tag] = elem.text 
...  pp(d) # or whack it ito a database 
... 
{'ChapterName': 'A', 
'ChapterNo': '1', 
'Content': 'zfsdfsdf', 
'Line': '1', 
'Synonyms': 'fdgd', 
'Translation': 'assdfsdfsdf'} 
{'ChapterName': 'A', 
'ChapterNo': '1', 
'Content': 'ertreter', 
'Line': '2', 
'Synonyms': 'retreter', 
'Translation': 'erterte'} 
{'ChapterName': 'B', 
'ChapterNo': '2', 
'Content': 'sadsafs', 
'Line': '1', 
'Synonyms': 'sdfsdfsd', 
'Translation': 'sdfsdfsd'} 
{'ChapterName': 'B', 
'ChapterNo': '2', 
'Content': 'retete', 
'Line': '2', 
'Synonyms': 'retertret', 
'Translation': 'retertert'} 
>>>

看，麻，没有的XPath！

使用元素树读取xml文件

相关推荐