用Java读取XML文件

问题描述：

在用Java读取XML文件之前，是否有必要完整地了解XML文件的结构和标签？用Java读取XML文件

areaElement.getElementsByTagName("checked").item(0).getTextContent()

我在读取文件之前不知道字段名“checked”。有什么办法可以列出XML文件中的所有标签，基本上是文件结构？

你可能会在这里得到一些东西.. http://stackoverflow.com/questions/12255529/how-to-extract-xml-tag-value-without-using-the-tag-name-in-java – gowtham

答

我已经自己准备好了这个DOM解析器，使用递归来解析你的xml而不需要知道单个标签。它将为您提供每个节点的文本内容（如果存在），按顺序排列。您可以删除以下代码中的注释部分以获取节点名称。希望它会有所帮助。

import java.io.BufferedWriter; 
import java.io.File; 
import java.io.FileInputStream; 
import java.io.FileOutputStream; 
import java.io.IOException; 
import java.io.OutputStreamWriter; 

import javax.xml.parsers.DocumentBuilder; 
import javax.xml.parsers.DocumentBuilderFactory; 
import org.w3c.dom.Document; 
import org.w3c.dom.Node; 
import org.w3c.dom.NodeList; 



public class RecDOMP { 


public static void main(String[] args) throws Exception{ 
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); 
     dbf.setValidating(false); 
     DocumentBuilder db = dbf.newDocumentBuilder(); 

// replace following path with your input xml path 
     Document doc = db.parse(new FileInputStream(new File ("D:\\ambuj\\ATT\\apip\\APIP_New.xml"))); 

// replace following path with your output xml path 
     File OutputDOM = new File("D:\\ambuj\\ATT\\apip\\outapip1.txt"); 
      FileOutputStream fostream = new FileOutputStream(OutputDOM); 
      OutputStreamWriter oswriter = new OutputStreamWriter (fostream); 
      BufferedWriter bwriter = new BufferedWriter(oswriter); 

      // if file doesnt exists, then create it 
      if (!OutputDOM.exists()) { 
       OutputDOM.createNewFile();} 


      visitRecursively(doc,bwriter); 
      bwriter.close(); oswriter.close(); fostream.close(); 

      System.out.println("Done"); 
} 
public static void visitRecursively(Node node, BufferedWriter bw) throws IOException{ 

      // get all child nodes 
     NodeList list = node.getChildNodes();         
     for (int i=0; i<list.getLength(); i++) {   
       // get child node    
     Node childNode = list.item(i); 
     if (childNode.getNodeType() == Node.TEXT_NODE) 
     { 
    //System.out.println("Found Node: " + childNode.getNodeName()   
    // + " - with value: " + childNode.getNodeValue()+" Node type:"+childNode.getNodeType()); 

    String nodeValue= childNode.getNodeValue(); 
    nodeValue=nodeValue.replace("\n","").replaceAll("\\s",""); 
    if (!nodeValue.isEmpty()) 
    { 
     System.out.println(nodeValue); 
     bw.write(nodeValue); 
     bw.newLine(); 
    } 
     } 
     visitRecursively(childNode,bw); 

      }   

    } 

}

谢谢你为你的答案 – asjr

答

你一定要检查出这个库，如dom4j（http://dom4j.sourceforge.net/）。他们可以解析整个XML文档，让您不仅可以列出元素之类的东西，还可以在其上执行XPath查询和其他如此酷炫的东西。

性能受到影响，特别是在大型XML文档中，所以您需要在提交到库之前检查用例的性能。如果您只需要从XML文档中取出一小部分内容（并且您知道您已经在查找什么内容），则尤其如此。

答

您的问题的答案是否定的，没有必要事先知道任何元素名称。例如，您可以走树来发现元素名称。但这一切都取决于你实际想要做的事情。

对于绝大多数应用程序，顺便说一句，Java DOM是解决问题的最糟糕的方法之一。但如果不知道您的项目需求，我不会进一步评论。

相关推荐