java操作xml的大汇总，一篇看懂

一、java操作xml简介

java里面操作xml的方法无非就是读与写，读出来才能谈后续的替换、查找等操作，java里面操作xml的方法也是五花八门的，我这里总结几种常用的方法，见下图：

java操作xml的大汇总，一篇看懂

xml读：sax，dom,jdom,dom4j

xml写：dom，jdom,dom4j

下面先大致介绍一下这几种方法，后续再贴出具体代码。

sax:即Simple API for XML，它是java本身自带的类库，使用它不需要你引入任何其他的jar包，在xml的解析方面性能特别快，特别针对比较大的xml文档，处理起来很有优势。

dom：即jdk针对w3c dom结构标准的实现，虽然可以用来xml的解析与写入，不过在实际使用的状况来看，还是在写入方面使用的比较多，解析建议使用sax。

jdom：是一个第三方类库，我们使用的使用引入jdom.jar就可以使用了

dom4j:"4j"这个词不读做“四j”，而是读作“for j”，意思是dom for java，也是一个第三方类库，我们使用时也需要引入它的jar包。

二、使用java本身针对w3c dom的实现(dom)读取xml

java本身针对w3c dom结构标准的实现就叫做dom，它在操作写入xml的时候用的比较多，xml解析一般用sax，但是dom也可以用来做xml的解析，下面是代码。

要解析的xml叫做msg.xml，内容如下：

<?xml version="1.0" encoding="UTF-8"?>
<message>
	<creator>zhao</creator>
	<noticeId>20141213</noticeId>
	<time>2014-12-13 19:10:30</time>
	<dept>59000</dept>
	<content>
		<optionId effect="true">1001000</optionId>
		<optionName>请列举你的至少3项意见</optionName>
	</content>
</message>

现有需求如下，我想取到optionName标签的内容该怎么用dom来实现呢？下面就是实现该功能的java代码：

package zhao;
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
/**
 * 使用java本身针对w3c dom的实现之dom读取xml。读取不方便
 * @author zhao
 */
public class XmlReader1 {
	public static void main(String[] args) throws Exception {
		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
		DocumentBuilder db = dbf.newDocumentBuilder();
		Document document = db.parse(new File(System.getProperty("user.dir")+File.separator+"msg.xml"));
		//得到根节点
		NodeList rootElement = document.getElementsByTagName("message");
		for (int i = 0; i < rootElement.getLength(); i++) {
			   NodeList contentElList = document.getElementsByTagName("content");
			   Element contentEl = (Element)contentElList.item(0);
			   String textContent = contentEl.getElementsByTagName("optionName").item(0).
			   getFirstChild().getNodeValue();
			   System.out.println(textContent);
		}
	}
}

执行结果如下：

java操作xml的大汇总，一篇看懂

三、使用jdk自带的sax解析xml

sax(Simple API for XML)就是用来做xml解析用的，使用很简单，它是基于事件模型的，这样说你可能有点看不懂，什么意思呢。我们自己想一想读xml能有什么事件？无非就是说读到了一个xml，刚读到的时候是个事件吧，读到一个xml标签开始的时候也是一个事件吧，读到一个xml标签结束的时候也可以是一个事件，读到文本标签的时候也是一个事件，无非就是这些嘛，sax是怎么分这些事件的呢，也跟我们上边自己分的思路是一样的，只不过它可能考虑的更周密，那我们就来看看使用sax是怎么解析xml的。

要解析的xml，依然是msg.xml:

<?xml version="1.0" encoding="UTF-8"?>

<optionName>请列举你的至少3项意见</optionName>

</content>

</message>

我现在的需求不让你找到optionName标签内的文本内容了，直接让你读出来整个文档，使用sax怎么做呢？看下面的代码：

package zhao;
import java.io.File;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class XmlReader2 {
	public static void main(String[] args) throws Exception {
		SAXParserFactory saxParserFactory=SAXParserFactory.newInstance();
		SAXParser saxParser = saxParserFactory.newSAXParser();
		saxParser.parse(new File(System.getProperty("user.dir")+File.separator+"msg.xml"), new XmlHandler());
	}
}
class XmlHandler extends DefaultHandler{

	@Override
	public void startDocument() throws SAXException {
    	System.out.println("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");
	}

	@Override
	public void endDocument() throws SAXException {
		System.out.print("解析完毕");
	}

	@Override
	public void startElement(String uri, String localName, String qName,
			Attributes attributes) throws SAXException {
	       System.out.print("<");
	       System.out.print(qName);
	       if (attributes!=null) {
			for (int i = 0; i < attributes.getLength(); i++) {
			  System.out.print(" "+attributes.getQName(i)+"=\""+attributes.getValue(i)+"\"");	 
			}
		  }
	       System.out.print(">");
	}

	@Override
	public void endElement(String uri, String localName, String qName)
			throws SAXException {
		 System.out.print("</");
	       System.out.print(qName);
	    if (qName.equals("message")) {
			System.out.println(">");
		}else {
			System.out.print(">");
		}
	}
	@Override
	public void characters(char[] ch, int start, int length) throws SAXException {
		System.out.print(new String(ch,start,length));
	}
}

执行结果如下图：

java操作xml的大汇总，一篇看懂

总结：

可以看到，sax定义了startDocument，endDocument，startElement，endElement，characters这些事件，针对上面的例子来说，startDocument，endDocument只会调用一次，因为只有一个文档嘛，自然就只有一次文档的开始与文档的结束。startElement，endElement，characters事件都会调用多次，其中characters用于获取元素内的文本内容。

四、使用jdom解析xml文件

jdom是java里面一个操作xml的第三方库，这里我依然使用msg.xml来演示jdom解析xml的功能，msg.xml的内容如下：

<?xml version="1.0" encoding="UTF-8"?>
<message>
	<creator>zhao</creator>
	<noticeId>20141213</noticeId>
	<time>2014-12-13 19:10:30</time>
	<dept>59000</dept>
	<content>
		<optionId effect="true">1001000</optionId>
		<optionName>请列举你的至少3项意见</optionName>
	</content>
</message>

需求跟前面使用sax解析是一样的，就是使用jdom读取msg.xml的全部内容，以下是使用jdom解析xml的代码：

package zhao;
import java.io.File;
import java.util.Iterator;
import java.util.List;
import javax.xml.parsers.ParserConfigurationException;

import org.jdom.Attribute;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.input.SAXBuilder;
public class XmlReader3 {
	public static void main(String[] args) throws Exception {
		SAXBuilder saxBuilder=new SAXBuilder();
		Document document = saxBuilder.build(new File(System.getProperty("user.dir")+File.separator+"msg.xml"));
		Element rootElement = document.getRootElement();
		if (rootElement!=null) {
			System.out.print("<");
			System.out.print(rootElement.getName());
			System.out.println(">");
			//只得到儿子
			List<Element> children = rootElement.getChildren();
			readXmlByJdom(children);
			System.out.print("</");
			System.out.print(rootElement.getName());
			System.out.println(">");
		}
	}
	
	private static void readXmlByJdom(List<Element> children){
		for (Iterator<Element> iterator = children.iterator(); iterator.hasNext();) {
			//现在每个element是creator级别的
			Element element =  iterator.next();
			String name = element.getName();
			//element.getChildren()!=null 这里每个元素的孩子可能会为空:文本节点不算在孩子之内，其实w3c里是算的
			//System.out.println(element.getChildren());
			if (element.getChildren().size()!=0) {
				System.out.println("<"+name+">");
				//递归读取
				 readXmlByJdom((List<Element>)element.getChildren());
				 System.out.println("</"+name+">");
			}else {
				if (!element.getAttributes().isEmpty()) {
					System.out.print("<"+name);
					for (int i = 0; i < element.getAttributes().size(); i++) {
						Attribute attribute=(Attribute) element.getAttributes().get(i);
						System.out.print(" "+attribute.getName()+"=\""+attribute.getValue()+"\"");
					}
					System.out.print(">");
				}else {
					System.out.print("<"+name+">");
					
				}
				
				//element.getText()没有text的话自然输出空
				System.out.print(element.getText());
				System.out.println("</"+name+">");
			}
			//得到名字为"creator"的的element 的child
			//Element child = element.getChild("creator");
			//System.out.println(child.getText());
		}
	}
}

结果：

java操作xml的大汇总，一篇看懂

可以看出这里使用jdom实现与前边sax同样功能的话，就需要一点功力了，需要自己写递归实现xml的读取，单独针对msg.xml，其实没有必要使用递归，因为结构你是知道的，也是固定的，我这里用了递归，是因为，因为什么呢，这是很早之前我的代码了，2014年的吧，可能就是为了求代码的通用性而写的递归吧。

五、使用dom4j解析xml文件

要被解析的msg.xml:

java操作xml的大汇总，一篇看懂

下面代码实现的功能是使用dom4j解析上面的msg.xml:

package zhao;
import java.io.File;
import java.util.Iterator;
import java.util.List;
import javax.xml.parsers.ParserConfigurationException;
import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
public class XmlReader4 {
	public static void main(String[] args) throws Exception {
		SAXReader saxReader=new SAXReader();
		Document document = saxReader.read(new File(System.getProperty("user.dir")+File.separator+"msg.xml"));
		Element root = document.getRootElement();
		System.out.println("<"+root.getName()+">");
		//当然了，同jdom的getChildren一样，也是只包含儿子元素
		Iterator childrenOfRoot = root.elementIterator();
		readByDom4j(childrenOfRoot);
		System.out.println("</"+root.getName()+">");
		
	}
	private static void readByDom4j(Iterator iterator){
		while(iterator.hasNext()){
			Element element = (Element) iterator.next();
			Iterator children = element.elementIterator();
			//这里同jdom一样，文本节点也不算在子元素集合的迭代器之内
			//System.out.println(children.hasNext());
			if(children.hasNext()){
				System.out.println("<"+element.getName()+">");
				readByDom4j(children);
				System.out.println("</"+element.getName()+">");
			}else {
				List attributes = element.attributes();
				if (!attributes.isEmpty()) {
					System.out.print("<"+element.getName());
					for(Object o:attributes){
						Attribute attribute=(Attribute) o;
						System.out.print(" "+attribute.getName()+"=\""+attribute.getValue()+"\"");
					}
					System.out.print(">");
				}else {
					System.out.print("<"+element.getName()+">");
				}
				System.out.print(element.getText());
				System.out.println("</"+element.getName()+">");
			}
		}
	}
}

执行结果：

java操作xml的大汇总，一篇看懂

六、使用jdom解析xml字符串

有时候，我们的xml可能不是一个xml文件，比如在webservice里面，xml可能是调用webservice方法的返回值，我们收到的返回值就是一个字符串了，使用jdom当然也可以解析了，因为这里的xml字符串本质上也是一个xml文件嘛，你读了一个xml文件，读进来之后不就是一个字符串了嘛，都是一样的道理，没什么区别。唯一的区别就是得到org.jdom.Document对象的方法不同了，解析xml文件的时候是通过

Document document = saxBuilder.build(new File(System.getProperty("user.dir")+File.separator+"msg.xml"))的方式，而这里因为是xml字符串而不是xml文件，就换了一下，改为了

StringReader sr=new StringReader(str);

InputSource is=new InputSource(sr);

Document document = saxBuilder.build(is);

的方式，其他没什么区别，还是再写个简单的例子吧：

package zhao;

import java.io.File;
import java.io.StringReader;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;

import javax.xml.parsers.ParserConfigurationException;

import org.jdom.Attribute;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.input.SAXBuilder;
import org.jdom.xpath.XPath;
import org.xml.sax.InputSource;
public class XmlReader5 {

	public static void main(String[] args) throws Exception {
		//ws传来的字符串
		String str="<?xml version=\"1.0\" encoding=\"UTF-8\"?>"+
"<message><creator>zhao</creator><noticeId>20141213</noticeId><time>2014-12-13 19:10:30</time><dept>59000</dept><content>"+
			"<optionId effect=\"true\">1001000</optionId>"+
				"<optionName>请列举你的至少3项意见</optionName></content></message>";
		SAXBuilder saxBuilder=new SAXBuilder();
		StringReader sr=new StringReader(str);
		InputSource is=new InputSource(sr);
		Document document = saxBuilder.build(is);
		Map<String, String> map=new HashMap<String, String>();
		Element rootElement = document.getRootElement();
		Element child = rootElement.getChild("creator");
		System.out.println(child.getName());
		System.out.println(child.getText());
		
		//xpath测试
		System.out.println("---xpath测试---");
		XPath xPath=XPath.newInstance("//optionName");
		@SuppressWarnings("unchecked")
		List<Element> selectNodes = xPath.selectNodes(child);
		for (int i = 0; i < selectNodes.size(); i++) {
			System.out.println(selectNodes.get(i).getName());
		}
	}
}

注意：jdom里面还可以使用xpath来进行元素的选取，记得要导入相关jar包就行了。

七、使用dom4j解析xml字符串

如题所示，这篇java教程再来看一下如何使用dom4j解析xml字符串，跟前边那篇jdom类似，dom4j既然可以解析xml文件，自然也就可以解析xml字符串了，没什么大的区别，可以当做一个巩固的例子来学习。

package zhao;
import java.util.Iterator;
import javax.xml.parsers.ParserConfigurationException;
import org.dom4j.Document;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;
public class XmlReader6 {
	public static void main(String[] args) throws Exception {
		String str="<?xml version=\"1.0\" encoding=\"UTF-8\"?>"+
		"<messages><message><creator>zhao</creator><noticeId>20141213</noticeId>" +
		"<noticeId>20141214</noticeId><time>2014-12-13 19:10:30</time>" +
		"<dept>59000</dept><content>"+
				"<optionId effect=\"true\">1001000</optionId>"+
				"<optionName>请列举你的至少3项意见</optionName></content></message>"+
				"<message><creator>zhao</creator><noticeId>111</noticeId>" +
				"<noticeId>2222</noticeId><time>2014-12-13 19:10:30</time><dept>59000</dept>" +
				"<content>"+
				"<optionId effect=\"true\">1001000</optionId>"+
				"<optionName>请列举你的至少3项意见</optionName></content></message></messages>";
		Document document = DocumentHelper.parseText(str);
		Element root = document.getRootElement();
		//相当于jdom的getchild("noticeId")
		Iterator elementIterator = root.elementIterator("message");
		while(elementIterator.hasNext()){
			Element element = (Element) elementIterator.next();
			Iterator elementIterator2 = element.elementIterator("noticeId");
			while(elementIterator2.hasNext()){
				Element element2 = (Element) elementIterator2.next();
				System.out.println(element2.getName());
				System.out.println(element2.getText());
			}
		}
	}
}

上面程序的功能是找出所有的message标签下面的noticeId标签，并打印出来找到的标签名字和标签里面的内容，执行结果如下图：

java操作xml的大汇总，一篇看懂

八、使用jdk自带的dom生成xml

使用java本身自带的实现了w3c dom的dom可以生成xml，先看下代码：

package zhao;

import java.io.File;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
public class XmlWrite1 {
	public static void main(String[] args) throws Exception {
		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
		DocumentBuilder db = dbf.newDocumentBuilder();
		Document document = db.newDocument();
		//创建元素
		Element list = document.createElement("list");
		Element option = document.createElement("option");
		Element name = document.createElement("name");
		Element email = document.createElement("email");
		//设置值
		name.setTextContent("abc");
		name.setAttribute("effect", "true");
		email.setTextContent("[email protected]");
		//设置关系
		option.appendChild(name);
		option.appendChild(email);
		list.appendChild(option);
		document.appendChild(list);
		
		TransformerFactory tff=TransformerFactory.newInstance();
		Transformer tf = tff.newTransformer();
		DOMSource xmlSource=new DOMSource(document);
		StreamResult sr=new StreamResult(new File(System.getProperty("user.dir")+File.separator+"javaDomWriter.xml"));
		//输出前的格式化
		tf.setOutputProperty(OutputKeys.ENCODING, "utf-8");
				tf.setOutputProperty(OutputKeys.INDENT, "yes");
		tf.transform(xmlSource, sr);
	}
}

生成的javaDomWriter.xml如下，：

java操作xml的大汇总，一篇看懂

注意：tf.setOutputProperty(OutputKeys.INDENT, "yes");的作用时要xml换行，如果不加入这句的话，那么生成的xml都是放在一行里面，虽然说内容都是一样的，但是放在一行里面很不容易阅读，加不加这一句视具体情况而定，如果你只是用于数据传输，可以不加这一句，这样起到了压缩传输内容大小的作用，如果你要生成xml自己看，那就加上吧。

九、使用jdom生成xml

这篇教程演示如何通过jdom来生成xml文件：

package zhao;
import java.io.File;
import java.io.FileOutputStream;
import javax.xml.parsers.ParserConfigurationException;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.output.XMLOutputter;
public class XmlWrite3 {
	public static void main(String[] args) throws Exception {
		//创建元素
		Element list=new Element("list");
		Element option=new Element("option");
		Element name=new Element("name");
		Element email=new Element("email");
		
		//设置内容
		name.setText("abc");
		name.setAttribute("effect", "false");
		email.setText("[email protected]");
		//设置关系
		option.addContent(name);
		option.addContent(email);
		list.addContent(option);
		//新建document
		Document document=new Document(list);
		XMLOutputter xmlOutputter=new XMLOutputter();
		xmlOutputter.setEncoding("gbk");
		xmlOutputter.output(document, new FileOutputStream(new File(System.getProperty("user.dir")+
		File.separator+"writenByJdom.xml")));
	}
}

生成的文件writenByJdom.xml内容如下：

<?xml version="1.0" encoding="gbk"?>
<list><option><name effect="false">abc</name><email>[email protected]</email></option></list>

十、使用dom4j生成xml文件

在java中，使用dom4j这个类库可以生成xml文件，dom4j与jdom都综合了sax的读取和dom的写操作的优点，实现的功能都是类似的，即都可以用来解析xml和生成xml等操作，我个人比较喜欢用dom4j，没有什么原因，也没有针对dom4j和jdom做过性能比较，总感觉没那个必要吧，都是成熟的类库，在不是对性能要求至一分一毫的系统里面，何必做那些无用功。至于为甚么喜欢用dom4j，可能是先入为主的原因。下面的代码演示了如何使用dom4j来生成一个xml文件。

package zhao;

import java.io.File;
import java.io.FileOutputStream;

import javax.xml.parsers.ParserConfigurationException;

import org.dom4j.Document;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;
import org.dom4j.io.OutputFormat;
import org.dom4j.io.XMLWriter;
public class XmlWrite4 {
	public static void main(String[] args) throws Exception {
		//dom4j创建元素的同时就设置了关系
		Document document = DocumentHelper.createDocument();
		Element root = document.addElement("list");
		Element option = root.addElement("option");
		Element name = option.addElement("name");
		Element email = option.addElement("email");
		//设置内容
		name.addAttribute("effect", "false");
		name.addText("abc");
		email.addText("[email protected]");
		
		//输出
		OutputFormat outputFormat=OutputFormat.createPrettyPrint();
		outputFormat.setEncoding("gbk");
		XMLWriter xmlWriter=new XMLWriter(new FileOutputStream(new File
				(System.getProperty("user.dir")+File.separator+"wirtenByDom4j.xml")), outputFormat);
		xmlWriter.write(document);
		//document.write(xmlWriter);
		xmlWriter.close();
	}
}

生成的wirtenByDom4j.xml如下：

java操作xml的大汇总，一篇看懂

java操作xml的大汇总，一篇看懂

相关推荐