如何查找XML中特定标记内的所有标记
我想提取<Page>
节点内的所有节点。 我使用下面的方法通过使用以下两种方法如何查找XML中特定标记内的所有标记
doc.getElementsByTagName("*"); //getting all the nodes
doc.getElementsByTagName("name"); //getting nodes <name>
找到一个XML文档中的所有节点,但我想找到一个特定的节点内的所有节点。例如我想要<page>
内的所有节点。请给我建议的方式来做到这一点...
<Pages>
<Page>
<Diagram>
<Widgets>
<Image>
<Name>YmcLogo</Name>
<Rectangle>
<Rectangle X="0" Y="4" Width="130" Height="28" />
</Rectangle>
<Bold>False</Bold>
<BorderColor>Color(argb) = (255, 0, 0, 0)</BorderColor>
<BorderWidth>-1</BorderWidth>
<FillColor>Color(argb) = (255, 255, 255, 255)</FillColor>
<FontName>Arial</FontName>
<FontSize>9.75</FontSize>
<ForeColor>Color(argb) = (255, 0, 0, 0)</ForeColor>
<HorizontalAlignment>Center</HorizontalAlignment>
<Italic>False</Italic>
<Underline>False</Underline>
<VerticalAlignment>Center</VerticalAlignment>
<Widgets>
<TextPanel>
<Html><p style="font-size:13px;text-align:center;line-height:normal;"><span style="font-family:'Arial Regular', 'Arial';font-weight:400;font-style:normal;font-size:13px;color:#000000;text-align:center;line-height:normal;">&nbsp;</span></p></Html>
<Name />
<Rectangle>
<Rectangle X="2" Y="6" Width="126" Height="16" />
</Rectangle>
<Bold>False</Bold>
<BorderColor>Color(argb) = (255, 0, 0, 0)</BorderColor>
<BorderWidth>-1</BorderWidth>
<FillColor>Color(argb) = (255, 255, 255, 255)</FillColor>
<FontName>Arial</FontName>
<FontSize>9.75</FontSize>
<ForeColor>Color(argb) = (255, 0, 0, 0)</ForeColor>
<HorizontalAlignment>Center</HorizontalAlignment>
<Italic>False</Italic>
<Underline>False</Underline>
<VerticalAlignment>Center</VerticalAlignment>
</TextPanel>
</Widgets>
</Image>
<ShapeType>H2</ShapeType>
<Annotation>
<Properties>
<PropertyValue PropertyName="ContainerType">conditionContainer</PropertyValue>
</Properties>
</Annotation>
<FootnoteNumber>1</FootnoteNumber>
<Name>SCMProductGroup</Name>
<Rectangle>
<Rectangle X="72" Y="110" Width="127" Height="15" />
</Rectangle>
<Underline>False</Underline>
<VerticalAlignment>Near</VerticalAlignment>
</Shape>
<Textbox>
<Text />
<Annotation>
<Properties>
<PropertyValue PropertyName="ContainerType">conditionContainer</PropertyValue>
<PropertyValue PropertyName="field_label[多言語対応用キー][多语言对应Key]">label.scmProductGroup</PropertyValue>
<PropertyValue PropertyName="type">text</PropertyValue>
<PropertyValue PropertyName="cvcodeobjary ">scmProductGrp</PropertyValue>
<PropertyValue PropertyName="cvcontainerobjary ">scmProductGrpNm</PropertyValue>
<PropertyValue PropertyName="cvfieldstrary ">scmProductGrpName</PropertyValue>
<PropertyValue PropertyName="cvopenmethod ">scmProductGrp_ajax_codeValue</PropertyValue>
<PropertyValue PropertyName="maxlength[桁数-最大][最大位数]">3</PropertyValue>
<PropertyValue PropertyName="size">3</PropertyValue>
</Properties>
</Annotation>
</Textbox>
<Textbox>
<Text />
<Annotation>
<Properties>
<PropertyValue PropertyName="ContainerType">conditionContainer</PropertyValue>
<PropertyValue PropertyName="type">text</PropertyValue>
<PropertyValue PropertyName="datatype">String</PropertyValue>
<PropertyValue PropertyName="styleClass">display</PropertyValue>
<PropertyValue PropertyName="full-width">False</PropertyValue>
<PropertyValue PropertyName="half-width-al">True</PropertyValue>
<PropertyValue PropertyName="half-width-num">False</PropertyValue>
<PropertyValue PropertyName="half-width-other">False</PropertyValue>
</Properties>
</Annotation>
</Textbox>
<Table>
<Annotation>
<Properties>
<PropertyValue PropertyName="ContainerType">DhtmlX Grid Container</PropertyValue>
<PropertyValue PropertyName="maxlength[桁数-最大][最大位数]">3</PropertyValue>
<PropertyValue PropertyName="size">3</PropertyValue>
<PropertyValue PropertyName="group-name">1</PropertyValue>
<PropertyValue PropertyName="group-type">list</PropertyValue>
<PropertyValue PropertyName="collection">result</PropertyValue>
<PropertyValue PropertyName="edit[入出力区分][输入区分]">true</PropertyValue>
<PropertyValue PropertyName="sort">True</PropertyValue>
</Properties>
</Annotation>
<FootnoteNumber>5</FootnoteNumber>
<Name>DHTMLXgrid</Name>
<Rectangle>
<Rectangle X="20" Y="180" Width="812" Height="140" />
</Rectangle>
</Table>
</Widgets>
</Diagram>
<PackageInfo>
<Name>01::inquiry::list</Name>
</PackageInfo>
</Page>
</Pages>
获取与名称的所有节点<page>
NodeList list = doc.getElementsByTagName("page");
如果有很多,在它们之间迭代和每个让孩子
for (Node node : list)
{
//Get all nodes inside the this <page> element
NodeList childList = node.getChildNodes();
}
如果你真的想所有节点包含在每个<page>
内,您将需要递归功能。这一次将填补它得到的参数列表:
public void getAllChildren(ArrayList<Node> list, Node parentNode)
{
NodeList childList = parentNode.getChildNodes()
for(Node node : childList)
{
list.add(node);
getAllChildren(list, node);
}
}
要使用此功能
ArrayList<Node> allNodes = new ArrayList<Node>();
//Get the first node of all elements of <page>
Node pageNode = doc.getElementsByTagName("page").item(0);
getAllChildren(allNodes, pageNode);
//Now every child and child of child etc is on allNodes
获取页面元素,然后用Element.getElementsByTagName
(不Document.getElementsByTagName
)。例如:
Element pageElement = (Element)doc.getElementsByTagName("Page").item(0);
NodeList result = pageElement.getElementsByTagName("Name");
结果将包含仅限于
线程“main”中的异常java.lang.ClassCastException:com.sun.org.apache.xerces.internal.dom.DeferredElementImpl不能转换为javax.lang.model.element .Element – SOP
@ShashiRanjan,你输入了错误的类。改用'org.w3c.dom.Element'。 –
了解一点XPath。 XPath是专门用于获取XML文档特定部分的小型语言。
例如,要得到所有子要素<Page>
您可以简单地写//Page/*
。或者,如果您更希望同样的后裔元素,使用//Page//*
:
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//Page//*");
NodeList result = (NodeList)expr.evaluate(doc, XPathConstants.NODESET);
参考:
- How to read XML using XPath in Java:其实我不代码在Java中,示例代码从上面这个链接,其适于
- w3school XPath syntax:简单介绍基本的XPath语法,与XPath 1.0的the official documentation进行交叉检查如果您有疑问
这是好的milez我能够提取所有的页面节点,但我想要一个页面节点内的所有节点。 – SOP
这是childList的功能吗?每个'node'都是一个页面,你可以调用getChildNodes来得到另一个层次的节点 – milez
childNode只返回节点下的一个层次。它永远不会返回任何节点内的所有节点。 – SOP