正则表达式匹配的文本

问题描述：

node [ 
    id 2 
    label "node 2" 
    thisIsASampleAttribute 43 
] 
node [ 
    id 3 
    label "node 3" 
    thisIsASampleAttribute 44 
]

我想组的每个节点和它的括号内如内容：

node [ 
    id 2 
    label "node 2" 
    thisIsASampleAttribute 43 
]

不过我我下面的代码分组的全文：

Pattern p = Pattern.compile("node \\[\n(.*|\n)*?\\]", Pattern.MULTILINE); 

Matcher m = p.matcher(text); 

while(m.find()) 
{ 
    System.out.println(m.group()); 
}

编辑文本：

node [\n" + 
"  id 2\n" + 
"  label \"node 2\"\n" + 
"  thisIsASampleAttribute 43\n" + 
" ]\n" + 
" node [\n" + 
"  id 3\n" + 
"  label \"node 3\"\n" + 
"  thisIsASampleAttribute 44\n" + 
" ]\n"

你有足够的斜线？ – 2016-01-23 01:11:19

答

问题是你只捕获最后一个字符(.*|\n)*?（因为.?不在捕获组内）。

您可以将捕获组更改为非捕获组，然后用捕获组包装该捕获组并将其与*?包装在一起以捕获所有匹配((?:.*?|\n)*?)。

Example Here

Pattern p = Pattern.compile("node \\[\\n((?:.*?|\\n)*?)\\]", Pattern.MULTILINE); 
Matcher m = p.matcher(text); 
while(m.find()) 
{ 
    System.out.println(m.group(1)); 
}

但是，正则表达式以上是相对低效的。一种可能更好的方法是将非]字符与否定字符集([^\]]*)匹配。

Example Here

Pattern p = Pattern.compile("node \\[\\n([^\\]]*)\\]", Pattern.MULTILINE); 
Matcher m = p.matcher(text); 
while(m.find()) 
{ 
    System.out.println(m.group(1)); 
}

我不是Java专家，但为什么它只需要'\ n'中的一个斜杠和'\\ [''中的两个斜杠？ – 2016-01-23 01:19:55

似乎仍然是分组的一切。如果有帮助，我已经更新了包含字符的文本的问题 – joe

@joe我添加了示例..您是否在检索第一个捕获组？ 'm.group（1）'？ –

正则表达式匹配的文本

相关推荐