我的正则表达式不匹配,我不能确定为什么
问题描述:
这里是我想一个标量内匹配的文本的例子:我的正则表达式不匹配,我不能确定为什么
1 N [51]Gone Girl [52]Fox $37,513,109 - 3,014 - $12,446 $37,513,109 $61 1
2 N [53]Annabelle [54]WB (NL) $37,134,255 - 3,185 - $11,659 $37,134,255 $6.5 1
3 1 [55]The Equalizer [56]Sony $18,750,375 -45.1% 3,236 - $5,794 $64,236,992 $55 2
4 3 [57]The Boxtrolls [58]Focus $11,979,588 -30.7% 3,464 - $3,458 $32,093,796 $60 2
5 2 [59]The Maze Runner [60]Fox $11,634,764 -33.3% 3,605 -33 $3,227 $73,556,159 $34 3
6 N [61]Left Behind (2014) [62]Free $6,300,147 - 1,825 - $3,452 $6,300,147 $16 1
7 4 [63]This is Where I Leave You [64]WB $4,009,345 -41.8% 2,735 -133 $1,466 $29,012,573 $19.8 3
8 5 [65]Dolphin Tale 2 [66]WB $3,422,377 -28.5% 2,790 -586 $1,227 $37,866,130 $36 4
下面是我用的是韩元正则表达式似乎没有匹配。任何人都可以找出原因
if ($allData =~ /(\d+)\s+(\d+|[N])\s+(\[\d+\])(.+)\s+(\[\d+\])(.+)\s+(\$\.+)\s+(\-|\+\d+\.\d+%|\-\d+\.\d+%)\s+(\d+)\s+(\-\d+|\-|\+\d+)\s+(\$\.+)\s+(\$\.+)\s+(\.+)\s+(\d+)/g)
{
$current[$i] = $1;
$last[$i] = $2;
$title[$i] = $4;
$week[$i] = $7;
$cume[$i] = $12;
printf("%-4s%-4s%-35s%-10s%-10s", $current[$i], $last[$i], $title[$i], $week[$i], $cume[$i]);
if ($last[$i] ne '-'){
$gain = $last[$i] - $current[$i];
}
if ($gain < $bigloss){
$bigloss = $gain;
$losstitle = $title[$i];
}
if ($gain > $biggain){
$biggain = $gain;
$gaintitle = $title[$i];
}
if ($last[$i] eq '-'){
if ($current[$i] < $bigdebut){
$bigdebut = $current[$i];
$bigdebuttitle = $title[$i];
}
if ($current[$i] > $weakdebut){
$weakdebut = $current[$i];
$weakdebuttitle = $title[$i];
}
}
$i++;
}
答
可能是修复 -
# /(\d+)\s+(\d+|[N])\s+(\[\d+\])(.+?)\s+(\[\d+\])(.+?)\s+(\$.+?)\s+(\-|\+\d+\.\d+%|\-\d+\.\d+%)\s+([\d,]+)\s+(\-\d+|\-|\+\d+)\s+(\$.+?)\s+(\$.+?)\s+(.+?)\s+(\d+)/g
(\d+) # (1)
\s+
(\d+ | [N]) # (2)
\s+
(\[ \d+ \]) # (3)
(.+?) # (4)
\s+
(\[ \d+ \]) # (5)
(.+?) # (6)
\s+
(\$ .+?) # (7)
\s+
( # (8 start)
\-
| \+ \d+ \. \d+ %
| \- \d+ \. \d+ %
) # (8 end)
\s+
([\d,]+) # (9)
\s+
(\- \d+ | \- | \+ \d+) # (10)
\s+
(\$ .+?) # (11)
\s+
(\$ .+?) # (12)
\s+
(.+?) # (13)
\s+
(\d+) # (14)
输出样本:
** Grp 0 - (pos 506 , len 98)
7 4 [63]This is Where I Leave You [64]WB $4,009,345 -41.8% 2,735 -133 $1,466 $29,012,573 $19.8 3
** Grp 1 - (pos 506 , len 1)
7
** Grp 2 - (pos 508 , len 1)
4
** Grp 3 - (pos 510 , len 4)
[63]
** Grp 4 - (pos 514 , len 25)
This is Where I Leave You
** Grp 5 - (pos 540 , len 4)
[64]
** Grp 6 - (pos 544 , len 2)
WB
** Grp 7 - (pos 547 , len 10)
$4,009,345
** Grp 8 - (pos 558 , len 6)
-41.8%
** Grp 9 - (pos 565 , len 5)
2,735
** Grp 10 - (pos 571 , len 4)
-133
** Grp 11 - (pos 578 , len 6)
$1,466
** Grp 12 - (pos 585 , len 11)
$29,012,573
** Grp 13 - (pos 597 , len 5)
$19.8
** Grp 14 - (pos 603 , len 1)
3
+0
这个正则表达式工作出色 – Neonjoe 2014-10-09 21:52:13
答
试试这个正则表达式:
\d\s[A-Z0-9]\s\[\d\d\][A-Z][a-z]+(\s\b\w+\b){0,}\s(\(\d+\)\s)?\[\d\d\][A-Z]+[a-z]*\s(\(\w+\)\s)?\$(\d{1,3},){2}\d{3}\s-\s?\d+[,.]\d+((%\s\d,\d{1,3}\s-\s?\$?\d{1,3}(,\d{1,3}\s)?)|\s-\s\$\d{1,3},\d{1,3}\s)\s?\$\d{1,3},\d{1,3}(,\d{1,3})*\s\$\d{1,3}(,\d{1,3})*(\.\d+)?(\s\$\d+(\.)?\d+)?\s\d
尝试使用可视化正则表达式工具,如http://regex101.com/ – tinkertime 2014-10-09 19:12:12
我没有看到perl的选项。 python/java中的正则表达式与perl处理相同吗? – Neonjoe 2014-10-09 19:18:13
我只是将它设置在python上,并且能够逐个扫描每个捕获,直到我能够创建匹配。感谢您的网站,我知道这将非常方便。 我的全局匹配字符前面有一些错位的转义字符,这使得它寻找字面字符。感谢一群 – Neonjoe 2014-10-09 19:26:51