对数组中的匹配字符串进行Perl计数
我有一个填充字符串的数组。我想检查一个特定的字符串是否在这个数组中超过一次,然后打印错误警告。对数组中的匹配字符串进行Perl计数
我在List :: MoreUtils中使用了真正的方法来计算我的匹配。 在我的数组中,我有一些字符串,它们的子字符串与同一数组中的其他字符串相同。
因此,如果我检查相同的字符串是否不止一次在数组中,我会得到我的错误警告,即使可能只是具有相同子字符串的另一个字符串。 我试图通过添加字符串长度作为模式来解决问题(所以字符串和长度必须相等,即弹出错误消息),但这也不起作用。
我的代码看起来是这样的:
use strict;
use warnings;
use List::MoreUtils 'true';
my @list = ("one", "two", "three", "onefour", "one");
foreach my $f (@list) {
my $length = length($f);
my $count = true { $length && "$f"} @list;
if($count > 1) {
print "Error with: ", $f, " counted ", $count, " times!\n";
}
$count = 0;
}
有了这个代码,我没有得到一个错误警告可言,即使“一”是两次在数组中。如果我不包含长度作为真正方法的模式,那么字符串“one”将被计数三次。
我不会为此使用true
- 它看起来像你试图做的是'挑出'重复,并不关心子字符串。
my %seen;
$seen{$_}++ for @list;
print grep { $seen{$_} > 1 } @list;
所以要复制你的测试:
my %count_of;
$count_of{$_}++ for @list;
foreach my $duplicate ( grep { $count_of{$_} > 1 } @list) {
print "Error: $duplicate was seen $count_of{$duplicate} time\n";
}
你实际上是不匹配任何东西。我将调试输出添加到您的代码中。
my @list = ("one", "two", "three", "onefour", "one");
foreach my $f (@list) {
say "f: $f";
my $length = length($f);
say "length: $length";
say "true { $length && $f} $_: " . ($length && "$f") for @list;
my $count = true { $length && "$f" } @list;
say "count: $count";
if ($count > 1) {
print "Error with: ", $f, " counted ", $count, " times!\n";
}
$count = 0;
}
让我们一起来看看:
f: one
length: 3
true { 3 && one} one: one
true { 3 && one} two: one
true { 3 && one} three: one
true { 3 && one} onefour: one
true { 3 && one} one: one
count: 5
Error with: one counted 5 times!
f: two
length: 3
true { 3 && two} one: two
true { 3 && two} two: two
true { 3 && two} three: two
true { 3 && two} onefour: two
true { 3 && two} one: two
count: 5
Error with: two counted 5 times!
f: three
length: 5
true { 5 && three} one: three
true { 5 && three} two: three
true { 5 && three} three: three
true { 5 && three} onefour: three
true { 5 && three} one: three
count: 5
Error with: three counted 5 times!
f: onefour
length: 7
true { 7 && onefour} one: onefour
true { 7 && onefour} two: onefour
true { 7 && onefour} three: onefour
true { 7 && onefour} onefour: onefour
true { 7 && onefour} one: onefour
count: 5
Error with: onefour counted 5 times!
f: one
length: 3
true { 3 && one} one: one
true { 3 && one} two: one
true { 3 && one} three: one
true { 3 && one} onefour: one
true { 3 && one} one: one
count: 5
Error with: one counted 5 times!
所以,你总是有串$f
,这是大于0,因此评价为Perl的true
的长度。那么你有$f
。这也是true
,因为所有不是空字符串的字符串(''
)都是真的。
使用true
函数遍历@list
中的所有元素。该块永远是真实的。所以你总是得到@list
中元素的数量。
如果您只想删除双重事件,则可以使用散列来计算它们。
my %count;
$count{$_}++ for @list;
my @unique = keys %count; # unsorted
# see Sobrique's answer with grep for sorted the same way as before
然后也有在List::MoreUtilsuniq
。
my @unique = uniq @list;
如果你想知道每个元素,如果它是任何其他元素的子串,你可以使用Perl's builtin index
,它发现某个字符串在另一字符串中的位置,以及grep
。
foreach my $f (@list) {
if (my @matches = grep { $_ ne $f && index($_, $f) > -1 } @list) {
warn "$f is a substr of: @matches"; # will auto-join on $,
}
}
__END__
one is a substr of: onefour at /code/scratch.pl line 91.
one is a substr of: onefour at /code/scratch.pl line 91.
当然,这当然不会得到因为ne
因素0和4都是“1”。请注意,如果根本没有匹配,index
返回-1
。
编辑后your comment on Sobrique's answer:
只得到警告,如果有重复的(或SUBSTR重复),简单地计算他们。没有任何修改发生在任何地方:
my @list = ("one", "two", "three", "onefour", "one");
my %count;
$count{$_}++ for @list;
warn sprintf 'Number of duplicates: %d', @list - keys %count if @list != keys %count;
my $count_substr;
foreach my $f (@list) {
$count_substr++
if grep { $_ ne $f && index($_, $f) > -1 } @list;
}
warn sprintf 'Number of substring duplicates: %d', $count_substr if $count_substr;
你是否只是“一”被报告为愚蠢?例如。不是子字符串匹配? – Sobrique