对数组中的匹配字符串进行Perl计数

问题描述：

我有一个填充字符串的数组。我想检查一个特定的字符串是否在这个数组中超过一次，然后打印错误警告。对数组中的匹配字符串进行Perl计数

我在List :: MoreUtils中使用了真正的方法来计算我的匹配。在我的数组中，我有一些字符串，它们的子字符串与同一数组中的其他字符串相同。
因此，如果我检查相同的字符串是否不止一次在数组中，我会得到我的错误警告，即使可能只是具有相同子字符串的另一个字符串。我试图通过添加字符串长度作为模式来解决问题（所以字符串和长度必须相等，即弹出错误消息），但这也不起作用。
我的代码看起来是这样的：

use strict; 
use warnings; 
use List::MoreUtils 'true'; 

my @list = ("one", "two", "three", "onefour", "one"); 

foreach my $f (@list) { 

     my $length = length($f); 
     my $count = true { $length && "$f"} @list; 

      if($count > 1) { 
        print "Error with: ", $f, " counted ", $count, " times!\n"; 
       } 
     $count = 0; 
    }

有了这个代码，我没有得到一个错误警告可言，即使“一”是两次在数组中。如果我不包含长度作为真正方法的模式，那么字符串“one”将被计数三次。

你是否只是“一”被报告为愚蠢？例如。不是子字符串匹配？ – Sobrique

答

我不会为此使用true - 它看起来像你试图做的是'挑出'重复，并不关心子字符串。

my %seen; 
$seen{$_}++ for @list; 
print grep { $seen{$_} > 1 } @list;

所以要复制你的测试：

my %count_of; 
$count_of{$_}++ for @list; 
foreach my $duplicate ( grep { $count_of{$_} > 1 } @list) { 
    print "Error: $duplicate was seen $count_of{$duplicate} time\n"; 
}

我不想“挑选”重复。如果数组中有重复项，我想打印一条错误消息，而不是更改数组/擦除重复项！ – nieka

这不会修改您的数组 - grep会创建一个您打印的“新”。我已经添加了一个片段，我_think_做你想要的东西？ – Sobrique

对不起，迟到的答案。你的回答非常好，解决了我的问题！非常感谢;） – nieka

答

你实际上是不匹配任何东西。我将调试输出添加到您的代码中。

my @list = ("one", "two", "three", "onefour", "one"); 

foreach my $f (@list) { 
    say "f: $f"; 
    my $length = length($f); 
    say "length: $length"; 
    say "true { $length && $f} $_: " . ($length && "$f") for @list; 
    my $count = true { $length && "$f" } @list; 
    say "count: $count"; 

    if ($count > 1) { 
     print "Error with: ", $f, " counted ", $count, " times!\n"; 
    } 
    $count = 0; 
}

让我们一起来看看：

f: one 
length: 3 
true { 3 && one} one: one 
true { 3 && one} two: one 
true { 3 && one} three: one 
true { 3 && one} onefour: one 
true { 3 && one} one: one 
count: 5 
Error with: one counted 5 times! 
f: two 
length: 3 
true { 3 && two} one: two 
true { 3 && two} two: two 
true { 3 && two} three: two 
true { 3 && two} onefour: two 
true { 3 && two} one: two 
count: 5 
Error with: two counted 5 times! 
f: three 
length: 5 
true { 5 && three} one: three 
true { 5 && three} two: three 
true { 5 && three} three: three 
true { 5 && three} onefour: three 
true { 5 && three} one: three 
count: 5 
Error with: three counted 5 times! 
f: onefour 
length: 7 
true { 7 && onefour} one: onefour 
true { 7 && onefour} two: onefour 
true { 7 && onefour} three: onefour 
true { 7 && onefour} onefour: onefour 
true { 7 && onefour} one: onefour 
count: 5 
Error with: onefour counted 5 times! 
f: one 
length: 3 
true { 3 && one} one: one 
true { 3 && one} two: one 
true { 3 && one} three: one 
true { 3 && one} onefour: one 
true { 3 && one} one: one 
count: 5 
Error with: one counted 5 times!

所以，你总是有串$f，这是大于0，因此评价为Perl的true的长度。那么你有$f。这也是true，因为所有不是空字符串的字符串（''）都是真的。

使用true函数遍历@list中的所有元素。该块永远是真实的。所以你总是得到@list中元素的数量。

如果您只想删除双重事件，则可以使用散列来计算它们。

my %count; 
$count{$_}++ for @list; 
my @unique = keys %count; # unsorted 
# see Sobrique's answer with grep for sorted the same way as before

然后也有在List::MoreUtilsuniq。

my @unique = uniq @list;

如果你想知道每个元素，如果它是任何其他元素的子串，你可以使用Perl's builtin index，它发现某个字符串在另一字符串中的位置，以及grep。

foreach my $f (@list) { 
    if (my @matches = grep { $_ ne $f && index($_, $f) > -1 } @list) { 
     warn "$f is a substr of: @matches"; # will auto-join on $, 
    } 
} 

__END__ 

one is a substr of: onefour at /code/scratch.pl line 91. 
one is a substr of: onefour at /code/scratch.pl line 91.

当然，这当然不会得到因为ne因素0和4都是“1”。请注意，如果根本没有匹配，index返回-1。

编辑后your comment on Sobrique's answer：

只得到警告，如果有重复的（或SUBSTR重复），简单地计算他们。没有任何修改发生在任何地方：

my @list = ("one", "two", "three", "onefour", "one"); 

my %count; 
$count{$_}++ for @list; 
warn sprintf 'Number of duplicates: %d', @list - keys %count if @list != keys %count; 

my $count_substr; 
foreach my $f (@list) { 
    $count_substr++ 
     if grep { $_ ne $f && index($_, $f) > -1 } @list; 
} 
warn sprintf 'Number of substring duplicates: %d', $count_substr if $count_substr;

一个更全面的答案。我认为我们得出了类似的结论，认为“真实”并不是真正正确的选择。 – Sobrique

谢谢@Sobrique。 :)你有没有注意到我们的名字看起来很相似？那总是让我感到疲惫。 – simbabque

对数组中的匹配字符串进行Perl计数

相关推荐