如何在grep获得的每行末尾添加一个值

问题描述：

我有一些CSV文件，我想用grep解析（或从终端获取其他函数）以提取一些信息。他们以这种形式：如何在grep获得的每行末尾添加一个值

* Comment 1 
* Comment line 2 explaining what the following numbers mean 
1000000 ; 3208105 ; 0.18 ; 0.45 ; 0.00015 ; 0.1485 ; 0.03 ; 1 ; 1 ; 5 ; 477003 ; 

* Comment 3 
* Comment 4 explaining the meaning of the following lines 

* Comment 5 
0; 706520; p; 30.4983 
1; 20859; p; 57.8 
2; 192814; p; 111.842 
3; 344542; p; 130.543 
4; 54605; p; 131.598 
5; 64746; d; 140.898 
6; 442082; p; 214.11 
7; 546701; p; 249.167 
8; 298394; p; 305.034 
9; 81188; p; 305.034 
.......

在每个文件中可能有至多一个线，其中第三场等于d而不是p。所以要么有一行包含d或者没有。

我有很多像这样的文件，我想要做的是从每个文件中提取包含字母d的行（如果存在），并在此行后追加第一个非注释行的最后一个参数，在这个例子中是47703。

到目前为止，我设法分开提取我需要的线。

有了这个我可以提取从每一个文件I具有包含d每一行：

：

grep -h -E ' d;' *.csv > output.csv

而与此我可以从像在例的文件中提取准确数量47703

grep -v -e "^*" -e " p; " -e " d; " example_file.csv | cut -d \; -f 11

但我不知道如何把这两个放在一起。

最终的输出，我想从一开始的例子来获得是这样的一行：

5; 64746; d; 140.898; 47703

，我想有这样一行在当前目录中的所有CSV文件。

有没有办法做到这一点？

请加样品输入所需输出为输入您的问题样本。 – Cyrus

我做到了。输入是第一个例子，输出是最后一行 – jackscorrow

答

我用循环来所有的.csv文件，并从里grep变量，即在每个循环的结束级联分配的返回值附和道：

for f in *.csv ; do value=`grep -v -e "^*" -e " p; " -e " d; " -e '^\s*$' "$f" | cut -d \; -f 11` ; line=`grep -h -E ' d;' "$f" ; echo "$line;$value" ; done

编辑：（我还加-e '^\s*$'首先grep，得到与第一个没有评论线上的值线，之前，它匹配空行）

这只能回应像5; 64746; d; 140.898; 47703，你想要的行。如果你想将其重定向到一些文件（找到的所有线路将在单一的输出文件），你可以把它添加到回声持续在很长的命令，如：

for f in *.csv ; do value=`grep -v -e "^*" -e " p; " -e " d; " -e '^\s*$' "$f" | cut -d \; -f 11` ; line=`grep -h -E ' d;' "$f" ; echo "$line;$value" > output.csv ; done

的可读性，相同的代码上多行：

for f in *.csv 
do 
    value=`grep -v -e "^*" -e " p; " -e " d; " -e '^\s*$' "$f" | cut -d \; -f 11` 
    line=`grep -h -E ' d;' "$f" 
    echo "$line;$value" 
done

答

这听起来像sed工作：

parse.sed（GNU SED）

/^ +$/d       # Ignore empty lines 
/^[ 0-9;.]+$/h     # Save first "number-only" line to hold space 
/d;/{       # Run block on lines containing ' d; ' 
    G        # Copy saved line to pattern space 
    s/\n.*; ([0-9]+) *; *$/; \1/ # Append the last number on the second line 
    p        # to the first line and print the result 
}

解析。sed的（便携式SED）这样

# Ignore empty lines 
/^ +$/d       

# Save first "number-only" line to hold space 
/^[ 0-9;.]+$/h     

# Run block on lines containing ' d; ' 
/d;/{       

    # Copy saved line to pattern space 
    G        

    # Append the last number on the second line 
    # to the first line and print the result 
    s/\n.*; ([0-9]+) *; *$/; \1/ 
    p        
}

运行：

sed -Enf parse.sed infile.csv

输出：

5; 64746; d; 140.898; 477003

注意，这里假设你只有包含字符组[ 0-9;.]在一行文件。

要在所有本地CSV文件，运行此执行以下操作：

sed -Enf parse.sed *.csv

当我尝试运行它到一个文件时，它给了我错误'sed：1：parse.sed：在d命令结尾的额外字符' – jackscorrow

@jackscorrow ：对不起，我没有在BSD sed中测试脚本。看到增加的便携式版本 – Thor

好吧，现在它工作。谢谢！只要我可以，我会尝试你的解决方案，看看哪一个更好 – jackscorrow

如何在grep获得的每行末尾添加一个值

相关推荐