以特定顺序从文件中选择某些行的简便方法

I have a text file, with many lines. I also have a selected number of lines I want to print out, in certain order. Let's say, for example, "5, 3, 10, 6". In this order.

我有一个文本文件,有很多行。我也有一定数量的线要打印出来,按一定的顺序排列。比方说,例如,“5,3,10,6”。按此顺序。

Is there some easy and "canonical" way of doing this? (with "standard" Linux tools, and bash)

这样做有一些简单和“规范”的方式吗? (使用“标准”Linux工具和bash)

When I tried the answers from this question

当我尝试这个问题的答案时

Bash tool to get nth line from a file

Bash工具从文件中获取第n行

it always prints the lines in order they are in the file.

它总是按照文件中的顺序打印行。

6 个解决方案

#1

A rather efficient method if your file is not too large is to read it all in memory, in an array, one line per field using mapfile (this is a Bash ≥4 builtin):

如果你的文件不是太大,一个相当有效的方法是在内存,数组中读取所有内容,每个字段使用mapfile一行(这是Bash≥4内置):

mapfile -t array < file.txt

Then you can echo all the lines you want in any order, e.g.,

然后,您可以按任何顺序回显所需的所有行,例如,

printf '%s\n' "${array[4]}" "${array[2]}" "${array[9]}" "${array[5]}"

to print the lines 5, 3, 10, 6. Now you'll feel it's a bit awkward that the array fields start with a 0 so that you have to offset your numbers. This can be easily cured with the -O option of mapfile:

打印第5,3,10,6行。现在你会觉得数组字段以0开头有点尴尬,所以你必须要抵消你的数字。使用mapfile的-O选项可以很容易地解决这个问题:

mapfile -t -O 1 array < file.txt

this will start assigning to array at index 1, so that you can print your lines 5, 3, 10 and 6 as:

这将开始分配索引1处的数组,以便您可以将第5,3,10和6行打印为:

printf '%s\n' "${array[5]}" "${array[3]}" "${array[10]}" "${array[6]}"

Finally, you want to make a wrapper function for this:

最后,您想为此创建一个包装函数:

printlines() {
    local i
    for i; do printf '%s\n' "${array[i]}"; done
}

so that you can just state:

所以你可以说:

printlines 5 3 10 6

And it's all pure Bash, no external tools!

它都是纯粹的Bash,没有外部工具!

As @glennjackmann suggests in the comments you can make the helper function also take care of reading the file (passed as argument):

正如@glennjackmann在评论中建议你可以让辅助函数也负责读取文件(作为参数传递):

printlinesof() {
    # $1 is filename
    # $2,... are the lines to print
    local i array
    mapfile -t -O 1 array < "$1" || return 1
    shift
    for i; do printf '%s\n' "${array[i]}"; done
}

Then you can use it as:

然后你可以用它作为:

printlinesof file.txt 5 3 10 6

And if you also want to handle stdin:

如果你还想处理stdin:

printlinesof() {
    # $1 is filename or - for stdin
    # $2,... are the lines to print
    local i array file=$1
    [[ $file = - ]] && file=/dev/stdin
    mapfile -t -O 1 array < "$file" || return 1
    shift
    for i; do printf '%s\n' "${array[i]}"; done
}

so that

printf '%s\n' {a..z} | printlinesof - 5 3 10 6

will also work.

也会工作。

#2

A one liner using sed:

使用sed的一个衬垫:

for i in 5 3 10 6 ; do  sed -n "${i}p" < ff; done

#3

Here is one way using awk:

这是使用awk的一种方法:

awk -v s='5,3,10,6' 'BEGIN{split(s, a, ","); for (i=1; i<=length(a); i++) b[a[i]]=i}
        b[NR]{data[NR]=$0} END{for (i=1; i<=length(a); i++) print data[a[i]]}' file

Testing:

cat file
Line 1
Line 2
Line 3
Line 4
Line 5
Line 6
Line 7
Line 8
Line 9
Line 10
Line 11
Line 12

awk -v s='5,3,10,6' 'BEGIN{split(s, a, ","); for (i=1; i<=length(a); i++) b[a[i]]=i}
        b[NR]{data[NR]=$0} END{for (i=1; i<=length(a); i++) print data[a[i]]}' file
Line 5
Line 3
Line 10
Line 6

#4

First, generate a sed expression that would print the lines with a number at the beginning that you can later use to sort the output:

首先,生成一个sed表达式,该表达式将在开头打印一个数字,以后可以用来对输出进行排序:

#!/bin/bash
lines=(5 3 10 6)
sed=''
i=0
for line in "${lines[@]}" ; do
    sed+="${line}s/^/$((i++)) /p;"
done

for i in {a..z} ; do echo $i ; done \
    | sed -n "$sed" \
    | sort -n \
    | cut -d' ' -f2-

I's probably use Perl, though:

不过我可能会使用Perl:

for c in {a..z} ; do echo $c ; done \
| perl -e 'undef @lines{@ARGV};
           while (<STDIN>) {
               $lines{$.} = $_ if exists $lines{$.};
           }
           print @lines{@ARGV};
          ' 5 3 10 6

You can also use Perl instead of hacking with sed in the first solution:

您也可以在第一个解决方案中使用Perl而不是使用sed进行黑客攻击:

for c in {a..z} ; do echo $c ; done \
| perl -e ' %lines = map { $ARGV[$_], ++$i } 0 .. $#ARGV;
            while (<STDIN>) {
                print "$lines{$.} $_" if exists $lines{$.};
            }
          ' 5 3 10 6 | sort -n | cut -d' ' -f2-

#5

l=(5 3 10 6)
printf "%s\n" {a..z} | 
sed -n "$(printf "%d{=;p};" "${l[@]}")" | 
paste - - | {
    while IFS=$'\t' read -r nr text; do 
        line[nr]=$text
    done
    for n in "${l[@]}"; do
        echo "${line[n]}"
    done
}

#6

You can use the nl trick: number the lines in the input and join the output with the list of actual line numbers. Additional sorts are needed to make the join possible as it needs sorted input (so the nl trick is used once more the number the expected lines):

您可以使用nl技巧:对输入中的行进行编号,并将输出与实际行号列表连接起来。需要额外的排序才能使连接成为可能,因为它需要排序输入(因此nl技巧再次使用预期行的数量):

#! /bin/bash

LINES=(5 3 10 6)

lines=$( IFS=$'\n' ; echo "${LINES[*]}" | nl )

for c in {a..z} ; do
    echo $c
done | nl \
    | grep -E '^\s*('"$( IFS='|' ; echo "${LINES[*]}")"')\s' \
    | join -12 -21 <(echo "$lines" | sort -k2n) - \
    | sort -k2n \
    | cut -d' ' -f3-

#1