捕获查找的输出。-print0到bash数组中

时间:2022-06-09 15:42:34

Using find . -print0 seems to be the only safe way of obtaining a list of files in bash due to the possibility of filenames containing spaces, newlines, quotation marks etc.

使用找到。-print0似乎是在bash中获取文件列表的唯一安全方法,因为可能包含空格、新行、引号等文件名。

However, I'm having a hard time actually making find's output useful within bash or with other command line utilities. The only way I have managed to make use of the output is by piping it to perl, and changing perl's IFS to null:

但是,我很难在bash或其他命令行实用程序中使用find的输出。我设法利用输出的惟一方法是将它与perl连接起来,并将perl的IFS更改为null:

find . -print0 | perl -e '$/="\0"; @files=<>; print $#files;'

This example prints the number of files found, avoiding the danger of newlines in filenames corrupting the count, as would occur with:

此示例打印找到的文件数量,避免文件名中出现换行破坏计数的危险,如下所示:

find . | wc -l

As most command line programs do not support null-delimited input, I figure the best thing would be to capture the output of find . -print0 in a bash array, like I have done in the perl snippet above, and then continue with the task, whatever it may be.

由于大多数命令行程序不支持空分隔输入,我认为最好的方法是捕获find的输出。-print0在bash数组中,就像我在上面的perl代码片段中所做的那样,然后继续执行任务,不管它是什么。

How can I do this?

我该怎么做呢?

This doesn't work:

这并不工作:

find . -print0 | ( IFS=$'\0' ; array=( $( cat ) ) ; echo ${#array[@]} )

A much more general question might be: How can I do useful things with lists of files in bash?

一个更普遍的问题可能是:如何在bash中使用文件列表做有用的事情?

13 个解决方案

#1


95  

Shamelessly stolen from Greg's BashFAQ:

无耻地偷格雷格的BashFAQ:

unset a i
while IFS= read -r -d $'\0' file; do
    a[i++]="$file"        # or however you want to process each file
done < <(find /tmp -type f -print0)

Note that the redirection construct used here (cmd1 < <(cmd2)) is similar to, but not quite the same as the more usual pipeline (cmd2 | cmd1) -- if the commands are shell builtins (e.g. while), the pipeline version executes them in subshells, and any variables they set (e.g. the array a) are lost when they exit. cmd1 < <(cmd2) only runs cmd2 in a subshell, so the array lives past its construction. Warning: this form of redirection is only available in bash, not even bash in sh-emulation mode; you must start your script with #!/bin/bash.

注意,这里使用的重定向构造(cmd1 < <(cmd2))是相似的,但不太一样更常见的管道(cmd2 | cmd1)——如果命令shell内置命令(如同时),管道版本执行轨道,和任何变量(例如数组a)退出时丢失。cmd1 < <(cmd2)只在子shell中运行cmd2,因此数组在构造后仍然存在。警告:这种形式的重定向只能在bash中使用,甚至不能在sh-仿真模式中使用;你必须用#!/bin/bash开始你的脚本。

Also, because the file processing step (in this case, just a[i++]="$file", but you might want to do something fancier directly in the loop) has its input redirected, it cannot use any commands that might read from stdin. To avoid this limitation, I tend to use:

另外,因为文件处理步骤(在本例中,只是一个[i++]="$file",但您可能想要在循环中直接做一些更高级的操作),它的输入重定向,它不能使用任何可能从stdin读取的命令。为了避免这种限制,我倾向于使用:

unset a i
while IFS= read -r -u3 -d $'\0' file; do
    a[i++]="$file"        # or however you want to process each file
done 3< <(find /tmp -type f -print0)

...which passes the file list via unit 3, rather than stdin.

…通过单元3而不是stdin传递文件列表。

#2


7  

Maybe you are looking for xargs:

也许你在寻找xargs:

find . -print0 | xargs -r0 do_something_useful

The option -L 1 could be useful for you too, which makes xargs exec do_something_useful with only 1 file argument.

选项- l1也可能对您有用,这使得xargs exec do_something_use只有一个文件参数。

#3


5  

The main problem is, that the delimiter NUL (\0) is useless here, because it isn't possible to assign IFS a NUL-value. So as good programmers we take care, that the input for our program is something it is able to handle.

主要的问题是,分隔符NUL(\0)在这里没有用,因为不可能为IFS赋一个空值。因此,作为优秀的程序员,我们要注意,我们的程序的输入是它能够处理的。

First we create a little program, which does this part for us:

首先我们创建一个小程序,它为我们做这个部分:

#!/bin/bash
printf "%s" "$@" | base64

...and call it base64str (don't forget chmod +x)

…叫它base64str(别忘了chmod +x)

Second we can now use a simple and straightforward for-loop:

其次,我们现在可以使用一个简单而直接的for循环:

for i in `find -type f -exec base64str '{}' \;`
do 
  file="`echo -n "$i" | base64 -d`"
  # do something with file
done

So the trick is, that a base64-string has no sign which causes trouble for bash - of course a xxd or something similar can also do the job.

因此,诀窍在于,base64-string没有任何符号会给bash带来麻烦——当然,xxd或类似的东西也可以这样做。

#4


3  

Yet another way of counting files:

还有另一种计算文件的方法:

find /DIR -type f -print0 | tr -dc '\0' | wc -c 

#5


1  

I think more elegant solutions exists, but I'll toss this one in. This will also work for filenames with spaces and/or newlines:

我认为存在更优雅的解决方案,但我要把它扔进去。这也适用于带有空格和/或换行的文件名:

i=0;
for f in *; do
  array[$i]="$f"
  ((i++))
done

You can then e.g. list the files one by one (in this case in reverse order):

然后,你可以将文件一一列出(在这种情况下,顺序相反):

for ((i = $i - 1; i >= 0; i--)); do
  ls -al "${array[$i]}"
done

This page gives a nice example, and for more see Chapter 26 in the Advanced Bash-Scripting Guide.

这一页给出了一个很好的例子,在高级的bash脚本指南中更多地看到了第26章。

#6


1  

You can safely do the count with this:

你可以用这个数一下:

find . -exec echo ';' | wc -l

(It prints a newline for every file/dir found, and then count the newlines printed out...)

(它为找到的每个文件/目录打印一个换行符,然后计算输出的换行符…)

#7


1  

Avoid xargs if you can:

尽量避免使用xargs:

man ruby | less -p 777 
IFS=$'\777' 
#array=( $(find ~ -maxdepth 1 -type f -exec printf "%s\777" '{}' \; 2>/dev/null) ) 
array=( $(find ~ -maxdepth 1 -type f -exec printf "%s\777" '{}' + 2>/dev/null) ) 
echo ${#array[@]} 
printf "%s\n" "${array[@]}" | nl 
echo "${array[0]}" 
IFS=$' \t\n' 

#8


1  

I am new but I believe that this an answer; hope it helps someone:

我是新的,但我相信这是一个答案;希望它能帮助一些人:

STYLE="$HOME/.fluxbox/styles/"

declare -a array1

LISTING=`find $HOME/.fluxbox/styles/ -print0 -maxdepth 1 -type f`


echo $LISTING
array1=( `echo $LISTING`)
TAR_SOURCE=`echo ${array1[@]}`

#tar czvf ~/FluxieStyles.tgz $TAR_SOURCE

#9


0  

This is similar to Stephan202's version, but the files (and directories) are put into an array all at once. The for loop here is just to "do useful things":

这类似于Stephan202的版本,但是文件(和目录)同时放入一个数组中。这里的for循环只是为了“做有用的事情”:

files=(*)                        # put files in current directory into an array
i=0
for file in "${files[@]}"
do
    echo "File ${i}: ${file}"    # do something useful 
    let i++
done

To get a count:

得到数:

echo ${#files[@]}

#10


0  

Old question, but no-one suggested this simple method, so I thought I would. Granted if your filenames have an ETX, this doesn't solve your problem, but I suspect it serves for any real-world scenario. Trying to use null seems to run afoul of default IFS handling rules. Season to your tastes with find options and error handling.

老问题,但是没有人提出这个简单的方法,所以我想我会的。假设您的文件名有一个ETX,这并不能解决您的问题,但是我怀疑它适用于任何真实的场景。尝试使用null似乎违反了默认的IFS处理规则。根据您的喜好调整查找选项和错误处理。

savedFS="$IFS"
IFS=$'\x3'
filenames=(`find wherever -printf %p$'\x3'`)
IFS="$savedFS"

#11


0  

Gordon Davisson's answer is great for bash. However a useful shortcut exist for zsh users:

戈登·戴维森的回答对巴什来说很好。但是对于zsh用户有一个有用的快捷方式:

First, place you string in a variable:

首先,将字符串放入一个变量中:

A="$(find /tmp -type f -print0)"

Next, split this variable and store it in an array:

接下来,分割这个变量并将它存储在一个数组中:

B=( ${(s/^@/)A} )

There is a trick: ^@ is the NUL character. To do it, you have to type Ctrl+V followed by Ctrl+@.

有一个技巧:^ @ NUL字符。为此,您必须输入Ctrl+V,然后按Ctrl+@。

You can check each entry of $B contains right value:

你可以核对每一笔$B的每项包含正确的价值:

for i in "$B[@]"; echo \"$i\"

Careful readers may notice that call to find command may be avoided in most cases using ** syntax. For example:

细心的读者可能会注意到,在大多数情况下,使用**语法可以避免调用find命令。例如:

B=( /tmp/** )

#12


0  

Since Bash 4.4, the builtin mapfile has the -d switch (to specify a delimiter, similar to the -d switch of the read statement), and the delimiter can be the null byte. Hence, a nice answer to the question in the title

从Bash 4.4开始,builtin mapfile有-d开关(指定一个分隔符,类似于read语句的-d开关),并且分隔符可以是空字节。因此,题目中有一个很好的答案。

Capturing output of find . -print0 into a bash array

捕获查找的输出。-print0到bash数组中

is:

是:

mapfile -d '' ary < <(find . -print0)

#13


-1  

Bash has never been good at handling filenames (or any text really) because it uses spaces as a list delimiter.

Bash从来不擅长处理文件名(或任何文本),因为它使用空格作为列表分隔符。

I'd recommend using python with the sh library instead.

我建议在sh库中使用python。

#1


95  

Shamelessly stolen from Greg's BashFAQ:

无耻地偷格雷格的BashFAQ:

unset a i
while IFS= read -r -d $'\0' file; do
    a[i++]="$file"        # or however you want to process each file
done < <(find /tmp -type f -print0)

Note that the redirection construct used here (cmd1 < <(cmd2)) is similar to, but not quite the same as the more usual pipeline (cmd2 | cmd1) -- if the commands are shell builtins (e.g. while), the pipeline version executes them in subshells, and any variables they set (e.g. the array a) are lost when they exit. cmd1 < <(cmd2) only runs cmd2 in a subshell, so the array lives past its construction. Warning: this form of redirection is only available in bash, not even bash in sh-emulation mode; you must start your script with #!/bin/bash.

注意,这里使用的重定向构造(cmd1 < <(cmd2))是相似的,但不太一样更常见的管道(cmd2 | cmd1)——如果命令shell内置命令(如同时),管道版本执行轨道,和任何变量(例如数组a)退出时丢失。cmd1 < <(cmd2)只在子shell中运行cmd2,因此数组在构造后仍然存在。警告:这种形式的重定向只能在bash中使用,甚至不能在sh-仿真模式中使用;你必须用#!/bin/bash开始你的脚本。

Also, because the file processing step (in this case, just a[i++]="$file", but you might want to do something fancier directly in the loop) has its input redirected, it cannot use any commands that might read from stdin. To avoid this limitation, I tend to use:

另外,因为文件处理步骤(在本例中,只是一个[i++]="$file",但您可能想要在循环中直接做一些更高级的操作),它的输入重定向,它不能使用任何可能从stdin读取的命令。为了避免这种限制,我倾向于使用:

unset a i
while IFS= read -r -u3 -d $'\0' file; do
    a[i++]="$file"        # or however you want to process each file
done 3< <(find /tmp -type f -print0)

...which passes the file list via unit 3, rather than stdin.

…通过单元3而不是stdin传递文件列表。

#2


7  

Maybe you are looking for xargs:

也许你在寻找xargs:

find . -print0 | xargs -r0 do_something_useful

The option -L 1 could be useful for you too, which makes xargs exec do_something_useful with only 1 file argument.

选项- l1也可能对您有用,这使得xargs exec do_something_use只有一个文件参数。

#3


5  

The main problem is, that the delimiter NUL (\0) is useless here, because it isn't possible to assign IFS a NUL-value. So as good programmers we take care, that the input for our program is something it is able to handle.

主要的问题是,分隔符NUL(\0)在这里没有用,因为不可能为IFS赋一个空值。因此,作为优秀的程序员,我们要注意,我们的程序的输入是它能够处理的。

First we create a little program, which does this part for us:

首先我们创建一个小程序,它为我们做这个部分:

#!/bin/bash
printf "%s" "$@" | base64

...and call it base64str (don't forget chmod +x)

…叫它base64str(别忘了chmod +x)

Second we can now use a simple and straightforward for-loop:

其次,我们现在可以使用一个简单而直接的for循环:

for i in `find -type f -exec base64str '{}' \;`
do 
  file="`echo -n "$i" | base64 -d`"
  # do something with file
done

So the trick is, that a base64-string has no sign which causes trouble for bash - of course a xxd or something similar can also do the job.

因此,诀窍在于,base64-string没有任何符号会给bash带来麻烦——当然,xxd或类似的东西也可以这样做。

#4


3  

Yet another way of counting files:

还有另一种计算文件的方法:

find /DIR -type f -print0 | tr -dc '\0' | wc -c 

#5


1  

I think more elegant solutions exists, but I'll toss this one in. This will also work for filenames with spaces and/or newlines:

我认为存在更优雅的解决方案,但我要把它扔进去。这也适用于带有空格和/或换行的文件名:

i=0;
for f in *; do
  array[$i]="$f"
  ((i++))
done

You can then e.g. list the files one by one (in this case in reverse order):

然后,你可以将文件一一列出(在这种情况下,顺序相反):

for ((i = $i - 1; i >= 0; i--)); do
  ls -al "${array[$i]}"
done

This page gives a nice example, and for more see Chapter 26 in the Advanced Bash-Scripting Guide.

这一页给出了一个很好的例子,在高级的bash脚本指南中更多地看到了第26章。

#6


1  

You can safely do the count with this:

你可以用这个数一下:

find . -exec echo ';' | wc -l

(It prints a newline for every file/dir found, and then count the newlines printed out...)

(它为找到的每个文件/目录打印一个换行符,然后计算输出的换行符…)

#7


1  

Avoid xargs if you can:

尽量避免使用xargs:

man ruby | less -p 777 
IFS=$'\777' 
#array=( $(find ~ -maxdepth 1 -type f -exec printf "%s\777" '{}' \; 2>/dev/null) ) 
array=( $(find ~ -maxdepth 1 -type f -exec printf "%s\777" '{}' + 2>/dev/null) ) 
echo ${#array[@]} 
printf "%s\n" "${array[@]}" | nl 
echo "${array[0]}" 
IFS=$' \t\n' 

#8


1  

I am new but I believe that this an answer; hope it helps someone:

我是新的,但我相信这是一个答案;希望它能帮助一些人:

STYLE="$HOME/.fluxbox/styles/"

declare -a array1

LISTING=`find $HOME/.fluxbox/styles/ -print0 -maxdepth 1 -type f`


echo $LISTING
array1=( `echo $LISTING`)
TAR_SOURCE=`echo ${array1[@]}`

#tar czvf ~/FluxieStyles.tgz $TAR_SOURCE

#9


0  

This is similar to Stephan202's version, but the files (and directories) are put into an array all at once. The for loop here is just to "do useful things":

这类似于Stephan202的版本,但是文件(和目录)同时放入一个数组中。这里的for循环只是为了“做有用的事情”:

files=(*)                        # put files in current directory into an array
i=0
for file in "${files[@]}"
do
    echo "File ${i}: ${file}"    # do something useful 
    let i++
done

To get a count:

得到数:

echo ${#files[@]}

#10


0  

Old question, but no-one suggested this simple method, so I thought I would. Granted if your filenames have an ETX, this doesn't solve your problem, but I suspect it serves for any real-world scenario. Trying to use null seems to run afoul of default IFS handling rules. Season to your tastes with find options and error handling.

老问题,但是没有人提出这个简单的方法,所以我想我会的。假设您的文件名有一个ETX,这并不能解决您的问题,但是我怀疑它适用于任何真实的场景。尝试使用null似乎违反了默认的IFS处理规则。根据您的喜好调整查找选项和错误处理。

savedFS="$IFS"
IFS=$'\x3'
filenames=(`find wherever -printf %p$'\x3'`)
IFS="$savedFS"

#11


0  

Gordon Davisson's answer is great for bash. However a useful shortcut exist for zsh users:

戈登·戴维森的回答对巴什来说很好。但是对于zsh用户有一个有用的快捷方式:

First, place you string in a variable:

首先,将字符串放入一个变量中:

A="$(find /tmp -type f -print0)"

Next, split this variable and store it in an array:

接下来,分割这个变量并将它存储在一个数组中:

B=( ${(s/^@/)A} )

There is a trick: ^@ is the NUL character. To do it, you have to type Ctrl+V followed by Ctrl+@.

有一个技巧:^ @ NUL字符。为此,您必须输入Ctrl+V,然后按Ctrl+@。

You can check each entry of $B contains right value:

你可以核对每一笔$B的每项包含正确的价值:

for i in "$B[@]"; echo \"$i\"

Careful readers may notice that call to find command may be avoided in most cases using ** syntax. For example:

细心的读者可能会注意到,在大多数情况下,使用**语法可以避免调用find命令。例如:

B=( /tmp/** )

#12


0  

Since Bash 4.4, the builtin mapfile has the -d switch (to specify a delimiter, similar to the -d switch of the read statement), and the delimiter can be the null byte. Hence, a nice answer to the question in the title

从Bash 4.4开始,builtin mapfile有-d开关(指定一个分隔符,类似于read语句的-d开关),并且分隔符可以是空字节。因此,题目中有一个很好的答案。

Capturing output of find . -print0 into a bash array

捕获查找的输出。-print0到bash数组中

is:

是:

mapfile -d '' ary < <(find . -print0)

#13


-1  

Bash has never been good at handling filenames (or any text really) because it uses spaces as a list delimiter.

Bash从来不擅长处理文件名(或任何文本),因为它使用空格作为列表分隔符。

I'd recommend using python with the sh library instead.

我建议在sh库中使用python。