如何计算每个目录中的文件数?

时间:2021-11-15 01:04:13

I am able to list all the directories by

我可以列出所有的目录。

find ./ -type d

I attempted to list the contents of each directory and count the number of files in each directory by using the following command

我尝试使用以下命令列出每个目录的内容并计算每个目录中文件的数量

find ./ -type d | xargs ls -l | wc -l

But this summed the total number of lines returned by

但这是对返回的行数的总和

find ./ -type d | xargs ls -l

Is there a way I can count the number of files in each directory?

是否有一种方法可以计算每个目录中的文件数?

11 个解决方案

#1


65  

Assuming you have GNU find, let it find the directories and let bash do the rest:

假设您有GNU find,让它找到目录,让bash执行其余操作:

find . -type d -print0 | while read -d '' -r dir; do
    files=("$dir"/*)
    printf "%5d files in directory %s\n" "${#files[@]}" "$dir"
done

#2


65  

This prints the file count per directory for the current directory level:

这将打印当前目录级别的每个目录的文件计数:

du -a | cut -d/ -f2 | sort | uniq -c | sort -nr

#3


11  

You could arrange to find all the files, remove the file names, leaving you a line containing just the directory name for each file, and then count the number of times each directory appears:

您可以安排查找所有的文件,删除文件名,留下一行只包含每个文件的目录名,然后计算每个目录出现的次数:

find . -type f |
sed 's%/[^/]*$%%' |
sort |
uniq -c

The only gotcha in this is if you have any file names or directory names containing a newline character, which is fairly unlikely. If you really have to worry about newlines in file names or directory names, I suggest you find them, and fix them so they don't contain newlines (and quietly persuade the guilty party of the error of their ways).

这里唯一的问题是如果您有任何包含换行字符的文件名或目录名,这是不太可能的。如果您确实需要担心文件名或目录名中的换行符,我建议您找到它们,并对它们进行修复,使它们不包含换行符(并悄悄地说服错误方承认它们的错误)。


If you're interested in the count of the files in each sub-directory of the current directory, counting any files in any sub-directories along with the files in the immediate sub-directory, then I'd adapt the sed command to print only the top-level directory:

如果您对当前目录的每个子目录中的文件计数感兴趣,那么我将调整sed命令,只打印*目录:

find . -type f |
sed -e 's%^\(\./[^/]*/\).*$%\1%' -e 's%^\.\/[^/]*$%./%' |
sort |
uniq -c

The first pattern captures the start of the name, the dot, the slash, the name up to the next slash and the slash, and replaces the line with just the first part, so:

第一个模式捕获名称、点、斜线、到下一个斜线和斜线的名称的开头,并用第一部分替换行,因此:

./dir1/dir2/file1

is replaced by

取而代之的是

./dir1/

The second replace captures the files directly in the current directory; they don't have a slash at the end, and those are replace by ./. The sort and count then works on just the number of names.

第二个替换直接捕获当前目录中的文件;它们的末尾没有斜线,它们被。/替换。排序和计数只对名称的数量起作用。

#4


9  

Here's one way to do it, but probably not the most efficient.

这里有一种方法,但可能不是最有效的。

find -type d -print0 | xargs -0 -n1 bash -c 'echo -n "$1:"; ls -1 "$1" | wc -l' --

Gives output like this, with directory name followed by count of entries in that directory. Note that the output count will also include directory entries which may not be what you want.

输出如下所示,目录名后面跟着该目录中的条目数。注意,输出计数还将包括目录项,这可能不是您想要的。

./c/fa/l:0
./a:4
./a/c:0
./a/a:1
./a/a/b:0

#5


4  

Everyone else's solution has one drawback or another.

每个人的解决方案都有一个缺点。

find -type d -readable -exec sh -c 'printf "%s " "$1"; ls -1UA "$1" | wc -l' sh {} ';'

Explanation:

解释:

  • -type d: we're interested in directories.
  • 我们对目录感兴趣。
  • -readable: We only want them if it's possible to list the files in them. Note that find will still emit an error when it tries to search for more directories in them, but this prevents calling -exec for them.
  • -可读:我们只在有可能列出文件时才需要它们。注意,当查找目录中更多的目录时,find仍然会产生一个错误,但是这可以防止调用-exec。
  • -exec sh -c BLAH sh {} ';': for each directory, run this script fragment, with $0 set to sh and $1 set to the filename.
  • -exec sh -c BLAH sh{};':对于每个目录,运行这个脚本片段,将$0设置为sh,将$1设置为文件名。
  • printf "%s " "$1": portably and minimally print the directory name, followed by only a space, not a newline.
  • printf“%s”“$1”:可移植地和最低限度地打印目录名,后面只有一个空格,而不是换行符。
  • ls -1UA: list the files, one per line, in directory order (to avoid stalling the pipe), excluding only the special directories . and ..
  • ls -1UA:按目录顺序列出每一行的文件(以避免阻塞管道),只排除特殊目录。和. .
  • wc -l: count the lines
  • 数一下这些线

#6


4  

This can also be done with looping over ls instead of find

这也可以通过在ls上循环而不是find来实现

for f in */; do echo "$f -> $(ls $f | wc -l)"; done

f * /;重复“$f -> $(ls $f | wc -l)”;完成

Explanation:

解释:

for f in */; - loop over all directories

f * /;-循环遍历所有目录

do echo "$f -> - print out each directory name

是否回显“$f -> -打印出每个目录名

$(ls $f | wc -l) - call ls for this directory and count lines

$(ls $f | wc -l) -为这个目录调用ls并计数行

#7


1  

This should return the directory name followed by the number of files in the directory.

这将返回目录名,后面跟着目录中的文件数量。

findfiles() {
    echo "$1" $(find "$1" -maxdepth 1 -type f | wc -l)
}

export -f findfiles

find ./ -type d -exec bash -c 'findfiles "$0"' {} \;

Example output:

示例输出:

./ 6
./foo 1
./foo/bar 2
./foo/bar/bazzz 0
./foo/bar/baz 4
./src 4

The export -f is required because the -exec argument of find does not allow executing a bash function unless you invoke bash explicitly, and you need to export the function defined in the current scope to the new shell explicitly.

导出-f是必需的,因为find的-exec参数不允许执行bash函数,除非显式地调用bash,并且需要显式地将当前作用域中定义的函数导出到新shell。

#8


1  

I am living this here, for future reminder

我就住在这里,为了将来的提醒

ls |parallel 'echo {} && ls {}|wc -l'

#9


0  

find . -type f -printf '%h\n' | sort | uniq -c

找到。-type f -printf '%h\n' | sort | uniq -c

gives for example:

例如:

  5 .
  4 ./aln
  5 ./aln/iq
  4 ./bs
  4 ./ft
  6 ./hot

#10


0  

I tried with some of the others here but ended up with subfolders included in the file count when I only wanted the files. This prints ./folder/path<tab>nnn with the number of files, not including subfolders, for each subfolder in the current folder.

我在这里尝试了其他的一些文件,但最终得到了包含在文件计数中的子文件夹,而我只需要这些文件。这将为当前文件夹中的每个子文件夹打印./folder/path nnn文件的数量,不包括子文件夹。

for d in `find . -type d -print` 
do 
  echo -e "$d\t$(find $d -maxdepth 1 -type f -print | wc -l)"
done

#11


0  

This will give the overall count.

这将给出整体的计算结果。

for file in */; do echo "$file -> $(ls $file | wc -l)"; done | cut -d ' ' -f 3| py --ji -l 'numpy.sum(l)'

#1


65  

Assuming you have GNU find, let it find the directories and let bash do the rest:

假设您有GNU find,让它找到目录,让bash执行其余操作:

find . -type d -print0 | while read -d '' -r dir; do
    files=("$dir"/*)
    printf "%5d files in directory %s\n" "${#files[@]}" "$dir"
done

#2


65  

This prints the file count per directory for the current directory level:

这将打印当前目录级别的每个目录的文件计数:

du -a | cut -d/ -f2 | sort | uniq -c | sort -nr

#3


11  

You could arrange to find all the files, remove the file names, leaving you a line containing just the directory name for each file, and then count the number of times each directory appears:

您可以安排查找所有的文件,删除文件名,留下一行只包含每个文件的目录名,然后计算每个目录出现的次数:

find . -type f |
sed 's%/[^/]*$%%' |
sort |
uniq -c

The only gotcha in this is if you have any file names or directory names containing a newline character, which is fairly unlikely. If you really have to worry about newlines in file names or directory names, I suggest you find them, and fix them so they don't contain newlines (and quietly persuade the guilty party of the error of their ways).

这里唯一的问题是如果您有任何包含换行字符的文件名或目录名,这是不太可能的。如果您确实需要担心文件名或目录名中的换行符,我建议您找到它们,并对它们进行修复,使它们不包含换行符(并悄悄地说服错误方承认它们的错误)。


If you're interested in the count of the files in each sub-directory of the current directory, counting any files in any sub-directories along with the files in the immediate sub-directory, then I'd adapt the sed command to print only the top-level directory:

如果您对当前目录的每个子目录中的文件计数感兴趣,那么我将调整sed命令,只打印*目录:

find . -type f |
sed -e 's%^\(\./[^/]*/\).*$%\1%' -e 's%^\.\/[^/]*$%./%' |
sort |
uniq -c

The first pattern captures the start of the name, the dot, the slash, the name up to the next slash and the slash, and replaces the line with just the first part, so:

第一个模式捕获名称、点、斜线、到下一个斜线和斜线的名称的开头,并用第一部分替换行,因此:

./dir1/dir2/file1

is replaced by

取而代之的是

./dir1/

The second replace captures the files directly in the current directory; they don't have a slash at the end, and those are replace by ./. The sort and count then works on just the number of names.

第二个替换直接捕获当前目录中的文件;它们的末尾没有斜线,它们被。/替换。排序和计数只对名称的数量起作用。

#4


9  

Here's one way to do it, but probably not the most efficient.

这里有一种方法,但可能不是最有效的。

find -type d -print0 | xargs -0 -n1 bash -c 'echo -n "$1:"; ls -1 "$1" | wc -l' --

Gives output like this, with directory name followed by count of entries in that directory. Note that the output count will also include directory entries which may not be what you want.

输出如下所示,目录名后面跟着该目录中的条目数。注意,输出计数还将包括目录项,这可能不是您想要的。

./c/fa/l:0
./a:4
./a/c:0
./a/a:1
./a/a/b:0

#5


4  

Everyone else's solution has one drawback or another.

每个人的解决方案都有一个缺点。

find -type d -readable -exec sh -c 'printf "%s " "$1"; ls -1UA "$1" | wc -l' sh {} ';'

Explanation:

解释:

  • -type d: we're interested in directories.
  • 我们对目录感兴趣。
  • -readable: We only want them if it's possible to list the files in them. Note that find will still emit an error when it tries to search for more directories in them, but this prevents calling -exec for them.
  • -可读:我们只在有可能列出文件时才需要它们。注意,当查找目录中更多的目录时,find仍然会产生一个错误,但是这可以防止调用-exec。
  • -exec sh -c BLAH sh {} ';': for each directory, run this script fragment, with $0 set to sh and $1 set to the filename.
  • -exec sh -c BLAH sh{};':对于每个目录,运行这个脚本片段,将$0设置为sh,将$1设置为文件名。
  • printf "%s " "$1": portably and minimally print the directory name, followed by only a space, not a newline.
  • printf“%s”“$1”:可移植地和最低限度地打印目录名,后面只有一个空格,而不是换行符。
  • ls -1UA: list the files, one per line, in directory order (to avoid stalling the pipe), excluding only the special directories . and ..
  • ls -1UA:按目录顺序列出每一行的文件(以避免阻塞管道),只排除特殊目录。和. .
  • wc -l: count the lines
  • 数一下这些线

#6


4  

This can also be done with looping over ls instead of find

这也可以通过在ls上循环而不是find来实现

for f in */; do echo "$f -> $(ls $f | wc -l)"; done

f * /;重复“$f -> $(ls $f | wc -l)”;完成

Explanation:

解释:

for f in */; - loop over all directories

f * /;-循环遍历所有目录

do echo "$f -> - print out each directory name

是否回显“$f -> -打印出每个目录名

$(ls $f | wc -l) - call ls for this directory and count lines

$(ls $f | wc -l) -为这个目录调用ls并计数行

#7


1  

This should return the directory name followed by the number of files in the directory.

这将返回目录名,后面跟着目录中的文件数量。

findfiles() {
    echo "$1" $(find "$1" -maxdepth 1 -type f | wc -l)
}

export -f findfiles

find ./ -type d -exec bash -c 'findfiles "$0"' {} \;

Example output:

示例输出:

./ 6
./foo 1
./foo/bar 2
./foo/bar/bazzz 0
./foo/bar/baz 4
./src 4

The export -f is required because the -exec argument of find does not allow executing a bash function unless you invoke bash explicitly, and you need to export the function defined in the current scope to the new shell explicitly.

导出-f是必需的,因为find的-exec参数不允许执行bash函数,除非显式地调用bash,并且需要显式地将当前作用域中定义的函数导出到新shell。

#8


1  

I am living this here, for future reminder

我就住在这里,为了将来的提醒

ls |parallel 'echo {} && ls {}|wc -l'

#9


0  

find . -type f -printf '%h\n' | sort | uniq -c

找到。-type f -printf '%h\n' | sort | uniq -c

gives for example:

例如:

  5 .
  4 ./aln
  5 ./aln/iq
  4 ./bs
  4 ./ft
  6 ./hot

#10


0  

I tried with some of the others here but ended up with subfolders included in the file count when I only wanted the files. This prints ./folder/path<tab>nnn with the number of files, not including subfolders, for each subfolder in the current folder.

我在这里尝试了其他的一些文件,但最终得到了包含在文件计数中的子文件夹,而我只需要这些文件。这将为当前文件夹中的每个子文件夹打印./folder/path nnn文件的数量,不包括子文件夹。

for d in `find . -type d -print` 
do 
  echo -e "$d\t$(find $d -maxdepth 1 -type f -print | wc -l)"
done

#11


0  

This will give the overall count.

这将给出整体的计算结果。

for file in */; do echo "$file -> $(ls $file | wc -l)"; done | cut -d ' ' -f 3| py --ji -l 'numpy.sum(l)'