目录中所有文件的内容的总大小

时间:2022-08-03 03:47:27

When I use ls or du, I get the amount of disk space each file is occupying.

当我使用ls或du时,我得到每个文件占用的磁盘空间量。

I need the sum total of all the data in files and subdirectories I would get if I opened each file and counted the bytes. Bonus points if I can get this without opening each file and counting.

我需要文件和子目录中所有数据的总和,如果我打开每个文件并计算字节的话。如果我能在不打开每个文件和计数的情况下获得这个奖励积分。

12 个解决方案

#1


94  

If you want the 'apparent size' (that is the number of bytes in each file), not size taken up by files on the disk, use the -b or --bytes option (if you got a Linux system with GNU coreutils):

如果您想要“显示大小”(即每个文件中的字节数),而不是磁盘上的文件占用的大小,请使用-b或-bytes选项(如果您有一个带有GNU coreutils的Linux系统):

% du -sbh <directory>

#2


43  

Use du -sb:

使用du某人:

du -sb DIR

Optionally, add the h option for more user-friendly output:

可选地,添加h选项,使输出更加用户友好:

du -sbh DIR

#3


21  

cd to directory, then:

cd目录,然后:

du -sh

ftw!

增值!

Originally wrote about it here: https://ao.gl/get-the-total-size-of-all-the-files-in-a-directory/

最初是这样写的:https://ao.gl/get-the-total-size of all-the-files-in- directory/

#4


13  

Just an alternative:

只是一个选择:

$ ls -lR | grep -v '^d' | awk '{total += $5} END {print "Total:", total}'

grep -v '^d' will exclude the directories.

grep - v ^ d '将排除目录。

#5


9  

stat's "%s" format gives you the actual number of bytes in a file.

stat的“%s”格式给出了文件中实际的字节数。

 find . -type f |
 xargs stat --format=%s |
 awk '{s+=$1} END {print s}'

Feel free to substitute your favourite method for summing numbers.

可以用你最喜欢的方法来代替求和。

#6


4  

If you use busybox's "du" in emebedded system, you can not get a exact bytes with du, only Kbytes you can get.

如果您在emebedded系统中使用busybox的“du”,则无法获得du的确切字节数,只能获得Kbytes。

BusyBox v1.4.1 (2007-11-30 20:37:49 EST) multi-call binary

Usage: du [-aHLdclsxhmk] [FILE]...

Summarize disk space used for each FILE and/or directory.
Disk space is printed in units of 1024 bytes.

Options:
        -a      Show sizes of files in addition to directories
        -H      Follow symbolic links that are FILE command line args
        -L      Follow all symbolic links encountered
        -d N    Limit output to directories (and files with -a) of depth < N
        -c      Output a grand total
        -l      Count sizes many times if hard linked
        -s      Display only a total for each argument
        -x      Skip directories on different filesystems
        -h      Print sizes in human readable format (e.g., 1K 243M 2G )
        -m      Print sizes in megabytes
        -k      Print sizes in kilobytes(default)

#7


1  

du is handy, but find is useful in case if you want to calculate the size of some files only (for example, using filter by extension). Also note that find themselves can print the size of each file in bytes. To calculate a total size we can connect dc command in the following manner:

du很方便,但是如果您只想计算某些文件的大小(例如,使用按扩展名筛选),find非常有用。还要注意,find本身可以以字节为单位打印每个文件的大小。要计算总大小,我们可以通过以下方式连接dc命令:

find . -type f -printf "%s + " | dc -e0 -f- -ep

Here find generates sequence of commands for dc like 123 + 456 + 11 +. Although, the completed program should be like 0 123 + 456 + 11 + p (remember postfix notation).

在这里,find为dc生成命令序列,比如123 + 456 + 11 +。虽然,完成的程序应该是0 123 + 456 + 11 + p(记住后缀表示法)。

So, to get the completed program we need to put 0 on the stack before executing the sequence from stdin, and print the top number after executing (the p command at the end). We achieve it via dc options:

因此,要获得完成的程序,我们需要在执行stdin的序列之前将0放在堆栈上,然后在执行stdin之后打印最上面的数字(最后的p命令)。我们通过dc选项实现:

  1. -e0 is just shortcut for -e '0' that puts 0 on the stack,
  2. e0是-e '0的快捷方式把0放到堆栈上,
  3. -f- is for read and execute commands from stdin (that generated by find here),
  4. -f-表示从stdin(由find生成的)读取和执行命令,
  5. -ep is for print the result (-e 'p').
  6. -ep表示打印结果(-e 'p)。

To print the size in MiB like 284.06 MiB we can use -e '2 k 1024 / 1024 / n [ MiB] p' in point 3 instead (most spaces are optional).

要在MiB中打印284.06 MiB的大小,我们可以使用-e ' 2k 1024 / 1024 / n [MiB] p'在第3点(大多数空格都是可选的)。

#8


1  

Use:

使用:

$ du -ckx <DIR> | grep total | awk '{print $1}'

Where <DIR> is the directory you want to inspect.

其中

是要检查的目录。

The '-c' gives you grand total data which is extracted using the 'grep total' portion of the command, and the count in Kbytes is extracted with the awk command.

“-c”提供了使用命令的“grep total”部分提取的总数据,而Kbytes中的计数是使用awk命令提取的。

The only caveat here is if you have a subdirectory containing the text "total" it will get spit out as well.

这里唯一要注意的是,如果您有一个包含文本“total”的子目录,它也会被吐出。

#9


1  

This may help:

这可以帮助:

ls -l| grep -v '^d'| awk '{total = total + $5} END {print "Total" , total}'

The above command will sum total all the files leaving the directories size.

上面的命令将合计所有保留目录大小的文件。

#10


1  

There are at least three ways to get the "sum total of all the data in files and subdirectories" in bytes that work in both Linux/Unix and Git Bash for Windows, listed below in order from fastest to slowest on average. For your reference, they were executed at the root of a fairly deep file system (docroot in a Magento 2 Enterprise installation comprising 71,158 files in 30,027 directories).

至少有三种方法可以获得“文件和子目录中所有数据的总和”,即在Linux/Unix和Git Bash中为Windows工作的字节数,如下所示,从最快到最慢的平均。对于您的引用,它们是在一个相当深的文件系统的根处执行的(在Magento 2企业安装中,包含71158个文件,包含30,027个目录)。

1.

1。

$ time find -type f -printf '%s\n' | awk '{ total += $1 }; END { print total" bytes" }'
748660546 bytes

real    0m0.221s
user    0m0.068s
sys     0m0.160s

2.

2。

$ time echo `find -type f -print0 | xargs -0 stat --format=%s | awk '{total+=$1} END {print total}'` bytes
748660546 bytes

real    0m0.256s
user    0m0.164s
sys     0m0.196s

3.

3所示。

$ time echo `find -type f -exec du -bc {} + | grep -P "\ttotal$" | cut -f1 | awk '{ total += $1 }; END { print total }'` bytes
748660546 bytes

real    0m0.553s
user    0m0.308s
sys     0m0.416s


These two also work, but they rely on commands that don't exist on Git Bash for Windows:

这两个命令也可以工作,但是它们依赖于Git Bash中不存在的命令:

1.

1。

$ time echo `find -type f -printf "%s + " | dc -e0 -f- -ep` bytes
748660546 bytes

real    0m0.233s
user    0m0.116s
sys     0m0.176s

2.

2。

$ time echo `find -type f -printf '%s\n' | paste -sd+ | bc` bytes
748660546 bytes

real    0m0.242s
user    0m0.104s
sys     0m0.152s


If you only want the total for the current directory, then add -maxdepth 1 to find.

如果您只想要当前目录的总数,那么添加-maxdepth 1以查找。


Note that some of the suggested solutions don't return accurate results, so I would stick with the solutions above instead.

注意,一些建议的解决方案不能返回准确的结果,因此我将坚持上面的解决方案。

$ du -sbh
832M    .

$ ls -lR | grep -v '^d' | awk '{total += $5} END {print "Total:", total}'
Total: 583772525

$ find . -type f | xargs stat --format=%s | awk '{s+=$1} END {print s}'
xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option
4390471

$ ls -l| grep -v '^d'| awk '{total = total + $5} END {print "Total" , total}'
Total 968133

#11


0  

For Win32 DOS, you can:

对于windos,你可以:

c:> dir /s c:\directory\you\want

c:> dir / s c:\目录\ \想要的

and the penultimate line will tell you how many bytes the files take up.

倒数第二行会告诉你文件占用了多少字节。

I know this reads all files and directories, but works faster in some situations.

我知道它可以读取所有的文件和目录,但是在某些情况下会更快。

#12


0  

When a folder is created, many Linux filesystems allocate 4096 bytes to store some metadata about the directory itself. This space is increased by a multiple of 4096 bytes as the directory grows.

创建文件夹时,许多Linux文件系统会分配4096个字节来存储有关目录本身的一些元数据。随着目录的增长,这个空间将增加4096字节的倍数。

du command (with or without -b option) take in count this space, as you can see typing:

du命令(带或不带-b选项)计算这个空间,您可以看到输入:

mkdir test && du -b test

you will have a result of 4096 bytes for an empty dir. So, if you put 2 files of 10000 bytes inside the dir, the total amount given by du -sb would be 24096 bytes.

对于空的dir,您将得到4096字节的结果。所以,如果你把两个10000字节的文件放在目录中,du -sb给出的总数就是24096字节。

If you read carefully the question, this is not what asked. The questioner asked:

如果你仔细阅读这个问题,就会发现这不是问题所在。提问者问道:

the sum total of all the data in files and subdirectories I would get if I opened each file and counted the bytes

如果打开每个文件并计算字节,我将得到文件和子目录中所有数据的总和

that in the example above should be 20000 bytes, not 24096.

上面的例子应该是20000字节,而不是24096。

So, the correct answer IMHO could be a blend of Nelson answer and hlovdal suggestion to handle filenames containing spaces:

因此,正确的答案IMHO可能是尼尔森答案和hlovdal建议处理包含空格的文件名:

find . -type f -print0 | xargs -0 stat --format=%s | awk '{s+=$1} END {print s}'

#1


94  

If you want the 'apparent size' (that is the number of bytes in each file), not size taken up by files on the disk, use the -b or --bytes option (if you got a Linux system with GNU coreutils):

如果您想要“显示大小”(即每个文件中的字节数),而不是磁盘上的文件占用的大小,请使用-b或-bytes选项(如果您有一个带有GNU coreutils的Linux系统):

% du -sbh <directory>

#2


43  

Use du -sb:

使用du某人:

du -sb DIR

Optionally, add the h option for more user-friendly output:

可选地,添加h选项,使输出更加用户友好:

du -sbh DIR

#3


21  

cd to directory, then:

cd目录,然后:

du -sh

ftw!

增值!

Originally wrote about it here: https://ao.gl/get-the-total-size-of-all-the-files-in-a-directory/

最初是这样写的:https://ao.gl/get-the-total-size of all-the-files-in- directory/

#4


13  

Just an alternative:

只是一个选择:

$ ls -lR | grep -v '^d' | awk '{total += $5} END {print "Total:", total}'

grep -v '^d' will exclude the directories.

grep - v ^ d '将排除目录。

#5


9  

stat's "%s" format gives you the actual number of bytes in a file.

stat的“%s”格式给出了文件中实际的字节数。

 find . -type f |
 xargs stat --format=%s |
 awk '{s+=$1} END {print s}'

Feel free to substitute your favourite method for summing numbers.

可以用你最喜欢的方法来代替求和。

#6


4  

If you use busybox's "du" in emebedded system, you can not get a exact bytes with du, only Kbytes you can get.

如果您在emebedded系统中使用busybox的“du”,则无法获得du的确切字节数,只能获得Kbytes。

BusyBox v1.4.1 (2007-11-30 20:37:49 EST) multi-call binary

Usage: du [-aHLdclsxhmk] [FILE]...

Summarize disk space used for each FILE and/or directory.
Disk space is printed in units of 1024 bytes.

Options:
        -a      Show sizes of files in addition to directories
        -H      Follow symbolic links that are FILE command line args
        -L      Follow all symbolic links encountered
        -d N    Limit output to directories (and files with -a) of depth < N
        -c      Output a grand total
        -l      Count sizes many times if hard linked
        -s      Display only a total for each argument
        -x      Skip directories on different filesystems
        -h      Print sizes in human readable format (e.g., 1K 243M 2G )
        -m      Print sizes in megabytes
        -k      Print sizes in kilobytes(default)

#7


1  

du is handy, but find is useful in case if you want to calculate the size of some files only (for example, using filter by extension). Also note that find themselves can print the size of each file in bytes. To calculate a total size we can connect dc command in the following manner:

du很方便,但是如果您只想计算某些文件的大小(例如,使用按扩展名筛选),find非常有用。还要注意,find本身可以以字节为单位打印每个文件的大小。要计算总大小,我们可以通过以下方式连接dc命令:

find . -type f -printf "%s + " | dc -e0 -f- -ep

Here find generates sequence of commands for dc like 123 + 456 + 11 +. Although, the completed program should be like 0 123 + 456 + 11 + p (remember postfix notation).

在这里,find为dc生成命令序列,比如123 + 456 + 11 +。虽然,完成的程序应该是0 123 + 456 + 11 + p(记住后缀表示法)。

So, to get the completed program we need to put 0 on the stack before executing the sequence from stdin, and print the top number after executing (the p command at the end). We achieve it via dc options:

因此,要获得完成的程序,我们需要在执行stdin的序列之前将0放在堆栈上,然后在执行stdin之后打印最上面的数字(最后的p命令)。我们通过dc选项实现:

  1. -e0 is just shortcut for -e '0' that puts 0 on the stack,
  2. e0是-e '0的快捷方式把0放到堆栈上,
  3. -f- is for read and execute commands from stdin (that generated by find here),
  4. -f-表示从stdin(由find生成的)读取和执行命令,
  5. -ep is for print the result (-e 'p').
  6. -ep表示打印结果(-e 'p)。

To print the size in MiB like 284.06 MiB we can use -e '2 k 1024 / 1024 / n [ MiB] p' in point 3 instead (most spaces are optional).

要在MiB中打印284.06 MiB的大小,我们可以使用-e ' 2k 1024 / 1024 / n [MiB] p'在第3点(大多数空格都是可选的)。

#8


1  

Use:

使用:

$ du -ckx <DIR> | grep total | awk '{print $1}'

Where <DIR> is the directory you want to inspect.

其中

是要检查的目录。

The '-c' gives you grand total data which is extracted using the 'grep total' portion of the command, and the count in Kbytes is extracted with the awk command.

“-c”提供了使用命令的“grep total”部分提取的总数据,而Kbytes中的计数是使用awk命令提取的。

The only caveat here is if you have a subdirectory containing the text "total" it will get spit out as well.

这里唯一要注意的是,如果您有一个包含文本“total”的子目录,它也会被吐出。

#9


1  

This may help:

这可以帮助:

ls -l| grep -v '^d'| awk '{total = total + $5} END {print "Total" , total}'

The above command will sum total all the files leaving the directories size.

上面的命令将合计所有保留目录大小的文件。

#10


1  

There are at least three ways to get the "sum total of all the data in files and subdirectories" in bytes that work in both Linux/Unix and Git Bash for Windows, listed below in order from fastest to slowest on average. For your reference, they were executed at the root of a fairly deep file system (docroot in a Magento 2 Enterprise installation comprising 71,158 files in 30,027 directories).

至少有三种方法可以获得“文件和子目录中所有数据的总和”,即在Linux/Unix和Git Bash中为Windows工作的字节数,如下所示,从最快到最慢的平均。对于您的引用,它们是在一个相当深的文件系统的根处执行的(在Magento 2企业安装中,包含71158个文件,包含30,027个目录)。

1.

1。

$ time find -type f -printf '%s\n' | awk '{ total += $1 }; END { print total" bytes" }'
748660546 bytes

real    0m0.221s
user    0m0.068s
sys     0m0.160s

2.

2。

$ time echo `find -type f -print0 | xargs -0 stat --format=%s | awk '{total+=$1} END {print total}'` bytes
748660546 bytes

real    0m0.256s
user    0m0.164s
sys     0m0.196s

3.

3所示。

$ time echo `find -type f -exec du -bc {} + | grep -P "\ttotal$" | cut -f1 | awk '{ total += $1 }; END { print total }'` bytes
748660546 bytes

real    0m0.553s
user    0m0.308s
sys     0m0.416s


These two also work, but they rely on commands that don't exist on Git Bash for Windows:

这两个命令也可以工作,但是它们依赖于Git Bash中不存在的命令:

1.

1。

$ time echo `find -type f -printf "%s + " | dc -e0 -f- -ep` bytes
748660546 bytes

real    0m0.233s
user    0m0.116s
sys     0m0.176s

2.

2。

$ time echo `find -type f -printf '%s\n' | paste -sd+ | bc` bytes
748660546 bytes

real    0m0.242s
user    0m0.104s
sys     0m0.152s


If you only want the total for the current directory, then add -maxdepth 1 to find.

如果您只想要当前目录的总数,那么添加-maxdepth 1以查找。


Note that some of the suggested solutions don't return accurate results, so I would stick with the solutions above instead.

注意,一些建议的解决方案不能返回准确的结果,因此我将坚持上面的解决方案。

$ du -sbh
832M    .

$ ls -lR | grep -v '^d' | awk '{total += $5} END {print "Total:", total}'
Total: 583772525

$ find . -type f | xargs stat --format=%s | awk '{s+=$1} END {print s}'
xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option
4390471

$ ls -l| grep -v '^d'| awk '{total = total + $5} END {print "Total" , total}'
Total 968133

#11


0  

For Win32 DOS, you can:

对于windos,你可以:

c:> dir /s c:\directory\you\want

c:> dir / s c:\目录\ \想要的

and the penultimate line will tell you how many bytes the files take up.

倒数第二行会告诉你文件占用了多少字节。

I know this reads all files and directories, but works faster in some situations.

我知道它可以读取所有的文件和目录,但是在某些情况下会更快。

#12


0  

When a folder is created, many Linux filesystems allocate 4096 bytes to store some metadata about the directory itself. This space is increased by a multiple of 4096 bytes as the directory grows.

创建文件夹时,许多Linux文件系统会分配4096个字节来存储有关目录本身的一些元数据。随着目录的增长,这个空间将增加4096字节的倍数。

du command (with or without -b option) take in count this space, as you can see typing:

du命令(带或不带-b选项)计算这个空间,您可以看到输入:

mkdir test && du -b test

you will have a result of 4096 bytes for an empty dir. So, if you put 2 files of 10000 bytes inside the dir, the total amount given by du -sb would be 24096 bytes.

对于空的dir,您将得到4096字节的结果。所以,如果你把两个10000字节的文件放在目录中,du -sb给出的总数就是24096字节。

If you read carefully the question, this is not what asked. The questioner asked:

如果你仔细阅读这个问题,就会发现这不是问题所在。提问者问道:

the sum total of all the data in files and subdirectories I would get if I opened each file and counted the bytes

如果打开每个文件并计算字节,我将得到文件和子目录中所有数据的总和

that in the example above should be 20000 bytes, not 24096.

上面的例子应该是20000字节,而不是24096。

So, the correct answer IMHO could be a blend of Nelson answer and hlovdal suggestion to handle filenames containing spaces:

因此,正确的答案IMHO可能是尼尔森答案和hlovdal建议处理包含空格的文件名:

find . -type f -print0 | xargs -0 stat --format=%s | awk '{s+=$1} END {print s}'