在c++中不打开文件检查文件大小?

时间:2022-12-05 20:49:45

I'm trying to get the filesize of a large file (12gb+) and I don't want to open the file to do so as I assume this would eat a lot of resources. Is there any good API to do so with? I'm in a Windows environment.

我正在尝试获取一个大文件的文件大小(12gb+),我不想打开这个文件,因为我认为这会消耗大量资源。有什么好的API可以这么做吗?我在Windows环境中。

5 个解决方案

#1


41  

You should call GetFileSizeEx which is easier to use than the older GetFileSize. You will need to open the file by calling CreateFile but that's a cheap operation. Your assumption that opening a file is expensive, even a 12GB file, is false.

您应该调用GetFileSizeEx,它比旧的GetFileSize更容易使用。您将需要通过调用CreateFile来打开文件,但这是一个廉价的操作。您认为打开一个文件是昂贵的,即使是12GB的文件,也是错误的。

You could use the following function to get the job done:

您可以使用以下函数完成工作:

__int64 FileSize(const wchar_t* name)
{
    HANDLE hFile = CreateFile(name, GENERIC_READ, 
        FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 
        FILE_ATTRIBUTE_NORMAL, NULL);
    if (hFile==INVALID_HANDLE_VALUE)
        return -1; // error condition, could call GetLastError to find out more

    LARGE_INTEGER size;
    if (!GetFileSizeEx(hFile, &size))
    {
        CloseHandle(hFile);
        return -1; // error condition, could call GetLastError to find out more
    }

    CloseHandle(hFile);
    return size.QuadPart;
}

There are other API calls that will return you the file size without forcing you to create a file handle, notably GetFileAttributesEx. However, it's perfectly plausible that this function will just open the file behind the scenes.

还有一些API调用会返回文件大小,而不会强制您创建文件句柄,特别是GetFileAttributesEx。然而,这个函数很可能只是在幕后打开文件。

__int64 FileSize(const wchar_t* name)
{
    WIN32_FILE_ATTRIBUTE_DATA fad;
    if (!GetFileAttributesEx(name, GetFileExInfoStandard, &fad))
        return -1; // error condition, could call GetLastError to find out more
    LARGE_INTEGER size;
    size.HighPart = fad.nFileSizeHigh;
    size.LowPart = fad.nFileSizeLow;
    return size.QuadPart;
}

If you are compiling with Visual Studio and want to avoid calling Win32 APIs then you can use _wstat64.

如果您正在使用Visual Studio编译,并且希望避免调用Win32 api,那么可以使用_wstat64。

Here is a _wstat64 based version of the function:

下面是基于_wstat64的函数版本:

__int64 FileSize(const wchar_t* name)
{
    __stat64 buf;
    if (_wstat64(name, &buf) != 0)
        return -1; // error, could use errno to find out more

    return buf.st_size;
} 

If performance ever became an issue for you then you should time the various options on all the platforms that you target in order to reach a decision. Don't assume that the APIs that don't require you to call CreateFile will be faster. They might be but you won't know until you have timed it.

如果性能对您来说曾经是一个问题,那么您应该对所有您目标平台上的各种选项进行计时,以便做出决定。不要假设不需要调用CreateFile的api会更快。他们可能是,但你不会知道,直到你确定了时间。

#2


24  

I've also lived with the fear of the price paid for opening a file and closing it just to get its size. And decided to ask the performance counter^ and see how expensive the operations really are.

我也一直生活在这样的恐惧中:打开一个文件然后关闭它,只是为了得到它的大小。并决定向性能计数器^,看看昂贵的操作。

This is the number of cycles it took to execute 1 file size query on the same file with the three methods. Tested on 2 files: 150 MB and 1.5 GB. Got +/- 10% fluctuations so they don't seem to be affected by actual file size. (obviously this depend on CPU but it gives you a good vantage point)

这是用这三个方法在同一个文件上执行一个文件大小查询所需的周期数。在两个文件上测试:150 MB和1.5 GB。得到+/- 10%的波动,因此它们似乎不受实际文件大小的影响。(显然这取决于CPU,但它提供了一个很好的优势)

  • 190 cycles - CreateFile, GetFileSizeEx, CloseHandle
  • 190个循环- CreateFile, GetFileSizeEx, close句柄
  • 40 cycles - GetFileAttributesEx
  • 40个周期——GetFileAttributesEx
  • 150 cycles - FindFirstFile, FindClose
  • 150个循环- FindFirstFile, FindClose

The GIST with the code used^ is available here.

代码使用^的要点是可用的。

As we can see from this highly scientific :) test, slowest is actually the file opener. 2nd slowest is the file finder while the winner is the attributes reader. Now, in terms of reliability, CreateFile should be preferred over the other 2. But I still don't like the concept of opening a file just to read its size... Unless I'm doing size critical stuff, I'll go for the Attributes.

正如我们从这个高度科学的:)测试中看到的,最慢的实际上是文件打开器。第二慢的是文件查找程序,而第二慢的是属性读取程序。现在,在可靠性方面,CreateFile应该优先于其他两个。但是我仍然不喜欢打开一个文件来读取它的大小……除非我做的是关键尺寸的东西,否则我选择属性。

PS: When I'll have time I'll try to read sizes of files that are opened and am writing to. But not right now...

PS:当我有时间的时候,我会试着读取打开的文件的大小并写入。但不是现在…

#3


9  

Another option using the FindFirstFile function

使用FindFirstFile函数的另一个选项

#include "stdafx.h"
#include <windows.h>
#include <tchar.h>
#include <stdio.h>

int _tmain(int argc, _TCHAR* argv[])
{
   WIN32_FIND_DATA FindFileData;
   HANDLE hFind;
   LPCTSTR  lpFileName = L"C:\\Foo\\Bar.ext";

   hFind = FindFirstFile(lpFileName , &FindFileData);
   if (hFind == INVALID_HANDLE_VALUE) 
   {
      printf ("File not found (%d)\n", GetLastError());
      return -1;
   } 
   else 
   {
      ULONGLONG FileSize = FindFileData.nFileSizeHigh;
      FileSize <<= sizeof( FindFileData.nFileSizeHigh ) * 8; 
      FileSize |= FindFileData.nFileSizeLow;
      _tprintf (TEXT("file size is %u\n"), FileSize);
      FindClose(hFind);
   }
   return 0;

}

#4


1  

What about GetFileSize function?

GetFileSize函数呢?

#5


0  

As of C++17, there is file_size as part of the standard library. (Then the implementor gets to decide how to do it efficiently!)

在c++ 17中,file_size作为标准库的一部分。(然后实施者决定如何高效地完成它!)

#1


41  

You should call GetFileSizeEx which is easier to use than the older GetFileSize. You will need to open the file by calling CreateFile but that's a cheap operation. Your assumption that opening a file is expensive, even a 12GB file, is false.

您应该调用GetFileSizeEx,它比旧的GetFileSize更容易使用。您将需要通过调用CreateFile来打开文件,但这是一个廉价的操作。您认为打开一个文件是昂贵的,即使是12GB的文件,也是错误的。

You could use the following function to get the job done:

您可以使用以下函数完成工作:

__int64 FileSize(const wchar_t* name)
{
    HANDLE hFile = CreateFile(name, GENERIC_READ, 
        FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 
        FILE_ATTRIBUTE_NORMAL, NULL);
    if (hFile==INVALID_HANDLE_VALUE)
        return -1; // error condition, could call GetLastError to find out more

    LARGE_INTEGER size;
    if (!GetFileSizeEx(hFile, &size))
    {
        CloseHandle(hFile);
        return -1; // error condition, could call GetLastError to find out more
    }

    CloseHandle(hFile);
    return size.QuadPart;
}

There are other API calls that will return you the file size without forcing you to create a file handle, notably GetFileAttributesEx. However, it's perfectly plausible that this function will just open the file behind the scenes.

还有一些API调用会返回文件大小,而不会强制您创建文件句柄,特别是GetFileAttributesEx。然而,这个函数很可能只是在幕后打开文件。

__int64 FileSize(const wchar_t* name)
{
    WIN32_FILE_ATTRIBUTE_DATA fad;
    if (!GetFileAttributesEx(name, GetFileExInfoStandard, &fad))
        return -1; // error condition, could call GetLastError to find out more
    LARGE_INTEGER size;
    size.HighPart = fad.nFileSizeHigh;
    size.LowPart = fad.nFileSizeLow;
    return size.QuadPart;
}

If you are compiling with Visual Studio and want to avoid calling Win32 APIs then you can use _wstat64.

如果您正在使用Visual Studio编译,并且希望避免调用Win32 api,那么可以使用_wstat64。

Here is a _wstat64 based version of the function:

下面是基于_wstat64的函数版本:

__int64 FileSize(const wchar_t* name)
{
    __stat64 buf;
    if (_wstat64(name, &buf) != 0)
        return -1; // error, could use errno to find out more

    return buf.st_size;
} 

If performance ever became an issue for you then you should time the various options on all the platforms that you target in order to reach a decision. Don't assume that the APIs that don't require you to call CreateFile will be faster. They might be but you won't know until you have timed it.

如果性能对您来说曾经是一个问题,那么您应该对所有您目标平台上的各种选项进行计时,以便做出决定。不要假设不需要调用CreateFile的api会更快。他们可能是,但你不会知道,直到你确定了时间。

#2


24  

I've also lived with the fear of the price paid for opening a file and closing it just to get its size. And decided to ask the performance counter^ and see how expensive the operations really are.

我也一直生活在这样的恐惧中:打开一个文件然后关闭它,只是为了得到它的大小。并决定向性能计数器^,看看昂贵的操作。

This is the number of cycles it took to execute 1 file size query on the same file with the three methods. Tested on 2 files: 150 MB and 1.5 GB. Got +/- 10% fluctuations so they don't seem to be affected by actual file size. (obviously this depend on CPU but it gives you a good vantage point)

这是用这三个方法在同一个文件上执行一个文件大小查询所需的周期数。在两个文件上测试:150 MB和1.5 GB。得到+/- 10%的波动,因此它们似乎不受实际文件大小的影响。(显然这取决于CPU,但它提供了一个很好的优势)

  • 190 cycles - CreateFile, GetFileSizeEx, CloseHandle
  • 190个循环- CreateFile, GetFileSizeEx, close句柄
  • 40 cycles - GetFileAttributesEx
  • 40个周期——GetFileAttributesEx
  • 150 cycles - FindFirstFile, FindClose
  • 150个循环- FindFirstFile, FindClose

The GIST with the code used^ is available here.

代码使用^的要点是可用的。

As we can see from this highly scientific :) test, slowest is actually the file opener. 2nd slowest is the file finder while the winner is the attributes reader. Now, in terms of reliability, CreateFile should be preferred over the other 2. But I still don't like the concept of opening a file just to read its size... Unless I'm doing size critical stuff, I'll go for the Attributes.

正如我们从这个高度科学的:)测试中看到的,最慢的实际上是文件打开器。第二慢的是文件查找程序,而第二慢的是属性读取程序。现在,在可靠性方面,CreateFile应该优先于其他两个。但是我仍然不喜欢打开一个文件来读取它的大小……除非我做的是关键尺寸的东西,否则我选择属性。

PS: When I'll have time I'll try to read sizes of files that are opened and am writing to. But not right now...

PS:当我有时间的时候,我会试着读取打开的文件的大小并写入。但不是现在…

#3


9  

Another option using the FindFirstFile function

使用FindFirstFile函数的另一个选项

#include "stdafx.h"
#include <windows.h>
#include <tchar.h>
#include <stdio.h>

int _tmain(int argc, _TCHAR* argv[])
{
   WIN32_FIND_DATA FindFileData;
   HANDLE hFind;
   LPCTSTR  lpFileName = L"C:\\Foo\\Bar.ext";

   hFind = FindFirstFile(lpFileName , &FindFileData);
   if (hFind == INVALID_HANDLE_VALUE) 
   {
      printf ("File not found (%d)\n", GetLastError());
      return -1;
   } 
   else 
   {
      ULONGLONG FileSize = FindFileData.nFileSizeHigh;
      FileSize <<= sizeof( FindFileData.nFileSizeHigh ) * 8; 
      FileSize |= FindFileData.nFileSizeLow;
      _tprintf (TEXT("file size is %u\n"), FileSize);
      FindClose(hFind);
   }
   return 0;

}

#4


1  

What about GetFileSize function?

GetFileSize函数呢?

#5


0  

As of C++17, there is file_size as part of the standard library. (Then the implementor gets to decide how to do it efficiently!)

在c++ 17中,file_size作为标准库的一部分。(然后实施者决定如何高效地完成它!)