I have written an application that implements a file copy that is written as below. I was wondering why, when attempting to copy from a network drive to a another network drive, the copy times are huge (20-30 mins to copy a 300mb file) with the following code:
我编写了一个实现文件副本的应用程序,其编写如下。我想知道为什么,当尝试从网络驱动器复制到另一个网络驱动器时,复制时间很长(复制300mb文件需要20-30分钟),代码如下:
public static void CopyFileToDestination(string source, string dest)
{
_log.Debug(string.Format("Copying file {0} to {1}", source, dest));
DateTime start = DateTime.Now;
string destinationFolderPath = Path.GetDirectoryName(dest);
if (!Directory.Exists(destinationFolderPath))
{
Directory.CreateDirectory(destinationFolderPath);
}
if (File.Exists(dest))
{
File.Delete(dest);
}
FileInfo sourceFile = new FileInfo(source);
if (!sourceFile.Exists)
{
throw new FileNotFoundException("source = " + source);
}
long totalBytesToTransfer = sourceFile.Length;
if (!CheckForFreeDiskSpace(dest, totalBytesToTransfer))
{
throw new ApplicationException(string.Format("Unable to copy file {0}: Not enough disk space on drive {1}.",
source, dest.Substring(0, 1).ToUpper()));
}
long bytesTransferred = 0;
using (FileStream reader = sourceFile.OpenRead())
{
using (FileStream writer = new FileStream(dest, FileMode.OpenOrCreate, FileAccess.Write))
{
byte[] buf = new byte[64 * 1024];
int bytesRead = reader.Read(buf, 0, buf.Length);
double lastPercentage = 0;
while (bytesRead > 0)
{
double percentage = ((float)bytesTransferred / (float)totalBytesToTransfer) * 100.0;
writer.Write(buf, 0, bytesRead);
bytesTransferred += bytesRead;
if (Math.Abs(lastPercentage - percentage) > 0.25)
{
System.Diagnostics.Debug.WriteLine(string.Format("{0} : Copied {1:#,##0} of {2:#,##0} MB ({3:0.0}%)",
sourceFile.Name,
bytesTransferred / (1024 * 1024),
totalBytesToTransfer / (1024 * 1024),
percentage));
lastPercentage = percentage;
}
bytesRead = reader.Read(buf, 0, buf.Length);
}
}
}
System.Diagnostics.Debug.WriteLine(string.Format("{0} : Done copying", sourceFile.Name));
_log.Debug(string.Format("{0} copied in {1:#,##0} seconds", sourceFile.Name, (DateTime.Now - start).TotalSeconds));
}
However, with a simple File.Copy, the time is as expected.
但是,使用简单的File.Copy,时间是预期的。
Does anyone have any insight? Could it be because we are making the copy in small chunks?
有没有人有任何见解?可能是因为我们正在制作小块的副本吗?
2 个解决方案
#1
3
Changing the size of your buf
variable doesn't change the size of the buffer that FileStream.Read
or FileStream.Write
use when communicating with the file system. To see any change with buffer size, you have to specify the buffer size when you open the file.
更改buf变量的大小不会更改FileStream.Read或FileStream.Write在与文件系统通信时使用的缓冲区大小。要查看缓冲区大小的任何更改,您必须在打开文件时指定缓冲区大小。
As I recall, the default buffer size is 4K. Performance testing I did some time ago showed that the sweet spot is somewhere between 64K and 256K, with 64K being more consistently the best choice.
我记得,默认缓冲区大小是4K。我前段时间做过的性能测试显示,最佳位置介于64K和256K之间,64K更一致是最佳选择。
You should change your File.OpenRead()
to:
您应该将File.OpenRead()更改为:
new FileStream(sourceFile.FullName, FileMode.Open, FileAccess.Read, FileShare.None, BufferSize)
Change the FileShare
value if you don't want exclusive access, and declare BufferSize
as a constant equal to whatever buffer size you want. I use 64*1024
.
如果您不想要独占访问,请更改FileShare值,并将BufferSize声明为等于所需缓冲区大小的常量。我用的是64 * 1024。
Also, change the way you open your output file to:
此外,将打开输出文件的方式更改为:
new FileStream(dest, FileMode.Create, FileAccess.Write, FileShare.None, BufferSize)
Note that I used FileMode.Create
rather than FileMode.OpenOrCreate
. If you use OpenOrCreate
and the source file is smaller than the existing destination file, I don't think the file is truncated when you're done writing. So the destination file would contain extraneous data.
请注意,我使用的是FileMode.Create而不是FileMode.OpenOrCreate。如果您使用OpenOrCreate并且源文件小于现有目标文件,那么在您完成编写时我不认为该文件被截断。因此目标文件将包含无关数据。
That said, I wouldn't expect this to change your copy time from 20-30 minutes down to the few seconds that it should take. I suppose it could if every low-level read requires a network call. With the default 4K buffer, you're making 16 read calls to the file system in order to fill your 64K buffer. So by increasing your buffer size you greatly reduce the number of OS calls (and potentially the number of network transactions) your code makes.
也就是说,我不希望这会将你的复制时间从20-30分钟改为应该花费的几秒钟。我想如果每个低级读取都需要网络调用。使用默认的4K缓冲区,您将对文件系统进行16次读取调用,以填充64K缓冲区。因此,通过增加缓冲区大小,可以大大减少代码所进行的OS调用(以及可能的网络事务数)。
Finally, there's no need to check to see if a file exists before you delete it. File.Delete
silently ignores an attempt to delete a file that doesn't exist.
最后,在删除文件之前无需检查文件是否存在。 File.Delete以静默方式忽略尝试删除不存在的文件。
#2
0
Call the SetLength method on your writer Stream before actual copying, this should reduce operations by the target disk.
在实际复制之前,在编写器Stream上调用SetLength方法,这应该减少目标磁盘的操作。
Like so
像这样
writer.SetLength(totalBytesToTransfer);
You may need to set the Stream's psoition back to the start after calling this method by using Seek. Check the position of the stream after calling SetLength, should be still zero.
在使用Seek调用此方法后,您可能需要将Stream的psoition设置为start。调用SetLength后检查流的位置,应该仍为零。
writer.Seek(0, SeekOrigin.Begin); // Not sure on that one
If that still is too slow use the SetFileValidData
如果仍然太慢,请使用SetFileValidData
#1
3
Changing the size of your buf
variable doesn't change the size of the buffer that FileStream.Read
or FileStream.Write
use when communicating with the file system. To see any change with buffer size, you have to specify the buffer size when you open the file.
更改buf变量的大小不会更改FileStream.Read或FileStream.Write在与文件系统通信时使用的缓冲区大小。要查看缓冲区大小的任何更改,您必须在打开文件时指定缓冲区大小。
As I recall, the default buffer size is 4K. Performance testing I did some time ago showed that the sweet spot is somewhere between 64K and 256K, with 64K being more consistently the best choice.
我记得,默认缓冲区大小是4K。我前段时间做过的性能测试显示,最佳位置介于64K和256K之间,64K更一致是最佳选择。
You should change your File.OpenRead()
to:
您应该将File.OpenRead()更改为:
new FileStream(sourceFile.FullName, FileMode.Open, FileAccess.Read, FileShare.None, BufferSize)
Change the FileShare
value if you don't want exclusive access, and declare BufferSize
as a constant equal to whatever buffer size you want. I use 64*1024
.
如果您不想要独占访问,请更改FileShare值,并将BufferSize声明为等于所需缓冲区大小的常量。我用的是64 * 1024。
Also, change the way you open your output file to:
此外,将打开输出文件的方式更改为:
new FileStream(dest, FileMode.Create, FileAccess.Write, FileShare.None, BufferSize)
Note that I used FileMode.Create
rather than FileMode.OpenOrCreate
. If you use OpenOrCreate
and the source file is smaller than the existing destination file, I don't think the file is truncated when you're done writing. So the destination file would contain extraneous data.
请注意,我使用的是FileMode.Create而不是FileMode.OpenOrCreate。如果您使用OpenOrCreate并且源文件小于现有目标文件,那么在您完成编写时我不认为该文件被截断。因此目标文件将包含无关数据。
That said, I wouldn't expect this to change your copy time from 20-30 minutes down to the few seconds that it should take. I suppose it could if every low-level read requires a network call. With the default 4K buffer, you're making 16 read calls to the file system in order to fill your 64K buffer. So by increasing your buffer size you greatly reduce the number of OS calls (and potentially the number of network transactions) your code makes.
也就是说,我不希望这会将你的复制时间从20-30分钟改为应该花费的几秒钟。我想如果每个低级读取都需要网络调用。使用默认的4K缓冲区,您将对文件系统进行16次读取调用,以填充64K缓冲区。因此,通过增加缓冲区大小,可以大大减少代码所进行的OS调用(以及可能的网络事务数)。
Finally, there's no need to check to see if a file exists before you delete it. File.Delete
silently ignores an attempt to delete a file that doesn't exist.
最后,在删除文件之前无需检查文件是否存在。 File.Delete以静默方式忽略尝试删除不存在的文件。
#2
0
Call the SetLength method on your writer Stream before actual copying, this should reduce operations by the target disk.
在实际复制之前,在编写器Stream上调用SetLength方法,这应该减少目标磁盘的操作。
Like so
像这样
writer.SetLength(totalBytesToTransfer);
You may need to set the Stream's psoition back to the start after calling this method by using Seek. Check the position of the stream after calling SetLength, should be still zero.
在使用Seek调用此方法后,您可能需要将Stream的psoition设置为start。调用SetLength后检查流的位置,应该仍为零。
writer.Seek(0, SeekOrigin.Begin); // Not sure on that one
If that still is too slow use the SetFileValidData
如果仍然太慢,请使用SetFileValidData