I have a program that runs as a Windows Service which is processing files in a specific folder.
我有一个作为Windows服务运行的程序,它处理特定文件夹中的文件。
Since it's a service, it constantly monitors a folder for new files that have been added. Part of the program's job is to perform comparisons of files in the target folder and flag non-matching files.
由于它是一种服务,它会不断地监视一个文件夹,以查找添加的新文件。该程序的部分工作是对目标文件夹中的文件进行比较,并标记不匹配的文件。
What I would like to do is to detect a running copy operation and when it is completed, so that a file is not getting prematurely flagged if it's matching file has not been copied over to the target folder yet.
我想要做的是检测一个正在运行的复制操作以及它何时完成,这样如果一个文件的匹配文件还没有被复制到目标文件夹中,那么它就不会被提前标记。
What I was thinking of doing was using the FileSystemWatcher
to watch the target folder and see if a copy operation is occurring. If there is, I put my program's main thread to sleep until the copy operation has completed, then proceed to perform the operation on the folder like normal.
我想做的是使用文件系统监视程序来监视目标文件夹,并查看是否正在执行复制操作。如果有,我将程序的主线程放置到休眠状态,直到复制操作完成,然后像往常一样继续对文件夹执行操作。
I just wanted to get some insight on this approach and see if it is valid. If anyone else has any other unique approaches to this problem, it would be greatly appreciated.
我只是想了解一下这个方法,看看它是否有效。如果有人有其他独特的方法来解决这个问题,我们将不胜感激。
UPDATE:
更新:
I apologize for the confusion, when I say target directory, I mean the source folder containing all the files I want to process. A part of the function of my program is to copy the directory structure of the source directory to a destination directory and copy all valid files to that destination directory, preserving the directory structure of the original source directory, i.e. a user may copy folders containing files to the source directory. I want to prevent errors by ensuring that if a new set of folders containing more subfolders and files is copied to the source directory for processing, my program will not start operating on the target directory until the copy process has completed.
我为这一混乱表示歉意,当我说目标目录时,我指的是包含我要处理的所有文件的源文件夹。的一部分,我的程序的功能是将源目录的目录结构复制到目标目录和所有有效的文件复制到目标目录,保存原始的源目录的目录结构,即用户可以将文件夹包含文件复制到源目录。我希望通过确保如果将包含更多子文件夹和文件的新文件夹复制到源目录进行处理,以防止出现错误,我的程序将不会在目标目录上开始操作,直到复制过程完成为止。
4 个解决方案
#1
3
What you are looking for is a typical producer/consumer scenario. What you need to do is outlined in 'Producer/consumer queue' section on this page. This will allow you to use multi threading (maybe span a backgroundworker) to copy files so you don't block the main service thread from listening to system events & you can perform more meaningful tasks there - like checking for new files & updating the queue. So on main thread do check for new files
on background threads perform the actual coping task
. From personal experience (have implemented this tasks) there is not too much performance gain from this approach unless you are running on multiple CPU machine but the process is very clean & smooth + the code is logically separated nicely.
您正在寻找的是一个典型的生产者/消费者场景。您需要做的是在本页的“生产者/消费者队列”一节中概述。这将允许您使用多线程(可能跨后台工作程序)来复制文件,这样您就不会阻止主服务线程监听系统事件&您可以在那里执行更有意义的任务——比如检查新文件和更新队列。因此,在主线程上,请检查后台线程上的新文件是否执行实际的处理任务。根据个人经验(已经实现了此任务),这种方法不会获得太多性能收益,除非您在多台CPU机器上运行,但是过程非常干净和流畅,并且代码在逻辑上被很好地分离。
In short, what you have to do is have an object like the following:
简而言之,你所要做的就是拥有这样一个对象:
public class File
{
public string FullPath {get; internal set;}
public bool CopyInProgress {get; set;} // property to make sure
// .. other properties if desired
}
Then following the tutorial posted above issue a lock on the File object & the queue to update it & copy it. Using this approach you can use this type approaches instead of constantly monitoring for file copy completion. The important point to realize here is that your service has only one instance of File object per actual physical file - just make sure you (1)lock your queue when adding & removing & (2) lock the actual File object when initializing an update.
然后根据上面发布的教程,在File对象和队列上发出一个锁来更新和复制它。使用这种方法,您可以使用这种类型方法,而不是不断监视文件复制完成。这里需要注意的重要一点是,您的服务在每个实际的物理文件中只有一个文件对象实例——在初始化更新时,请确保(1)在添加和删除时锁定队列,(2)在初始化更新时锁定实际的文件对象。
EDIT
: Above where I say "there is not too much performance gain from this approach unless" I refere to if you do this approach in a single thread compare to @Jason's suggesting this approach must be noticeably faster due to @Jason's solution performing very expensive IO operations which will fail on most cases. This I haven't tested but I'm quite sure as my approach does not require IO operations open(once only), stream(once only) & close file(once only). @Jason approach suggests multiple open,open,open,open operations which will all fail except the last one.
编辑:上面我说“没有太多的性能从这种方法中获益,除非“我refere如果你这种方法在单个线程比较@Jason的表明这种方法必须明显由于@Jason更快的解决方案执行非常昂贵的IO操作,在大多数情况下会失败。这个我还没有测试过,但是我很确定,因为我的方法不需要IO操作打开(一次)、流(一次)和关闭文件(一次)。@Jason方法建议多个开放、开放、开放、开放的操作,除了最后一个都失败。
#2
11
Yup, use a FileSystemWatcher
but instead of watching for the created event, watch for the changed event. After every trigger, try to open the file. Something like this:
是的,使用一个文件系统监视程序,但是不监视已创建的事件,而是监视已更改的事件。每次触发后,尝试打开文件。是这样的:
var watcher = new FileSystemWatcher(path, filter);
watcher.Changed += (sender, e) => {
FileStream file = null;
try {
Thread.Sleep(100); // hack for timing issues
file = File.Open(
e.FullPath,
FileMode.Open,
FileAccess.Read,
FileShare.Read
);
}
catch(IOException) {
// we couldn't open the file
// this is probably because the copy operation is not done
// just swallow the exception
return;
}
// now we have a handle to the file
};
This is about the best that you can do, unfortunately. There is no clean way to know that the file is ready for you to use.
不幸的是,这是你能做的最好的事情。没有干净的方法可以知道文件已经准备好供您使用。
#3
2
One approach is to attempt to open the file and see if you get an error. The file will be locked if it is being copied. This will open the file in shared mode so it will conflict with an already open write lock on the file:
一种方法是尝试打开文件,看看是否有错误。如果文件被复制,文件将被锁定。这将以共享模式打开文件,因此它将与文件上已经打开的写锁发生冲突:
using(System.IO.File.Open("file", FileMode.Open,FileAccess.Read, FileShare.Read)) {}
Another is to check the file size. It would change over time if the file is being copied to.
另一个是检查文件大小。如果文件被复制到,它将随时间而改变。
It is also possible to get a list of all applications that has opened a certain file, but I don't know the API for this.
也可以获得打开某个文件的所有应用程序的列表,但我不知道为此使用的API。
#4
1
I know this is an old question, but here's an answer I spun up after searching for an answer to just this problem. This had to be tweaked a lot to remove some of the proprietary-ness from what I was working on, so this may not compile directly, but it'll give you an idea. This is working great for me:
我知道这是一个古老的问题,但这是我在寻找这个问题的答案后得出的答案。这需要进行大量的调整,以从我正在进行的工作中去除一些所有权,所以这可能不会直接编译,但是它会给您一个想法。这对我很有帮助:
void BlockingFileCopySync(FileInfo original, FileInfo copyPath)
{
bool ready = false;
FileSystemWatcher watcher = new FileSystemWatcher();
watcher.NotifyFilter = NotifyFilters.LastWrite;
watcher.Path = copyPath.Directory.FullName;
watcher.Filter = "*" + copyPath.Extension;
watcher.EnableRaisingEvents = true;
bool fileReady = false;
bool firsttime = true;
DateTime previousLastWriteTime = new DateTime();
// modify this as you think you need to...
int waitTimeMs = 100;
watcher.Changed += (sender, e) =>
{
// Get the time the file was modified
// Check it again in 100 ms
// When it has gone a while without modification, it's done.
while (!fileReady)
{
// We need to initialize for the "first time",
// ie. when the file was just created.
// (Really, this could probably be initialized off the
// time of the copy now that I'm thinking of it.)
if (firsttime)
{
previousLastWriteTime = System.IO.File.GetLastWriteTime(copyPath.FullName);
firsttime = false;
System.Threading.Thread.Sleep(waitTimeMs);
continue;
}
DateTime currentLastWriteTime = System.IO.File.GetLastWriteTime(copyPath.FullName);
bool fileModified = (currentLastWriteTime != previousLastWriteTime);
if (fileModified)
{
previousLastWriteTime = currentLastWriteTime;
System.Threading.Thread.Sleep(waitTimeMs);
continue;
}
else
{
fileReady = true;
break;
}
}
};
System.IO.File.Copy(original.FullName, copyPath.FullName, true);
// This guy here chills out until the filesystemwatcher
// tells him the file isn't being writen to anymore.
while (!fileReady)
{
System.Threading.Thread.Sleep(waitTimeMs);
}
}
#1
3
What you are looking for is a typical producer/consumer scenario. What you need to do is outlined in 'Producer/consumer queue' section on this page. This will allow you to use multi threading (maybe span a backgroundworker) to copy files so you don't block the main service thread from listening to system events & you can perform more meaningful tasks there - like checking for new files & updating the queue. So on main thread do check for new files
on background threads perform the actual coping task
. From personal experience (have implemented this tasks) there is not too much performance gain from this approach unless you are running on multiple CPU machine but the process is very clean & smooth + the code is logically separated nicely.
您正在寻找的是一个典型的生产者/消费者场景。您需要做的是在本页的“生产者/消费者队列”一节中概述。这将允许您使用多线程(可能跨后台工作程序)来复制文件,这样您就不会阻止主服务线程监听系统事件&您可以在那里执行更有意义的任务——比如检查新文件和更新队列。因此,在主线程上,请检查后台线程上的新文件是否执行实际的处理任务。根据个人经验(已经实现了此任务),这种方法不会获得太多性能收益,除非您在多台CPU机器上运行,但是过程非常干净和流畅,并且代码在逻辑上被很好地分离。
In short, what you have to do is have an object like the following:
简而言之,你所要做的就是拥有这样一个对象:
public class File
{
public string FullPath {get; internal set;}
public bool CopyInProgress {get; set;} // property to make sure
// .. other properties if desired
}
Then following the tutorial posted above issue a lock on the File object & the queue to update it & copy it. Using this approach you can use this type approaches instead of constantly monitoring for file copy completion. The important point to realize here is that your service has only one instance of File object per actual physical file - just make sure you (1)lock your queue when adding & removing & (2) lock the actual File object when initializing an update.
然后根据上面发布的教程,在File对象和队列上发出一个锁来更新和复制它。使用这种方法,您可以使用这种类型方法,而不是不断监视文件复制完成。这里需要注意的重要一点是,您的服务在每个实际的物理文件中只有一个文件对象实例——在初始化更新时,请确保(1)在添加和删除时锁定队列,(2)在初始化更新时锁定实际的文件对象。
EDIT
: Above where I say "there is not too much performance gain from this approach unless" I refere to if you do this approach in a single thread compare to @Jason's suggesting this approach must be noticeably faster due to @Jason's solution performing very expensive IO operations which will fail on most cases. This I haven't tested but I'm quite sure as my approach does not require IO operations open(once only), stream(once only) & close file(once only). @Jason approach suggests multiple open,open,open,open operations which will all fail except the last one.
编辑:上面我说“没有太多的性能从这种方法中获益,除非“我refere如果你这种方法在单个线程比较@Jason的表明这种方法必须明显由于@Jason更快的解决方案执行非常昂贵的IO操作,在大多数情况下会失败。这个我还没有测试过,但是我很确定,因为我的方法不需要IO操作打开(一次)、流(一次)和关闭文件(一次)。@Jason方法建议多个开放、开放、开放、开放的操作,除了最后一个都失败。
#2
11
Yup, use a FileSystemWatcher
but instead of watching for the created event, watch for the changed event. After every trigger, try to open the file. Something like this:
是的,使用一个文件系统监视程序,但是不监视已创建的事件,而是监视已更改的事件。每次触发后,尝试打开文件。是这样的:
var watcher = new FileSystemWatcher(path, filter);
watcher.Changed += (sender, e) => {
FileStream file = null;
try {
Thread.Sleep(100); // hack for timing issues
file = File.Open(
e.FullPath,
FileMode.Open,
FileAccess.Read,
FileShare.Read
);
}
catch(IOException) {
// we couldn't open the file
// this is probably because the copy operation is not done
// just swallow the exception
return;
}
// now we have a handle to the file
};
This is about the best that you can do, unfortunately. There is no clean way to know that the file is ready for you to use.
不幸的是,这是你能做的最好的事情。没有干净的方法可以知道文件已经准备好供您使用。
#3
2
One approach is to attempt to open the file and see if you get an error. The file will be locked if it is being copied. This will open the file in shared mode so it will conflict with an already open write lock on the file:
一种方法是尝试打开文件,看看是否有错误。如果文件被复制,文件将被锁定。这将以共享模式打开文件,因此它将与文件上已经打开的写锁发生冲突:
using(System.IO.File.Open("file", FileMode.Open,FileAccess.Read, FileShare.Read)) {}
Another is to check the file size. It would change over time if the file is being copied to.
另一个是检查文件大小。如果文件被复制到,它将随时间而改变。
It is also possible to get a list of all applications that has opened a certain file, but I don't know the API for this.
也可以获得打开某个文件的所有应用程序的列表,但我不知道为此使用的API。
#4
1
I know this is an old question, but here's an answer I spun up after searching for an answer to just this problem. This had to be tweaked a lot to remove some of the proprietary-ness from what I was working on, so this may not compile directly, but it'll give you an idea. This is working great for me:
我知道这是一个古老的问题,但这是我在寻找这个问题的答案后得出的答案。这需要进行大量的调整,以从我正在进行的工作中去除一些所有权,所以这可能不会直接编译,但是它会给您一个想法。这对我很有帮助:
void BlockingFileCopySync(FileInfo original, FileInfo copyPath)
{
bool ready = false;
FileSystemWatcher watcher = new FileSystemWatcher();
watcher.NotifyFilter = NotifyFilters.LastWrite;
watcher.Path = copyPath.Directory.FullName;
watcher.Filter = "*" + copyPath.Extension;
watcher.EnableRaisingEvents = true;
bool fileReady = false;
bool firsttime = true;
DateTime previousLastWriteTime = new DateTime();
// modify this as you think you need to...
int waitTimeMs = 100;
watcher.Changed += (sender, e) =>
{
// Get the time the file was modified
// Check it again in 100 ms
// When it has gone a while without modification, it's done.
while (!fileReady)
{
// We need to initialize for the "first time",
// ie. when the file was just created.
// (Really, this could probably be initialized off the
// time of the copy now that I'm thinking of it.)
if (firsttime)
{
previousLastWriteTime = System.IO.File.GetLastWriteTime(copyPath.FullName);
firsttime = false;
System.Threading.Thread.Sleep(waitTimeMs);
continue;
}
DateTime currentLastWriteTime = System.IO.File.GetLastWriteTime(copyPath.FullName);
bool fileModified = (currentLastWriteTime != previousLastWriteTime);
if (fileModified)
{
previousLastWriteTime = currentLastWriteTime;
System.Threading.Thread.Sleep(waitTimeMs);
continue;
}
else
{
fileReady = true;
break;
}
}
};
System.IO.File.Copy(original.FullName, copyPath.FullName, true);
// This guy here chills out until the filesystemwatcher
// tells him the file isn't being writen to anymore.
while (!fileReady)
{
System.Threading.Thread.Sleep(waitTimeMs);
}
}