从C#中不断增长的文件中读取?

时间:2021-11-27 21:23:20

In C#/.NET (on Windows) is there a way to read a "growing" file using a file stream? The length of the file will be very small when the filestream is opened, but the file will be being written to by another thread. If/when the filestream "catches up" to the other thread (i.e. when Read() returns 0 bytes read), I want to pause to allow the file to buffer a bit, then continue reading.

在C#/ .NET(在Windows上)有没有办法使用文件流读取“增长”文件?打开文件流时,文件的长度将非常小,但该文件将由另一个线程写入。如果/当文件流“赶上”到另一个线程时(即当Read()返回0字节读取时),我想暂停以允许文件缓冲一点,然后继续读取。

I don't really want to use a FilesystemWatcher and keep creating new file streams (as was suggested for log files), since this isn't a log file (it's a video file being encoded on the fly) and performance is an issue.

我真的不想使用FilesystemWatcher并继续创建新的文件流(正如日志文件所建议的那样),因为这不是一个日志文件(它是一个即时编码的视频文件),性能是个问题。

Thanks,
Robert

4 个解决方案

#1


6  

You can do this, but you need to keep careful track of the file read and write positions using Stream.Seek and with appropriate synchronization between the threads. Typically you would use an EventWaitHandle or subclass thereof to do the synchronization for data, and you would also need to consider synchronization for the access to the FileStream object itself (probably via a lock statement).

您可以这样做,但是您需要使用Stream.Seek以及线程之间的适当同步来仔细跟踪文件读取和写入位置。通常,您将使用EventWaitHandle或其子类来执行数据同步,并且您还需要考虑同步访问FileStream对象本身(可能通过lock语句)。

Update: In answering this question I implemented something similar - a situation where a file was being downloaded in the background and also being uploaded at the same time. I used memory buffers, and posted a gist which has working code. (It's GPL but that might not matter for you - in any case you can use the principles to do your own thing.)

更新:在回答这个问题时,我实现了类似的东西 - 在后台下载文件并同时上传的情况。我使用了内存缓冲区,并发布了一个有工作代码的要点。 (这是GPL,但这对你来说可能无关紧要 - 无论如何你都可以使用这些原则来做你自己的事情。)

#2


4  

The way i solved this is using the DirectoryWatcher / FilesystemWatcher class, and when it triggers on the file you want you open a FileStream and read it to the end. And when im done reading i save the position of the reader, so next time the DirectoryWatcher / FilesystemWatcher triggers i open a stream set the position to where i was last time.

我解决这个问题的方法是使用DirectoryWatcher / FilesystemWatcher类,当它触发文件时,你需要打开一个FileStream并将其读取到最后。当我完成阅读后,我保存了阅读器的位置,所以下次DirectoryWatcher / FilesystemWatcher触发时我打开一个流,将位置设置为我上次的位置。

Calling FileStream.length is actualy very slow, i have had no performance issues with my solution ( I was im reading a "log" ranging from 10mb to 50 ish).

调用FileStream.length实际上非常慢,我的解决方案没有性能问题(我正在阅读“日志”,范围从10mb到50 ish)。

To me the solution i describe is very simple and easy to maintain, i would try it and profile it. I dont think your going to get any performance issues based on it. I do this when ppl are playing a multi threaded game, taking their entire CPU and nobody has complained that my parser is more demanding then the competing parsers.

对我来说,我描述的解决方案非常简单易于维护,我会尝试并对其进行分析。我不认为你会在它基础上得到任何性能问题。当ppl正在玩一个多线程游戏时,我这样做,拿走他们的整个CPU,并且没有人抱怨我的解析器比竞争解析器要求更高。

#3


4  

This worked with a StreamReader around a file, with the following steps:

这适用于围绕文件的StreamReader,具有以下步骤:

  1. In the program that writes to the file, open it with read sharing, like this:

    在写入文件的程序中,使用读取共享打开它,如下所示:

    var out = new StreamWriter(File.Open("logFile.txt",  FileMode.OpenOrCreate, FileAccess.Write, FileShare.Read));
    
  2. In the program that reads the file, open it with read-write sharing, like this:

    在读取文件的程序中,使用读写共享打开它,如下所示:

    using (FileStream fileStream = File.Open("logFile.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
    using ( var file = new StreamReader(fileStream))
    
  3. Before accessing the input stream, check whether the end has been reached, and if so, wait around a while.

    在访问输入流之前,检查是否已到达结束,如果已到达,请等待一段时间。

    while (file.EndOfStream)
    {
        Thread.Sleep(5);
    }
    

#4


2  

One other thing that might be useful is the FileStream class has a property on it called ReadTimeOut which is defined as:

另一个可能有用的事情是FileStream类上有一个名为ReadTimeOut的属性,定义如下:

Gets or sets a value, in miliseconds, that determines how long the stream will attempt to read before timing out. (inherited from Stream)

获取或设置一个以毫秒为单位的值,该值确定流在超时之前尝试读取的时间。 (继承自Stream)

This could be useful in that when your reads catch up to your writes the thread performing the reads may pause while the write buffer gets flushed. It would certianly be worth writing a small test to see if this property would help your cause in any way.

这可能很有用,因为当您的读取赶上您的写入时,执行读取的线程可能会在写入缓冲区被刷新时暂停。值得编写一个小测试,看看这个属性是否会以任何方式帮助你的事业。

Are the read and write operations happening on the same object? If so you could write your own abstractions over the file and then write cross thread communication code such that the thread that is performing the writes and notify the thread performing the reads when it is done so that the thread doing the reads knows when to stop reading when it reaches EOF.

读写操作是否发生在同一个对象上?如果是这样,你可以在文件上编写自己的抽象,然后编写跨线程通信代码,使执行写操作的线程在完成时通知线程执行读操作,以便执行读操作的线程知道何时停止读当它达到EOF时。

#1


6  

You can do this, but you need to keep careful track of the file read and write positions using Stream.Seek and with appropriate synchronization between the threads. Typically you would use an EventWaitHandle or subclass thereof to do the synchronization for data, and you would also need to consider synchronization for the access to the FileStream object itself (probably via a lock statement).

您可以这样做,但是您需要使用Stream.Seek以及线程之间的适当同步来仔细跟踪文件读取和写入位置。通常,您将使用EventWaitHandle或其子类来执行数据同步,并且您还需要考虑同步访问FileStream对象本身(可能通过lock语句)。

Update: In answering this question I implemented something similar - a situation where a file was being downloaded in the background and also being uploaded at the same time. I used memory buffers, and posted a gist which has working code. (It's GPL but that might not matter for you - in any case you can use the principles to do your own thing.)

更新:在回答这个问题时,我实现了类似的东西 - 在后台下载文件并同时上传的情况。我使用了内存缓冲区,并发布了一个有工作代码的要点。 (这是GPL,但这对你来说可能无关紧要 - 无论如何你都可以使用这些原则来做你自己的事情。)

#2


4  

The way i solved this is using the DirectoryWatcher / FilesystemWatcher class, and when it triggers on the file you want you open a FileStream and read it to the end. And when im done reading i save the position of the reader, so next time the DirectoryWatcher / FilesystemWatcher triggers i open a stream set the position to where i was last time.

我解决这个问题的方法是使用DirectoryWatcher / FilesystemWatcher类,当它触发文件时,你需要打开一个FileStream并将其读取到最后。当我完成阅读后,我保存了阅读器的位置,所以下次DirectoryWatcher / FilesystemWatcher触发时我打开一个流,将位置设置为我上次的位置。

Calling FileStream.length is actualy very slow, i have had no performance issues with my solution ( I was im reading a "log" ranging from 10mb to 50 ish).

调用FileStream.length实际上非常慢,我的解决方案没有性能问题(我正在阅读“日志”,范围从10mb到50 ish)。

To me the solution i describe is very simple and easy to maintain, i would try it and profile it. I dont think your going to get any performance issues based on it. I do this when ppl are playing a multi threaded game, taking their entire CPU and nobody has complained that my parser is more demanding then the competing parsers.

对我来说,我描述的解决方案非常简单易于维护,我会尝试并对其进行分析。我不认为你会在它基础上得到任何性能问题。当ppl正在玩一个多线程游戏时,我这样做,拿走他们的整个CPU,并且没有人抱怨我的解析器比竞争解析器要求更高。

#3


4  

This worked with a StreamReader around a file, with the following steps:

这适用于围绕文件的StreamReader,具有以下步骤:

  1. In the program that writes to the file, open it with read sharing, like this:

    在写入文件的程序中,使用读取共享打开它,如下所示:

    var out = new StreamWriter(File.Open("logFile.txt",  FileMode.OpenOrCreate, FileAccess.Write, FileShare.Read));
    
  2. In the program that reads the file, open it with read-write sharing, like this:

    在读取文件的程序中,使用读写共享打开它,如下所示:

    using (FileStream fileStream = File.Open("logFile.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
    using ( var file = new StreamReader(fileStream))
    
  3. Before accessing the input stream, check whether the end has been reached, and if so, wait around a while.

    在访问输入流之前,检查是否已到达结束,如果已到达,请等待一段时间。

    while (file.EndOfStream)
    {
        Thread.Sleep(5);
    }
    

#4


2  

One other thing that might be useful is the FileStream class has a property on it called ReadTimeOut which is defined as:

另一个可能有用的事情是FileStream类上有一个名为ReadTimeOut的属性,定义如下:

Gets or sets a value, in miliseconds, that determines how long the stream will attempt to read before timing out. (inherited from Stream)

获取或设置一个以毫秒为单位的值,该值确定流在超时之前尝试读取的时间。 (继承自Stream)

This could be useful in that when your reads catch up to your writes the thread performing the reads may pause while the write buffer gets flushed. It would certianly be worth writing a small test to see if this property would help your cause in any way.

这可能很有用,因为当您的读取赶上您的写入时,执行读取的线程可能会在写入缓冲区被刷新时暂停。值得编写一个小测试,看看这个属性是否会以任何方式帮助你的事业。

Are the read and write operations happening on the same object? If so you could write your own abstractions over the file and then write cross thread communication code such that the thread that is performing the writes and notify the thread performing the reads when it is done so that the thread doing the reads knows when to stop reading when it reaches EOF.

读写操作是否发生在同一个对象上?如果是这样,你可以在文件上编写自己的抽象,然后编写跨线程通信代码,使执行写操作的线程在完成时通知线程执行读操作,以便执行读操作的线程知道何时停止读当它达到EOF时。