I need to read and process very large text file in powershell which I am able to do using the following pattern. However reading line by line seems inefficient to me.
我需要在powershell中读取和处理非常大的文本文件,我可以使用以下模式来完成。然而,逐行阅读对我来说似乎效率低下。
$reader = [System.IO.File]::OpenText($file)
while(!$reader.EndOfStream){
$line = $reader.ReadLine()
###Do something
}
so instead of reading line by line, is it possible to read multiple lines in one go from some kind of stream object?
因此,不是逐行阅读,是否可以从某种流对象一次读取多行?
2 个解决方案
#1
3
Why not use the built-in command for this:
为什么不使用内置命令:
Get-Content $file -ReadCount 1024 | Foreach {$_} | Where {$_ -match 'pattern'}
This reads 1024 lines at a time. Run those through a Foreach command to flatten the array of 1024 lines into single lines for processing - in this case, filtering based on a regex pattern.
这一次读取1024行。通过Foreach命令运行这些操作,将1024行的数组展平为单行进行处理 - 在这种情况下,基于正则表达式模式进行过滤。
#2
0
You are already reading from "some kind of stream object", using a stream reader. It depends on what you want to do. If you want to process things on a line-by-line basis (e.g. if you want to see if a line contains a certain word), then what you're doing is pretty much the right way. You could read more data at once, using the StreamReader.Read method: http://msdn.microsoft.com/en-us/library/9kstw824(v=vs.110).aspx You could also read all of it at once, using ReadAll. All depends on what level you want to parse things.
您已经使用流阅读器从“某种流对象”中读取。这取决于你想做什么。如果你想逐行处理事物(例如,如果你想看一条线是否包含某个单词),那么你所做的几乎是正确的方法。你可以使用StreamReader.Read方法一次读取更多数据:http://msdn.microsoft.com/en-us/library/9kstw824(v = vs1010).aspx你也可以一次阅读所有数据,使用ReadAll。一切都取决于你想要解析的东西。
#1
3
Why not use the built-in command for this:
为什么不使用内置命令:
Get-Content $file -ReadCount 1024 | Foreach {$_} | Where {$_ -match 'pattern'}
This reads 1024 lines at a time. Run those through a Foreach command to flatten the array of 1024 lines into single lines for processing - in this case, filtering based on a regex pattern.
这一次读取1024行。通过Foreach命令运行这些操作,将1024行的数组展平为单行进行处理 - 在这种情况下,基于正则表达式模式进行过滤。
#2
0
You are already reading from "some kind of stream object", using a stream reader. It depends on what you want to do. If you want to process things on a line-by-line basis (e.g. if you want to see if a line contains a certain word), then what you're doing is pretty much the right way. You could read more data at once, using the StreamReader.Read method: http://msdn.microsoft.com/en-us/library/9kstw824(v=vs.110).aspx You could also read all of it at once, using ReadAll. All depends on what level you want to parse things.
您已经使用流阅读器从“某种流对象”中读取。这取决于你想做什么。如果你想逐行处理事物(例如,如果你想看一条线是否包含某个单词),那么你所做的几乎是正确的方法。你可以使用StreamReader.Read方法一次读取更多数据:http://msdn.microsoft.com/en-us/library/9kstw824(v = vs1010).aspx你也可以一次阅读所有数据,使用ReadAll。一切都取决于你想要解析的东西。