Javascript使用File.Reader()逐行读取

时间:2020-12-30 23:27:29

This question is close but not quite close enough.

这个问题很接近但不够接近。

My HTML5 application reads a CSV file (although it applies to text as well) and displays some of the data on screen.

我的HTML5应用程序读取CSV文件(尽管它也适用于文本)并在屏幕上显示一些数据。

The problem I have is that the CSV files can be huge (I'm managed to get business to agree a 1GB file size limit). The good news is, I only need to display some of the data from the CSV file at any point.

我遇到的问题是CSV文件可能很大(我设法让业务同意1GB的文件大小限制)。好消息是,我只需要在任何时候显示CSV文件中的一些数据。

The idea is something like (psudeo code)

这个想法就像(psudeo代码)

var content;
var reader =  OpenReader(myCsvFile)
var line = 0;

while (reader.hasLinesRemaning)
    if (line % 10 == 1)
      content = currentLine;
Loop to next line

There are enough articles about how to read the CSV file, I'm using

有足够的文章介绍如何阅读CSV文件,我正在使用

function openCSVFile(csvFileName){
    var r = new FileReader();
    r.onload = function(e) {
        var contents = e.target.result;
        var s = "";
    };  
    r.readAsText(csvFileName);
}

but, I can't see how to read line at a time in Javascript OR even if it's possible.

但是,我无法看到如何在Javascript中一次读取行或即使它是可能的。

My CSV data looks like

我的CSV数据看起来像

Some detail: date, ,
More detail: time, ,
val1, val2
val11, val12
#val11, val12
val21, val22

I need to strip out the first 2 lines, and also consider what to do with the line starting with a # (hence why I need to read through line at a time)

我需要删除前两行,并考虑如何处理以#开头的行(因此我需要一次读取行)

So, other than loading the lot into memory, do I have any options to read line at a time?

因此,除了将批次加载到内存中之外,我是否有任何选项可以一次读取行?

1 个解决方案

#1


4  

There is no readLine() method to do this as of now. However, some ideas to explore:

到目前为止,没有readLine()方法可以执行此操作。但是,有些想法值得探讨:

  • Reading from a blob does fire progress events. While it is not required by the specification, the engine might prematurely populate the .result property similar to an XMLHttpRequest.
  • 从blob中读取会触发进度事件。虽然规范不要求它,但引擎可能过早地填充.result属性,类似于XMLHttpRequest。
  • The Streams API drafts a streaming .read(size) method for file readers. I don't think it is already implemented anywhere, though.
  • Streams API为文件读取器绘制流式.read(大小)方法。不过,我不认为它已经在任何地方实施过了。
  • Blobs do have a slice method which returns a new Blob containing a part of the original data. The spec and the synchronous nature of the operation suggest that this is done via references, not copying, and should be quite performant. This would allow you to read the huge file chunk-by-chunk.
  • Blob确实有一个slice方法,它返回一个包含原始数据的一部分的新Blob。操作的规范和同步性质表明这是通过引用完成的,而不是复制,并且应该是非常高效的。这将允许您逐块读取大块文件。

Admittedly, none of these methods do automatically stop at line endings. You will need to buffer the chunks manually, break them into lines and shift them out once they are complete. Also, these operations are working on bytes, not on characters, so there might be encoding problems with multi-byte characters that need to be handled.

不可否认,这些方法都不会自动停在线路终点。您需要手动缓冲块,将它们分成几行并在完成后将它们移出。此外,这些操作正在处理字节,而不是字符,因此可能存在需要处理的多字节字符的编码问题。

See also: Reading line-by-line file in JavaScript on client side

另请参阅:在客户端JavaScript中读取逐行文件

#1


4  

There is no readLine() method to do this as of now. However, some ideas to explore:

到目前为止,没有readLine()方法可以执行此操作。但是,有些想法值得探讨:

  • Reading from a blob does fire progress events. While it is not required by the specification, the engine might prematurely populate the .result property similar to an XMLHttpRequest.
  • 从blob中读取会触发进度事件。虽然规范不要求它,但引擎可能过早地填充.result属性,类似于XMLHttpRequest。
  • The Streams API drafts a streaming .read(size) method for file readers. I don't think it is already implemented anywhere, though.
  • Streams API为文件读取器绘制流式.read(大小)方法。不过,我不认为它已经在任何地方实施过了。
  • Blobs do have a slice method which returns a new Blob containing a part of the original data. The spec and the synchronous nature of the operation suggest that this is done via references, not copying, and should be quite performant. This would allow you to read the huge file chunk-by-chunk.
  • Blob确实有一个slice方法,它返回一个包含原始数据的一部分的新Blob。操作的规范和同步性质表明这是通过引用完成的,而不是复制,并且应该是非常高效的。这将允许您逐块读取大块文件。

Admittedly, none of these methods do automatically stop at line endings. You will need to buffer the chunks manually, break them into lines and shift them out once they are complete. Also, these operations are working on bytes, not on characters, so there might be encoding problems with multi-byte characters that need to be handled.

不可否认,这些方法都不会自动停在线路终点。您需要手动缓冲块,将它们分成几行并在完成后将它们移出。此外,这些操作正在处理字节,而不是字符,因此可能存在需要处理的多字节字符的编码问题。

See also: Reading line-by-line file in JavaScript on client side

另请参阅:在客户端JavaScript中读取逐行文件