从c#导出大量数据到excel的最佳/最快方法是什么

时间:2022-12-24 17:00:21

I have code that uses the OpenXML library to export data.

我有使用OpenXML库导出数据的代码。

I have 20,000 rows and 22 columns and it takes ages (about 10 minutes).

我有2万行和22列,需要很长时间(大约10分钟)。

is there any solution that would export data from C# to excel that would be faster as i am doing this from an asp.net mvc app and many people browsers are timing out.

是否有任何解决方案可以将数据从c#导出到excel中,这样做的速度会更快,因为我是在asp.net mvc应用程序中完成的,而且很多人都选择了超时。

7 个解决方案

#1


2  

Assuming 20'000 rows and 22 columns with about 100 bytes each, makes 41 megabytes data alone. plus xml tags, plus formatting, I'd say you end up zipping (.xlsx is nothing but several zipped xml files) 100 mb of data.

假设有2万行和22个列,每列大约100个字节,那么就只会有41兆字节的数据。加上xml标签,加上格式,我想说的是最后压缩(。xlsx只是几个压缩的xml文件)100mb的数据。

Of course this takes a while, and so does fetching the data. I recommend you use excel package plus instead of the Office OpenXML development kit. http://epplus.codeplex.com/

当然,这需要一段时间,获取数据也需要一段时间。我建议您使用excel软件包plus,而不是Office OpenXML开发工具包。http://epplus.codeplex.com/

There's probably a bug/performance-issue in the write-in-a-hurry-and-hope-that-it-doesnt-blow-up-too-soon Microsoft code.

微软代码中可能存在一个bug/性能问题。

#2


1  

CSV. It is a plain text file, but can be opened by any version of Excel.

CSV。它是一个纯文本文件,但可以通过任何版本的Excel打开。

No doubt it is a easier way to export data to excel. A lot of website provide data export as CSV.

毫无疑问,这是一种将数据导出到excel的更简单的方法。很多网站都提供CSV格式的数据导出。

What you need to do is just add a comma (,) to separate the values and a line break to separate the records. It won't take extra resource to build the csv file, so it is quite fast.

您需要做的就是添加一个逗号(,)来分隔值和一个换行符来分隔记录。构建csv文件不需要额外的资源,所以速度很快。

#3


1  

I wound up using an open source solution called ClosedXML that worked great

我最后使用了一个名为ClosedXML的开源解决方案,它非常有效

#4


0  

Depending on what version of Excel you are targetting, you could expose the data as an OData service which Excel 2010 can naturally consume and will handle the downloading and formattting for you.

根据您的target版本,您可以将数据公开为OData服务,Excel 2010自然可以使用该服务,并为您处理下载和格式化。

#5


0  

I am assuming that this data is something that needs to be completely sent to the client and has already been pre-filtered in some fashion, but still needs to be sent back to the person who made the request.

我假设这个数据需要完全发送给客户端,并且已经以某种方式进行了预先过滤,但是仍然需要发送回发出请求的人。

In this case, you want to perform this particular operation 'asynchronously'. I'm not sure if this would fit your workflow, but say that a person requests this large XML formatted document, I would: a) queue another worker thread to kick off the generation of the document while returning a 'token' (perhaps a GUID to the requester); b) return a link to a page where the requestor can click on the link (passing the token) allowing the page to look up results.

在本例中,您希望“异步”执行此特定操作。我不确定这是否适合您的工作流程,但假设某人请求这个大型XML格式的文档,我将:a)在返回“token”(可能是请求者的GUID)的同时,对另一个worker线程进行队列,以启动文档的生成;b)返回一个链接到一个页面,请求者可以单击该链接(传递令牌),允许该页面查找结果。

If the thread has completed processing the document, it places it into a special folder with a unique name and adds the token to a database table with its document location. If the person requests that page, the token exists in the database and the document exists on the file system, they are allowed to click and download it through HTTP. If it does not exist, they are either told it does not exist or to wait for the results. (This message can be based on the time the request was received.)

如果线程已经完成了对文档的处理,它将它放入一个具有惟一名称的特殊文件夹中,并将该令牌添加到一个带有文档位置的数据库表中。如果这个人请求这个页面,这个令牌存在于数据库中,文档存在于文件系统中,他们可以通过HTTP单击并下载它。如果它不存在,他们要么被告知它不存在,要么等待结果。(此消息可以基于收到请求的时间。)

If the person downloads the document successfully (and you can do this through script), you can remove the entry for the database for the document with that token and delete the file from the file system.

如果这个人成功下载了文档(并且您可以通过脚本完成此操作),您可以使用该令牌删除文档的数据库条目,并从文件系统中删除该文件。

I hope I read this question correctly.

我希望我读对了这个问题。

#6


0  

I have found that I can speed up exporting data from a database into an Excel spreadsheet by limiting the number of export operations. I found that by accumulating 100 lines of data before writing, the creation speed increased by a factor of at least 5-10x.

我发现,通过限制导出操作的数量,可以加快从数据库导出数据到Excel电子表格的速度。我发现,在写作之前积累了100行数据,创建速度至少增加了5-10x。

#7


0  

The mistake when exporting data that is most often done when exporting data is in the workflow

当导出数据时最常做的数据在工作流中是错误的。

  • Build Model
  • 构建模型
  • Build XML DOM
  • 构建XML DOM
  • Save XML DOM to file
  • 将XML DOM保存到文件中

This workflow leads to an overhead because building up the XML DOM needs it's time, the XML DOM is kept in memory together with the Model and then the whole bunch of data is written to a file.

这个工作流会导致开销,因为构建XML DOM需要时间,XML DOM与模型一起保存在内存中,然后将所有数据写到文件中。

A better way to handle this is to convert your model entry by entry directly to the target format and write it directly to a (buffered) file.

更好的处理方法是直接将模型条目转换为目标格式,并将其直接写到一个(缓冲)文件中。

A format with low overhead that's fast to write and is readable by Excel is CSV (ok, it's legacy, it's awkward...).

一种低开销、编写速度快、可被Excel读取的格式是CSV(好吧,它是遗留的,很尴尬……)

#1


2  

Assuming 20'000 rows and 22 columns with about 100 bytes each, makes 41 megabytes data alone. plus xml tags, plus formatting, I'd say you end up zipping (.xlsx is nothing but several zipped xml files) 100 mb of data.

假设有2万行和22个列,每列大约100个字节,那么就只会有41兆字节的数据。加上xml标签,加上格式,我想说的是最后压缩(。xlsx只是几个压缩的xml文件)100mb的数据。

Of course this takes a while, and so does fetching the data. I recommend you use excel package plus instead of the Office OpenXML development kit. http://epplus.codeplex.com/

当然,这需要一段时间,获取数据也需要一段时间。我建议您使用excel软件包plus,而不是Office OpenXML开发工具包。http://epplus.codeplex.com/

There's probably a bug/performance-issue in the write-in-a-hurry-and-hope-that-it-doesnt-blow-up-too-soon Microsoft code.

微软代码中可能存在一个bug/性能问题。

#2


1  

CSV. It is a plain text file, but can be opened by any version of Excel.

CSV。它是一个纯文本文件,但可以通过任何版本的Excel打开。

No doubt it is a easier way to export data to excel. A lot of website provide data export as CSV.

毫无疑问,这是一种将数据导出到excel的更简单的方法。很多网站都提供CSV格式的数据导出。

What you need to do is just add a comma (,) to separate the values and a line break to separate the records. It won't take extra resource to build the csv file, so it is quite fast.

您需要做的就是添加一个逗号(,)来分隔值和一个换行符来分隔记录。构建csv文件不需要额外的资源,所以速度很快。

#3


1  

I wound up using an open source solution called ClosedXML that worked great

我最后使用了一个名为ClosedXML的开源解决方案,它非常有效

#4


0  

Depending on what version of Excel you are targetting, you could expose the data as an OData service which Excel 2010 can naturally consume and will handle the downloading and formattting for you.

根据您的target版本,您可以将数据公开为OData服务,Excel 2010自然可以使用该服务,并为您处理下载和格式化。

#5


0  

I am assuming that this data is something that needs to be completely sent to the client and has already been pre-filtered in some fashion, but still needs to be sent back to the person who made the request.

我假设这个数据需要完全发送给客户端,并且已经以某种方式进行了预先过滤,但是仍然需要发送回发出请求的人。

In this case, you want to perform this particular operation 'asynchronously'. I'm not sure if this would fit your workflow, but say that a person requests this large XML formatted document, I would: a) queue another worker thread to kick off the generation of the document while returning a 'token' (perhaps a GUID to the requester); b) return a link to a page where the requestor can click on the link (passing the token) allowing the page to look up results.

在本例中,您希望“异步”执行此特定操作。我不确定这是否适合您的工作流程,但假设某人请求这个大型XML格式的文档,我将:a)在返回“token”(可能是请求者的GUID)的同时,对另一个worker线程进行队列,以启动文档的生成;b)返回一个链接到一个页面,请求者可以单击该链接(传递令牌),允许该页面查找结果。

If the thread has completed processing the document, it places it into a special folder with a unique name and adds the token to a database table with its document location. If the person requests that page, the token exists in the database and the document exists on the file system, they are allowed to click and download it through HTTP. If it does not exist, they are either told it does not exist or to wait for the results. (This message can be based on the time the request was received.)

如果线程已经完成了对文档的处理,它将它放入一个具有惟一名称的特殊文件夹中,并将该令牌添加到一个带有文档位置的数据库表中。如果这个人请求这个页面,这个令牌存在于数据库中,文档存在于文件系统中,他们可以通过HTTP单击并下载它。如果它不存在,他们要么被告知它不存在,要么等待结果。(此消息可以基于收到请求的时间。)

If the person downloads the document successfully (and you can do this through script), you can remove the entry for the database for the document with that token and delete the file from the file system.

如果这个人成功下载了文档(并且您可以通过脚本完成此操作),您可以使用该令牌删除文档的数据库条目,并从文件系统中删除该文件。

I hope I read this question correctly.

我希望我读对了这个问题。

#6


0  

I have found that I can speed up exporting data from a database into an Excel spreadsheet by limiting the number of export operations. I found that by accumulating 100 lines of data before writing, the creation speed increased by a factor of at least 5-10x.

我发现,通过限制导出操作的数量,可以加快从数据库导出数据到Excel电子表格的速度。我发现,在写作之前积累了100行数据,创建速度至少增加了5-10x。

#7


0  

The mistake when exporting data that is most often done when exporting data is in the workflow

当导出数据时最常做的数据在工作流中是错误的。

  • Build Model
  • 构建模型
  • Build XML DOM
  • 构建XML DOM
  • Save XML DOM to file
  • 将XML DOM保存到文件中

This workflow leads to an overhead because building up the XML DOM needs it's time, the XML DOM is kept in memory together with the Model and then the whole bunch of data is written to a file.

这个工作流会导致开销,因为构建XML DOM需要时间,XML DOM与模型一起保存在内存中,然后将所有数据写到文件中。

A better way to handle this is to convert your model entry by entry directly to the target format and write it directly to a (buffered) file.

更好的处理方法是直接将模型条目转换为目标格式,并将其直接写到一个(缓冲)文件中。

A format with low overhead that's fast to write and is readable by Excel is CSV (ok, it's legacy, it's awkward...).

一种低开销、编写速度快、可被Excel读取的格式是CSV(好吧,它是遗留的,很尴尬……)