在c#中解析Excel文件时,单元格似乎会在255个字符处被切断……我该如何停止呢?

时间:2021-06-23 05:54:39

I am parsing through an uploaded excel files (xlsx) in asp.net with c#. I am using the following code (simplified):

我正在用c#解析asp.net中上传的excel文件(xlsx)。我正在使用以下代码(简化):

string connString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileLocation + ";Extended Properties=\"Excel 12.0 Xml;HDR=YES\";");
OleDbDataAdapter adapter = new OleDbDataAdapter("SELECT * FROM [Sheet1$]", connString);
DataSet ds = new DataSet();
adapter.Fill(ds);
adapter.Dispose();
DataTable dt = ds.Tables[0];
var rows = from p in dt.AsEnumerable() select new { desc = p[2] };

This works perfectly, but if there is anything longer than 255 characters in the cell, it will get cut off. Any idea what I am doing wrong? Thank you.

这个功能很好,但是如果单元格中有超过255个字符,它就会被切断。知道我做错了什么吗?谢谢你!

EDIT: When viewing the excel sheet, it shows much more than 255 characters, so I don't believe the sheet itself is limited.

编辑:当查看excel表格时,它会显示超过255个字符,所以我不认为表格本身是有限的。

6 个解决方案

#1


0  

Just from a quick Googling of the subject, it appears that that's a limit of Excel.

仅仅是快速搜索一下这个主题,就会发现这是一个Excel的极限。

EDIT: Possible workaround (unfortunately in VB)

编辑:可能的解决方案(不幸的是在VB中)

#2


15  

The Solution!

I've been battling this today as well. I finally got it to work by modifying some registry keys before parsing the Excel spreadsheet.

我今天也一直在与之斗争。在解析Excel电子表格之前,我通过修改一些注册表键使它能够工作。

You must update this registry key before parsing the Excel spreadsheet:

在解析Excel电子表格之前,必须更新此注册表项:

// Excel 2010
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel\
or
HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel\

// Excel 2007
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel\

// Excel 2003
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel\

Change TypeGuessRows to 0 and ImportMixedTypes to Text under this key. You'll also need to update your connection string to include IMEX=1 in the extended properties:

将TypeGuessRows更改为0,并在此键下将输入类型转换为文本。您还需要更新连接字符串,以便在扩展属性中包含IMEX=1:

string connString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileLocation + ";Extended Properties=\"Excel 12.0 Xml;HDR=YES;IMEX=1\";");

References

http://blogs.vertigo.com/personal/aanttila/Blog/archive/2008/03/28/excel-and-csv-reference.aspx

http://blogs.vertigo.com/personal/aanttila/Blog/archive/2008/03/28/excel-and-csv-reference.aspx

http://msdn.microsoft.com/en-us/library/ms141683.aspx

http://msdn.microsoft.com/en-us/library/ms141683.aspx

...characters may be truncated. To import data from a memo column without truncation, you must make sure that the memo column in at least one of the sampled rows contains a value longer than 255 characters, or you must increase the number of rows sampled by the driver to include such a row. You can increase the number of rows sampled by increasing the value of TypeGuessRows under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry key....

…字符可能被截断。要从一个没有截断的memo列导入数据,必须确保至少一个抽样行中的memo列包含大于255个字符的值,或者您必须增加由驱动程序采样的行数来包含这样的行。你可以增加采样的行数增加的价值TypeGuessRows HKEY_LOCAL_MACHINE \微软软件\ \下飞机\ 4.0 \引擎\ Excel注册表键....

#3


3  

I have came across this, and the solution that worked for me was to move the cells with long text to the top of the spreadsheet.

我遇到过这种情况,对我有用的解决方案是将带有长文本的单元格移动到电子表格的顶部。

I found this comment in a forum describing the issue

我在一个描述这个问题的论坛上发现了这条评论

This is an issue with the Jet OLEDB provider. It looks at the first 8 rows
of the spreadsheet to determine the data type in each column. If the column does
not contain a field value over 256 characters in the first 8 rows, then it assumes the
data type is text, which has a character limit of 256. The following KB article has
more information on this issue: http://support.microsoft.com/kb/281517

这是Jet OLEDB提供程序的一个问题。它查看电子表格的前8行,以确定每个列中的数据类型。如果列在前8行中不包含超过256个字符的字段值,则假定数据类型为text,其字符限制为256。下面的KB文章有关于这个问题的更多信息:http://support.microsoft.com/kb/281517

Hope this help someone else!

希望这能帮助别人!

#4


1  

Have you tried setting the columns datatype to text within the spreadsheet? I believe doing this will allow the cells to contain much more than 255 characters.

您是否尝试过将列数据类型设置为电子表格中的文本?我相信这样做可以让单元格包含更多的255个字符。

[Edit] For what it's worth this dialog with the MS-Excel team is an interesting read. In the comments section at the bottom they get into some discussions about that 255 cutoff. They say Excel 12 can support 32k characters per cell.

与MS-Excel团队的对话是一个有趣的阅读。在底部的评论部分,他们讨论了255个截止点。他们说Excel 12每个单元可以支持32k个字符。

If that is true there must be a way to get at this data. Here is two things to consider.

如果这是真的,那么一定有办法获得这些数据。这里有两点需要考虑。

  1. In the past I have used the "IMEX=1" option in my connection string to deal with columns containing mixed data showing up as empty. It's a longshot, but you might give that a try.

    过去,我在连接字符串中使用“IMEX=1”选项来处理包含混合数据的列,这些列显示为空。这是不可能的,但你可以试试。

  2. Could you export the file to a tab delimited flat file? IMHO this is the most reliable way of dealing with Excel data, since Excel does have so many gotchas.

    能否将文件导出到以tab分隔的平面文件中?IMHO是处理Excel数据最可靠的方法,因为Excel确实有很多问题。

#5


0  

SpreadsheetGear for .NET can read and write (and more) xls and xlsx workbooks and supports the same limitations as Excel for text - in other words it will just work. There is a free evaluation if you want to give it a try.

net的SpreadsheetGear可以读取和写入(更多)xls和xlsx工作簿,并支持与Excel相同的文本限制——换句话说,它只会工作。如果你想尝试一下,可以免费评估。

Disclaimer: I own SpreadsheetGear LLC

免责声明:我自己开的是SpreadsheetGear LLC。

#6


0  

Regarding the last post, I also use SpreadsheetGear and find that it also suffers from the 255 characters per cell limitation when reading from the older XLS (not XLSX) format.

关于上一篇文章,我还使用了SpreadsheetGear,发现它在读取旧的XLS(而不是XLSX)格式时,每个单元都有255个字符的限制。

#1


0  

Just from a quick Googling of the subject, it appears that that's a limit of Excel.

仅仅是快速搜索一下这个主题,就会发现这是一个Excel的极限。

EDIT: Possible workaround (unfortunately in VB)

编辑:可能的解决方案(不幸的是在VB中)

#2


15  

The Solution!

I've been battling this today as well. I finally got it to work by modifying some registry keys before parsing the Excel spreadsheet.

我今天也一直在与之斗争。在解析Excel电子表格之前,我通过修改一些注册表键使它能够工作。

You must update this registry key before parsing the Excel spreadsheet:

在解析Excel电子表格之前,必须更新此注册表项:

// Excel 2010
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel\
or
HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel\

// Excel 2007
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel\

// Excel 2003
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel\

Change TypeGuessRows to 0 and ImportMixedTypes to Text under this key. You'll also need to update your connection string to include IMEX=1 in the extended properties:

将TypeGuessRows更改为0,并在此键下将输入类型转换为文本。您还需要更新连接字符串,以便在扩展属性中包含IMEX=1:

string connString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileLocation + ";Extended Properties=\"Excel 12.0 Xml;HDR=YES;IMEX=1\";");

References

http://blogs.vertigo.com/personal/aanttila/Blog/archive/2008/03/28/excel-and-csv-reference.aspx

http://blogs.vertigo.com/personal/aanttila/Blog/archive/2008/03/28/excel-and-csv-reference.aspx

http://msdn.microsoft.com/en-us/library/ms141683.aspx

http://msdn.microsoft.com/en-us/library/ms141683.aspx

...characters may be truncated. To import data from a memo column without truncation, you must make sure that the memo column in at least one of the sampled rows contains a value longer than 255 characters, or you must increase the number of rows sampled by the driver to include such a row. You can increase the number of rows sampled by increasing the value of TypeGuessRows under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry key....

…字符可能被截断。要从一个没有截断的memo列导入数据,必须确保至少一个抽样行中的memo列包含大于255个字符的值,或者您必须增加由驱动程序采样的行数来包含这样的行。你可以增加采样的行数增加的价值TypeGuessRows HKEY_LOCAL_MACHINE \微软软件\ \下飞机\ 4.0 \引擎\ Excel注册表键....

#3


3  

I have came across this, and the solution that worked for me was to move the cells with long text to the top of the spreadsheet.

我遇到过这种情况,对我有用的解决方案是将带有长文本的单元格移动到电子表格的顶部。

I found this comment in a forum describing the issue

我在一个描述这个问题的论坛上发现了这条评论

This is an issue with the Jet OLEDB provider. It looks at the first 8 rows
of the spreadsheet to determine the data type in each column. If the column does
not contain a field value over 256 characters in the first 8 rows, then it assumes the
data type is text, which has a character limit of 256. The following KB article has
more information on this issue: http://support.microsoft.com/kb/281517

这是Jet OLEDB提供程序的一个问题。它查看电子表格的前8行,以确定每个列中的数据类型。如果列在前8行中不包含超过256个字符的字段值,则假定数据类型为text,其字符限制为256。下面的KB文章有关于这个问题的更多信息:http://support.microsoft.com/kb/281517

Hope this help someone else!

希望这能帮助别人!

#4


1  

Have you tried setting the columns datatype to text within the spreadsheet? I believe doing this will allow the cells to contain much more than 255 characters.

您是否尝试过将列数据类型设置为电子表格中的文本?我相信这样做可以让单元格包含更多的255个字符。

[Edit] For what it's worth this dialog with the MS-Excel team is an interesting read. In the comments section at the bottom they get into some discussions about that 255 cutoff. They say Excel 12 can support 32k characters per cell.

与MS-Excel团队的对话是一个有趣的阅读。在底部的评论部分,他们讨论了255个截止点。他们说Excel 12每个单元可以支持32k个字符。

If that is true there must be a way to get at this data. Here is two things to consider.

如果这是真的,那么一定有办法获得这些数据。这里有两点需要考虑。

  1. In the past I have used the "IMEX=1" option in my connection string to deal with columns containing mixed data showing up as empty. It's a longshot, but you might give that a try.

    过去,我在连接字符串中使用“IMEX=1”选项来处理包含混合数据的列,这些列显示为空。这是不可能的,但你可以试试。

  2. Could you export the file to a tab delimited flat file? IMHO this is the most reliable way of dealing with Excel data, since Excel does have so many gotchas.

    能否将文件导出到以tab分隔的平面文件中?IMHO是处理Excel数据最可靠的方法,因为Excel确实有很多问题。

#5


0  

SpreadsheetGear for .NET can read and write (and more) xls and xlsx workbooks and supports the same limitations as Excel for text - in other words it will just work. There is a free evaluation if you want to give it a try.

net的SpreadsheetGear可以读取和写入(更多)xls和xlsx工作簿,并支持与Excel相同的文本限制——换句话说,它只会工作。如果你想尝试一下,可以免费评估。

Disclaimer: I own SpreadsheetGear LLC

免责声明:我自己开的是SpreadsheetGear LLC。

#6


0  

Regarding the last post, I also use SpreadsheetGear and find that it also suffers from the 255 characters per cell limitation when reading from the older XLS (not XLSX) format.

关于上一篇文章,我还使用了SpreadsheetGear,发现它在读取旧的XLS(而不是XLSX)格式时,每个单元都有255个字符的限制。