First, I want to say that I'm out on deep water here, since I'm just doing some changes to code that is written by someone else in the company, using OleDbDataAdapter to "talk" to Excel and I'm not familiar with that. There is one bug there I just can't follow.
首先,我想说我在这里深水,因为我只是对公司中其他人编写的代码进行了一些更改,使用OleDbDataAdapter与Excel“对话”,我不熟悉接着就,随即。有一个我无法遵循的错误。
I'm trying to use a OleDbDataAdapter to read in a excel file with around 450 lines.
我正在尝试使用OleDbDataAdapter来读取大约450行的excel文件。
In the code it's done like this:
在代码中,它是这样做的:
connection = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source='" + path + "';" + "Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=1;\"");
connection.Open();
OleDbDataAdapter objAdapter = new OleDbDataAdapter(objCommand.CommandText, connection);
objAdapter.Fill(objDataSet, "Excel");
foreach (DataColumn dataColumn in objTable.Columns) {
if (dataColumn.Ordinal > objDataSet.Tables[0].Columns.Count - 1) {
objDataSet.Tables[0].Columns.Add();
}
objDataSet.Tables[0].Columns[dataColumn.Ordinal].ColumnName = dataColumn.ColumnName;
objImport.Columns.Add(dataColumn.ColumnName);
}
foreach (DataRow dataRow in objDataSet.Tables[0].Rows) {
...
}
Everything seems to be working fine except for one thing. The second column is filled with mostly four digit numbers like 6739, 3920 and so one, but fice rows have alphanumeric values like 8201NO and 8205NO. Those five cells are reported as having blank contents instead of their alphanumeric content. I have checked in excel, and all the cells in this columns are marked as Text.
除了一件事,一切似乎都很好。第二列填充了大多数四位数字,如6739,3920等,但是fice行具有字母数字值,如8201NO和8205NO。据报道,这五个细胞具有空白内容而不是其字母数字内容。我已检入excel,此列中的所有单元格都标记为文本。
This is an xls file by the way, and not xlsx.
顺便说一句,这是一个xls文件,而不是xlsx。
Do anyone have any clue as why these cells are shown as blank in the DataRow, but the numeric ones are shown fine? There are other columns with alphanumeric content that are shown just fine.
有没有人知道为什么这些单元格在DataRow中显示为空白,但数字显示正常?还有其他具有字母数字内容的列显示得很好。
3 个解决方案
#1
8
What's happening is that excel is trying to assign a data type to the spreadsheet column based on the first several values in that column. I suspect that if you look at the properties in that column it will say it is a numerical column.
发生的事情是,Excel正在尝试根据该列中的前几个值将数据类型分配给电子表格列。我怀疑,如果你看一下该列中的属性,它会说它是一个数字列。
The problem comes when you start trying to query that spreadsheet using jet. When it thinks it's dealing with a numerical column and it finds a varchar value it quietly returns nothing. Not even a cryptic error message to go off of.
当您开始尝试使用jet查询该电子表格时,问题就出现了。当它认为它正在处理数字列并且它找到一个varchar值时,它会静静地返回任何内容。甚至没有一个神秘的错误消息。
As a possible work around can you move one of the alpha numeric values to the first row of data and then try parsing. I suspect you will start getting values for the alpha numeric rows then...
作为一种可能的解决方法,您可以将其中一个字母数字值移动到第一行数据,然后尝试解析。我怀疑你会开始获取字母数字行的值然后......
Take a look at this article. It goes into more detail on this issue. it also talks about a possible work around which is:
看看这篇文章。它详细介绍了这个问题。它还讨论了可能的工作:
However, as per JET documentation, we can override the registry setting thru the Connection String, if we set IMEX=1( as part of Extended Properties), the JET will set the all column type as UNICODE VARCHAR or ADVARWCHAR irrespective of ‘ImportMixedTypes’ key value.hey
但是,根据JET文档,我们可以通过Connection String覆盖注册表设置,如果我们设置IMEX = 1(作为扩展属性的一部分),JET会将all列类型设置为UNICODE VARCHAR或ADVARWCHAR,而不管'ImportMixedTypes'关键价值。嘿
#2
1
IMEX=1
means "Read mixed data as text."
IMEX = 1表示“将混合数据作为文本读取”。
There are some gotchas, however. Jet will only use several rows to determine whether the data is mixed, and if so happens these rows are all numeric, you'll get this behaviour.
然而,有一些陷阱。 Jet将只使用几行来确定数据是否是混合的,如果发生这些行都是数字,你会得到这种行为。
See connectionstrings.com for details:
有关详细信息,请参阅connectionstrings.com:
Check out the
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel]
located registryREG_DWORD
"TypeGuessRows". That's the key to not letting Excel use only the first 8 rows to guess the columns data type. Set this value to 0 to scan all rows. This might hurt performance. Please also note that adding the IMEX=1 option might cause the IMEX feature to set in after just 8 rows. Use IMEX=0 instead to be sure to force the registry TypeGuessRows=0 (scan all rows) to work.查看位于注册表REG_DWORD“TypeGuessRows”的[HKEY_LOCAL_MACHINE \ SOFTWARE \ Microsoft \ Jet \ 4.0 \ Engines \ Excel]。这是不让Excel只使用前8行来猜测列数据类型的关键。将此值设置为0以扫描所有行。这可能会影响性能。另请注意,添加IMEX = 1选项可能会导致IMEX功能仅在8行后设置。使用IMEX = 0来确保强制注册表TypeGuessRows = 0(扫描所有行)才能工作。
#3
1
I would advise against using the OleDb data provider stuff to access Excel if you can help it. I've had nothing but problems, for exactly the reasons that others have pointed out. The performance tends to be atrocious as well when you are dealing with large spreadsheets.
如果你能提供帮助,我建议不要使用OleDb数据提供程序来访问Excel。我完全没有别的问题,正是因为其他人指出的原因。当您处理大型电子表格时,性能往往也很恶劣。
You might try this open source solution: http://exceldatareader.codeplex.com/
您可以尝试这种开源解决方案:http://exceldatareader.codeplex.com/
#1
8
What's happening is that excel is trying to assign a data type to the spreadsheet column based on the first several values in that column. I suspect that if you look at the properties in that column it will say it is a numerical column.
发生的事情是,Excel正在尝试根据该列中的前几个值将数据类型分配给电子表格列。我怀疑,如果你看一下该列中的属性,它会说它是一个数字列。
The problem comes when you start trying to query that spreadsheet using jet. When it thinks it's dealing with a numerical column and it finds a varchar value it quietly returns nothing. Not even a cryptic error message to go off of.
当您开始尝试使用jet查询该电子表格时,问题就出现了。当它认为它正在处理数字列并且它找到一个varchar值时,它会静静地返回任何内容。甚至没有一个神秘的错误消息。
As a possible work around can you move one of the alpha numeric values to the first row of data and then try parsing. I suspect you will start getting values for the alpha numeric rows then...
作为一种可能的解决方法,您可以将其中一个字母数字值移动到第一行数据,然后尝试解析。我怀疑你会开始获取字母数字行的值然后......
Take a look at this article. It goes into more detail on this issue. it also talks about a possible work around which is:
看看这篇文章。它详细介绍了这个问题。它还讨论了可能的工作:
However, as per JET documentation, we can override the registry setting thru the Connection String, if we set IMEX=1( as part of Extended Properties), the JET will set the all column type as UNICODE VARCHAR or ADVARWCHAR irrespective of ‘ImportMixedTypes’ key value.hey
但是,根据JET文档,我们可以通过Connection String覆盖注册表设置,如果我们设置IMEX = 1(作为扩展属性的一部分),JET会将all列类型设置为UNICODE VARCHAR或ADVARWCHAR,而不管'ImportMixedTypes'关键价值。嘿
#2
1
IMEX=1
means "Read mixed data as text."
IMEX = 1表示“将混合数据作为文本读取”。
There are some gotchas, however. Jet will only use several rows to determine whether the data is mixed, and if so happens these rows are all numeric, you'll get this behaviour.
然而,有一些陷阱。 Jet将只使用几行来确定数据是否是混合的,如果发生这些行都是数字,你会得到这种行为。
See connectionstrings.com for details:
有关详细信息,请参阅connectionstrings.com:
Check out the
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel]
located registryREG_DWORD
"TypeGuessRows". That's the key to not letting Excel use only the first 8 rows to guess the columns data type. Set this value to 0 to scan all rows. This might hurt performance. Please also note that adding the IMEX=1 option might cause the IMEX feature to set in after just 8 rows. Use IMEX=0 instead to be sure to force the registry TypeGuessRows=0 (scan all rows) to work.查看位于注册表REG_DWORD“TypeGuessRows”的[HKEY_LOCAL_MACHINE \ SOFTWARE \ Microsoft \ Jet \ 4.0 \ Engines \ Excel]。这是不让Excel只使用前8行来猜测列数据类型的关键。将此值设置为0以扫描所有行。这可能会影响性能。另请注意,添加IMEX = 1选项可能会导致IMEX功能仅在8行后设置。使用IMEX = 0来确保强制注册表TypeGuessRows = 0(扫描所有行)才能工作。
#3
1
I would advise against using the OleDb data provider stuff to access Excel if you can help it. I've had nothing but problems, for exactly the reasons that others have pointed out. The performance tends to be atrocious as well when you are dealing with large spreadsheets.
如果你能提供帮助,我建议不要使用OleDb数据提供程序来访问Excel。我完全没有别的问题,正是因为其他人指出的原因。当您处理大型电子表格时,性能往往也很恶劣。
You might try this open source solution: http://exceldatareader.codeplex.com/
您可以尝试这种开源解决方案:http://exceldatareader.codeplex.com/