I am using the following code to read data from various Excel files:
我使用以下代码从各种Excel文件中读取数据:
// IMEX=1 - to force strings on mixed data
// HDR=NO - to process all the available data
// Locale 1033 is en-US. This was my first attempt to force en-US locale.
string connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Locale Identifier=1033;Extended Properties=\"{1};READONLY=TRUE;HDR=NO;IMEX=1;\"";
// source type according to the
// http://www.microsoft.com/en-us/download/details.aspx?id=13255
// try determining from extension
bool isOldFormat =
Path.GetExtension(sourceExcel).Equals(".xls", StringComparison.OrdinalIgnoreCase);
bool isBinary =
Path.GetExtension(sourceExcel).Equals(".xlsb", StringComparison.OrdinalIgnoreCase);
string sourceType = isOldFormat ? "Excel 8.0" : "Excel 12.0";
if (!isOldFormat)
sourceType += " Xml"; // for some reason the new binary xlsb files also need Xml
connectionString = string.Format(connectionString, sourceExcel, sourceType);
// this was my second attempt to force Excel to use US culture
var oldCulture = Thread.CurrentThread.CurrentCulture;
Thread.CurrentThread.CurrentCulture = CultureInfo.CreateSpecificCulture("en-US");
var dt = new DataTable();
try
{
using (var con = new OleDbConnection(connectionString))
{
con.Open();
// get all the available sheets
using (DataTable dataSet = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null))
{
// this was my third attempt to force Excel to use US culture
dataSet.Locale = CultureInfo.CreateSpecificCulture("en-US");
// get the sheet name in the file (will throw if out of range)
string workSheetName = dataSet.Rows[worksheetIndex]["TABLE_NAME"].ToString();//.Trim(new[] { '$' }).Replace("'", "");
string sql = String.Format("select * from [{0}]", workSheetName);
var da = new OleDbDataAdapter(sql, con);
// this was my fourth attempt to force Excel to use US culture
dt.Locale = CultureInfo.CreateSpecificCulture("en-US");
da.Fill(dt);
}
con.Close();
}
As you see, I was pretty desperate, trying to force Excel to use en-US compatible locale when importing data. I need this because my code might be executed on servers with various locales, but the data needs some additional processing which assumes that the incoming data is en-US/neutral locale.
如您所见,我非常绝望,尝试在导入数据时强制Excel使用en-US兼容的语言环境。我需要这个,因为我的代码可能在具有各种语言环境的服务器上执行,但数据需要一些额外的处理,这些处理假定传入的数据是en-US / neutral语言环境。
I tried also CultureInfo.InvariantCulture
instead of CultureInfo.CreateSpecificCulture("en-US")
.
我还尝试了CultureInfo.InvariantCulture而不是CultureInfo.CreateSpecificCulture(“en-US”)。
No matter how I try, when the server locale is set to some other locale which uses .
as thousands separator and ,
as decimal separator, I get wrong results in my dt DataTable
.
无论我如何尝试,当服务器区域设置设置为使用的其他区域设置时。作为千位分隔符,作为小数分隔符,我在我的dt DataTable中得到错误的结果。
To compare the result for a currency value -£200000.00 :
要比较货币值的结果 - £200000.00:
When the server regional settings correspond to US locale, I get "-£200,000.00"
当服务器区域设置对应于美国语言环境时,我得到“ - £200,000.00”
When the server regional settings correspond to Latvian locale, I get "-£200 000,00"
当服务器区域设置对应拉脱维亚语区域时,我得到“ - £200 000,00”
I cannot even post-process the data using the current numeric separators from Thread.CurrentThread.CurrentCulture
, because OleDb seems to ignore it completely.
我甚至无法使用Thread.CurrentThread.CurrentCulture中的当前数字分隔符对数据进行后处理,因为OleDb似乎完全忽略它。
Where does OleDb get the current culture from? Ho do I tell the OleDbConnection or Microsoft.ACE.OLEDB.12.0 provider that I need the data formatted according to en-US
or Invariant
culture?
OleDb从哪里获得当前的文化?何我告诉OleDbConnection或Microsoft.ACE.OLEDB.12.0提供商我需要根据en-US或Invariant culture格式化的数据?
1 个解决方案
#1
6
After lots of trials and errors and after reading this outdated article http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q320744 I found that the current version of OLEDB by default seems to be using the culture from HKEY_CURRENT_USER\Control Panel\International
. Unfortunately, I did not found how to call the SetVarConversionLocaleSetting
function from my C# code to force OLEDB to use the current thread culture, so I went with the principle - if I cannot adjust OLEDB for my code, then I'll adjust my code to be compatible with OLEDB culture. And after I am done with it, I can convert all the data to the invariant culture.
经过大量的试验和错误以及阅读这篇过时的文章http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q320744后,我发现当前版本的OLEDB默认似乎是在使用来自HKEY_CURRENT_USER \ Control Panel \ International的文化。不幸的是,我没有找到如何从我的C#代码调用SetVarConversionLocaleSetting函数来强制OLEDB使用当前的线程文化,所以我采用了原则 - 如果我不能为我的代码调整OLEDB,那么我将调整我的代码与OLEDB文化兼容。在完成它之后,我可以将所有数据转换为不变文化。
But there is a tricky part. You cannot just grab the decimal separator from HKEY_CURRENT_USER\Control Panel\International
, because OLEDB ignores user customized settings for number formats. OLEDB takes only the default preset values for that culture. So I had to do the following:
但是有一个棘手的部分。您不能只从HKEY_CURRENT_USER \ Control Panel \ International中获取小数点分隔符,因为OLEDB忽略了数字格式的用户自定义设置。 OLEDB仅采用该文化的默认预设值。所以我必须做以下事情:
var oldCulture = Thread.CurrentThread.CurrentCulture;
using (RegistryKey international =
Registry.CurrentUser.OpenSubKey("Control Panel\\International", false))
{
string userDefaultCulture = international.GetValue("LocaleName").ToString();
// notice: although the user might have customized his decimal/thousand separators,
// still OLEDB ignores these customizations. That is why I create a culture with default settings.
cultureToNormalize = new CultureInfo(userDefaultCulture, false);
}
// force both OLEDB and current thread cultures to match for the next ToString() etc. conversions in my function
Thread.CurrentThread.CurrentCulture = cultureToNormalize;
string decSep = cultureToNormalize.NumberFormat.NumberDecimalSeparator;
string groupSep = cultureToNormalize.NumberFormat.NumberGroupSeparator;
and now I can process the data as needed, and also I can call ToString() safely - the cultures of OLEDB and .NET stringified numbers and currencies will match. And, to be a good boy, I restore the previous culture at the end of my function.
现在我可以根据需要处理数据,而且我可以安全地调用ToString() - OLEDB和.NET字符串化数字和货币的文化将匹配。而且,为了成为一个好孩子,我在功能结束时恢复了以前的文化。
If anybody has a better solution, I'll be really grateful. But for now I'll keep it as is - all my unit tests are green now.
如果有人有更好的解决方案,我将非常感激。但是现在我会按原样保留它 - 现在所有的单元测试都是绿色的。
#1
6
After lots of trials and errors and after reading this outdated article http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q320744 I found that the current version of OLEDB by default seems to be using the culture from HKEY_CURRENT_USER\Control Panel\International
. Unfortunately, I did not found how to call the SetVarConversionLocaleSetting
function from my C# code to force OLEDB to use the current thread culture, so I went with the principle - if I cannot adjust OLEDB for my code, then I'll adjust my code to be compatible with OLEDB culture. And after I am done with it, I can convert all the data to the invariant culture.
经过大量的试验和错误以及阅读这篇过时的文章http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q320744后,我发现当前版本的OLEDB默认似乎是在使用来自HKEY_CURRENT_USER \ Control Panel \ International的文化。不幸的是,我没有找到如何从我的C#代码调用SetVarConversionLocaleSetting函数来强制OLEDB使用当前的线程文化,所以我采用了原则 - 如果我不能为我的代码调整OLEDB,那么我将调整我的代码与OLEDB文化兼容。在完成它之后,我可以将所有数据转换为不变文化。
But there is a tricky part. You cannot just grab the decimal separator from HKEY_CURRENT_USER\Control Panel\International
, because OLEDB ignores user customized settings for number formats. OLEDB takes only the default preset values for that culture. So I had to do the following:
但是有一个棘手的部分。您不能只从HKEY_CURRENT_USER \ Control Panel \ International中获取小数点分隔符,因为OLEDB忽略了数字格式的用户自定义设置。 OLEDB仅采用该文化的默认预设值。所以我必须做以下事情:
var oldCulture = Thread.CurrentThread.CurrentCulture;
using (RegistryKey international =
Registry.CurrentUser.OpenSubKey("Control Panel\\International", false))
{
string userDefaultCulture = international.GetValue("LocaleName").ToString();
// notice: although the user might have customized his decimal/thousand separators,
// still OLEDB ignores these customizations. That is why I create a culture with default settings.
cultureToNormalize = new CultureInfo(userDefaultCulture, false);
}
// force both OLEDB and current thread cultures to match for the next ToString() etc. conversions in my function
Thread.CurrentThread.CurrentCulture = cultureToNormalize;
string decSep = cultureToNormalize.NumberFormat.NumberDecimalSeparator;
string groupSep = cultureToNormalize.NumberFormat.NumberGroupSeparator;
and now I can process the data as needed, and also I can call ToString() safely - the cultures of OLEDB and .NET stringified numbers and currencies will match. And, to be a good boy, I restore the previous culture at the end of my function.
现在我可以根据需要处理数据,而且我可以安全地调用ToString() - OLEDB和.NET字符串化数字和货币的文化将匹配。而且,为了成为一个好孩子,我在功能结束时恢复了以前的文化。
If anybody has a better solution, I'll be really grateful. But for now I'll keep it as is - all my unit tests are green now.
如果有人有更好的解决方案,我将非常感激。但是现在我会按原样保留它 - 现在所有的单元测试都是绿色的。