I know questions like this are around in stack and there are 3rd part libraries to do the trick but none of them is fixing my issue at the moment. So the issue.
我知道像这样的问题存在于堆栈中,有第三部分库可以解决这个问题,但是目前没有一个库可以解决我的问题。所以这个问题。
I have an Excel workbook (.xlsx) with multiple sheets generated by another system. I have to read the data from this via SSIS and dump it to a SQL DB.
我有一个Excel工作簿(.xlsx),其中包含由另一个系统生成的多个表。我必须通过SSIS读取数据并将其转储到SQL DB中。
Now the issue is although the Excel sheet contains data and when I open manually it opens without any error and the data displays when I use a script task and use OLEDB connection to connect to the excel and open it up the connection is made successfully but when reading data the column names are not picked (I get F1, F2 likewise) and no data rows are read. I simply get a blank row and that's about it. I have tried with HDR= YES and NO and IMEX=1 and 0 but always the result is same.
现在问题是尽管Excel表包含数据,当我打开手动打开没有任何错误和数据显示当我使用一个脚本任务和使用OLEDB连接连接到Excel和打开连接成功,但是当读取数据列名不选(F1,F2同样)和不读数据行。我只是得到一个空行,仅此而已。我尝试过HDR= YES和NO和IMEX=1和0,但结果总是相同的。
Funny thing is if I open the excel sheet do some modification (like change a sheet name save and change back the sheet name and save and close) and after that I try to run the package the data gets picked without any issue (also I noticed that the file size increases from 164KB to 196KB). Now because of this what am trying to do is modify the the file a bit and save via code.
有趣的是如果我打开excel表做一些修改(如表名称保存并改变回表名称和保存并关闭),之后我尝试运行包的数据没有任何问题(我注意到从164 kb的文件大小增加到196 kb)。现在,正因为这个,我要做的是修改文件,并通过代码保存。
So the initial step I tried was through using Office.Interop.Excel and it works like a charm in my machine but on the server NO OFFICE so IT NO WORKS. And nope the IT guys are never going to install access engine or excel or anything there.
我尝试的第一步是使用Office.Interop。Excel和它在我的机器上很有魅力,但是在服务器上没有办公室,所以没有工作。不,IT人员永远不会安装访问引擎或excel之类的东西。
Then I tried via OpenXML and 3rd party library like NPOI and even via OLEDB connection to modify the file. in both NPOI and OLEDB methods the file got changed but still it didn't get picked up properly by the SSIS package (I noticed that the file size didn't change and remained at 164kb). In OpenXML it wasn't able to open the file and threw an error saying "the document cannot be opened because there is and invalid part with an unexpected content type".
然后我尝试通过OpenXML和第三方库,比如NPOI,甚至通过OLEDB连接来修改文件。在NPOI和OLEDB方法中,文件都被更改了,但是仍然没有被SSIS包正确地选中(我注意到文件大小没有改变,保持在164kb)。在OpenXML中,它无法打开文件并抛出一个错误,说“无法打开文档,因为有且无效的部分具有意外的内容类型”。
So right now I am stuck with no proper method in sight and would appreciate any help in solving this either through c# code or any other SSIS method available. SSIS version am using is 2008.
因此,现在我在视觉上没有正确的方法,并且希望通过c#代码或其他可用的SSIS方法来帮助解决这个问题。SSIS版本am使用的是2008。
Edit 1
编辑1
So I noticed that the script task is able to read the data from the first sheet out f the multiple sheets but the other sheets are the problem. So somewhere the xml for these sheets are broken. Anyway I can copy the xml configs of the first sheet to other ones? Just a thought...
所以我注意到脚本任务能够从第一个表中读取多个表中的数据但是其他表是问题所在。所以这些表的xml在某些地方被破坏了。我可以把第一个表的xml配置复制到其他表中吗?只是一个想法…
Edit 2 So the first sheet is of ContentType "application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml" while all the other sheets are of ContentType "application/xml"
因此,第一个表单是ContentType“application/vnd.openxmlformats-officedocument.spreadsheetml”。工作表+xml,而所有其他的表都是“应用程序/xml”的内容类型
1 个解决方案
#1
1
Ultimately ended up using two libraries for this. The data was read without an issue by using exceldatareader (http://exceldatareader.codeplex.com/). Using this the data was read into a dataset easily and then it was written to a new Excel file using epplus (http://epplus.codeplex.com/). After that when the new excel file was read via the SSIS package data got picked without an issue. Hope this will help someone out there.
最终使用了两个库。使用exceldatareader (http://exceldatareader.codeplex.com/)读取数据时不会出现问题。使用此方法,数据可以轻松地读入数据集,然后使用epplus (http://epplus.codeplex.com/)将其写到新的Excel文件中。在此之后,通过SSIS包数据读取新的excel文件时,就不会出现问题。希望这能帮助别人。
#1
1
Ultimately ended up using two libraries for this. The data was read without an issue by using exceldatareader (http://exceldatareader.codeplex.com/). Using this the data was read into a dataset easily and then it was written to a new Excel file using epplus (http://epplus.codeplex.com/). After that when the new excel file was read via the SSIS package data got picked without an issue. Hope this will help someone out there.
最终使用了两个库。使用exceldatareader (http://exceldatareader.codeplex.com/)读取数据时不会出现问题。使用此方法,数据可以轻松地读入数据集,然后使用epplus (http://epplus.codeplex.com/)将其写到新的Excel文件中。在此之后,通过SSIS包数据读取新的excel文件时,就不会出现问题。希望这能帮助别人。