I have been researching and writing/re-writing a program to do this task for a week now. I need some collaboration on this to maybe bring up something I haven't though of before. Specifically, we have an auto-generated XML file sent to us daily with ~70k records (~75MB in size.) I was asked to make a table on one of the servers (SQL) which contains this information so that it can queried. Also, this program must Update existing records (if data has changed) and Insert new records DAILY. Records must not be deleted from the db
我一直在研究和编写/重写一个程序来执行这项任务一周。我需要在这方面进行一些合作,以便提出我以前没有的东西。具体来说,我们每天都有一个自动生成的XML文件,大约有70,000条记录(大小约为75MB)。我被要求在其中一个包含此信息的服务器(SQL)上创建一个表,以便查询它。此外,此程序必须更新现有记录(如果数据已更改)和插入新记录每日。不得从db中删除记录
Here is the list of methods I have attempted (so far) and reasons they did not work.
这是我尝试过的方法列表(到目前为止)以及它们不起作用的原因。
-
SQLXMLBulkLoad - This worked excellent for importing the data. However, the limitation of the Bulk Load class is that it can not Update and/or Insert. Time for a re-write.
SQLXMLBulkLoad - 这非常适合导入数据。但是,Bulk Load类的限制是它不能更新和/或插入。是时候重写了。
-
SQL OpenRowSet (using SQLCommand, etc.) - This does not work because the server, program, and XML file will all 3 be on different computers. These devices CAN be configured to allow each other access to the file (specifically the server), however this method was deemed "Not realistic, too much overhead" Time for a re-write.
SQL OpenRowSet(使用SQLCommand等) - 这不起作用,因为服务器,程序和XML文件都将在不同的计算机上。这些设备可以配置为允许彼此访问文件(特别是服务器),但是这种方法被认为是“不现实,开销太大”重写时间。
-
DataSet Merge, then TableAdapter.Update - This method intitially seemed like it would definitely work. The idea is simple, use DataSet.XMLRead() method to put the XML data into a table in the dataset, then just add the SQL table to the dataset (Using SQLCommand, etc.), merge the two tables, and then use Table Adapter to Update/Insert the table into the existing SQL table. This method seems not to work because the XML file has two nodes (columns) which contains dates. Unfortunately, there is not a uniform Date datatype between SQL and XML. I even attempted changing all of the date formats from the XML file to the DateTime SQL format, which worked, but still deemed a datatype mismatch exception upon running.
DataSet Merge,然后是TableAdapter.Update - 这个方法在内部看起来肯定会起作用。这个想法很简单,使用DataSet.XMLRead()方法将XML数据放入数据集中的表中,然后只需将SQL表添加到数据集(使用SQLCommand等),合并两个表,然后使用Table适配器更新/将表插入现有SQL表。此方法似乎不起作用,因为XML文件有两个包含日期的节点(列)。不幸的是,SQL和XML之间没有统一的Date数据类型。我甚至尝试将所有日期格式从XML文件更改为DateTime SQL格式,这种格式有效,但在运行时仍被视为数据类型不匹配异常。
At this point, I am out of ideas. This seems to be a task that has surely been done before. I am not necessarily looking for someone to write this code for me (I am fully capable of this), I just need some collaboration on the topic.
在这一点上,我没有想法。这似乎是以前肯定已经完成的任务。我不一定要找人为我编写这段代码(我完全有能力),我只需要就这个主题进行一些合作。
Thank You
谢谢
1 个解决方案
#1
1
We do something similar with database imports received in XML format, and all I do is pass the XML directly to a stored procedure and then shred the XML using XQuery and OPENXML.
我们做了类似于以XML格式接收的数据库导入,我所做的就是将XML直接传递给存储过程,然后使用XQuery和OPENXML碎化XML。
Both of those technologies allow you to query XML in SQL as if it was a table in your database. Taking that approach, you can just pass your XML to a script or stored procedure, query it in SQL, and insert the results wherever you need them. Anecdotally, OPENXML is better for processing large XML files, but you could try both and see how they work for you. Below is an example using OPENXML and a simple merge statement.
这两种技术都允许您在SQL中查询XML,就好像它是数据库中的表一样。采用这种方法,您只需将XML传递给脚本或存储过程,在SQL中查询,然后将结果插入到您需要的任何位置。有趣的是,OPENXML更适合处理大型XML文件,但您可以尝试两种方式,看看它们如何为您工作。下面是使用OPENXML和简单合并语句的示例。
create procedure ImportXml
(
@importXml xml
)
as
--with OPENXML you have to prepare the document before you use it
--this is unecessary with XQuery
DECLARE @idoc int
EXEC sp_xml_preparedocument @idoc OUTPUT, @importXml;
--this is just a typical Merge statement that will update data if it exists
--and insert it if it does not
merge NormalDataTable
using
(
--here is where you are querying the XML document directly. You can
--see, it works just like a SQL statement, with a special syntax for
--specifying where to get data out of the XML document and how to map
--it to a table structure
select *
from openxml(@idoc, '/Root/Element')
with
(
ElementID int '@ElementID',
ElementValueName varchar(50) '@ElementValueName'
)
) source
on NormalDataTable.ElementID = source.ElementID
when not matched then
insert ...
when matched then
update ...
exec sp_xml_removedocument @idoc
#1
1
We do something similar with database imports received in XML format, and all I do is pass the XML directly to a stored procedure and then shred the XML using XQuery and OPENXML.
我们做了类似于以XML格式接收的数据库导入,我所做的就是将XML直接传递给存储过程,然后使用XQuery和OPENXML碎化XML。
Both of those technologies allow you to query XML in SQL as if it was a table in your database. Taking that approach, you can just pass your XML to a script or stored procedure, query it in SQL, and insert the results wherever you need them. Anecdotally, OPENXML is better for processing large XML files, but you could try both and see how they work for you. Below is an example using OPENXML and a simple merge statement.
这两种技术都允许您在SQL中查询XML,就好像它是数据库中的表一样。采用这种方法,您只需将XML传递给脚本或存储过程,在SQL中查询,然后将结果插入到您需要的任何位置。有趣的是,OPENXML更适合处理大型XML文件,但您可以尝试两种方式,看看它们如何为您工作。下面是使用OPENXML和简单合并语句的示例。
create procedure ImportXml
(
@importXml xml
)
as
--with OPENXML you have to prepare the document before you use it
--this is unecessary with XQuery
DECLARE @idoc int
EXEC sp_xml_preparedocument @idoc OUTPUT, @importXml;
--this is just a typical Merge statement that will update data if it exists
--and insert it if it does not
merge NormalDataTable
using
(
--here is where you are querying the XML document directly. You can
--see, it works just like a SQL statement, with a special syntax for
--specifying where to get data out of the XML document and how to map
--it to a table structure
select *
from openxml(@idoc, '/Root/Element')
with
(
ElementID int '@ElementID',
ElementValueName varchar(50) '@ElementValueName'
)
) source
on NormalDataTable.ElementID = source.ElementID
when not matched then
insert ...
when matched then
update ...
exec sp_xml_removedocument @idoc