I have to shred around 25 - 30 XMLs into my SQL Server 2005 database (the total size would be around 10 MB). And I need this logic to run automatically as soon as new xml files are copied to the server.
我必须将大约25-30个XML分解为我的SQL Server 2005数据库(总大小约为10 MB)。一旦将新的xml文件复制到服务器,我就需要这个逻辑自动运行。
Ive read many posts on this site and also other sites, but can't still conclude on what must I use to shred data.
我已阅读本网站和其他网站上的很多帖子,但仍不能总结我必须用什么来粉碎数据。
Pls let me know which option must I should go with
请让我知道我应该选择哪个选项
- SqlBulk Copy
- SqlBulk复制
- C# deserialization
- C#反序列化
- SSIS
- SSIS
I have to create C# classes for my data models. So C# deserialization was my first choice. But pls let me know which option will be right from a performance perspective.
我必须为我的数据模型创建C#类。所以C#反序列化是我的第一选择。但请告诉我从性能角度来看哪个选项是正确的。
Another thing I forgot to mention was the structure of the XML files will vary. It wouldnt be the same. I will have tables that will have all the columns that could possibly be populated. But the xmls will not have all the data at all times.
我忘了提到的另一件事是XML文件的结构会有所不同。它不会是一样的。我将拥有将包含可能填充的所有列的表。但是xmls不会始终拥有所有数据。
Sample of the xml
xml的示例
<?xml version="1.0" encoding="utf-8"?>
<estateList date="2012-08-06T12:17:05">
<uniqueID>22XXln</uniqueID>
<category name="Apartment" />
<listingAgent>
<name>DIW Office</name>
<telephone type="BH">96232 2345</telephone>
<telephone type="BH">9234 2399</telephone>
<email>abcd@abc.com</email>
</listingAgent>
<inspectionTimes />
<description>AVAILABLE NOW. </description>
<price>0</price>
<address display="yes">
<street>Lachlsan Street</street>
<ImagesContainer>
<img id="m" modTime="2012-08-06-12:17:05" url="http://images/2409802.jpg" format="jpg" />
<img id="a" modTime="2012-08-06-12:17:05" />
</ImagesContainer>
</address>
</estateList>
Thanks.
谢谢。
1 个解决方案
#1
4
Given you have your XML in a SQL variabe, you can pretty easily parse out most of the info using straight T-SQL with the XQuery support added in SQL Server 2005.
鉴于您在SQL变量中拥有XML,您可以使用SQL Server 2005中添加的XQuery支持直接使用直接T-SQL轻松解析大部分信息。
Try something like:
尝试以下方法:
DECLARE @Input XML = '<estateList date="2012-08-06T12:17:05">
<uniqueID>22XXln</uniqueID>
<category name="Apartment" />
<listingAgent>
<name>DIW Office</name>
<telephone type="BH">96232 2345</telephone>
<telephone type="BH">9234 2399</telephone>
<email>abcd@abc.com</email>
</listingAgent>
<inspectionTimes />
<description>AVAILABLE NOW. </description>
<price>0</price>
<address display="yes">
<street>Lachlsan Street</street>
<ImagesContainer>
<img id="m" modTime="2012-08-06-12:17:05" url="http://images/2409802.jpg" format="jpg" />
<img id="a" modTime="2012-08-06-12:17:05" />
</ImagesContainer>
</address>
</estateList>'
SELECT
EstateListDate = EstL.value('@date', 'datetime'),
UniqueID = EstL.value('(uniqueID)[1]', 'varchar(20)'),
Category = EstL.value('(category/@name)[1]', 'varchar(20)'),
ListingAgentName = EstL.value('(listingAgent/name)[1]', 'varchar(50)'),
ListingAgentTel = EstL.value('(listingAgent/telephone)[1]', 'varchar(50)'),
ListingAgentEMail = EstL.value('(listingAgent/email)[1]', 'varchar(250)'),
[Description] = EstL.value('(description)[1]', 'varchar(250)'),
Price = EstL.value('(price)[1]', 'decimal(14,2)'),
DisplayAddress = EstL.value('(address/@display)[1]', 'varchar(10)'),
AddressStreet = EstL.value('(address/street)[1]', 'varchar(100)')
FROM @input.nodes('/estateList') AS Tbl(EstL)
and you should get:
你应该得到:
This data could be easily inserted into a table. And this query could be run against any number of XML files on disk, using a fairly easy SSIS package (enumerate the XML, load each into a SQL variable, parse it, insert data into tables etc.)
这些数据可以很容易地插入表格中。此查询可以使用相当简单的SSIS包(枚举XML,将每个XML加载到SQL变量,解析它,将数据插入表等)对磁盘上的任意数量的XML文件运行。
BUT: the challenging part is going to be questions like:
但是:具有挑战性的部分将是如下问题:
- can there be more than one listing agent? And if yes : how to handle that?
- 可以有多个上市代理吗?如果是的话:如何处理?
- can there be more than one phone number and how to deal with that?
- 可以有多个电话号码以及如何处理?
- what do to with the multiple images per address
- 如何处理每个地址的多个图像
and so forth ....
等等....
Update: this query here would extract the UniqueID
and each complete <img>
tag's information from that XML input and display it (or insert it into another table):
更新:此查询将从该XML输入中提取UniqueID和每个完整的标记的信息并显示它(或将其插入另一个表):
SELECT
UniqueID = @input.value('(/estateList/uniqueID)[1]', 'varchar(20)'),
ImageID = Images.value('(img/@id)[1]', 'varchar(20)'),
ImageModTime = Images.value('(img/@modTime)[1]', 'varchar(50)'),
ImageFormat = Images.value('(img/@format)[1]', 'varchar(20)'),
ImageURL = Images.value('(img/@url)[1]', 'varchar(250)')
FROM
@input.nodes('/estateList/address/ImagesContainer') AS Tbl(Images)
#1
4
Given you have your XML in a SQL variabe, you can pretty easily parse out most of the info using straight T-SQL with the XQuery support added in SQL Server 2005.
鉴于您在SQL变量中拥有XML,您可以使用SQL Server 2005中添加的XQuery支持直接使用直接T-SQL轻松解析大部分信息。
Try something like:
尝试以下方法:
DECLARE @Input XML = '<estateList date="2012-08-06T12:17:05">
<uniqueID>22XXln</uniqueID>
<category name="Apartment" />
<listingAgent>
<name>DIW Office</name>
<telephone type="BH">96232 2345</telephone>
<telephone type="BH">9234 2399</telephone>
<email>abcd@abc.com</email>
</listingAgent>
<inspectionTimes />
<description>AVAILABLE NOW. </description>
<price>0</price>
<address display="yes">
<street>Lachlsan Street</street>
<ImagesContainer>
<img id="m" modTime="2012-08-06-12:17:05" url="http://images/2409802.jpg" format="jpg" />
<img id="a" modTime="2012-08-06-12:17:05" />
</ImagesContainer>
</address>
</estateList>'
SELECT
EstateListDate = EstL.value('@date', 'datetime'),
UniqueID = EstL.value('(uniqueID)[1]', 'varchar(20)'),
Category = EstL.value('(category/@name)[1]', 'varchar(20)'),
ListingAgentName = EstL.value('(listingAgent/name)[1]', 'varchar(50)'),
ListingAgentTel = EstL.value('(listingAgent/telephone)[1]', 'varchar(50)'),
ListingAgentEMail = EstL.value('(listingAgent/email)[1]', 'varchar(250)'),
[Description] = EstL.value('(description)[1]', 'varchar(250)'),
Price = EstL.value('(price)[1]', 'decimal(14,2)'),
DisplayAddress = EstL.value('(address/@display)[1]', 'varchar(10)'),
AddressStreet = EstL.value('(address/street)[1]', 'varchar(100)')
FROM @input.nodes('/estateList') AS Tbl(EstL)
and you should get:
你应该得到:
This data could be easily inserted into a table. And this query could be run against any number of XML files on disk, using a fairly easy SSIS package (enumerate the XML, load each into a SQL variable, parse it, insert data into tables etc.)
这些数据可以很容易地插入表格中。此查询可以使用相当简单的SSIS包(枚举XML,将每个XML加载到SQL变量,解析它,将数据插入表等)对磁盘上的任意数量的XML文件运行。
BUT: the challenging part is going to be questions like:
但是:具有挑战性的部分将是如下问题:
- can there be more than one listing agent? And if yes : how to handle that?
- 可以有多个上市代理吗?如果是的话:如何处理?
- can there be more than one phone number and how to deal with that?
- 可以有多个电话号码以及如何处理?
- what do to with the multiple images per address
- 如何处理每个地址的多个图像
and so forth ....
等等....
Update: this query here would extract the UniqueID
and each complete <img>
tag's information from that XML input and display it (or insert it into another table):
更新:此查询将从该XML输入中提取UniqueID和每个完整的标记的信息并显示它(或将其插入另一个表):
SELECT
UniqueID = @input.value('(/estateList/uniqueID)[1]', 'varchar(20)'),
ImageID = Images.value('(img/@id)[1]', 'varchar(20)'),
ImageModTime = Images.value('(img/@modTime)[1]', 'varchar(50)'),
ImageFormat = Images.value('(img/@format)[1]', 'varchar(20)'),
ImageURL = Images.value('(img/@url)[1]', 'varchar(250)')
FROM
@input.nodes('/estateList/address/ImagesContainer') AS Tbl(Images)