Does it make more sense (except of server side validation XML/schema/dtd) to store XML in XML type instead of text/varchar/ntext? I'm not planning to do any XML manipulation on database side.
是否更有意义(服务器端验证XML / schema / dtd除外)以XML类型而不是text / varchar / ntext存储XML?我不打算在数据库端进行任何XML操作。
The purpose of my investigation is to decrease database size. Can I use XML data type for untyped XML for this purpose? What are the pros and cons?
我调查的目的是减少数据库大小。为此,我可以将XML数据类型用于非类型化XML吗?优缺点都有什么?
I found an article related to the topic, but I am not sure if the authors assumptions/conclusions are correct.
我发现了一篇与该主题相关的文章,但我不确定作者的假设/结论是否正确。
4 个解决方案
#1
8
If you store xml in an xml typed column, the data will not get stored as simple text, as in the nvarchar case, it will be stored in some sort of parsed data tree, which in turn will be smaller than the unparsed xml version. This not only decreases the database size, but gives you other advantages, like validation, easy manipulation etc. (even though you're not using any of these, still, they are there for future use).
如果将xml存储在xml类型的列中,则数据将不会以简单文本形式存储,如在nvarchar情况下,它将存储在某种解析数据树中,而后者将比未解析的xml版本小。这不仅减少了数据库的大小,而且还为您提供了其他优势,例如验证,易于操作等等(即使您没有使用其中任何一个,仍然可以将它们用于将来使用)。
On the other hand, the server will have to parse the data upon insertion, which will probably slow your database down - you have to make a decision of speed vs. size.
另一方面,服务器必须在插入时解析数据,这可能会减慢数据库速度 - 您必须决定速度与大小。
Edit:
编辑:
Personally, I think that data in the database should be stored as xml only when it has structure which is hard to implement in a relational model, e.g. layouts, style descriptions etc. Usually that means that there won't be much data and speed is not an issue, thus added xml features, like data validation and manipulation ability (also, last but not least, the ability to click on the value in managment studio and see formatted xml - I really love that feature!), outweight the costs.
就个人而言,我认为数据库中的数据只有在具有难以在关系模型中实现的结构时才应存储为xml。布局,样式描述等。通常这意味着没有太多的数据和速度不是问题,因此添加了xml功能,如数据验证和操作能力(同样,最后但并非最不重要的,点击值的能力在管理工作室,看到格式化的xml - 我真的很喜欢这个功能!),超重成本。
I don't have direct experience in storing large amounts of xml in the database and I wouldn't do that if I had the option, since it is almost always slower that a relational model, but if that would be the case, I'd recommend profiling both options, and choosing between size and speed that best suit your needs.
我没有在数据库中存储大量xml的直接经验,如果我有选项,我不会这样做,因为它几乎总是比关系模型慢,但如果是这样,我' d建议分析两种选项,并在最适合您需求的尺寸和速度之间进行选择。
#2
7
My quick investigation shows that MS SQL 2005 (Express Edition)
我的快速调查显示MS SQL 2005(Express Edition)
Microsoft SQL Server 2005 - 9.00.3073.00 (Intel X86) Aug 5 2008 12:31:12 Copyright (c) 1988-2005 Microsoft Corporation Express Edition on Windows NT 6.0 (Build 6000: )
Microsoft SQL Server 2005 - 9.00.3073.00(Intel X86)2008年8月5日12:31:12版权所有(c)1988-2005 Microsoft Corporation在Windows NT 6.0上的Express Edition(Build 6000:)
store XML with overhead about 70% (possible for faster processing/parsing).
以大约70%的开销存储XML(可以更快地处理/解析)。
My data before conversion: rows=160320, reserved=178576 KB, data=178184 KB, index_size=272 KB, unused=120 KB
转换前的数据:rows = 160320,reserved = 178576 KB,data = 178184 KB,index_size = 272 KB,unused = 120 KB
My data after the conversion: rows=160320, reserved=309702 KB, data=307216 KB, index_size=1672 KB, unused=184 KB
转换后的数据:rows = 160320,reserved = 309702 KB,data = 307216 KB,index_size = 1672 KB,unused = 184 KB
So it hasn't any sense to store XML data in XML data type if you are not planning use XML technology on database side.
因此,如果您没有计划在数据库端使用XML技术,那么将XML数据存储在XML数据类型中没有任何意义。
#3
2
I am using xml extensively for communicating to handheld devices and am using XQuery in most all my stored procs to retrieve only the data from the XML that I need. It works GREAT!! I am just worried about storage space because with only a hundred thousand records or so, the DB size is 1 to 2 GB. We are expecting going live with millions of records stored for logging and customer usage.. so it will be worrisome until I see what it's actually going to do in terms of space usage.
我正在广泛使用xml与手持设备进行通信,并且在大多数所有存储过程中都使用XQuery来仅检索我需要的XML数据。它很棒!!我只是担心存储空间,因为只有十万条记录左右,数据库大小为1到2 GB。我们希望能够存储数百万条用于记录和客户使用的记录。所以在看到空间使用方面的实际情况之前,这将是令人担忧的。
#4
2
It makes sense to use XML data type to store XML data as do you have to deal with UTF8 to UTF18 and vice versa conversion in MS SQL Servers.
使用XML数据类型来存储XML数据是有意义的,就像在MS SQL Server中处理UTF8到UTF18和反之亦然一样。
#1
8
If you store xml in an xml typed column, the data will not get stored as simple text, as in the nvarchar case, it will be stored in some sort of parsed data tree, which in turn will be smaller than the unparsed xml version. This not only decreases the database size, but gives you other advantages, like validation, easy manipulation etc. (even though you're not using any of these, still, they are there for future use).
如果将xml存储在xml类型的列中,则数据将不会以简单文本形式存储,如在nvarchar情况下,它将存储在某种解析数据树中,而后者将比未解析的xml版本小。这不仅减少了数据库的大小,而且还为您提供了其他优势,例如验证,易于操作等等(即使您没有使用其中任何一个,仍然可以将它们用于将来使用)。
On the other hand, the server will have to parse the data upon insertion, which will probably slow your database down - you have to make a decision of speed vs. size.
另一方面,服务器必须在插入时解析数据,这可能会减慢数据库速度 - 您必须决定速度与大小。
Edit:
编辑:
Personally, I think that data in the database should be stored as xml only when it has structure which is hard to implement in a relational model, e.g. layouts, style descriptions etc. Usually that means that there won't be much data and speed is not an issue, thus added xml features, like data validation and manipulation ability (also, last but not least, the ability to click on the value in managment studio and see formatted xml - I really love that feature!), outweight the costs.
就个人而言,我认为数据库中的数据只有在具有难以在关系模型中实现的结构时才应存储为xml。布局,样式描述等。通常这意味着没有太多的数据和速度不是问题,因此添加了xml功能,如数据验证和操作能力(同样,最后但并非最不重要的,点击值的能力在管理工作室,看到格式化的xml - 我真的很喜欢这个功能!),超重成本。
I don't have direct experience in storing large amounts of xml in the database and I wouldn't do that if I had the option, since it is almost always slower that a relational model, but if that would be the case, I'd recommend profiling both options, and choosing between size and speed that best suit your needs.
我没有在数据库中存储大量xml的直接经验,如果我有选项,我不会这样做,因为它几乎总是比关系模型慢,但如果是这样,我' d建议分析两种选项,并在最适合您需求的尺寸和速度之间进行选择。
#2
7
My quick investigation shows that MS SQL 2005 (Express Edition)
我的快速调查显示MS SQL 2005(Express Edition)
Microsoft SQL Server 2005 - 9.00.3073.00 (Intel X86) Aug 5 2008 12:31:12 Copyright (c) 1988-2005 Microsoft Corporation Express Edition on Windows NT 6.0 (Build 6000: )
Microsoft SQL Server 2005 - 9.00.3073.00(Intel X86)2008年8月5日12:31:12版权所有(c)1988-2005 Microsoft Corporation在Windows NT 6.0上的Express Edition(Build 6000:)
store XML with overhead about 70% (possible for faster processing/parsing).
以大约70%的开销存储XML(可以更快地处理/解析)。
My data before conversion: rows=160320, reserved=178576 KB, data=178184 KB, index_size=272 KB, unused=120 KB
转换前的数据:rows = 160320,reserved = 178576 KB,data = 178184 KB,index_size = 272 KB,unused = 120 KB
My data after the conversion: rows=160320, reserved=309702 KB, data=307216 KB, index_size=1672 KB, unused=184 KB
转换后的数据:rows = 160320,reserved = 309702 KB,data = 307216 KB,index_size = 1672 KB,unused = 184 KB
So it hasn't any sense to store XML data in XML data type if you are not planning use XML technology on database side.
因此,如果您没有计划在数据库端使用XML技术,那么将XML数据存储在XML数据类型中没有任何意义。
#3
2
I am using xml extensively for communicating to handheld devices and am using XQuery in most all my stored procs to retrieve only the data from the XML that I need. It works GREAT!! I am just worried about storage space because with only a hundred thousand records or so, the DB size is 1 to 2 GB. We are expecting going live with millions of records stored for logging and customer usage.. so it will be worrisome until I see what it's actually going to do in terms of space usage.
我正在广泛使用xml与手持设备进行通信,并且在大多数所有存储过程中都使用XQuery来仅检索我需要的XML数据。它很棒!!我只是担心存储空间,因为只有十万条记录左右,数据库大小为1到2 GB。我们希望能够存储数百万条用于记录和客户使用的记录。所以在看到空间使用方面的实际情况之前,这将是令人担忧的。
#4
2
It makes sense to use XML data type to store XML data as do you have to deal with UTF8 to UTF18 and vice versa conversion in MS SQL Servers.
使用XML数据类型来存储XML数据是有意义的,就像在MS SQL Server中处理UTF8到UTF18和反之亦然一样。