在关系数据库中存储XML的基本原理

时间:2021-06-26 16:56:17

I need to explain why we need to store XML documents in a database.

我需要解释为什么我们需要在数据库中存储XML文档。

On the up side:

的一面:

  1. No effort to shred individual elements to tables and atrribute to columns
  2. 不需要将单个元素分解成表和atrribute到列。
  3. No effort to maintain relations between tables as they are self contained within XML
  4. 不需要维护表之间的关系,因为它们本身包含在XML中。
  5. Portable across systems that share the XML
  6. 可跨共享XML的系统移植
  7. Should there be a need, literally all DBMS support XML operations to query XML as a relational entity.
  8. 如果需要,所有的DBMS都支持XML操作,将XML查询为关系实体。

On the down side:

缺点:

  1. Network payload considerably larger than the RDBMS counter part.
  2. 网络有效负载比RDBMS计数器要大得多。
  3. Require client applications to shred them to usable components.
  4. 要求客户端应用程序将它们分解为可用的组件。

Are these justifications valid? Can anyone think of any more?

这些理由有效吗?还有人能想到更多吗?

3 个解决方案

#1


4  

There isn't really a definitive pro con list - its dependent on what you're trying to do. But here are a couple more for you to think about:

实际上并没有一个确定的专业人士名单——这取决于你想要做什么。但这里还有一些你需要考虑的问题:

  1. Not all SQL databases support XML xpath (beyond blob like '%xxx%'). Perhaps you are stuck on an older version of a database, which doesn't have XML support functions (ie, Mysql 4). Lighter SQL databases such as Sqlite and hsql would also fall into this camp.
  2. 并非所有SQL数据库都支持XML xpath(不包括“%xxx%”之类的blob)。也许您被困在一个旧版本的数据库中,这个数据库没有XML支持函数(即,Mysql 4)。
  3. Even when XML can be searched in the database, its not optimal. SQL searches of XML can't take advantage of the SQL server's built in search optimizations (ie, indexes).
  4. 即使可以在数据库中搜索XML,也不是最优的。XML的SQL搜索不能利用SQL服务器内置的搜索优化(即索引)。
  5. Depending on the database you use the XML document in the database also cannot take advantage of the SQL server's validation and type features. For instance Oracle can do XML schema validation, and I don't see that Mysql can.
  6. 根据在数据库中使用XML文档的数据库,也不能利用SQL服务器的验证和类型特性。例如,Oracle可以执行XML模式验证,而我不认为Mysql可以。
  7. Performance for what queries you can do, won't compare to standard column queries.
  8. 您可以执行的查询的性能不会与标准列查询相比。
  9. Database size. If you store XML in your database its going to be bigger. You could compress it, but then querying it would be hard/impossible.
  10. 数据库的大小。如果你在数据库中存储XML,它会更大。你可以压缩它,但然后查询它将是困难/不可能的。
  11. Normalization issues may become relevant issues - perhaps you don't expect to use SQL to query the XML at some point, but then at a later point its decided that some field is actually needed. You may need to pull that field out of the XML and populate an actual column just to get the desired performance...in which case you now have redundant information in your database.
  12. 规范化问题可能成为相关的问题——也许您不希望在某个时候使用SQL来查询XML,但是后来在一个点上,它决定实际需要一些字段。您可能需要从XML中取出该字段并填充一个实际的列,以获得所需的性能……在这种情况下,现在您的数据库中有冗余信息。

The pros and cons really depend on what you are going to be storing, and what its for.

正反真的取决于你要存储什么,以及它的用途。

  1. If its essentially binary/configuration information that you just need to stick some place, and for whatever reason prefer to stick in your SQL database...well, considerations regarding queries aren't relevant. In that case, the important issues would concern space and how to minimize it (ie, compression).
  2. 如果它本质上是二进制/配置信息,你只需要在某个地方粘贴,而不管出于什么原因,你更喜欢使用SQL数据库……关于查询的考虑并不相关。在这种情况下,重要的问题将涉及空间以及如何最小化空间(即压缩)。
  3. If there is any possibility that the XML would need to be searched regularly then you run the risk of slow queries and the redundancy issues I mentioned above. In which case, you should consider your long term design very carefully up front: do you really need to store this data as XML? Would it be better to construct XML from that data?
  4. 如果有可能需要定期搜索XML,那么您就有可能出现缓慢的查询和我上面提到的冗余问题。在这种情况下,您应该预先非常仔细地考虑您的长期设计:您真的需要将这些数据存储为XML吗?从这些数据构造XML会更好吗?

#2


3  

There are pro and cons in both cases and it depends on your usage scenario.

这两种情况都有优缺点,这取决于您的使用场景。

The main disadvantages of storing as XML itself is that we wont be able to perform a quick search for a particular data. To perform a search, we will have to retrieve and parse all the XML files.

作为XML存储的主要缺点是,我们不能对特定的数据进行快速搜索。要执行搜索,我们必须检索和解析所有XML文件。

We encountered a similar situation in one of our projects. After discussing over it, we went in for a middle ground approach: All the main information (information that needs to be quickly queried) were stored in related tables. And we stored the XMLs also; but instead of storing the XML as such, we saved the XML to disk and used that file path in the table.

我们在一个项目中遇到了类似的情况。在讨论完之后,我们进入了一个中间的方法:所有主要的信息(需要快速查询的信息)都存储在相关的表中。我们还存储了XMLs;但是,我们将XML保存到磁盘中,并使用表中的文件路径,而不是像这样存储XML。

#3


3  

Discussing your comments:

讨论你的评论:

  1. Not storing individual elements also means not enforcing the constraints on them
  2. 不存储单个元素也意味着不强制约束它们。
  3. Again, constraints between tables are not stored
  4. 同样,表之间的约束也没有存储。
  5. Only portable if the target system confirms to the same schema.
  6. 只有当目标系统确认到相同的模式时才具有可移植性。
  7. Yes but the performance will vary.
  8. 是的,但演出会有所不同。

#1


4  

There isn't really a definitive pro con list - its dependent on what you're trying to do. But here are a couple more for you to think about:

实际上并没有一个确定的专业人士名单——这取决于你想要做什么。但这里还有一些你需要考虑的问题:

  1. Not all SQL databases support XML xpath (beyond blob like '%xxx%'). Perhaps you are stuck on an older version of a database, which doesn't have XML support functions (ie, Mysql 4). Lighter SQL databases such as Sqlite and hsql would also fall into this camp.
  2. 并非所有SQL数据库都支持XML xpath(不包括“%xxx%”之类的blob)。也许您被困在一个旧版本的数据库中,这个数据库没有XML支持函数(即,Mysql 4)。
  3. Even when XML can be searched in the database, its not optimal. SQL searches of XML can't take advantage of the SQL server's built in search optimizations (ie, indexes).
  4. 即使可以在数据库中搜索XML,也不是最优的。XML的SQL搜索不能利用SQL服务器内置的搜索优化(即索引)。
  5. Depending on the database you use the XML document in the database also cannot take advantage of the SQL server's validation and type features. For instance Oracle can do XML schema validation, and I don't see that Mysql can.
  6. 根据在数据库中使用XML文档的数据库,也不能利用SQL服务器的验证和类型特性。例如,Oracle可以执行XML模式验证,而我不认为Mysql可以。
  7. Performance for what queries you can do, won't compare to standard column queries.
  8. 您可以执行的查询的性能不会与标准列查询相比。
  9. Database size. If you store XML in your database its going to be bigger. You could compress it, but then querying it would be hard/impossible.
  10. 数据库的大小。如果你在数据库中存储XML,它会更大。你可以压缩它,但然后查询它将是困难/不可能的。
  11. Normalization issues may become relevant issues - perhaps you don't expect to use SQL to query the XML at some point, but then at a later point its decided that some field is actually needed. You may need to pull that field out of the XML and populate an actual column just to get the desired performance...in which case you now have redundant information in your database.
  12. 规范化问题可能成为相关的问题——也许您不希望在某个时候使用SQL来查询XML,但是后来在一个点上,它决定实际需要一些字段。您可能需要从XML中取出该字段并填充一个实际的列,以获得所需的性能……在这种情况下,现在您的数据库中有冗余信息。

The pros and cons really depend on what you are going to be storing, and what its for.

正反真的取决于你要存储什么,以及它的用途。

  1. If its essentially binary/configuration information that you just need to stick some place, and for whatever reason prefer to stick in your SQL database...well, considerations regarding queries aren't relevant. In that case, the important issues would concern space and how to minimize it (ie, compression).
  2. 如果它本质上是二进制/配置信息,你只需要在某个地方粘贴,而不管出于什么原因,你更喜欢使用SQL数据库……关于查询的考虑并不相关。在这种情况下,重要的问题将涉及空间以及如何最小化空间(即压缩)。
  3. If there is any possibility that the XML would need to be searched regularly then you run the risk of slow queries and the redundancy issues I mentioned above. In which case, you should consider your long term design very carefully up front: do you really need to store this data as XML? Would it be better to construct XML from that data?
  4. 如果有可能需要定期搜索XML,那么您就有可能出现缓慢的查询和我上面提到的冗余问题。在这种情况下,您应该预先非常仔细地考虑您的长期设计:您真的需要将这些数据存储为XML吗?从这些数据构造XML会更好吗?

#2


3  

There are pro and cons in both cases and it depends on your usage scenario.

这两种情况都有优缺点,这取决于您的使用场景。

The main disadvantages of storing as XML itself is that we wont be able to perform a quick search for a particular data. To perform a search, we will have to retrieve and parse all the XML files.

作为XML存储的主要缺点是,我们不能对特定的数据进行快速搜索。要执行搜索,我们必须检索和解析所有XML文件。

We encountered a similar situation in one of our projects. After discussing over it, we went in for a middle ground approach: All the main information (information that needs to be quickly queried) were stored in related tables. And we stored the XMLs also; but instead of storing the XML as such, we saved the XML to disk and used that file path in the table.

我们在一个项目中遇到了类似的情况。在讨论完之后,我们进入了一个中间的方法:所有主要的信息(需要快速查询的信息)都存储在相关的表中。我们还存储了XMLs;但是,我们将XML保存到磁盘中,并使用表中的文件路径,而不是像这样存储XML。

#3


3  

Discussing your comments:

讨论你的评论:

  1. Not storing individual elements also means not enforcing the constraints on them
  2. 不存储单个元素也意味着不强制约束它们。
  3. Again, constraints between tables are not stored
  4. 同样,表之间的约束也没有存储。
  5. Only portable if the target system confirms to the same schema.
  6. 只有当目标系统确认到相同的模式时才具有可移植性。
  7. Yes but the performance will vary.
  8. 是的,但演出会有所不同。