在数据库设计中,使用rowguid作为唯一的键是一个好主意吗?

时间:2022-06-03 11:15:14

SQL Server provides the type [rowguid]. I like to use this as unique primary key, to identify a row for update. The benefit shows up if you dump the table and reload it, no mess with SerialNo (identity) columns.

SQL Server提供类型[rowguid]。我喜欢使用这个作为惟一的主键,以标识要更新的行。如果您转储表并重新加载它,那么好处就会显现出来,不会影响到SerialNo (identity)列。

In the special case of distributed databases like offline copies on notebooks or something like that, nothing else works.

在分布式数据库的特殊情况下,如笔记本上的脱机拷贝或类似的东西,没有其他方法可以工作。

What do you think? Too much overhead?

你怎么认为?太多的开销吗?

4 个解决方案

#1


18  

As a primary key in the logical sense (uniquely identifying your rows) - yes, absolutely, makes total sense.

作为逻辑意义上的主键(唯一标识行)——是的,绝对有意义。

BUT: in SQL Server, the primary key is by default also the clustering key on your table, and using a ROWGUID as the clustering key is a really really bad idea. See Kimberly Tripp's excellent GUIDs as a PRIMARY and/or the clustering key article for in-depth reasons why not to use GUIDs for clustering.

但是:在SQL Server中,主键默认也是表上的集群键,使用ROWGUID作为集群键是一个非常糟糕的想法。将Kimberly Tripp的优秀的GUIDs作为主要的和/或集群的关键文章,以深入解释为什么不使用GUIDs进行集群。

Since the GUID is by definition random, you'll have a horrible index fragmentation and thus really really bad performance on insert, update, delete and select statements.

由于GUID根据定义是随机的,您将会有一个可怕的索引碎片,因此在插入、更新、删除和选择语句时性能非常糟糕。

Also, since the clustering key is being added to each and every field of each and every non-clustered index on your table, you're wasting a lot of space - both on disk as well as in server RAM - when using 16-byte GUID vs. 4-byte INT.

此外,由于在表上每个字段和每个非聚集索引中都添加了聚类键,所以在使用16字节的GUID和4字节整数时,您浪费了大量的空间——无论是磁盘上还是服务器RAM中。

So: yes, as a primary key, a ROWGUID has its merits - but if you do use it, definitely avoid using that column as your clustering key in the table! Use a INT IDENTITY() or something similar for that.

所以:是的,作为一个主键,ROWGUID有它的优点——但是如果您使用它,一定要避免在表中使用该列作为集群键!使用INT IDENTITY()或类似的东西。

For a clustering key, ideally you should look for four features:

对于聚类键,理想情况下您应该查找四个特性:

  • stable (never changing)
  • 稳定的(从来没有改变)
  • unique
  • 独特的
  • as small as possible
  • 尽可能小
  • ever-increasing
  • 不断增加的

INT IDENTITY() ideally suits that need. And yes - the clustering key must be unique since it's used to physically locate a row in the table - if you pick a column that can't be guaranteed to be unique, SQL Server will actually add a four-byte uniqueifier to your clustering key - again, not something you want to have....

INT IDENTITY()理想地适合此需求。是的——集群键必须是唯一的,因为它是用于身体定位表中的一行,如果你选择一列不能保证是独一无二的,SQL Server会添加一个四字节uniqueifier集群关键——再一次,不是你想要....

Check out The Clustered Index Debate Continues - another wonderful and insightful article by Kim Tripp (the "Queen of SQL Server Indexing") in which she explains all these requirements very nicely and thoroughly.

看看集群索引的争论还在继续——Kim Tripp(“SQL Server的女王”)撰写的另一篇精彩而有见地的文章,她在文中很好地、彻底地解释了所有这些需求。

MArc

马克

#2


6  

The problem with rowguid is that if you use it for your clustered index you end up constantly re-calculating your table pages for record inserts. A sequential guid ( NEWSEQUENTIALID() ) often works better.

rowguid的问题是,如果您将它用于集群索引,那么您最终将不断地重新计算表页以获取记录插入。顺序guid (NEWSEQUENTIALID())通常工作得更好。

#3


1  

Our offline application is used in branch offices and we have a central database in our main office. To synchronize the database into central database we have used rowguid column in all tables. May be there are better solutions but it is easier for us. We have not faced any major problem till date in last 3 years.

我们的离线应用程序在分支机构中使用,在我们的主办公室中有一个*数据库。为了将数据库同步到*数据库,我们在所有表中都使用了rowguid列。也许有更好的解决方案,但对我们来说更容易。3年来,我们还没有遇到过什么大问题。

#4


1  

Contrary to the accepted answer, the uniqueidentifier datatype in SQL Server is indeed a good candidate for a primary clustering key; so long as you keep it sequential.

与所接受的答案相反,在SQL Server中唯一的识别器数据类型确实是一个主要的聚类键的合适候选者;只要你保持顺序。

This is easily accomplished using (newsequentialid()) as the default value for the column.

使用(newsequentialid())作为列的默认值很容易实现这一点。

If you actually read Kimberly Tripp's article you will find that sequentially generated GUIDs are actually a good candidate for primary clustering keys in terms of fragmentation and the only downside is size.

如果您确实阅读了Kimberly Tripp的文章,您将发现,从碎片化的角度来看,顺序生成的gui实际上是主要集群键的一个很好的候选,惟一的缺点是大小。

If you have large rows with few indexes, the extra few bytes in a GUID may be negligible. Sure the issue compounds if you have short rows with numerous indexes, but this is something you have to weigh up depending on your own situation.

如果有大行且索引很少,那么GUID中额外的几个字节可能可以忽略。当然,如果您有很多索引的短行,那么问题就复杂了,但是这是您必须根据自己的情况权衡的问题。

Using sequential uniqueidentifiers makes a lot of sense when you're going to use merge replication, especially when dealing with identity seeding and the woes that ensue.

当您打算使用合并复制时,使用顺序惟一标识符是很有意义的,尤其是在处理身份播种和随之而来的灾难时。

Server calss storage isn't cheap, but I'd rather have a database that uses a bit more space than one that screeches to a halt when your automatically assigned identity ranges overlap.

服务器calss存储并不便宜,但我更希望有一个数据库,它使用的空间比自动分配的标识范围重叠时突然停止的数据库多一点。

#1


18  

As a primary key in the logical sense (uniquely identifying your rows) - yes, absolutely, makes total sense.

作为逻辑意义上的主键(唯一标识行)——是的,绝对有意义。

BUT: in SQL Server, the primary key is by default also the clustering key on your table, and using a ROWGUID as the clustering key is a really really bad idea. See Kimberly Tripp's excellent GUIDs as a PRIMARY and/or the clustering key article for in-depth reasons why not to use GUIDs for clustering.

但是:在SQL Server中,主键默认也是表上的集群键,使用ROWGUID作为集群键是一个非常糟糕的想法。将Kimberly Tripp的优秀的GUIDs作为主要的和/或集群的关键文章,以深入解释为什么不使用GUIDs进行集群。

Since the GUID is by definition random, you'll have a horrible index fragmentation and thus really really bad performance on insert, update, delete and select statements.

由于GUID根据定义是随机的,您将会有一个可怕的索引碎片,因此在插入、更新、删除和选择语句时性能非常糟糕。

Also, since the clustering key is being added to each and every field of each and every non-clustered index on your table, you're wasting a lot of space - both on disk as well as in server RAM - when using 16-byte GUID vs. 4-byte INT.

此外,由于在表上每个字段和每个非聚集索引中都添加了聚类键,所以在使用16字节的GUID和4字节整数时,您浪费了大量的空间——无论是磁盘上还是服务器RAM中。

So: yes, as a primary key, a ROWGUID has its merits - but if you do use it, definitely avoid using that column as your clustering key in the table! Use a INT IDENTITY() or something similar for that.

所以:是的,作为一个主键,ROWGUID有它的优点——但是如果您使用它,一定要避免在表中使用该列作为集群键!使用INT IDENTITY()或类似的东西。

For a clustering key, ideally you should look for four features:

对于聚类键,理想情况下您应该查找四个特性:

  • stable (never changing)
  • 稳定的(从来没有改变)
  • unique
  • 独特的
  • as small as possible
  • 尽可能小
  • ever-increasing
  • 不断增加的

INT IDENTITY() ideally suits that need. And yes - the clustering key must be unique since it's used to physically locate a row in the table - if you pick a column that can't be guaranteed to be unique, SQL Server will actually add a four-byte uniqueifier to your clustering key - again, not something you want to have....

INT IDENTITY()理想地适合此需求。是的——集群键必须是唯一的,因为它是用于身体定位表中的一行,如果你选择一列不能保证是独一无二的,SQL Server会添加一个四字节uniqueifier集群关键——再一次,不是你想要....

Check out The Clustered Index Debate Continues - another wonderful and insightful article by Kim Tripp (the "Queen of SQL Server Indexing") in which she explains all these requirements very nicely and thoroughly.

看看集群索引的争论还在继续——Kim Tripp(“SQL Server的女王”)撰写的另一篇精彩而有见地的文章,她在文中很好地、彻底地解释了所有这些需求。

MArc

马克

#2


6  

The problem with rowguid is that if you use it for your clustered index you end up constantly re-calculating your table pages for record inserts. A sequential guid ( NEWSEQUENTIALID() ) often works better.

rowguid的问题是,如果您将它用于集群索引,那么您最终将不断地重新计算表页以获取记录插入。顺序guid (NEWSEQUENTIALID())通常工作得更好。

#3


1  

Our offline application is used in branch offices and we have a central database in our main office. To synchronize the database into central database we have used rowguid column in all tables. May be there are better solutions but it is easier for us. We have not faced any major problem till date in last 3 years.

我们的离线应用程序在分支机构中使用,在我们的主办公室中有一个*数据库。为了将数据库同步到*数据库,我们在所有表中都使用了rowguid列。也许有更好的解决方案,但对我们来说更容易。3年来,我们还没有遇到过什么大问题。

#4


1  

Contrary to the accepted answer, the uniqueidentifier datatype in SQL Server is indeed a good candidate for a primary clustering key; so long as you keep it sequential.

与所接受的答案相反,在SQL Server中唯一的识别器数据类型确实是一个主要的聚类键的合适候选者;只要你保持顺序。

This is easily accomplished using (newsequentialid()) as the default value for the column.

使用(newsequentialid())作为列的默认值很容易实现这一点。

If you actually read Kimberly Tripp's article you will find that sequentially generated GUIDs are actually a good candidate for primary clustering keys in terms of fragmentation and the only downside is size.

如果您确实阅读了Kimberly Tripp的文章,您将发现,从碎片化的角度来看,顺序生成的gui实际上是主要集群键的一个很好的候选,惟一的缺点是大小。

If you have large rows with few indexes, the extra few bytes in a GUID may be negligible. Sure the issue compounds if you have short rows with numerous indexes, but this is something you have to weigh up depending on your own situation.

如果有大行且索引很少,那么GUID中额外的几个字节可能可以忽略。当然,如果您有很多索引的短行,那么问题就复杂了,但是这是您必须根据自己的情况权衡的问题。

Using sequential uniqueidentifiers makes a lot of sense when you're going to use merge replication, especially when dealing with identity seeding and the woes that ensue.

当您打算使用合并复制时,使用顺序惟一标识符是很有意义的,尤其是在处理身份播种和随之而来的灾难时。

Server calss storage isn't cheap, but I'd rather have a database that uses a bit more space than one that screeches to a halt when your automatically assigned identity ranges overlap.

服务器calss存储并不便宜,但我更希望有一个数据库,它使用的空间比自动分配的标识范围重叠时突然停止的数据库多一点。