What are the down sides of using a composite/compound primary key?
使用复合/复合主键有哪些缺点?
8 个解决方案
#1
13
- Could cause more problems for normalisation (2NF, "Note that when a 1NF table has no composite candidate keys (candidate keys consisting of more than one attribute), the table is automatically in 2NF")
- 可能导致规范化的更多问题(2NF,“请注意,当1NF表没有复合候选键(候选键由多个属性组成)时,表自动在2NF”)
- More unnecessary data duplication. If your composite key consists of 3 columns, you will need to create the same 3 columns in every table, where it is used as a foreign key.
- 更多不必要的数据重复。如果组合键由3列组成,则需要在每个表中创建相同的3列,并将其用作外键。
- Generally avoidable with the help of surrogate keys (read about their advantages and disadvantages)
- 在代理键的帮助下通常可以避免(阅读它们的优点和缺点)
- I can imagine a good scenario for composite key -- in a table representing a N:N relation, like Students - Classes, and the key in the intermediate table will be (StudentID, ClassID). But if you need to store more information about each pair (like a history of all marks of a student in a class) then you'll probably introduce a surrogate key.
- 我可以想象一个复合键的好方案 - 在表示N:N关系的表中,如Students - Classes,中间表中的键将是(StudentID,ClassID)。但是如果你需要存储关于每一对的更多信息(比如一个班级中学生的所有分数的历史记录),那么你可能会引入一个代理键。
#2
5
There's nothing wrong with having a compound key per se, but a primary key should ideally be as small as possible (in terms of number of bytes required). If the primary key is long then this will cause non-clustered indexes to be bloated.
拥有复合密钥本身没有任何问题,但主键理想情况下应尽可能小(就所需的字节数而言)。如果主键很长,那么这将导致非聚簇索引膨胀。
Bear in mind that the order of the columns in the primary key is important. The first column should be as selective as possible i.e. as 'unique' as possible. Searches on the first column will be able to seek, but searches just on the second column will have to scan, unless there is also a non-clustered index on the second column.
请记住,主键中列的顺序很重要。第一列应尽可能具有选择性,即尽可能“独特”。搜索第一列将能够搜索,但只搜索第二列将必须扫描,除非第二列上还有非聚集索引。
#3
4
I think this is a specialisation of the synthetic key debate (whether to use meaningful keys or an arbitrary synthetic primary key). I come down almost completely on the synthetic key side of this debate for a number of reasons. These are a few of the more pertinent ones:
我认为这是合成密钥辩论的一个特殊化(是否使用有意义的密钥或任意合成主键)。出于多种原因,我几乎完全退出本次辩论的综合关键方面。这些是一些更相关的:
-
You have to keep dependent child tables on the end of a foriegn key up to date. If you change the the value of one of the primary key fields (which can happen - see below) you have to somehow change all of the dependent tables where their PK value includes these fields. This is a bit tricky because changing key values will invalidate FK relationships with child tables so you may (depending on the constraint validation options available on your platform) have to resort to tricks like copying the record to a new one and deleting the old records.
您必须在foriegn键的末尾保持依赖子表是最新的。如果更改其中一个主键字段的值(可能发生 - 见下文),则必须以某种方式更改其PK值包含这些字段的所有相关表。这有点棘手,因为更改键值将使子表的FK关系无效,因此您可能(取决于平台上可用的约束验证选项)必须采用将记录复制到新记录并删除旧记录等技巧。
-
On a deep schema the keys can get quite wide - I've seen 8 columns once.
在深层架构上,键可以变得非常宽 - 我曾经看过8列。
-
Changes in primary key values can be troublesome to identify in ETL processes loading off the system. The example I once had occasion to see was an MIS application extracting from an insurance underwriting system. On some occasions a policy entry would be re-used by the customer, changing the policy identifier. This was a part of the primary key of the table. When this happens the warehouse load is not aware of what the old value was so it cannot match the new data to it. The developer had to go searching through audit logs to identify the changed value.
在加载系统的ETL过程中识别主键值的变化可能很麻烦。我曾经看过的一个例子是从保险承保系统中提取的MIS应用程序。在某些情况下,客户将重新使用策略条目,从而更改策略标识符。这是该表主键的一部分。发生这种情况时,仓库负载不知道旧值是什么,因此无法将新数据与其匹配。开发人员必须搜索审计日志以识别更改的值。
Most of the issues with non-synthetic primary keys revolve around issues when PK values of records change. The most useful applications of non-synthetic values are where a database schema is intended to be used, such as an M.I.S. application where report writers are using the tables directly. In this case short values with fixed domains such as currency codes or dates might reasonably be placed directly on the table for convenience.
非合成主键的大多数问题都围绕着记录的PK值发生变化时的问题。非合成值的最有用的应用程序是要使用数据库模式的地方,例如M.I.S.报表编写者直接使用表的应用程序。在这种情况下,为方便起见,可以合理地将具有固定域(例如货币代码或日期)的短值直接放在桌子上。
#4
1
I would recommend a generated primary key in those cases with a unique not null constraint on the natural composite key.
在这些情况下,我建议使用生成的主键,在自然复合键上使用唯一的非空约束。
If you use the natural key as primary then you will most likely have to reference both values in foreign key references to make sure you are identifying the correct record.
如果您使用自然键作为主键,那么您很可能必须引用外键引用中的两个值以确保您正在识别正确的记录。
#5
1
Take the example of a table with two candidate keys: one simple (single-column) and one compound (multi-column). Your question in that context seems to be, "What disadvantage may I suffer if I choose to promote one key to be 'primary' and I choose the compound key?"
以带有两个候选键的表为例:一个简单(单列)和一个复合(多列)。你在这种情况下的问题似乎是,“如果我选择将一把钥匙提升为'主要'并且我选择复合钥匙,我会遭受哪些不利因素?”
First, consider whether you actually need to promote a key at all: "the very existence of the PRIMARY KEY
in SQL seems to be an historical accident of some kind. According to author Chris Date the earliest incarnations of SQL didn't have any key constraints and PRIMARY KEY
was only later addded to the SQL standards. The designers of the standard obviously took the term from E.F.Codd who invented it, even though Codd's original notion had been abandoned by that time! (Codd originally proposed that foreign keys must only reference one key - the primary key - but that idea was forgotten and ignored because it was widely recognised as a pointless limitation)." [source: David Portas' Blog: Down with Primary Keys?
首先,考虑一下你是否真的需要提升一个密钥:“SQL中PRIMARY KEY的存在似乎是某种形式的历史性事故。根据作者Chris Date,SQL的最早版本没有任何关键约束和PRIMARY KEY后来才被添加到SQL标准中。标准的设计者明显接受了发明它的EFCodd这个术语,尽管那时候Codd的原始概念已被抛弃了!(Codd最初提出外键必须只有引用一个密钥 - 主键 - 但这个想法被遗忘和忽略,因为它被广泛认为是一个无意义的限制。) [来源:大卫波塔斯的博客:打倒主键?
Second, what criteria would you apply to choose which key in a table should be 'primary'? In SQL, the choice of key PRIMARY KEY
is arbitrary and product specific. In ACE/Jet (a.k.a. MS Access) the two main and often competing factors is whether you want to use PRIMARY KEY
to favour clustering on disk or whether you want the columns comprising the key to appears as bold in the 'Relationships' picture in the MS Access user interface; I'm in the minority by thinking that index strategy trumps pretty picture :) In SQL Server, you can specify the clustered index independently of the PRIMARY KEY
and there seems to be no product-specific advantage afforded. The only remaining advantage seems to be the fact you can omit the columns of the PRIMARY KEY
when creating a foreign key in SQL DDL, being a SQL-92 Standard behaviour and anyhow doesn't seem such a big deal to me (perhaps another one of the things they added to the Standard because it was a feature already widespread in SQL products?) So, it's not a case of looking for drawbacks, rather, you should be looking to see what advantage, if any, your SQL product gives the PRIMARY KEY
. Put another way, the only drawback to choosing the wrong key is that you may be missing out on a given advantage.
其次,您应该选择哪个标准来选择表中的哪个键应该是“主要”?在SQL中,键PRIMARY KEY的选择是任意的和产品特定的。在ACE / Jet(又名MS Access)中,两个主要且经常相互竞争的因素是您是否要使用PRIMARY KEY来支持磁盘上的群集,或者您是否希望包含该键的列在“关系”图片中显示为粗体。 MS Access用户界面;我认为索引策略胜过漂亮的图片我占少数:)在SQL Server中,您可以独立于PRIMARY KEY指定聚集索引,并且似乎没有提供特定于产品的优势。唯一剩下的优点似乎是在SQL DDL中创建外键时可以省略PRIMARY KEY的列,这是一个SQL-92标准行为,无论如何对我来说似乎没什么大不了的(也许是另一个他们添加到标准中的东西是因为它是SQL产品中已经普及的一个特性吗?)所以,这不是寻找缺点的情况,而是你应该看看你的SQL产品给出了什么样的优势(如果有的话)首要的关键。换句话说,选择错误密钥的唯一缺点是你可能错过了一个给定的优势。
Third, are you rather alluding to using an artificial/synthetic/surrogate key to implement in your physical model a candidate key from your logical model because you are concerned there will be performance penalties if you use the natural key in foreign keys and table joins? That's an entirely different question and largely depends on your 'religious' stance on the issue of natural keys in SQL.
第三,您是否提到使用人工/合成/代理键在您的物理模型中实现逻辑模型中的候选键,因为您担心如果在外键和表连接中使用自然键会有性能损失?这是一个完全不同的问题,很大程度上取决于你对SQL中自然键问题的“宗教”立场。
#6
0
Need more specificity.
需要更多特异性。
Taken too far, it can overcomplicate Inserts (Every key MUST exist) and documentation and your joined reads could be suspect if incomplete.
太过分了,它可能会过度复杂化插入(每个键必须存在)和文档,如果不完整,你的联合读取可能会被怀疑。
Sometimes it can indicate a flawed data model (is a composite key REALLY what's described by the data?)
有时它可以指示一个有缺陷的数据模型(复合键真的是数据描述的是什么?)
I don't believe there is a performance cost...it just can go really wrong really easily.
我不相信会有性能成本......它真的很容易出错。
#7
0
- when you se it on a diagram are less readable
- 当你在图表上看它时可读性较差
- when you use it on a query join are less readable
- 当您在查询上使用它时,连接的可读性较差
- when you use it on a foregein key you have to add a check constraint about all the attribute have to be null or not null (if only one is null the key is not checked)
- 当你在foregein键上使用它时,你必须添加一个关于所有属性的检查约束必须为null或不为null(如果只有一个为null,则不检查该键)
- usualy need more storage when use it as foreign key
- 使用它作为外键时通常需要更多存储空间
- some tool doesn't manage composite key
- 某些工具无法管理复合键
#8
0
The main downside of using a compound primary key, is that you will confuse the hell out of typical ORM code generators.
使用复合主键的主要缺点是,你会混淆典型的ORM代码生成器。
#1
13
- Could cause more problems for normalisation (2NF, "Note that when a 1NF table has no composite candidate keys (candidate keys consisting of more than one attribute), the table is automatically in 2NF")
- 可能导致规范化的更多问题(2NF,“请注意,当1NF表没有复合候选键(候选键由多个属性组成)时,表自动在2NF”)
- More unnecessary data duplication. If your composite key consists of 3 columns, you will need to create the same 3 columns in every table, where it is used as a foreign key.
- 更多不必要的数据重复。如果组合键由3列组成,则需要在每个表中创建相同的3列,并将其用作外键。
- Generally avoidable with the help of surrogate keys (read about their advantages and disadvantages)
- 在代理键的帮助下通常可以避免(阅读它们的优点和缺点)
- I can imagine a good scenario for composite key -- in a table representing a N:N relation, like Students - Classes, and the key in the intermediate table will be (StudentID, ClassID). But if you need to store more information about each pair (like a history of all marks of a student in a class) then you'll probably introduce a surrogate key.
- 我可以想象一个复合键的好方案 - 在表示N:N关系的表中,如Students - Classes,中间表中的键将是(StudentID,ClassID)。但是如果你需要存储关于每一对的更多信息(比如一个班级中学生的所有分数的历史记录),那么你可能会引入一个代理键。
#2
5
There's nothing wrong with having a compound key per se, but a primary key should ideally be as small as possible (in terms of number of bytes required). If the primary key is long then this will cause non-clustered indexes to be bloated.
拥有复合密钥本身没有任何问题,但主键理想情况下应尽可能小(就所需的字节数而言)。如果主键很长,那么这将导致非聚簇索引膨胀。
Bear in mind that the order of the columns in the primary key is important. The first column should be as selective as possible i.e. as 'unique' as possible. Searches on the first column will be able to seek, but searches just on the second column will have to scan, unless there is also a non-clustered index on the second column.
请记住,主键中列的顺序很重要。第一列应尽可能具有选择性,即尽可能“独特”。搜索第一列将能够搜索,但只搜索第二列将必须扫描,除非第二列上还有非聚集索引。
#3
4
I think this is a specialisation of the synthetic key debate (whether to use meaningful keys or an arbitrary synthetic primary key). I come down almost completely on the synthetic key side of this debate for a number of reasons. These are a few of the more pertinent ones:
我认为这是合成密钥辩论的一个特殊化(是否使用有意义的密钥或任意合成主键)。出于多种原因,我几乎完全退出本次辩论的综合关键方面。这些是一些更相关的:
-
You have to keep dependent child tables on the end of a foriegn key up to date. If you change the the value of one of the primary key fields (which can happen - see below) you have to somehow change all of the dependent tables where their PK value includes these fields. This is a bit tricky because changing key values will invalidate FK relationships with child tables so you may (depending on the constraint validation options available on your platform) have to resort to tricks like copying the record to a new one and deleting the old records.
您必须在foriegn键的末尾保持依赖子表是最新的。如果更改其中一个主键字段的值(可能发生 - 见下文),则必须以某种方式更改其PK值包含这些字段的所有相关表。这有点棘手,因为更改键值将使子表的FK关系无效,因此您可能(取决于平台上可用的约束验证选项)必须采用将记录复制到新记录并删除旧记录等技巧。
-
On a deep schema the keys can get quite wide - I've seen 8 columns once.
在深层架构上,键可以变得非常宽 - 我曾经看过8列。
-
Changes in primary key values can be troublesome to identify in ETL processes loading off the system. The example I once had occasion to see was an MIS application extracting from an insurance underwriting system. On some occasions a policy entry would be re-used by the customer, changing the policy identifier. This was a part of the primary key of the table. When this happens the warehouse load is not aware of what the old value was so it cannot match the new data to it. The developer had to go searching through audit logs to identify the changed value.
在加载系统的ETL过程中识别主键值的变化可能很麻烦。我曾经看过的一个例子是从保险承保系统中提取的MIS应用程序。在某些情况下,客户将重新使用策略条目,从而更改策略标识符。这是该表主键的一部分。发生这种情况时,仓库负载不知道旧值是什么,因此无法将新数据与其匹配。开发人员必须搜索审计日志以识别更改的值。
Most of the issues with non-synthetic primary keys revolve around issues when PK values of records change. The most useful applications of non-synthetic values are where a database schema is intended to be used, such as an M.I.S. application where report writers are using the tables directly. In this case short values with fixed domains such as currency codes or dates might reasonably be placed directly on the table for convenience.
非合成主键的大多数问题都围绕着记录的PK值发生变化时的问题。非合成值的最有用的应用程序是要使用数据库模式的地方,例如M.I.S.报表编写者直接使用表的应用程序。在这种情况下,为方便起见,可以合理地将具有固定域(例如货币代码或日期)的短值直接放在桌子上。
#4
1
I would recommend a generated primary key in those cases with a unique not null constraint on the natural composite key.
在这些情况下,我建议使用生成的主键,在自然复合键上使用唯一的非空约束。
If you use the natural key as primary then you will most likely have to reference both values in foreign key references to make sure you are identifying the correct record.
如果您使用自然键作为主键,那么您很可能必须引用外键引用中的两个值以确保您正在识别正确的记录。
#5
1
Take the example of a table with two candidate keys: one simple (single-column) and one compound (multi-column). Your question in that context seems to be, "What disadvantage may I suffer if I choose to promote one key to be 'primary' and I choose the compound key?"
以带有两个候选键的表为例:一个简单(单列)和一个复合(多列)。你在这种情况下的问题似乎是,“如果我选择将一把钥匙提升为'主要'并且我选择复合钥匙,我会遭受哪些不利因素?”
First, consider whether you actually need to promote a key at all: "the very existence of the PRIMARY KEY
in SQL seems to be an historical accident of some kind. According to author Chris Date the earliest incarnations of SQL didn't have any key constraints and PRIMARY KEY
was only later addded to the SQL standards. The designers of the standard obviously took the term from E.F.Codd who invented it, even though Codd's original notion had been abandoned by that time! (Codd originally proposed that foreign keys must only reference one key - the primary key - but that idea was forgotten and ignored because it was widely recognised as a pointless limitation)." [source: David Portas' Blog: Down with Primary Keys?
首先,考虑一下你是否真的需要提升一个密钥:“SQL中PRIMARY KEY的存在似乎是某种形式的历史性事故。根据作者Chris Date,SQL的最早版本没有任何关键约束和PRIMARY KEY后来才被添加到SQL标准中。标准的设计者明显接受了发明它的EFCodd这个术语,尽管那时候Codd的原始概念已被抛弃了!(Codd最初提出外键必须只有引用一个密钥 - 主键 - 但这个想法被遗忘和忽略,因为它被广泛认为是一个无意义的限制。) [来源:大卫波塔斯的博客:打倒主键?
Second, what criteria would you apply to choose which key in a table should be 'primary'? In SQL, the choice of key PRIMARY KEY
is arbitrary and product specific. In ACE/Jet (a.k.a. MS Access) the two main and often competing factors is whether you want to use PRIMARY KEY
to favour clustering on disk or whether you want the columns comprising the key to appears as bold in the 'Relationships' picture in the MS Access user interface; I'm in the minority by thinking that index strategy trumps pretty picture :) In SQL Server, you can specify the clustered index independently of the PRIMARY KEY
and there seems to be no product-specific advantage afforded. The only remaining advantage seems to be the fact you can omit the columns of the PRIMARY KEY
when creating a foreign key in SQL DDL, being a SQL-92 Standard behaviour and anyhow doesn't seem such a big deal to me (perhaps another one of the things they added to the Standard because it was a feature already widespread in SQL products?) So, it's not a case of looking for drawbacks, rather, you should be looking to see what advantage, if any, your SQL product gives the PRIMARY KEY
. Put another way, the only drawback to choosing the wrong key is that you may be missing out on a given advantage.
其次,您应该选择哪个标准来选择表中的哪个键应该是“主要”?在SQL中,键PRIMARY KEY的选择是任意的和产品特定的。在ACE / Jet(又名MS Access)中,两个主要且经常相互竞争的因素是您是否要使用PRIMARY KEY来支持磁盘上的群集,或者您是否希望包含该键的列在“关系”图片中显示为粗体。 MS Access用户界面;我认为索引策略胜过漂亮的图片我占少数:)在SQL Server中,您可以独立于PRIMARY KEY指定聚集索引,并且似乎没有提供特定于产品的优势。唯一剩下的优点似乎是在SQL DDL中创建外键时可以省略PRIMARY KEY的列,这是一个SQL-92标准行为,无论如何对我来说似乎没什么大不了的(也许是另一个他们添加到标准中的东西是因为它是SQL产品中已经普及的一个特性吗?)所以,这不是寻找缺点的情况,而是你应该看看你的SQL产品给出了什么样的优势(如果有的话)首要的关键。换句话说,选择错误密钥的唯一缺点是你可能错过了一个给定的优势。
Third, are you rather alluding to using an artificial/synthetic/surrogate key to implement in your physical model a candidate key from your logical model because you are concerned there will be performance penalties if you use the natural key in foreign keys and table joins? That's an entirely different question and largely depends on your 'religious' stance on the issue of natural keys in SQL.
第三,您是否提到使用人工/合成/代理键在您的物理模型中实现逻辑模型中的候选键,因为您担心如果在外键和表连接中使用自然键会有性能损失?这是一个完全不同的问题,很大程度上取决于你对SQL中自然键问题的“宗教”立场。
#6
0
Need more specificity.
需要更多特异性。
Taken too far, it can overcomplicate Inserts (Every key MUST exist) and documentation and your joined reads could be suspect if incomplete.
太过分了,它可能会过度复杂化插入(每个键必须存在)和文档,如果不完整,你的联合读取可能会被怀疑。
Sometimes it can indicate a flawed data model (is a composite key REALLY what's described by the data?)
有时它可以指示一个有缺陷的数据模型(复合键真的是数据描述的是什么?)
I don't believe there is a performance cost...it just can go really wrong really easily.
我不相信会有性能成本......它真的很容易出错。
#7
0
- when you se it on a diagram are less readable
- 当你在图表上看它时可读性较差
- when you use it on a query join are less readable
- 当您在查询上使用它时,连接的可读性较差
- when you use it on a foregein key you have to add a check constraint about all the attribute have to be null or not null (if only one is null the key is not checked)
- 当你在foregein键上使用它时,你必须添加一个关于所有属性的检查约束必须为null或不为null(如果只有一个为null,则不检查该键)
- usualy need more storage when use it as foreign key
- 使用它作为外键时通常需要更多存储空间
- some tool doesn't manage composite key
- 某些工具无法管理复合键
#8
0
The main downside of using a compound primary key, is that you will confuse the hell out of typical ORM code generators.
使用复合主键的主要缺点是,你会混淆典型的ORM代码生成器。