Further to my question "Why to use ´not null primary key´ in TSQL?"...
继我的问题“为什么在TSQL中使用'not null primary key'?”......
As I understood from other discussions, some RDBMS (for example SQLite, MySQL) permit "unique" NULL in the primary key.
正如我从其他讨论中所理解的,一些RDBMS(例如SQLite,MySQL)允许主键中的“唯一”NULL。
Why is this allowed and how might it be useful?
为什么允许这样做以及它有何用处?
Background: I believe it is beneficial for communication with colleagues and database professionals to know the differences in fundamental concepts, approaches and their implementations in different DBMS.
背景:我认为与同事和数据库专业人员进行沟通有助于了解不同DBMS中基本概念,方法及其实现的差异。
Notes
- MySQL is rehabilitated and returned to the "NOT NULL PK" list.
- MySQL被修复并返回到“NOT NULL PK”列表。
- SQLite has been added (thanks to Paul Hadfield) to "NULL PK" list:
- SQLite已被添加(感谢Paul Hadfield)到“NULL PK”列表:
For the purposes of determining the uniqueness of primary key values, NULL values are considered distinct from all other values, including other NULLs.
为了确定主键值的唯一性,NULL值被认为与所有其他值不同,包括其他NULL。
If an INSERT or UPDATE statement attempts to modify the table content so that two or more rows feature identical primary key values, it is a constraint violation. According to the SQL standard, PRIMARY KEY should always imply NOT NULL. Unfortunately, due to a long-standing coding oversight, this is not the case in SQLite.
如果INSERT或UPDATE语句尝试修改表内容,以便两个或多个行具有相同的主键值,则会违反约束。根据SQL标准,PRIMARY KEY应始终表示NOT NULL。不幸的是,由于长期的编码监督,在SQLite中并非如此。
Unless the column is an INTEGER PRIMARY KEY SQLite allows NULL values in a PRIMARY KEY column. We could change SQLite to conform to the standard (and we might do so in the future), but by the time the oversight was discovered, SQLite was in such wide use that we feared breaking legacy code if we fixed the problem.
除非该列是INTEGER PRIMARY KEY,否则SQLite允许在PRIMARY KEY列中使用NULL值。我们可以改变SQLite以符合标准(我们将来可能会这样做),但是当发现疏忽时,SQLite的使用范围很广,如果我们解决问题,我们担心会破坏遗留代码。
So for now we have chosen to continue allowing NULLs in PRIMARY KEY columns. Developers should be aware, however, that we may change SQLite to conform to the SQL standard in future and should design new programs accordingly.
所以现在我们已经选择在PRIMARY KEY列中继续允许NULL。但是,开发人员应该意识到,我们可能会在未来更改SQLite以符合SQL标准,并应相应地设计新程序。
— SQL As Understood By SQLite: CREATE TABLE
- SQLite理解的SQL:CREATE TABLE
6 个解决方案
#1
22
Suppose you have a primary key containing a nullable column Kn.
假设您有一个包含可空列Kn的主键。
If you want to have a second row rejected on the ground that in that second row, Kn is null and the table already contains a row with Kn null, then you are actually requiring that the system would treat the comparison "row1.Kn = row2.Kn" as giving TRUE (because you somehow want the system to detect that the key values in those rows are indeed equal). However, this comparison boils down to the comparison "null = null", and the standard already explicitly specifies that null doesn't compare equal to anything, including itself.
如果你想让第二行被拒绝,那么在第二行中Kn是null并且表已经包含一个Kn null的行,那么你实际上要求系统将对比“row1.Kn = row2 .Kn“赋予TRUE(因为你不知何故希望系统检测到那些行中的键值确实相等)。但是,这种比较归结为比较“null = null”,并且标准已经明确指定null不比较等于任何东西,包括它自己。
To allow for what you want, would thus amount to SQL deviating from its own principles regarding the treatment of null. There are innumerable inconsistencies in SQL, but this particular one never got past the committee.
为了满足你的需要,因此SQL会偏离自己关于null处理的原则。 SQL中存在无数的不一致,但这个特殊的一个从未超越委员会。
#2
5
I don't know whether older versions of MySQL differ on this, but as of modern versions a primary key must be on columns that are not null. See the manual page on CREATE TABLE
: "A PRIMARY KEY
is a unique index where all key columns must be defined as NOT NULL
. If they are not explicitly declared as NOT NULL
, MySQL declares them so implicitly (and silently)."
我不知道MySQL的旧版本是否与此不同,但是从现代版本开始,主键必须是非空的列。请参阅CREATE TABLE上的手册页:“PRIMARY KEY是一个唯一索引,其中所有键列必须定义为NOT NULL。如果未将它们显式声明为NOT NULL,则MySQL会隐式声明(并默默地)。”
#3
4
As far as relational database theory is concerned:
就关系数据库理论而言:
- The primary key of a table is used to uniquely identify each and every row in the table
- 表的主键用于唯一标识表中的每一行
- A NULL value in a column indicates that you don't konw what the value is
- 列中的NULL值表示您不知道该值是什么
- Therefore, you should never use the value of "I don't know" to uniquely identify a row in a table.
- 因此,您不应该使用“我不知道”的值来唯一标识表中的行。
Depending upon the data you are modelling, a "made up" value can be used instead of NULL. I've used 0, "N/A", 'Jan 1, 1980', and similar values to represent dummy "known to be missing" data.
根据您建模的数据,可以使用“组合”值而不是NULL。我使用了0,“N / A”,“1980年1月1日”,以及类似的值来表示虚拟的“已知缺失”数据。
Most, if not all, DB engines do allow for a UNIQUE constraint or index, which does allow for NULL column values, though (ideally) only one row may be assigned the value null (otherwise it wouldn't be a unique value). This can be used to support the irritatingly pragmatic (but occasionally necessary) situations that don't fit neatly into relational theory.
大多数(如果不是全部)数据库引擎都允许使用UNIQUE约束或索引,它允许NULL列值,但(理想情况下)只能为一行赋值null(否则它不是唯一值)。这可以用来支持那些不适合关系理论的烦人的(但偶尔必要的)情境。
#4
3
Well, it could allow you to implement the Null Object Pattern natively within the database. So if you were using something similar in code, which interacted very closely with the DB, you could just look up the object corresponding to the key without having to special-case a null check.
好吧,它可以允许您在数据库中本机实现Null对象模式。因此,如果您在代码中使用类似的东西,它与数据库密切相关,您可以只查找与密钥对应的对象,而无需特殊情况下进行空检查。
Now whether this is worthwhile functionality I'm not sure, but it's really a question of whether the pros of disallowing null pkeys in absolutely all cases outweigh the cons of obstructing someone who (for better or worse) actually wants to use null keys. This would only be worth it if you could demonstrate some non-trivial improvement (such as faster key lookup) from being able to guarantee that keys are non-null. Some DB engines would show this, others might not. And if there aren't any real pros from forcing this, why artificially restrict your clients?
现在这是否是值得的功能我不确定,但这确实是一个问题,在绝对所有情况下禁止空pkeys的优点是否超过妨碍某人(无论好坏)实际上想要使用空键的缺点。如果你能够保证密钥非空,那么你可以证明一些非平凡的改进(例如更快的密钥查找),这是值得的。有些数据库引擎会显示此信息,有些则可能没有。如果没有任何真正的专业人士强迫这一点,为什么人为地限制你的客户呢?
#5
1
As discussed in other answers, NULL was intended to mean "the information that should go in this column is unknown". However, it is also frequently used to indicate an alternative meaning of "this attribute does not exist". This is a particularly useful interpretation when looking at timestamp fields that are interpreted as the time some particular event occurred, in which case NULL is often used to indicate that the event has not yet occurred.
正如其他答案中所讨论的,NULL意在表示“本栏中应该包含的信息未知”。但是,它也经常用于表示“此属性不存在”的替代含义。当查看被解释为某个特定事件发生的时间的时间戳字段时,这是一种特别有用的解释,在这种情况下,NULL通常用于指示事件尚未发生。
It is a problem that SQL doesn't support this interpretation very well -- for this to work properly, it really needs to have a separate value (something like "never") that doesn't behave as null does ("never" should be equal to "never" and should compare as higher than all other values). But as SQL lacks this notion, and there is no convenient way to add it, using null for this purposes is often the best choice.
这是一个问题,SQL不能很好地支持这种解释 - 为了使其正常工作,它确实需要一个单独的值(类似于“never”),它不像null一样(“never”应该等于“从不”,并且应该比所有其他值更高。但由于SQL缺乏这一概念,并且没有方便的方法来添加它,因此为此目的使用null通常是最佳选择。
This leaves the problem that when a timestamp of an event that may have not occurred should be part of the primary key of a table (a common requirement perhaps being the use of a natural key along with a deletion timestamp when using soft deletion with a requirement for the ability to recreate the item after deletion) you really want the primary key to have a nullable column. Alas, this is not allowed in most databases, and instead you have to resort to an artificial primary key (e.g. a row sequence number) and a UNIQUE constraint for what should otherwise have been your actual primary key.
这留下了一个问题,即当一个可能没有发生的事件的时间戳应该是表的主键的一部分时(一个常见的要求可能是当使用带有要求的软删除时使用自然键和删除时间戳)为了能够在删除后重新创建项目,你真的希望主键具有可为空的列。唉,这在大多数数据库中是不允许的,相反,你必须求助于一个人工主键(例如一个行序列号)和一个UNIQUE约束,否则它应该是你的实际主键。
An example scenario, in order to clarify this: I have a users
table. As I require each user to have a distinct username, I decide to use username
as the primary key. I want to support user deletion, but as I need to track the existence of users historically for auditing purposes I use soft deletion (in the first version of the schema, I add a 'deleted' flag to the user, and ensure that the deleted flag is checked in all queries where only active users are expected).
一个示例场景,为了澄清这一点:我有一个用户表。由于我要求每个用户都有一个不同的用户名,我决定使用用户名作为主键。我想支持用户删除,但由于我需要在历史上跟踪用户的存在以进行审计,我使用软删除(在架构的第一个版本中,我向用户添加'已删除'标志,并确保删除在只有活动用户的所有查询中检查标志)。
An additional requirement, however, is that if a username is deleted, it should be available for new users to register. An attractive way to achieve this would be to have the deleted flag change to a nullable timestamp (where nulls indicate that the user has not been deleted) and put this in the primary key. Were primary keys to allow nullable columns, this would have the following effect:
但是,另一个要求是,如果删除了用户名,则新用户可以注册该用户名。实现这一目标的一种有吸引力的方法是将已删除的标志更改为可以为空的时间戳(其中空值表示用户尚未被删除)并将其置于主键中。如果是允许可空列的主键,则会产生以下影响:
- Creating a new user with an existing username when that user's
deleted
column is null would be denied as a duplicate key entry - 当该用户的已删除列为空时,使用现有用户名创建新用户将被拒绝作为重复键条目
- Deleting a user changes its key (which requires changes to cascade to foreign keys that reference the user, which is suboptimal but if deletions are rare is acceptable) so that the
deleted
column is a timestamp for the when the deletion occurred - 删除用户会更改其密钥(需要更改为级联到引用用户的外键,这是次优的但是如果删除很少是可接受的),那么删除的列是删除发生时的时间戳
- Now a new user (which would have a null
deleted
timestamp) can be successfully created. - 现在可以成功创建新用户(将具有空删除的时间戳)。
However, this cannot actually be achieved with standard SQL, so instead one must use a different primary key (probably a generated numeric user id in this case) and use a UNIQUE constraint to enforce the uniqueness of (username
,deleted
).
但是,使用标准SQL实际上无法实现这一点,因此必须使用不同的主键(在这种情况下可能是生成的数字用户ID)并使用UNIQUE约束来强制(username,deleted)的唯一性。
#6
0
Having primary key null can be beneficial in some scenarios. In one of my projects I used this feature during synchronisation of databases: one on server and many on different users devices. Considering the fact that not all users have access to the Internet all the time, I decided that only the main database will be able to give ids to my entities. SQLite has its own mechanism for numbering rows. Had I used additional id field I would use more bandwith. Having null as id not only notifies me that an entity has been created on clients device when he hadn't access to the Internet, but also decreases code complexity. The only drawback is that on clients device I can't get an entity by it's id unless it was previously synchronised with main database. However thats not an issue since my user cares for entities for their parameters, not their unique id.
在某些情况下,使主键为null可能是有益的。在我的一个项目中,我在数据库同步期间使用了此功能:一个在服务器上,另一个在不同的用户设备上。考虑到并非所有用户都能一直访问Internet,我决定只有主数据库才能为我的实体提供ID。 SQLite有自己的行编号机制。如果我使用额外的id字段,我会使用更多带宽。将null作为id不仅通知我在客户端设备上没有访问Internet时已经创建了实体,而且还降低了代码复杂性。唯一的缺点是,在客户端设备上,我无法通过它的id获取实体,除非它先前与主数据库同步。然而,这不是问题,因为我的用户关心实体的参数,而不是他们唯一的ID。
#1
22
Suppose you have a primary key containing a nullable column Kn.
假设您有一个包含可空列Kn的主键。
If you want to have a second row rejected on the ground that in that second row, Kn is null and the table already contains a row with Kn null, then you are actually requiring that the system would treat the comparison "row1.Kn = row2.Kn" as giving TRUE (because you somehow want the system to detect that the key values in those rows are indeed equal). However, this comparison boils down to the comparison "null = null", and the standard already explicitly specifies that null doesn't compare equal to anything, including itself.
如果你想让第二行被拒绝,那么在第二行中Kn是null并且表已经包含一个Kn null的行,那么你实际上要求系统将对比“row1.Kn = row2 .Kn“赋予TRUE(因为你不知何故希望系统检测到那些行中的键值确实相等)。但是,这种比较归结为比较“null = null”,并且标准已经明确指定null不比较等于任何东西,包括它自己。
To allow for what you want, would thus amount to SQL deviating from its own principles regarding the treatment of null. There are innumerable inconsistencies in SQL, but this particular one never got past the committee.
为了满足你的需要,因此SQL会偏离自己关于null处理的原则。 SQL中存在无数的不一致,但这个特殊的一个从未超越委员会。
#2
5
I don't know whether older versions of MySQL differ on this, but as of modern versions a primary key must be on columns that are not null. See the manual page on CREATE TABLE
: "A PRIMARY KEY
is a unique index where all key columns must be defined as NOT NULL
. If they are not explicitly declared as NOT NULL
, MySQL declares them so implicitly (and silently)."
我不知道MySQL的旧版本是否与此不同,但是从现代版本开始,主键必须是非空的列。请参阅CREATE TABLE上的手册页:“PRIMARY KEY是一个唯一索引,其中所有键列必须定义为NOT NULL。如果未将它们显式声明为NOT NULL,则MySQL会隐式声明(并默默地)。”
#3
4
As far as relational database theory is concerned:
就关系数据库理论而言:
- The primary key of a table is used to uniquely identify each and every row in the table
- 表的主键用于唯一标识表中的每一行
- A NULL value in a column indicates that you don't konw what the value is
- 列中的NULL值表示您不知道该值是什么
- Therefore, you should never use the value of "I don't know" to uniquely identify a row in a table.
- 因此,您不应该使用“我不知道”的值来唯一标识表中的行。
Depending upon the data you are modelling, a "made up" value can be used instead of NULL. I've used 0, "N/A", 'Jan 1, 1980', and similar values to represent dummy "known to be missing" data.
根据您建模的数据,可以使用“组合”值而不是NULL。我使用了0,“N / A”,“1980年1月1日”,以及类似的值来表示虚拟的“已知缺失”数据。
Most, if not all, DB engines do allow for a UNIQUE constraint or index, which does allow for NULL column values, though (ideally) only one row may be assigned the value null (otherwise it wouldn't be a unique value). This can be used to support the irritatingly pragmatic (but occasionally necessary) situations that don't fit neatly into relational theory.
大多数(如果不是全部)数据库引擎都允许使用UNIQUE约束或索引,它允许NULL列值,但(理想情况下)只能为一行赋值null(否则它不是唯一值)。这可以用来支持那些不适合关系理论的烦人的(但偶尔必要的)情境。
#4
3
Well, it could allow you to implement the Null Object Pattern natively within the database. So if you were using something similar in code, which interacted very closely with the DB, you could just look up the object corresponding to the key without having to special-case a null check.
好吧,它可以允许您在数据库中本机实现Null对象模式。因此,如果您在代码中使用类似的东西,它与数据库密切相关,您可以只查找与密钥对应的对象,而无需特殊情况下进行空检查。
Now whether this is worthwhile functionality I'm not sure, but it's really a question of whether the pros of disallowing null pkeys in absolutely all cases outweigh the cons of obstructing someone who (for better or worse) actually wants to use null keys. This would only be worth it if you could demonstrate some non-trivial improvement (such as faster key lookup) from being able to guarantee that keys are non-null. Some DB engines would show this, others might not. And if there aren't any real pros from forcing this, why artificially restrict your clients?
现在这是否是值得的功能我不确定,但这确实是一个问题,在绝对所有情况下禁止空pkeys的优点是否超过妨碍某人(无论好坏)实际上想要使用空键的缺点。如果你能够保证密钥非空,那么你可以证明一些非平凡的改进(例如更快的密钥查找),这是值得的。有些数据库引擎会显示此信息,有些则可能没有。如果没有任何真正的专业人士强迫这一点,为什么人为地限制你的客户呢?
#5
1
As discussed in other answers, NULL was intended to mean "the information that should go in this column is unknown". However, it is also frequently used to indicate an alternative meaning of "this attribute does not exist". This is a particularly useful interpretation when looking at timestamp fields that are interpreted as the time some particular event occurred, in which case NULL is often used to indicate that the event has not yet occurred.
正如其他答案中所讨论的,NULL意在表示“本栏中应该包含的信息未知”。但是,它也经常用于表示“此属性不存在”的替代含义。当查看被解释为某个特定事件发生的时间的时间戳字段时,这是一种特别有用的解释,在这种情况下,NULL通常用于指示事件尚未发生。
It is a problem that SQL doesn't support this interpretation very well -- for this to work properly, it really needs to have a separate value (something like "never") that doesn't behave as null does ("never" should be equal to "never" and should compare as higher than all other values). But as SQL lacks this notion, and there is no convenient way to add it, using null for this purposes is often the best choice.
这是一个问题,SQL不能很好地支持这种解释 - 为了使其正常工作,它确实需要一个单独的值(类似于“never”),它不像null一样(“never”应该等于“从不”,并且应该比所有其他值更高。但由于SQL缺乏这一概念,并且没有方便的方法来添加它,因此为此目的使用null通常是最佳选择。
This leaves the problem that when a timestamp of an event that may have not occurred should be part of the primary key of a table (a common requirement perhaps being the use of a natural key along with a deletion timestamp when using soft deletion with a requirement for the ability to recreate the item after deletion) you really want the primary key to have a nullable column. Alas, this is not allowed in most databases, and instead you have to resort to an artificial primary key (e.g. a row sequence number) and a UNIQUE constraint for what should otherwise have been your actual primary key.
这留下了一个问题,即当一个可能没有发生的事件的时间戳应该是表的主键的一部分时(一个常见的要求可能是当使用带有要求的软删除时使用自然键和删除时间戳)为了能够在删除后重新创建项目,你真的希望主键具有可为空的列。唉,这在大多数数据库中是不允许的,相反,你必须求助于一个人工主键(例如一个行序列号)和一个UNIQUE约束,否则它应该是你的实际主键。
An example scenario, in order to clarify this: I have a users
table. As I require each user to have a distinct username, I decide to use username
as the primary key. I want to support user deletion, but as I need to track the existence of users historically for auditing purposes I use soft deletion (in the first version of the schema, I add a 'deleted' flag to the user, and ensure that the deleted flag is checked in all queries where only active users are expected).
一个示例场景,为了澄清这一点:我有一个用户表。由于我要求每个用户都有一个不同的用户名,我决定使用用户名作为主键。我想支持用户删除,但由于我需要在历史上跟踪用户的存在以进行审计,我使用软删除(在架构的第一个版本中,我向用户添加'已删除'标志,并确保删除在只有活动用户的所有查询中检查标志)。
An additional requirement, however, is that if a username is deleted, it should be available for new users to register. An attractive way to achieve this would be to have the deleted flag change to a nullable timestamp (where nulls indicate that the user has not been deleted) and put this in the primary key. Were primary keys to allow nullable columns, this would have the following effect:
但是,另一个要求是,如果删除了用户名,则新用户可以注册该用户名。实现这一目标的一种有吸引力的方法是将已删除的标志更改为可以为空的时间戳(其中空值表示用户尚未被删除)并将其置于主键中。如果是允许可空列的主键,则会产生以下影响:
- Creating a new user with an existing username when that user's
deleted
column is null would be denied as a duplicate key entry - 当该用户的已删除列为空时,使用现有用户名创建新用户将被拒绝作为重复键条目
- Deleting a user changes its key (which requires changes to cascade to foreign keys that reference the user, which is suboptimal but if deletions are rare is acceptable) so that the
deleted
column is a timestamp for the when the deletion occurred - 删除用户会更改其密钥(需要更改为级联到引用用户的外键,这是次优的但是如果删除很少是可接受的),那么删除的列是删除发生时的时间戳
- Now a new user (which would have a null
deleted
timestamp) can be successfully created. - 现在可以成功创建新用户(将具有空删除的时间戳)。
However, this cannot actually be achieved with standard SQL, so instead one must use a different primary key (probably a generated numeric user id in this case) and use a UNIQUE constraint to enforce the uniqueness of (username
,deleted
).
但是,使用标准SQL实际上无法实现这一点,因此必须使用不同的主键(在这种情况下可能是生成的数字用户ID)并使用UNIQUE约束来强制(username,deleted)的唯一性。
#6
0
Having primary key null can be beneficial in some scenarios. In one of my projects I used this feature during synchronisation of databases: one on server and many on different users devices. Considering the fact that not all users have access to the Internet all the time, I decided that only the main database will be able to give ids to my entities. SQLite has its own mechanism for numbering rows. Had I used additional id field I would use more bandwith. Having null as id not only notifies me that an entity has been created on clients device when he hadn't access to the Internet, but also decreases code complexity. The only drawback is that on clients device I can't get an entity by it's id unless it was previously synchronised with main database. However thats not an issue since my user cares for entities for their parameters, not their unique id.
在某些情况下,使主键为null可能是有益的。在我的一个项目中,我在数据库同步期间使用了此功能:一个在服务器上,另一个在不同的用户设备上。考虑到并非所有用户都能一直访问Internet,我决定只有主数据库才能为我的实体提供ID。 SQLite有自己的行编号机制。如果我使用额外的id字段,我会使用更多带宽。将null作为id不仅通知我在客户端设备上没有访问Internet时已经创建了实体,而且还降低了代码复杂性。唯一的缺点是,在客户端设备上,我无法通过它的id获取实体,除非它先前与主数据库同步。然而,这不是问题,因为我的用户关心实体的参数,而不是他们唯一的ID。