Firstly, can anyone explain how unique index works in databases?
首先,任何人都可以解释独特的索引如何在数据库中工作?
Suppose I have a User model with a name column
and I add a unique index
on it but in the model (user.rb) I just have a presence
validator on the name
field.
假设我有一个带有名称列的用户模型,我在其上添加了一个唯一索引,但在模型中(user.rb)我只在名称字段上有一个在线验证器。
So now when I'm trying to create two users with same name, I get PGError
所以现在当我尝试创建两个具有相同名称的用户时,我得到了PGError
duplicate key value violates unique constraint "index_users_on_name"
重复键值违反唯一约束“index_users_on_name”
So It looks to me like the unique index
working same as uniqueness validator
(?)
所以在我看来,像唯一性验证器一样工作的唯一索引(?)
If so then what about foreign keys?
如果是这样那么外键呢?
Lets say I have Post
model with belongs_to :user
association with User has_many :posts
. And a foreign key user_id
in the posts table
with unique index. Then multiple posts cannot have a same user_id
.
假设我有使用belongs_to的Post模型:用户关联User has_many:posts。并且posts表中的外键user_id具有唯一索引。然后多个帖子不能具有相同的user_id。
Can someone explain how unique index
works?
有人可以解释唯一索引的工作原理吗
I'm on Rails 4 with Ruby 2.0.0.
我使用Ruby 2.0.0在Rails 4上。
3 个解决方案
#1
15
Here are the difference between unique index and validates_uniqueness_of
以下是unique index和validates_uniqueness_of之间的区别
This is a patch to enable ActiveRecord to identify db-generated errors for unique constraint violations. For example, it makes the following work without declaring a validates_uniqueness_of:
这是一个补丁,用于使ActiveRecord能够识别数据库生成的唯一约束违规错误。例如,它在不声明validates_uniqueness_of的情况下进行以下工作:
create_table "users" do |t|
t.string "email", :null => false
end
add_index "users", ["email"], :unique => true
class User < ActiveRecord::Base
end
User.create!(:email => 'abc@abc.com')
u = User.create(:email => 'abc@abc.com')
u.errors[:email]
=> "has already been taken"
The benefits are speed, ease of use, and completeness --
优点是速度,易用性和完整性 -
Speed
速度
With this approach you don't need to do a db lookup to check for uniqueness when saving (which can sometimes be quite slow when the index is missed -- https://rails.lighthouseapp.com/projects/8994/tickets/2503-validate... ). If you really care about validating uniqueness you're going to have to use database constraints anyway so the database will validate uniqueness no matter what and this approach removes an extra query. Checking the index twice isn't a problem for the DB (it's cached the 2nd time around), but saving a DB round-trip from the application is a big win.
使用这种方法,您无需进行数据库查找以在保存时检查唯一性(错过索引时有时会非常慢 - https://rails.lighthouseapp.com/projects/8994/tickets/2503 -validate ...)。如果您真的关心验证唯一性,那么无论如何都必须使用数据库约束,因此无论如何数据库都将验证唯一性,这种方法会删除额外的查询。检查索引两次对于DB来说不是问题(它是第二次缓存的),但是从应用程序中保存数据库往返是一个很大的胜利。
Ease of use
使用方便
Given that you have to have db constraints for true uniqueness anyway, this approach will let everything just happen automatically once the db constraints are in place. You can still use validates_uniqueness_of if you want to.
鉴于您必须具有真正唯一性的数据库约束,这种方法将使所有内容在数据库约束到位后自动发生。如果您愿意,仍然可以使用validates_uniqueness_of。
Completeness
完整性
validates_uniqueness_of has always been a bit of a hack -- it can't handle race conditions properly and results in exceptions that must be handled using somewhat redundant error handling logic. (See "Concurrency and integrity" section in http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMe...)
validates_uniqueness_of一直是一个hack - 它无法正确处理竞争条件并导致必须使用有些冗余的错误处理逻辑处理的异常。 (请参阅http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMe中的“并发性和完整性”部分...)
validates_uniqueness_of is not sufficient to ensure the uniqueness of a value. The reason for this is that in production, multiple worker processes can cause race conditions:
validates_uniqueness_of不足以确保值的唯一性。原因是在生产中,多个工作进程可能导致竞争条件:
-
Two concurrent requests try to create a user with the same name (and we want user names to be unique)
两个并发请求尝试创建具有相同名称的用户(我们希望用户名是唯一的)
-
The requests are accepted on the server by two worker processes who will now process them in parallel
两个工作进程在服务器上接受请求,这两个进程现在将并行处理它们
-
Both requests scan the users table and see that the name is available
这两个请求都会扫描users表并查看该名称是否可用
-
Both requests pass validation and create a user with the seemingly available name
两个请求都通过验证并创建具有看似可用名称的用户
For more clear understanding please check this
为了更清楚地理解,请检查这一点
If you create a unique index for a column it means you’re guaranteed the table won’t have more than one row with the same value for that column. Using only validates_uniqueness_of validation in your model isn’t enough to enforce uniqueness because there can be concurrent users trying to create the same data.
如果为列创建唯一索引,则意味着您可以保证该表不会有多个具有该列相同值的行。在模型中仅使用validates_uniqueness_of验证不足以强制执行唯一性,因为可能有并发用户尝试创建相同的数据。
Imagine that two users tries to register an account with the same email where you have added validates_uniqueness_of :email in your user model. If they hit the “Sign up” button at the same time, Rails will look in the user table for that email and respond back that everything is fine and that it’s ok to save the record to the table. Rails will then save the two records to the user table with the same email and now you have a really shitty problem to deal with.
想象一下,两个用户尝试使用您在用户模型中添加validates_uniqueness_of:email的同一电子邮件注册帐户。如果他们同时点击“注册”按钮,Rails将在用户表中查找该电子邮件,并回复一切正常,并且可以将记录保存到表中。然后,Rails会使用相同的电子邮件将两条记录保存到用户表中,现在您有一个非常糟糕的问题需要处理。
To avoid this you need to create a unique constraint at the database level as well:
为避免这种情况,您还需要在数据库级别创建唯一约束:
class CreateUsers < ActiveRecord::Migration
def change
create_table :users do |t|
t.string :email
...
end
add_index :users, :email, unique: true
end
end
So by creating the index_users_on_email unique index you get two very nice benefits. Data integrity and good performance because unique indexes tends to be very fast.
因此,通过创建index_users_on_email唯一索引,您将获得两个非常好的好处。数据完整性和良好性能,因为独特的索引往往非常快。
If you put unique: true in your posts table for user_id then it will not allow to enter duplicate records with same user_id.
如果在posts表的user_id中输入unique:true,则不允许输入具有相同user_id的重复记录。
#2
2
Db Unique index and i quote from this SO question is:
Db Unique索引和我引用的这个问题是:
Unique Index in a database is an index on that column that also enforces the constraint that you cannot have two equal values in that column in two different rows
数据库中的唯一索引是该列的索引,该索引还强制执行约束,即在该列中两个不同的行中不能有两个相等的值
While ROR uniqueness validation should do the same but from application level, meaning that the following scenario could rarely but easily happen:
虽然ROR唯一性验证应该从应用程序级别执行相同的操作,这意味着以下方案可能很少但很容易发生:
- User A submits form
- 用户A提交表单
- Rails checks database for existing ID for User A- none found
- Rails检查数据库是否找到了用户A的现有ID
- User B submits form
- 用户B提交表单
- Rails checks database for existing ID for User B- none found
- Rails检查数据库中是否存在用户B的现有ID
- Rails Saves user A record
- Rails保存用户A记录
- Rails saves user B record
- Rails保存用户B记录
Which happened to me a month ago and got advise to solve it using DB unique index in this SO question
一个月前发生在我身上,并建议在这个SO问题中使用DB唯一索引来解决它
By the way this workaround is well documented in Rails:
顺便说一句,这个解决方法在Rails中有详细记录:
The best way to work around this problem is to add a unique index to the database table using ActiveRecord::ConnectionAdapters::SchemaStatements#add_index. In the rare case that a race condition occurs, the database will guarantee the field’s uniqueness
解决此问题的最佳方法是使用ActiveRecord :: ConnectionAdapters :: SchemaStatements #add_index向数据库表添加唯一索引。在极少数情况下发生竞争条件时,数据库将保证该字段的唯一性
#3
2
As for the uniqueness goes,
至于独特性,
Uniqueness validates that the attribute's value is unique right before the object gets saved. It does not create a uniqueness constraint in the database, so it may happen that two different database connections create two records with the same value for a column that you intend to be unique. To avoid that, you must create a unique index on both columns in your database.
唯一性在对象保存之前验证属性的值是唯一的。它不会在数据库中创建唯一性约束,因此可能会发生两个不同的数据库连接创建两个记录,这些记录对于您想要唯一的列具有相同的值。为避免这种情况,您必须在数据库的两个列上创建唯一索引。
Also, if you just have validates_uniqueness_of
at model level then you would be restricted to insert duplicate records from rails side BUT not at the database level. SQL inject queries through dbconsole would insert duplicate records without any problem.
此外,如果您只是在模型级别有validates_uniqueness_of,那么您将被限制从rails侧插入重复记录但不在数据库级别。通过dbconsole的SQL注入查询将插入重复的记录没有任何问题。
When you say that you created a foreign key with index on "user_id" in "posts" table then by default rails only creates an index
on it and NOT a unique index
. If you have 1-M relationship then there is no point in unique index in your case.
当你说你在“posts”表中创建了一个索引为“user_id”的外键时,默认情况下rails只会在其上创建一个索引而不是唯一索引。如果你有1-M的关系,那么在你的情况下,唯一索引就没有意义了。
If you had unique: true
in your posts table for "user_id" then there is no way that duplicate records with same "user_id" would go through
如果你的帖子表中有“user_id”的唯一:true,那么同样的“user_id”的重复记录就无法通过
#1
15
Here are the difference between unique index and validates_uniqueness_of
以下是unique index和validates_uniqueness_of之间的区别
This is a patch to enable ActiveRecord to identify db-generated errors for unique constraint violations. For example, it makes the following work without declaring a validates_uniqueness_of:
这是一个补丁,用于使ActiveRecord能够识别数据库生成的唯一约束违规错误。例如,它在不声明validates_uniqueness_of的情况下进行以下工作:
create_table "users" do |t|
t.string "email", :null => false
end
add_index "users", ["email"], :unique => true
class User < ActiveRecord::Base
end
User.create!(:email => 'abc@abc.com')
u = User.create(:email => 'abc@abc.com')
u.errors[:email]
=> "has already been taken"
The benefits are speed, ease of use, and completeness --
优点是速度,易用性和完整性 -
Speed
速度
With this approach you don't need to do a db lookup to check for uniqueness when saving (which can sometimes be quite slow when the index is missed -- https://rails.lighthouseapp.com/projects/8994/tickets/2503-validate... ). If you really care about validating uniqueness you're going to have to use database constraints anyway so the database will validate uniqueness no matter what and this approach removes an extra query. Checking the index twice isn't a problem for the DB (it's cached the 2nd time around), but saving a DB round-trip from the application is a big win.
使用这种方法,您无需进行数据库查找以在保存时检查唯一性(错过索引时有时会非常慢 - https://rails.lighthouseapp.com/projects/8994/tickets/2503 -validate ...)。如果您真的关心验证唯一性,那么无论如何都必须使用数据库约束,因此无论如何数据库都将验证唯一性,这种方法会删除额外的查询。检查索引两次对于DB来说不是问题(它是第二次缓存的),但是从应用程序中保存数据库往返是一个很大的胜利。
Ease of use
使用方便
Given that you have to have db constraints for true uniqueness anyway, this approach will let everything just happen automatically once the db constraints are in place. You can still use validates_uniqueness_of if you want to.
鉴于您必须具有真正唯一性的数据库约束,这种方法将使所有内容在数据库约束到位后自动发生。如果您愿意,仍然可以使用validates_uniqueness_of。
Completeness
完整性
validates_uniqueness_of has always been a bit of a hack -- it can't handle race conditions properly and results in exceptions that must be handled using somewhat redundant error handling logic. (See "Concurrency and integrity" section in http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMe...)
validates_uniqueness_of一直是一个hack - 它无法正确处理竞争条件并导致必须使用有些冗余的错误处理逻辑处理的异常。 (请参阅http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMe中的“并发性和完整性”部分...)
validates_uniqueness_of is not sufficient to ensure the uniqueness of a value. The reason for this is that in production, multiple worker processes can cause race conditions:
validates_uniqueness_of不足以确保值的唯一性。原因是在生产中,多个工作进程可能导致竞争条件:
-
Two concurrent requests try to create a user with the same name (and we want user names to be unique)
两个并发请求尝试创建具有相同名称的用户(我们希望用户名是唯一的)
-
The requests are accepted on the server by two worker processes who will now process them in parallel
两个工作进程在服务器上接受请求,这两个进程现在将并行处理它们
-
Both requests scan the users table and see that the name is available
这两个请求都会扫描users表并查看该名称是否可用
-
Both requests pass validation and create a user with the seemingly available name
两个请求都通过验证并创建具有看似可用名称的用户
For more clear understanding please check this
为了更清楚地理解,请检查这一点
If you create a unique index for a column it means you’re guaranteed the table won’t have more than one row with the same value for that column. Using only validates_uniqueness_of validation in your model isn’t enough to enforce uniqueness because there can be concurrent users trying to create the same data.
如果为列创建唯一索引,则意味着您可以保证该表不会有多个具有该列相同值的行。在模型中仅使用validates_uniqueness_of验证不足以强制执行唯一性,因为可能有并发用户尝试创建相同的数据。
Imagine that two users tries to register an account with the same email where you have added validates_uniqueness_of :email in your user model. If they hit the “Sign up” button at the same time, Rails will look in the user table for that email and respond back that everything is fine and that it’s ok to save the record to the table. Rails will then save the two records to the user table with the same email and now you have a really shitty problem to deal with.
想象一下,两个用户尝试使用您在用户模型中添加validates_uniqueness_of:email的同一电子邮件注册帐户。如果他们同时点击“注册”按钮,Rails将在用户表中查找该电子邮件,并回复一切正常,并且可以将记录保存到表中。然后,Rails会使用相同的电子邮件将两条记录保存到用户表中,现在您有一个非常糟糕的问题需要处理。
To avoid this you need to create a unique constraint at the database level as well:
为避免这种情况,您还需要在数据库级别创建唯一约束:
class CreateUsers < ActiveRecord::Migration
def change
create_table :users do |t|
t.string :email
...
end
add_index :users, :email, unique: true
end
end
So by creating the index_users_on_email unique index you get two very nice benefits. Data integrity and good performance because unique indexes tends to be very fast.
因此,通过创建index_users_on_email唯一索引,您将获得两个非常好的好处。数据完整性和良好性能,因为独特的索引往往非常快。
If you put unique: true in your posts table for user_id then it will not allow to enter duplicate records with same user_id.
如果在posts表的user_id中输入unique:true,则不允许输入具有相同user_id的重复记录。
#2
2
Db Unique index and i quote from this SO question is:
Db Unique索引和我引用的这个问题是:
Unique Index in a database is an index on that column that also enforces the constraint that you cannot have two equal values in that column in two different rows
数据库中的唯一索引是该列的索引,该索引还强制执行约束,即在该列中两个不同的行中不能有两个相等的值
While ROR uniqueness validation should do the same but from application level, meaning that the following scenario could rarely but easily happen:
虽然ROR唯一性验证应该从应用程序级别执行相同的操作,这意味着以下方案可能很少但很容易发生:
- User A submits form
- 用户A提交表单
- Rails checks database for existing ID for User A- none found
- Rails检查数据库是否找到了用户A的现有ID
- User B submits form
- 用户B提交表单
- Rails checks database for existing ID for User B- none found
- Rails检查数据库中是否存在用户B的现有ID
- Rails Saves user A record
- Rails保存用户A记录
- Rails saves user B record
- Rails保存用户B记录
Which happened to me a month ago and got advise to solve it using DB unique index in this SO question
一个月前发生在我身上,并建议在这个SO问题中使用DB唯一索引来解决它
By the way this workaround is well documented in Rails:
顺便说一句,这个解决方法在Rails中有详细记录:
The best way to work around this problem is to add a unique index to the database table using ActiveRecord::ConnectionAdapters::SchemaStatements#add_index. In the rare case that a race condition occurs, the database will guarantee the field’s uniqueness
解决此问题的最佳方法是使用ActiveRecord :: ConnectionAdapters :: SchemaStatements #add_index向数据库表添加唯一索引。在极少数情况下发生竞争条件时,数据库将保证该字段的唯一性
#3
2
As for the uniqueness goes,
至于独特性,
Uniqueness validates that the attribute's value is unique right before the object gets saved. It does not create a uniqueness constraint in the database, so it may happen that two different database connections create two records with the same value for a column that you intend to be unique. To avoid that, you must create a unique index on both columns in your database.
唯一性在对象保存之前验证属性的值是唯一的。它不会在数据库中创建唯一性约束,因此可能会发生两个不同的数据库连接创建两个记录,这些记录对于您想要唯一的列具有相同的值。为避免这种情况,您必须在数据库的两个列上创建唯一索引。
Also, if you just have validates_uniqueness_of
at model level then you would be restricted to insert duplicate records from rails side BUT not at the database level. SQL inject queries through dbconsole would insert duplicate records without any problem.
此外,如果您只是在模型级别有validates_uniqueness_of,那么您将被限制从rails侧插入重复记录但不在数据库级别。通过dbconsole的SQL注入查询将插入重复的记录没有任何问题。
When you say that you created a foreign key with index on "user_id" in "posts" table then by default rails only creates an index
on it and NOT a unique index
. If you have 1-M relationship then there is no point in unique index in your case.
当你说你在“posts”表中创建了一个索引为“user_id”的外键时,默认情况下rails只会在其上创建一个索引而不是唯一索引。如果你有1-M的关系,那么在你的情况下,唯一索引就没有意义了。
If you had unique: true
in your posts table for "user_id" then there is no way that duplicate records with same "user_id" would go through
如果你的帖子表中有“user_id”的唯一:true,那么同样的“user_id”的重复记录就无法通过