在Rails模型中不区分大小写的搜索。

时间:2021-01-14 19:24:08

My product model contains some items

我的产品模型包含一些项目

 Product.first
 => #<Product id: 10, name: "Blue jeans" >

I'm now importing some product parameters from another dataset, but there are inconsistencies in the spelling of the names. For instance, in the other dataset, Blue jeans could be spelled Blue Jeans.

我现在正在从另一个数据集中导入一些产品参数,但是名称的拼写存在不一致性。例如,在另一个数据集里,蓝色牛仔裤可以拼成蓝色牛仔裤。

I wanted to Product.find_or_create_by_name("Blue Jeans"), but this will create a new product, almost identical to the first. What are my options if I want to find and compare the lowercased name.

我想要的产品。find_or_create_by_name(“蓝色牛仔裤”),但是这将创建一个新产品,几乎与第一个产品相同。如果我想查找并比较小写的名称,我的选项是什么?

Performance issues is not really important here: There are only 100-200 products, and I want to run this as a migration that imports the data.

性能问题在这里并不是很重要:只有100-200个产品,我想把它作为一个导入数据的迁移来运行。

Any ideas?

什么好主意吗?

17 个解决方案

#1


312  

You'll probably have to be more verbose here

你可能得说得更详细些

name = "Blue Jeans"
model = Product.where('lower(name) = ?', name.downcase).first 
model ||= Product.create(:name => name)

#2


90  

This is a complete setup in Rails, for my own reference. I'm happy if it helps you too.

这是Rails中的一个完整的设置,供我参考。如果对你也有帮助的话,我很高兴。

the query:

查询:

Product.where("lower(name) = ?", name.downcase).first

the validator:

验证器:

validates :name, presence: true, uniqueness: {case_sensitive: false}

the index (answer from Case-insensitive unique index in Rails/ActiveRecord?):

索引(来自Rails/ActiveRecord中不区分大小写的唯一索引?):

execute "CREATE UNIQUE INDEX index_products_on_lower_name ON products USING btree (lower(name));"

I wish there was a more beautiful way to do the first and the last, but then again, Rails and ActiveRecord is open source, we shouldn't complain - we can implement it ourselves and send pull request.

我希望有一种更好的方式来实现第一个和最后一个,但是话说回来,Rails和ActiveRecord是开源的,我们不应该抱怨——我们可以自己实现它并发送pull请求。

#3


20  

You might want to use the following:

您可能想要使用以下内容:

validates_uniqueness_of :name, :case_sensitive => false

Please note that by default the setting is :case_sensitive => false, so you don't even need to write this option if you haven't changed other ways.

请注意,默认设置是:case_sensitive => false,因此如果您没有更改其他方法,甚至不需要编写此选项。

Find more at: http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#method-i-validates_uniqueness_of

找到更多的在:http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html # method-i-validates_uniqueness_of

#4


17  

If you are using Postegres and Rails 4+, then you have the option of using column type CITEXT, which will allow case insensitive queries without having to write out the query logic.

如果您正在使用Postegres和Rails 4+,那么您可以选择使用列类型CITEXT,它将允许不区分大小写的查询,而不必编写查询逻辑。

The migration:

迁移:

def change
  enable_extension :citext
  change_column :products, :name, :citext
  add_index :products, :name, unique: true # If you want to index the product names
end

And to test it out you should expect the following:

要测试它,你应该期待以下内容:

Product.create! name: 'jOgGers'
=> #<Product id: 1, name: "jOgGers">

Product.find_by(name: 'joggers')
=> #<Product id: 1, name: "jOgGers">

Product.find_by(name: 'JOGGERS')
=> #<Product id: 1, name: "jOgGers">

#5


11  

In postgres:

在postgres:

 user = User.find(:first, :conditions => ['username ~* ?', "regedarek"])

#6


9  

Quoting from the SQLite documentation:

引用SQLite文档:

Any other character matches itself or its lower/upper case equivalent (i.e. case-insensitive matching)

任何其他字符与自身或其较低/上例相等(即不区分大小写匹配)

...which I didn't know.But it works:

…我不知道。但它的工作原理:

sqlite> create table products (name string);
sqlite> insert into products values ("Blue jeans");
sqlite> select * from products where name = 'Blue Jeans';
sqlite> select * from products where name like 'Blue Jeans';
Blue jeans

So you could do something like this:

你可以这样做:

name = 'Blue jeans'
if prod = Product.find(:conditions => ['name LIKE ?', name])
    # update product or whatever
else
    prod = Product.create(:name => name)
end

Not #find_or_create, I know, and it may not be very cross-database friendly, but worth looking at?

不是#find_or_create,我知道,它可能不太适合跨数据库,但值得一看?

#7


6  

Upper and lower case letters differ only by a single bit - the most efficient way to search them is to ignore this bit, not to convert lower or upper, etc.. See keywords COLLATION for MS SQL, see NLS_SORT=BINARY_CI if using Oracle, etc..

大写字母和小写字母只相差一个位——搜索它们的最有效方法是忽略这个位,而不是转换为小写字母或大写字母等等。请查看MS SQL的关键字排序,如果使用Oracle,请参见NLS_SORT=BINARY_CI等。

#8


6  

Another approach that no one has mentioned is to add case insensitive finders into ActiveRecord::Base. Details can be found here. The advantage of this approach is that you don't have to modify every model, and you don't have to add the lower() clause to all your case insensitive queries, you just use a different finder method instead.

没有人提到的另一种方法是将不区分大小写的查找程序添加到ActiveRecord: Base中。详情可以在这里找到。这种方法的优点是,您不必修改每个模型,也不必向所有不区分大小写的查询添加lower()子句,只需使用不同的查找器方法。

#9


5  

Several comments refer to Arel, without providing an example.

一些注释引用了Arel,但没有提供示例。

Here is an Arel example of a case-insensitive search:

这里有一个不区分大小写搜索的Arel示例:

Product.where(Product.arel_table[:name].matches('Blue Jeans'))

The advantage of this type of solution is that it is database-agnostic - it will use the correct SQL commands for your current adapter (matches will use ILIKE for Postgres, and LIKE for everything else).

这种解决方案的优点是它与数据库无关——它将为您当前的适配器使用正确的SQL命令(match将对Postgres使用ILIKE,对其他任何东西都使用ILIKE)。

#10


4  

Find_or_create is now deprecated, you should use an AR Relation instead plus first_or_create, like so:

Find_or_create现在不建议使用,您应该使用AR关系来替代first_or_create,如下所示:

TombolaEntry.where("lower(name) = ?", self.name.downcase).first_or_create(name: self.name)

This will return the first matched object, or create one for you if none exists.

这将返回第一个匹配的对象,如果不存在,则为您创建一个对象。

#11


2  

Case-insensitive searching comes built-in with Rails. It accounts for differences in database implementations. Use either the built-in Arel library, or a gem like Squeel.

不区分大小写的搜索内置了Rails。它解释了数据库实现中的差异。使用内置的Arel库,或者像Squeel这样的宝石。

#12


2  

There are lots of great answers here, particularly @oma's. But one other thing you could try is to use custom column serialization. If you don't mind everything being stored lowercase in your db then you could create:

这里有很多很好的答案,尤其是@oma。但是您还可以尝试使用自定义列序列化。如果你不介意在db中存储小写字母,那么你可以创建:

# lib/serializers/downcasing_string_serializer.rb
module Serializers
  class DowncasingStringSerializer
    def self.load(value)
      value
    end

    def self.dump(value)
      value.downcase
    end
  end
end

Then in your model:

然后在你的模型:

# app/models/my_model.rb
serialize :name, Serializers::DowncasingStringSerializer
validates_uniqueness_of :name, :case_sensitive => false

The benefit of this approach is that you can still use all the regular finders (including find_or_create_by) without using custom scopes, functions, or having lower(name) = ? in your queries.

这种方法的好处是,您仍然可以使用所有常规查找器(包括find_or_create_by),而无需使用自定义范围、函数或更低的(name) = ?在你的查询。

The downside is that you lose casing information in the database.

缺点是丢失了数据库中的套接信息。

#13


0  

Assuming that you use mysql, you could use fields that are not case sensitive: http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html

假设您使用的是mysql,那么您可以使用不区分大小写的字段:http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html

#14


0  

user = Product.where(email: /^#{email}$/i).first

#15


0  

Some people show using LIKE or ILIKE, but those allow regex searches. Also you don't need to downcase in Ruby. You can let the database do it for you. I think it may be faster. Also first_or_create can be used after where.

有些人显示使用LIKE或ILIKE,但这些允许regex搜索。同样,在Ruby中也不需要降级。你可以让数据库为你做。我认为可能会更快。也可以在where之后使用first_or_create。

# app/models/product.rb
class Product < ActiveRecord::Base

  # case insensitive name
  def self.ci_name(text)
    where("lower(name) = lower(?)", text)
  end
end

# first_or_create can be used after a where clause
Product.ci_name("Blue Jeans").first_or_create
# Product Load (1.2ms)  SELECT  "products".* FROM "products"  WHERE (lower(name) = lower('Blue Jeans'))  ORDER BY "products"."id" ASC LIMIT 1
# => #<Product id: 1, name: "Blue jeans", created_at: "2016-03-27 01:41:45", updated_at: "2016-03-27 01:41:45"> 

#16


0  

You can also use scopes like this below and put them in a concern and include in models you may need them:

您还可以使用如下所示的作用域,并将它们放在需要它们的模型中:

scope :ci_find, lambda { |column, value| where("lower(#{column}) = ?", value.downcase).first }

范围:ci_find, lambda{|列,值|,其中(“lower(#{列})= ?”,value.downcase)。第一}

Then use like this: Model.ci_find('column', 'value')

然后像这样使用:Model。ci_find(“列”、“价值”)

#17


-7  

So far, I made a solution using Ruby. Place this inside the Product model:

到目前为止,我使用Ruby做了一个解决方案。把这个放在产品模型中:

  #return first of matching products (id only to minimize memory consumption)
  def self.custom_find_by_name(product_name)
    @@product_names ||= Product.all(:select=>'id, name')
    @@product_names.select{|p| p.name.downcase == product_name.downcase}.first
  end

  #remember a way to flush finder cache in case you run this from console
  def self.flush_custom_finder_cache!
    @@product_names = nil
  end

This will give me the first product where names match. Or nil.

这将是第一个名字匹配的产品。或零。

>> Product.create(:name => "Blue jeans")
=> #<Product id: 303, name: "Blue jeans">

>> Product.custom_find_by_name("Blue Jeans")
=> nil

>> Product.flush_custom_finder_cache!
=> nil

>> Product.custom_find_by_name("Blue Jeans")
=> #<Product id: 303, name: "Blue jeans">
>>
>> #SUCCESS! I found you :)

#1


312  

You'll probably have to be more verbose here

你可能得说得更详细些

name = "Blue Jeans"
model = Product.where('lower(name) = ?', name.downcase).first 
model ||= Product.create(:name => name)

#2


90  

This is a complete setup in Rails, for my own reference. I'm happy if it helps you too.

这是Rails中的一个完整的设置,供我参考。如果对你也有帮助的话,我很高兴。

the query:

查询:

Product.where("lower(name) = ?", name.downcase).first

the validator:

验证器:

validates :name, presence: true, uniqueness: {case_sensitive: false}

the index (answer from Case-insensitive unique index in Rails/ActiveRecord?):

索引(来自Rails/ActiveRecord中不区分大小写的唯一索引?):

execute "CREATE UNIQUE INDEX index_products_on_lower_name ON products USING btree (lower(name));"

I wish there was a more beautiful way to do the first and the last, but then again, Rails and ActiveRecord is open source, we shouldn't complain - we can implement it ourselves and send pull request.

我希望有一种更好的方式来实现第一个和最后一个,但是话说回来,Rails和ActiveRecord是开源的,我们不应该抱怨——我们可以自己实现它并发送pull请求。

#3


20  

You might want to use the following:

您可能想要使用以下内容:

validates_uniqueness_of :name, :case_sensitive => false

Please note that by default the setting is :case_sensitive => false, so you don't even need to write this option if you haven't changed other ways.

请注意,默认设置是:case_sensitive => false,因此如果您没有更改其他方法,甚至不需要编写此选项。

Find more at: http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#method-i-validates_uniqueness_of

找到更多的在:http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html # method-i-validates_uniqueness_of

#4


17  

If you are using Postegres and Rails 4+, then you have the option of using column type CITEXT, which will allow case insensitive queries without having to write out the query logic.

如果您正在使用Postegres和Rails 4+,那么您可以选择使用列类型CITEXT,它将允许不区分大小写的查询,而不必编写查询逻辑。

The migration:

迁移:

def change
  enable_extension :citext
  change_column :products, :name, :citext
  add_index :products, :name, unique: true # If you want to index the product names
end

And to test it out you should expect the following:

要测试它,你应该期待以下内容:

Product.create! name: 'jOgGers'
=> #<Product id: 1, name: "jOgGers">

Product.find_by(name: 'joggers')
=> #<Product id: 1, name: "jOgGers">

Product.find_by(name: 'JOGGERS')
=> #<Product id: 1, name: "jOgGers">

#5


11  

In postgres:

在postgres:

 user = User.find(:first, :conditions => ['username ~* ?', "regedarek"])

#6


9  

Quoting from the SQLite documentation:

引用SQLite文档:

Any other character matches itself or its lower/upper case equivalent (i.e. case-insensitive matching)

任何其他字符与自身或其较低/上例相等(即不区分大小写匹配)

...which I didn't know.But it works:

…我不知道。但它的工作原理:

sqlite> create table products (name string);
sqlite> insert into products values ("Blue jeans");
sqlite> select * from products where name = 'Blue Jeans';
sqlite> select * from products where name like 'Blue Jeans';
Blue jeans

So you could do something like this:

你可以这样做:

name = 'Blue jeans'
if prod = Product.find(:conditions => ['name LIKE ?', name])
    # update product or whatever
else
    prod = Product.create(:name => name)
end

Not #find_or_create, I know, and it may not be very cross-database friendly, but worth looking at?

不是#find_or_create,我知道,它可能不太适合跨数据库,但值得一看?

#7


6  

Upper and lower case letters differ only by a single bit - the most efficient way to search them is to ignore this bit, not to convert lower or upper, etc.. See keywords COLLATION for MS SQL, see NLS_SORT=BINARY_CI if using Oracle, etc..

大写字母和小写字母只相差一个位——搜索它们的最有效方法是忽略这个位,而不是转换为小写字母或大写字母等等。请查看MS SQL的关键字排序,如果使用Oracle,请参见NLS_SORT=BINARY_CI等。

#8


6  

Another approach that no one has mentioned is to add case insensitive finders into ActiveRecord::Base. Details can be found here. The advantage of this approach is that you don't have to modify every model, and you don't have to add the lower() clause to all your case insensitive queries, you just use a different finder method instead.

没有人提到的另一种方法是将不区分大小写的查找程序添加到ActiveRecord: Base中。详情可以在这里找到。这种方法的优点是,您不必修改每个模型,也不必向所有不区分大小写的查询添加lower()子句,只需使用不同的查找器方法。

#9


5  

Several comments refer to Arel, without providing an example.

一些注释引用了Arel,但没有提供示例。

Here is an Arel example of a case-insensitive search:

这里有一个不区分大小写搜索的Arel示例:

Product.where(Product.arel_table[:name].matches('Blue Jeans'))

The advantage of this type of solution is that it is database-agnostic - it will use the correct SQL commands for your current adapter (matches will use ILIKE for Postgres, and LIKE for everything else).

这种解决方案的优点是它与数据库无关——它将为您当前的适配器使用正确的SQL命令(match将对Postgres使用ILIKE,对其他任何东西都使用ILIKE)。

#10


4  

Find_or_create is now deprecated, you should use an AR Relation instead plus first_or_create, like so:

Find_or_create现在不建议使用,您应该使用AR关系来替代first_or_create,如下所示:

TombolaEntry.where("lower(name) = ?", self.name.downcase).first_or_create(name: self.name)

This will return the first matched object, or create one for you if none exists.

这将返回第一个匹配的对象,如果不存在,则为您创建一个对象。

#11


2  

Case-insensitive searching comes built-in with Rails. It accounts for differences in database implementations. Use either the built-in Arel library, or a gem like Squeel.

不区分大小写的搜索内置了Rails。它解释了数据库实现中的差异。使用内置的Arel库,或者像Squeel这样的宝石。

#12


2  

There are lots of great answers here, particularly @oma's. But one other thing you could try is to use custom column serialization. If you don't mind everything being stored lowercase in your db then you could create:

这里有很多很好的答案,尤其是@oma。但是您还可以尝试使用自定义列序列化。如果你不介意在db中存储小写字母,那么你可以创建:

# lib/serializers/downcasing_string_serializer.rb
module Serializers
  class DowncasingStringSerializer
    def self.load(value)
      value
    end

    def self.dump(value)
      value.downcase
    end
  end
end

Then in your model:

然后在你的模型:

# app/models/my_model.rb
serialize :name, Serializers::DowncasingStringSerializer
validates_uniqueness_of :name, :case_sensitive => false

The benefit of this approach is that you can still use all the regular finders (including find_or_create_by) without using custom scopes, functions, or having lower(name) = ? in your queries.

这种方法的好处是,您仍然可以使用所有常规查找器(包括find_or_create_by),而无需使用自定义范围、函数或更低的(name) = ?在你的查询。

The downside is that you lose casing information in the database.

缺点是丢失了数据库中的套接信息。

#13


0  

Assuming that you use mysql, you could use fields that are not case sensitive: http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html

假设您使用的是mysql,那么您可以使用不区分大小写的字段:http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html

#14


0  

user = Product.where(email: /^#{email}$/i).first

#15


0  

Some people show using LIKE or ILIKE, but those allow regex searches. Also you don't need to downcase in Ruby. You can let the database do it for you. I think it may be faster. Also first_or_create can be used after where.

有些人显示使用LIKE或ILIKE,但这些允许regex搜索。同样,在Ruby中也不需要降级。你可以让数据库为你做。我认为可能会更快。也可以在where之后使用first_or_create。

# app/models/product.rb
class Product < ActiveRecord::Base

  # case insensitive name
  def self.ci_name(text)
    where("lower(name) = lower(?)", text)
  end
end

# first_or_create can be used after a where clause
Product.ci_name("Blue Jeans").first_or_create
# Product Load (1.2ms)  SELECT  "products".* FROM "products"  WHERE (lower(name) = lower('Blue Jeans'))  ORDER BY "products"."id" ASC LIMIT 1
# => #<Product id: 1, name: "Blue jeans", created_at: "2016-03-27 01:41:45", updated_at: "2016-03-27 01:41:45"> 

#16


0  

You can also use scopes like this below and put them in a concern and include in models you may need them:

您还可以使用如下所示的作用域,并将它们放在需要它们的模型中:

scope :ci_find, lambda { |column, value| where("lower(#{column}) = ?", value.downcase).first }

范围:ci_find, lambda{|列,值|,其中(“lower(#{列})= ?”,value.downcase)。第一}

Then use like this: Model.ci_find('column', 'value')

然后像这样使用:Model。ci_find(“列”、“价值”)

#17


-7  

So far, I made a solution using Ruby. Place this inside the Product model:

到目前为止,我使用Ruby做了一个解决方案。把这个放在产品模型中:

  #return first of matching products (id only to minimize memory consumption)
  def self.custom_find_by_name(product_name)
    @@product_names ||= Product.all(:select=>'id, name')
    @@product_names.select{|p| p.name.downcase == product_name.downcase}.first
  end

  #remember a way to flush finder cache in case you run this from console
  def self.flush_custom_finder_cache!
    @@product_names = nil
  end

This will give me the first product where names match. Or nil.

这将是第一个名字匹配的产品。或零。

>> Product.create(:name => "Blue jeans")
=> #<Product id: 303, name: "Blue jeans">

>> Product.custom_find_by_name("Blue Jeans")
=> nil

>> Product.flush_custom_finder_cache!
=> nil

>> Product.custom_find_by_name("Blue Jeans")
=> #<Product id: 303, name: "Blue jeans">
>>
>> #SUCCESS! I found you :)