I am trying to loop through a title string with an array of strings and see which ones from the array match.
我试图用一个字符串数组循环一个标题字符串,看看数组中哪些匹配。
My code works fine but I am not sure if it is the most efficient way to do this.
我的代码工作正常,但我不确定它是否是最有效的方法。
The important thing is that the strings in the array do not have to match a phrase in the title exactly. They can be in any order as long as every word is in the title. Any help would be great.
重要的是数组中的字符串不必与标题中的短语完全匹配。只要每个单词都在标题中,它们就可以按任何顺序排列。任何帮助都会很棒。
EX.title = "Apple Iphone 4 Verizon"
array = ["iphone apple, verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
I need it to return ["iphone apple", "verizon iphone", "iphone 4"]
. The words in the strings "verizon iphone" and "iphone apple" are in the title, the order does not matter
我需要它返回[“iphone apple”,“verizon iphone”,“iphone 4”]。字符串“verizon iphone”和“iphone apple”中的单词在标题中,顺序无关紧要
results = []
#Loop through all the pids to see if they are found in the title
all_pids = ["iphone 3gs", "iphone white 4", "iphone verizon", "black iphone", "at&t iphone"]
title = "Apple Iphone 4 White Verizon"
all_pids.each do |pid|
match = []
split_id = pid.downcase.split(' ')
split_id.each do |name|
in_title = title.downcase.include?(name)
if in_title == true
match << name
end
end
final = match.join(" ")
if final.strip == pid.strip
results << pid
end
end
print results
When I run this it prints what I need ["iphone white 4", "iphone verizon"]
当我运行它打印我需要的东西[“iphone white 4”,“iphone verizon”]
2 个解决方案
#1
2
You could do something like the following:
您可以执行以下操作:
>> require 'set'
=> true
>> title = "Apple Iphone 4 Verizon"
=> "Apple Iphone 4 Verizon"
>> all_pids = ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
=> ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
>> title_set = Set.new(title.downcase.split)
=> #<Set: {"apple", "iphone", "4", "verizon"}>
>> all_pids.select { |pid| Set.new(pid.downcase.split).subset? title_set }
=> ["iphone apple", "verizon iphone", "iphone 4"]
You can do something very similar with array differences, but sets might be faster since they are implemented as hashes.
您可以执行与数组差异非常相似的操作,但是设置可能会更快,因为它们实现为哈希。
#2
2
It looks to me that you want to find the strings that are composed of strings that strictly intersect the strings in the title.
在我看来,您希望找到由严格与标题中的字符串相交的字符串组成的字符串。
Array#-
performs set difference operations. [2] - [1,2,3] = []
and [1,2,3] - [2] = [1,3]
数组# - 执行集差异操作。 [2] - [1,2,3] = []和[1,2,3] - [2] = [1,3]
title = "Apple Iphone 4 White Verizon"
all_pids = ["iphone 3gs", "iphone white 4", "iphone verizon", "black iphone", "at&t iphone"]
set_of_strings_in_title = title.downcase.split
all_pids.find_all do |pid|
set_of_strings_not_in_title = pid.downcase.split - set_of_strings_in_title
set_of_strings_not_in_title.empty?
end
EDIT: Changed #find to #find_all to return all matches, not just the first.
编辑:将#find更改为#find_all以返回所有匹配项,而不仅仅是第一项。
#1
2
You could do something like the following:
您可以执行以下操作:
>> require 'set'
=> true
>> title = "Apple Iphone 4 Verizon"
=> "Apple Iphone 4 Verizon"
>> all_pids = ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
=> ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
>> title_set = Set.new(title.downcase.split)
=> #<Set: {"apple", "iphone", "4", "verizon"}>
>> all_pids.select { |pid| Set.new(pid.downcase.split).subset? title_set }
=> ["iphone apple", "verizon iphone", "iphone 4"]
You can do something very similar with array differences, but sets might be faster since they are implemented as hashes.
您可以执行与数组差异非常相似的操作,但是设置可能会更快,因为它们实现为哈希。
#2
2
It looks to me that you want to find the strings that are composed of strings that strictly intersect the strings in the title.
在我看来,您希望找到由严格与标题中的字符串相交的字符串组成的字符串。
Array#-
performs set difference operations. [2] - [1,2,3] = []
and [1,2,3] - [2] = [1,3]
数组# - 执行集差异操作。 [2] - [1,2,3] = []和[1,2,3] - [2] = [1,3]
title = "Apple Iphone 4 White Verizon"
all_pids = ["iphone 3gs", "iphone white 4", "iphone verizon", "black iphone", "at&t iphone"]
set_of_strings_in_title = title.downcase.split
all_pids.find_all do |pid|
set_of_strings_not_in_title = pid.downcase.split - set_of_strings_in_title
set_of_strings_not_in_title.empty?
end
EDIT: Changed #find to #find_all to return all matches, not just the first.
编辑:将#find更改为#find_all以返回所有匹配项,而不仅仅是第一项。