如何使用字符串数组遍历字符串以查找匹配项?

时间:2021-12-02 15:41:07

I am trying to loop through a title string with an array of strings and see which ones from the array match.

我试图用一个字符串数组循环一个标题字符串,看看数组中哪些匹配。

My code works fine but I am not sure if it is the most efficient way to do this.

我的代码工作正常,但我不确定它是否是最有效的方法。

The important thing is that the strings in the array do not have to match a phrase in the title exactly. They can be in any order as long as every word is in the title. Any help would be great.

重要的是数组中的字符串不必与标题中的短语完全匹配。只要每个单词都在标题中,它们就可以按任何顺序排列。任何帮助都会很棒。

EX.title = "Apple Iphone 4 Verizon"
   array = ["iphone apple, verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]

I need it to return ["iphone apple", "verizon iphone", "iphone 4"]. The words in the strings "verizon iphone" and "iphone apple" are in the title, the order does not matter

我需要它返回[“iphone apple”,“verizon iphone”,“iphone 4”]。字符串“verizon iphone”和“iphone apple”中的单词在标题中,顺序无关紧要

results = [] 

#Loop through all the pids to see if they are found in the title
all_pids = ["iphone 3gs", "iphone white 4", "iphone verizon", "black iphone", "at&t      iphone"]
title = "Apple Iphone 4 White Verizon"
all_pids.each do |pid|
    match = []
    split_id = pid.downcase.split(' ')
    split_id.each do |name|

      in_title = title.downcase.include?(name) 
      if in_title == true
        match << name
      end
    end

    final = match.join(" ")

    if final.strip == pid.strip
      results << pid
    end

end

print results

When I run this it prints what I need ["iphone white 4", "iphone verizon"]

当我运行它打印我需要的东西[“iphone white 4”,“iphone verizon”]

2 个解决方案

#1


2  

You could do something like the following:

您可以执行以下操作:

>> require 'set'
=> true
>> title = "Apple Iphone 4 Verizon"
=> "Apple Iphone 4 Verizon"
>> all_pids = ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
=> ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
>> title_set = Set.new(title.downcase.split)
=> #<Set: {"apple", "iphone", "4", "verizon"}>
>> all_pids.select { |pid| Set.new(pid.downcase.split).subset? title_set }
=> ["iphone apple", "verizon iphone", "iphone 4"]

You can do something very similar with array differences, but sets might be faster since they are implemented as hashes.

您可以执行与数组差异非常相似的操作,但是设置可能会更快,因为它们实现为哈希。

#2


2  

It looks to me that you want to find the strings that are composed of strings that strictly intersect the strings in the title.

在我看来,您希望找到由严格与标题中的字符串相交的字符串组成的字符串。

Array#- performs set difference operations. [2] - [1,2,3] = [] and [1,2,3] - [2] = [1,3]

数组# - 执行集差异操作。 [2] - [1,2,3] = []和[1,2,3] - [2] = [1,3]

title = "Apple Iphone 4 White Verizon"
all_pids = ["iphone 3gs", "iphone white 4", "iphone verizon", "black iphone", "at&t      iphone"]
set_of_strings_in_title = title.downcase.split
all_pids.find_all do |pid|
  set_of_strings_not_in_title = pid.downcase.split - set_of_strings_in_title 
  set_of_strings_not_in_title.empty?
end

EDIT: Changed #find to #find_all to return all matches, not just the first.

编辑:将#find更改为#find_all以返回所有匹配项,而不仅仅是第一项。

#1


2  

You could do something like the following:

您可以执行以下操作:

>> require 'set'
=> true
>> title = "Apple Iphone 4 Verizon"
=> "Apple Iphone 4 Verizon"
>> all_pids = ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
=> ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
>> title_set = Set.new(title.downcase.split)
=> #<Set: {"apple", "iphone", "4", "verizon"}>
>> all_pids.select { |pid| Set.new(pid.downcase.split).subset? title_set }
=> ["iphone apple", "verizon iphone", "iphone 4"]

You can do something very similar with array differences, but sets might be faster since they are implemented as hashes.

您可以执行与数组差异非常相似的操作,但是设置可能会更快,因为它们实现为哈希。

#2


2  

It looks to me that you want to find the strings that are composed of strings that strictly intersect the strings in the title.

在我看来,您希望找到由严格与标题中的字符串相交的字符串组成的字符串。

Array#- performs set difference operations. [2] - [1,2,3] = [] and [1,2,3] - [2] = [1,3]

数组# - 执行集差异操作。 [2] - [1,2,3] = []和[1,2,3] - [2] = [1,3]

title = "Apple Iphone 4 White Verizon"
all_pids = ["iphone 3gs", "iphone white 4", "iphone verizon", "black iphone", "at&t      iphone"]
set_of_strings_in_title = title.downcase.split
all_pids.find_all do |pid|
  set_of_strings_not_in_title = pid.downcase.split - set_of_strings_in_title 
  set_of_strings_not_in_title.empty?
end

EDIT: Changed #find to #find_all to return all matches, not just the first.

编辑:将#find更改为#find_all以返回所有匹配项,而不仅仅是第一项。