简单的文本匹配alogritm用于存储过程

时间:2022-09-20 23:22:03

I have a table with two fields in an sql server database and my asp.net application calls a stored procedure with a '@SearchString' parameter and the stored procedure finds all records where the @Searchstring value is found in the concatenation of two fields in the table, call them 'Field1' and 'Field2'

我在sql server数据库中有一个包含两个字段的表,我的asp.net应用程序使用'@SearchString'参数调用存储过程,存储过程查找在两个字段的串联中找到@Searchstring值的所有记录表,称他们为'Field1'和'Field2'

So the logic looks like this(I have simplified the actual query):

所以逻辑看起来像这样(我简化了实际的查询):

CREATE PROCEDURE [dbo].[sp_FindMatches] @SearchString varchar(30)
AS
  SELECT * FROM Table1 WHERE Field1+Field2 LIKE @SearchString

I would like to improve this rather basic matching algorithm so that it is not so restrictive in the records it matches. For example if the user enters "DOG HOUSE" as a parameter, the rather basic logic in the existing SP will return records where it finds the exact string. I would like for it to also return records just with "DOG" and "HOUSE" even if the strings aren't exactly next to each other.

我想改进这个相当基本的匹配算法,以便它在匹配的记录中没有那么严格。例如,如果用户输入“DOG HOUSE”作为参数,则现有SP中相当基本的逻辑将返回找到确切字符串的记录。我希望它也能用“DOG”和“HOUSE”返回记录,即使这些字符串并不完全相邻。

Even better if there was a way to rank the records in terms of 'best match' it would be even better, i.e. if "DOG HOUSE" is found it is an exact match, if "DOG" and "HOUSE" are found, second best match, if "dog but not 'house' or 'house' but not 'dog' third best etc.

即使有排名记录在“最匹配”的方面,它甚至会更好,一个更好的方式,也就是说,如果“狗屋”被发现是完全匹配,如果“狗”和“房子”被发现,第二最好的匹配,如果“狗但不是'房子'或'房子'但不是'狗'第三最佳等。

Is there a generic algorithm that does much of what I want?

是否有一个通用算法可以完成我想要的大部分工作?

2 个解决方案

#1


You should look at Full Text Search. It's specificially designed to do exactly what you're asking for, and it does so very well.

您应该查看全文搜索。它专门设计用于完成你所要求的,并且它做得非常好。

The way you've implemented this using traditional TSQL will render any indexes on the affected columns entirely unusable.

使用传统TSQL实现此方法的方式将使受影响列上的任何索引完全无法使用。

And don't be scared off by Full-Text Search - it's suprisingly simple to set up.

并且不要被全文搜索吓到 - 设置起来非常简单。

#2


I would create a view which concatenates the two columns, and use full text searching on that view through your stored procedure.

我将创建一个连接两列的视图,并通过存储过程在该视图上使用全文搜索。

#1


You should look at Full Text Search. It's specificially designed to do exactly what you're asking for, and it does so very well.

您应该查看全文搜索。它专门设计用于完成你所要求的,并且它做得非常好。

The way you've implemented this using traditional TSQL will render any indexes on the affected columns entirely unusable.

使用传统TSQL实现此方法的方式将使受影响列上的任何索引完全无法使用。

And don't be scared off by Full-Text Search - it's suprisingly simple to set up.

并且不要被全文搜索吓到 - 设置起来非常简单。

#2


I would create a view which concatenates the two columns, and use full text searching on that view through your stored procedure.

我将创建一个连接两列的视图,并通过存储过程在该视图上使用全文搜索。