在SQL Server 2008中跨多个表,列使用全文搜索

时间:2022-05-08 09:28:56

I need to search across multiple columns from two tables in my database using Full-Text Search. The two tables in question have the relevant columns full-text indexed.

我需要使用全文搜索从数据库中的两个表中搜索多个列。有问题的两个表都有相关的列全文索引。

The reason I'm opting for Full-text search: 1. To be able to search accented words easily (cafè) 2. To be able to rank according to word proximity, etc. 3. "Did you mean XXX?" functionality

我选择全文搜索的原因:1。能够轻松搜索重音词(cafè)2。能够根据词语接近等排名.3。“你的意思是XXX?”功能

Here is a dummy table structure, to illustrate the challenge:

这是一个虚拟表结构,以说明挑战:

Table Book
BookID
Name (Full-text indexed)
Notes (Full-text indexed)

Table Shelf
ShelfID
BookID

Table ShelfAuthor
AuthorID
ShelfID

Table Author
AuthorID
Name (Full-text indexed)

I need to search across Book Name, Book Notes and Author Name.

我需要搜索书名,书注和作者姓名。

I know of two ways to accomplish this:

我知道有两种方法可以做到这一点:

  1. Using a Full-text Indexed View: This would have been my preferred method, but I can't do this because for a view to be full-text indexed, it needs to be schemabound, not have any outer joins, have a unique index. The view I will need to get my data does not satisfy these constraints (it contains many other joined tables I need to get data from).

    使用全文索引视图:这可能是我首选的方法,但我不能这样做,因为对于要进行全文索引的视图,它需要是模式绑定的,没有任何外连接,具有唯一索引。我需要获取我的数据的视图不满足这些约束(它包含我需要从中获取数据的许多其他连接表)。

  2. Using joins in a stored procedure: The problem with this approach is that I need to have the results sorted by rank. If I am making multiple joins across the tables, SQL Server won't search across multiple fields by default. I can combine two individual CONTAINS queries on the two linked tables, but I don't know of a way to extract the combined rank from the two search queries. For example, if I search for 'Arthur', the results of both the Book query and the Author query should be taken into account and weighted accordingly.

    在存储过程中使用连接:这种方法的问题是我需要按排名排序结果。如果我在表中进行多个连接,则默认情况下SQL Server不会跨多个字段进行搜索。我可以在两个链接表上组合两个单独的CONTAINS查询,但我不知道从两个搜索查询中提取组合排名的方法。例如,如果我搜索“Arthur”,则应考虑Book查询和Author查询的结果并相应地加权。

6 个解决方案

#1


15  

Using FREETEXTTABLE, you just need to design some algorithm to calculate the merged rank on each joined table result. The example below skews the result towards hits from the book table.

使用FREETEXTTABLE,您只需设计一些算法来计算每个连接表结果的合并排名。下面的示例将结果倾向于书表中的命中。

SELECT b.Name, a.Name, bkt.[Rank] + akt.[Rank]/2 AS [Rank]
FROM Book b
INNER JOIN Author a ON b.AuthorID = a.AuthorID
INNER JOIN FREETEXTTABLE(Book, Name, @criteria) bkt ON b.ContentID = bkt.[Key] 
LEFT JOIN FREETEXTTABLE(Author, Name, @criteria) akt ON a.AuthorID = akt.[Key]
ORDER BY [Rank] DESC

Note that I simplified your schema for this example.

请注意,我简化了此示例的架构。

#2


3  

I don't think the accepted answer will solve the problem. If you try to find all the books from a certain author and, therefore, use the author's name (or part of it) as the search criteria, the only books returned by the query will be those which have the search criteria in its own name.

我不认为接受的答案会解决问题。如果您尝试查找某位作者的所有书籍,并因此使用作者的姓名(或其中的一部分)作为搜索条件,则查询返回的唯一书籍将是具有自己名称的搜索条件的书籍。 。

The only way I see around this problem is to replicate the Author's columns that you wish to search by in the Book table and index those columns (or column since it would probably be smart to store the author's relevant information in an XML column in the Book table).

我看到这个问题的唯一方法是复制你希望在Book表中搜索的作者列,并索引这些列(或列,因为它可能是聪明的将作者的相关信息存储在本书的XML列中表)。

#3


3  

I had the same problem as you but it actually involved 10 tables (a Users table and several others for information)

我遇到了和你一样的问题,但它实际上涉及10个表(一个用户表和其他一些用于获取信息的表)

I did my first query using FREETEXT in the WHERE clause for each table but the query was taking far too long.

我在每个表的WHERE子句中使用FREETEXT进行了第一次查询,但查询耗时太长。

I then saw several replies about using FREETEXTTABLE instead and checking for not nulls values in the key column for each table, but that took also to long to execute.

然后我看到几个关于使用FREETEXTTABLE的回复,​​并检查每个表的键列中的非空值,但这也需要很长时间才能执行。

I fixed it by using a combination of FREETEXTTABLE and UNION selects:

我通过使用FREETEXTTABLE和UNION的组合来修复它:

SELECT Users.* FROM Users INNER JOIN
(SELECT Users.UserId FROM Users INNER JOIN FREETEXTTABLE(Users, (column1, column2), @variableWithSearchTerm) UsersFT ON Users.UserId = UsersFT.key
UNION
SELECT Table1.UserId FROM Table1 INNER JOIN FREETEXTTABLE(Table1, TextColumn, @variableWithSearchTerm) Table1FT ON Table1.UserId = Table1FT.key
UNION
SELECT Table2.UserId FROM Table2 INNER JOIN FREETEXTTABLE(Table2, TextColumn, @variableWithSearchTerm) Table2FT ON Table2.UserId = Table2FT.key
... --same for all tables
) fts ON Users.UserId = fts.UserId

This proved to be incredibly much faster.

事实证明这速度要快得多。

I hope it helps.

我希望它有所帮助。

#4


1  

I would use a stored procedure. The full text method or whatever returns a rank which you can sort by. I am not sure how they will be weighted against eachother, but I'm sure you could tinker for awhile and figure it out. For example:

我会使用存储过程。全文方法或其他任何返回您可以排序的排名。我不确定他们将如何对抗彼此,但我相信你可能会修补一段时间并弄明白。例如:

Select SearchResults.key, SearchResults.rank From FREETEXTTABLE(myColumn, *, @searchString) as SearchResults Order By SearchResults.rank Desc

#5


1  

FWIW, in a similar situation our DBA created DML triggers to maintain a dedicated full-text search table. It was not possible to use a materialized view because of its many restrictions.

FWIW,在类似情况下,我们的DBA创建了DML触发器来维护专用的全文搜索表。由于其许多限制,无法使用物化视图。

#6


0  

This answer is well overdue, but one way to do this if you cannot modify primary tables is to create a new table with the search parameters added to one column.

这个答案已经过期了,但是如果你不能修改主表,一种方法是创建一个新表,并将搜索参数添加到一列。

Then create a full text index on that column and query that column.

然后在该列上创建一个完整的文本索引并查询该列。

Example

SELECT 
    FT_TBL.[EANHotelID]                 AS HotelID, 
    ISNULL(FT_TBL.[Name],'-')           AS HotelName,
    ISNULL(FT_TBL.[Address1],'-')       AS HotelAddress,
    ISNULL(FT_TBL.[City],'-')           AS HotelCity,
    ISNULL(FT_TBL.[StateProvince],'-')  AS HotelCountyState,
    ISNULL(FT_TBL.[PostalCode],'-')     AS HotelPostZipCode,
    ISNULL(FT_TBL.[Latitude],0.00)      AS HotelLatitude,
    ISNULL(FT_TBL.[Longitude],0.00)     AS HotelLongitude,
    ISNULL(FT_TBL.[CheckInTime],'-')    AS HotelCheckinTime,
    ISNULL(FT_TBL.[CheckOutTime],'-')   AS HotelCheckOutTime,
    ISNULL(b.[CountryName],'-')         AS HotelCountry,
    ISNULL(c.PropertyDescription,'-')   AS HotelDescription,
    KEY_TBL.RANK 

    FROM [EAN].[dbo].[tblactivepropertylist] AS FT_TBL INNER JOIN
     CONTAINSTABLE ([EAN].[dbo].[tblEanFullTextSearch], FullTextSearchColumn, @s)
      AS KEY_TBL
    ON FT_TBL.EANHotelID = KEY_TBL.[KEY]
    INNER JOIN [EAN].[dbo].[tblCountrylist] b
    ON FT_TBL.Country = b.CountryCode
    INNER JOIN [EAN].[dbo].[tblPropertyDescriptionList] c
    ON FT_TBL.[EANHotelID] = c.EANHotelID

In the code above [EAN].[dbo].[tblEanFullTextSearch], FullTextSearchColumn is the new table and column with the fields added, you can now do a query on the new table with joins to the table you want to display the data from.

在上面的代码[EAN]。[dbo]。[tblEanFullTextSearch]中,FullTextSearchColumn是添加了字段的新表和列,现在可以对新表进行查询,并在表中添加要显示数据的表。

Hope this helps

希望这可以帮助

#1


15  

Using FREETEXTTABLE, you just need to design some algorithm to calculate the merged rank on each joined table result. The example below skews the result towards hits from the book table.

使用FREETEXTTABLE,您只需设计一些算法来计算每个连接表结果的合并排名。下面的示例将结果倾向于书表中的命中。

SELECT b.Name, a.Name, bkt.[Rank] + akt.[Rank]/2 AS [Rank]
FROM Book b
INNER JOIN Author a ON b.AuthorID = a.AuthorID
INNER JOIN FREETEXTTABLE(Book, Name, @criteria) bkt ON b.ContentID = bkt.[Key] 
LEFT JOIN FREETEXTTABLE(Author, Name, @criteria) akt ON a.AuthorID = akt.[Key]
ORDER BY [Rank] DESC

Note that I simplified your schema for this example.

请注意,我简化了此示例的架构。

#2


3  

I don't think the accepted answer will solve the problem. If you try to find all the books from a certain author and, therefore, use the author's name (or part of it) as the search criteria, the only books returned by the query will be those which have the search criteria in its own name.

我不认为接受的答案会解决问题。如果您尝试查找某位作者的所有书籍,并因此使用作者的姓名(或其中的一部分)作为搜索条件,则查询返回的唯一书籍将是具有自己名称的搜索条件的书籍。 。

The only way I see around this problem is to replicate the Author's columns that you wish to search by in the Book table and index those columns (or column since it would probably be smart to store the author's relevant information in an XML column in the Book table).

我看到这个问题的唯一方法是复制你希望在Book表中搜索的作者列,并索引这些列(或列,因为它可能是聪明的将作者的相关信息存储在本书的XML列中表)。

#3


3  

I had the same problem as you but it actually involved 10 tables (a Users table and several others for information)

我遇到了和你一样的问题,但它实际上涉及10个表(一个用户表和其他一些用于获取信息的表)

I did my first query using FREETEXT in the WHERE clause for each table but the query was taking far too long.

我在每个表的WHERE子句中使用FREETEXT进行了第一次查询,但查询耗时太长。

I then saw several replies about using FREETEXTTABLE instead and checking for not nulls values in the key column for each table, but that took also to long to execute.

然后我看到几个关于使用FREETEXTTABLE的回复,​​并检查每个表的键列中的非空值,但这也需要很长时间才能执行。

I fixed it by using a combination of FREETEXTTABLE and UNION selects:

我通过使用FREETEXTTABLE和UNION的组合来修复它:

SELECT Users.* FROM Users INNER JOIN
(SELECT Users.UserId FROM Users INNER JOIN FREETEXTTABLE(Users, (column1, column2), @variableWithSearchTerm) UsersFT ON Users.UserId = UsersFT.key
UNION
SELECT Table1.UserId FROM Table1 INNER JOIN FREETEXTTABLE(Table1, TextColumn, @variableWithSearchTerm) Table1FT ON Table1.UserId = Table1FT.key
UNION
SELECT Table2.UserId FROM Table2 INNER JOIN FREETEXTTABLE(Table2, TextColumn, @variableWithSearchTerm) Table2FT ON Table2.UserId = Table2FT.key
... --same for all tables
) fts ON Users.UserId = fts.UserId

This proved to be incredibly much faster.

事实证明这速度要快得多。

I hope it helps.

我希望它有所帮助。

#4


1  

I would use a stored procedure. The full text method or whatever returns a rank which you can sort by. I am not sure how they will be weighted against eachother, but I'm sure you could tinker for awhile and figure it out. For example:

我会使用存储过程。全文方法或其他任何返回您可以排序的排名。我不确定他们将如何对抗彼此,但我相信你可能会修补一段时间并弄明白。例如:

Select SearchResults.key, SearchResults.rank From FREETEXTTABLE(myColumn, *, @searchString) as SearchResults Order By SearchResults.rank Desc

#5


1  

FWIW, in a similar situation our DBA created DML triggers to maintain a dedicated full-text search table. It was not possible to use a materialized view because of its many restrictions.

FWIW,在类似情况下,我们的DBA创建了DML触发器来维护专用的全文搜索表。由于其许多限制,无法使用物化视图。

#6


0  

This answer is well overdue, but one way to do this if you cannot modify primary tables is to create a new table with the search parameters added to one column.

这个答案已经过期了,但是如果你不能修改主表,一种方法是创建一个新表,并将搜索参数添加到一列。

Then create a full text index on that column and query that column.

然后在该列上创建一个完整的文本索引并查询该列。

Example

SELECT 
    FT_TBL.[EANHotelID]                 AS HotelID, 
    ISNULL(FT_TBL.[Name],'-')           AS HotelName,
    ISNULL(FT_TBL.[Address1],'-')       AS HotelAddress,
    ISNULL(FT_TBL.[City],'-')           AS HotelCity,
    ISNULL(FT_TBL.[StateProvince],'-')  AS HotelCountyState,
    ISNULL(FT_TBL.[PostalCode],'-')     AS HotelPostZipCode,
    ISNULL(FT_TBL.[Latitude],0.00)      AS HotelLatitude,
    ISNULL(FT_TBL.[Longitude],0.00)     AS HotelLongitude,
    ISNULL(FT_TBL.[CheckInTime],'-')    AS HotelCheckinTime,
    ISNULL(FT_TBL.[CheckOutTime],'-')   AS HotelCheckOutTime,
    ISNULL(b.[CountryName],'-')         AS HotelCountry,
    ISNULL(c.PropertyDescription,'-')   AS HotelDescription,
    KEY_TBL.RANK 

    FROM [EAN].[dbo].[tblactivepropertylist] AS FT_TBL INNER JOIN
     CONTAINSTABLE ([EAN].[dbo].[tblEanFullTextSearch], FullTextSearchColumn, @s)
      AS KEY_TBL
    ON FT_TBL.EANHotelID = KEY_TBL.[KEY]
    INNER JOIN [EAN].[dbo].[tblCountrylist] b
    ON FT_TBL.Country = b.CountryCode
    INNER JOIN [EAN].[dbo].[tblPropertyDescriptionList] c
    ON FT_TBL.[EANHotelID] = c.EANHotelID

In the code above [EAN].[dbo].[tblEanFullTextSearch], FullTextSearchColumn is the new table and column with the fields added, you can now do a query on the new table with joins to the table you want to display the data from.

在上面的代码[EAN]。[dbo]。[tblEanFullTextSearch]中,FullTextSearchColumn是添加了字段的新表和列,现在可以对新表进行查询,并在表中添加要显示数据的表。

Hope this helps

希望这可以帮助