We’re having a problem we were hoping the good folks of Stack Overflow could help us with. We’re running SQL Server 2008 R2 and are having problems with a query that takes a very long time to run on a moderate set of data , about 100000 rows. We're using CONTAINS to search through xml files and LIKE on another column to support leading wild cards.
我们遇到了一个问题,我们希望Stack Overflow的优秀人员可以帮助我们。我们正在运行SQL Server 2008 R2,并且在查询时遇到问题需要很长时间才能在一组中等数据上运行,大约有100000行。我们使用CONTAINS搜索xml文件,并在另一列上搜索LIKE以支持领先的外卡。
We’ve reproduced the problem with the following small query that takes about 35 seconds to run:
我们使用以下小型查询重现了该问题,该查询大约需要35秒才能运行:
SELECT something FROM table1
WHERE (CONTAINS(TextColumn, '"WhatEver"') OR
DescriptionColumn LIKE '%WhatEver%')
Query plan:
查询计划:
If we modify the query above to using UNION instead, the running time drops from 35 seconds to < 1 seconds. We would like to avoid using this approach to solve the issue.
如果我们将上面的查询修改为使用UNION,则运行时间从35秒下降到<1秒。我们希望避免使用这种方法来解决问题。
SELECT something FROM table1 WHERE (CONTAINS(TextColumn, '"WhatEver"')
UNION
(SELECT something FROM table1 WHERE (DescriptionColumn LIKE '%WhatEver%'))
Query plan:
查询计划:
The column that we’re using CONTAINS to search through is a column with type image and consists of xml files sized anywhere from 1k to 20k in size.
我们使用CONTAINS搜索的列是一个类型为image的列,由大小为1k到20k的xml文件组成。
We have no good theories as to why the first query is so slow so we were hoping someone here would have something wise to say on the matter. The query plans don’t show anything out of the ordinary as far as we can tell. We've also rebuilt the indexes and statistics.
我们没有很好的理论为什么第一个查询是如此缓慢,所以我们希望有人在这个问题上有一些明智的说法。据我们所知,查询计划并没有显示任何异常。我们还重建了索引和统计数据。
Is there anything blatantly obvious we’re overlooking here?
有什么明显的东西我们在这里俯瞰吗?
Thanks in advance for your time!
在此先感谢您的时间!
3 个解决方案
#1
4
Why are you using DescriptionColumn LIKE '%WhatEver%'
instead of CONTAINS(DescriptionColumn, '"WhatEver"')
?
你为什么使用DescriptionColumn LIKE'%WhatEver%'而不是CONTAINS(DescriptionColumn,'“WhatEver”')?
CONTAINS
is obviously a Full-Text predicate and will use the SQL Server Full-Text engine to filter the search results, however LIKE
is a "normal" SQL Server keyword and so SQL Server will not use the Full-Text engine to asist with this query - In this case because the LIKE
term begins with a wildcard SQL Server will be unable to use any indexes to help with the query either which will most likely result in a table scan and / or poorer performance than using the Full-Text engine.
CONTAINS显然是一个全文谓词,并将使用SQL Server全文引擎来过滤搜索结果,但是LIKE是一个“普通”SQL Server关键字,因此SQL Server将不会使用全文引擎来对此进行分析查询 - 在这种情况下,因为LIKE术语以通配符开头SQL Server将无法使用任何索引来帮助查询,这很可能导致表扫描和/或性能低于使用全文引擎。
Its
difficult
impossible to tell without an execution plan, however my guess on whats happening would be:
没有执行计划很难说不出来,但我对最新情况的猜测是:
-
The
UNION
variation of the query is performing a table scan againsttable1
- the table scan is not fast, however because there are relatively few rows in the table it is not performing that slowly (compared to a 35s benchmark).查询的UNION变体正在对table1执行表扫描 - 表扫描速度不快,但是因为表中的行数相对较少,所以它的执行速度不慢(与35s基准测试相比)。
-
In the
OR
variation of the query SQL Server is first using the Full-Text engine to filter based on theCONTAINS
and then goes on to perform an RDI lookup on each matching row in the result to filter based on theLIKE
predicate, however for some reason SQL Server has massively underestimated the number of rows (this can happen with certain types of predicate) and so goes on to perform several thousnad RDI lookups which ends up being incredibly slow (a table scan would have been much quicker).在查询的OR变体中,SQL Server首先使用全文引擎基于CONTAINS进行过滤,然后继续对结果中的每个匹配行执行RDI查找,以根据LIKE谓词进行过滤,但对于某些原因SQL Server大量低估了行数(这可能发生在某些类型的谓词中),因此继续执行几个thousnad RDI查找,最终变得非常慢(表扫描速度会快得多)。
To really understand whats going on you need to get a query plan.
要真正了解最新情况,您需要获取查询计划。
#2
1
Did you guys try this:
你们试试这个:
SELECT *
FROM table
WHERE CONTAINS((column1, column2, column3), '"*keyword*"')
Instead of this:
而不是这个:
SELECT *
FROM table
WHERE CONTAINS(column1, '"*keyword*"')
OR CONTAINS(column2, '"*keyword*"')
OR CONTAINS(column3y, '"*keyword*"')
The first one is a lot faster.
第一个更快。
#3
1
I just ran into this. This is reportedly a bug on SQL server 2008 R2:
我刚碰到这个。据报道,这是SQL Server 2008 R2上的一个错误:
http://www.arcomit.co.uk/support/kb.aspx?kbid=000060
http://www.arcomit.co.uk/support/kb.aspx?kbid=000060
Your approach of using a UNION of two selects instead of an OR is the workaround they recommend in that article.
使用UNION of two选择而不是OR的方法是他们在该文章中推荐的解决方法。
#1
4
Why are you using DescriptionColumn LIKE '%WhatEver%'
instead of CONTAINS(DescriptionColumn, '"WhatEver"')
?
你为什么使用DescriptionColumn LIKE'%WhatEver%'而不是CONTAINS(DescriptionColumn,'“WhatEver”')?
CONTAINS
is obviously a Full-Text predicate and will use the SQL Server Full-Text engine to filter the search results, however LIKE
is a "normal" SQL Server keyword and so SQL Server will not use the Full-Text engine to asist with this query - In this case because the LIKE
term begins with a wildcard SQL Server will be unable to use any indexes to help with the query either which will most likely result in a table scan and / or poorer performance than using the Full-Text engine.
CONTAINS显然是一个全文谓词,并将使用SQL Server全文引擎来过滤搜索结果,但是LIKE是一个“普通”SQL Server关键字,因此SQL Server将不会使用全文引擎来对此进行分析查询 - 在这种情况下,因为LIKE术语以通配符开头SQL Server将无法使用任何索引来帮助查询,这很可能导致表扫描和/或性能低于使用全文引擎。
Its
difficult
impossible to tell without an execution plan, however my guess on whats happening would be:
没有执行计划很难说不出来,但我对最新情况的猜测是:
-
The
UNION
variation of the query is performing a table scan againsttable1
- the table scan is not fast, however because there are relatively few rows in the table it is not performing that slowly (compared to a 35s benchmark).查询的UNION变体正在对table1执行表扫描 - 表扫描速度不快,但是因为表中的行数相对较少,所以它的执行速度不慢(与35s基准测试相比)。
-
In the
OR
variation of the query SQL Server is first using the Full-Text engine to filter based on theCONTAINS
and then goes on to perform an RDI lookup on each matching row in the result to filter based on theLIKE
predicate, however for some reason SQL Server has massively underestimated the number of rows (this can happen with certain types of predicate) and so goes on to perform several thousnad RDI lookups which ends up being incredibly slow (a table scan would have been much quicker).在查询的OR变体中,SQL Server首先使用全文引擎基于CONTAINS进行过滤,然后继续对结果中的每个匹配行执行RDI查找,以根据LIKE谓词进行过滤,但对于某些原因SQL Server大量低估了行数(这可能发生在某些类型的谓词中),因此继续执行几个thousnad RDI查找,最终变得非常慢(表扫描速度会快得多)。
To really understand whats going on you need to get a query plan.
要真正了解最新情况,您需要获取查询计划。
#2
1
Did you guys try this:
你们试试这个:
SELECT *
FROM table
WHERE CONTAINS((column1, column2, column3), '"*keyword*"')
Instead of this:
而不是这个:
SELECT *
FROM table
WHERE CONTAINS(column1, '"*keyword*"')
OR CONTAINS(column2, '"*keyword*"')
OR CONTAINS(column3y, '"*keyword*"')
The first one is a lot faster.
第一个更快。
#3
1
I just ran into this. This is reportedly a bug on SQL server 2008 R2:
我刚碰到这个。据报道,这是SQL Server 2008 R2上的一个错误:
http://www.arcomit.co.uk/support/kb.aspx?kbid=000060
http://www.arcomit.co.uk/support/kb.aspx?kbid=000060
Your approach of using a UNION of two selects instead of an OR is the workaround they recommend in that article.
使用UNION of two选择而不是OR的方法是他们在该文章中推荐的解决方法。