Hello all and thanks in advance. I have a view that when queried with no where clause takes just over 0 seconds to return ~8600 rows. However, when I query with a where clause such as:
大家好,提前谢谢。我有一个视图,当查询没有where子句只需要超过0秒返回~8600行。但是,当我使用where子句查询时,例如:
SELECT * FROM myView WHERE myID = 123
depending on what constant I put in place of 123 the query execution time changes considerably.
根据我代替123的常量,查询执行时间会发生很大变化。
Now, "considerably" in this case means the difference between just above 0 seconds and 3 to 4 seconds. But the view is called frequently and repeatedly for certain tasks which makes 3 seconds turn into 30 or more seconds.
现在,在这种情况下,“相当大”意味着刚好超过0秒和3到4秒之间的差异。但是对于某些任务频繁且重复地调用该视图,这使得3秒变为30秒或更多秒。
While I cannot give the code for the view itself, what I can confirm is that:
虽然我无法为视图本身提供代码,但我可以确认的是:
-
The view is comprised of the joining of 6 standard tables (no special qualities).
该视图由6个标准表的连接组成(没有特殊质量)。
-
While there may not always be records in table A that link up with table B, thus creating null columns in the results, I have confirmed that such instances are not consistently resulting in the longer or shorter query times.
虽然表A中可能并不总是存在与表B链接的记录,因此在结果中创建空列,但我已经确认此类实例不一致导致查询时间更长或更短。
-
The view itself has no clauses beyond the standard
Select
,From
, andLeft Outer Join
clauses.视图本身没有超出标准Select,From和Left Outer Join子句的子句。
-
Certain IDs always result in long query times and the others always result in short query times
某些ID始终会导致查询时间过长,而其他ID始终会导致查询时间缩短
-
I have dropped and created the view in between queries on the off chance that there was a cached execution plan that was sub-optimal.
我已经删除并在查询之间创建了视图,因为有可能存在一个次优的缓存执行计划。
If these known variables are not enough to reduce the possibilities down to 2 or 3 possible causes I would still like to know what THEORETICAL problems might be causing this issue just to expand my understanding.
如果这些已知变量不足以将可能性降低到2或3个可能的原因,我仍然想知道理论问题可能导致这个问题只是为了扩展我的理解。
Thanks Again,
再次感谢,
ProtoNoob
ProtoNoob
1 个解决方案
#1
0
I would assume that the statistics for the tables are outdated and do not match the real content of the tables. This would mean that the optimizer, relying on the statistics, e. g. assumes that a value you use in the WHERE clause does not occur in the data at all, hence the result set being rather small, while in reality it contains many rows. Or the other way round: Relying on the statistics, the optimizer could assume that - say- 20% of the rows of the table have this value, and hence it is better to do a full table scan than to first access index pages for evaluating the where condition, then jump to a data page for almost each index entry to read the data, and in the end having to read nearly all pages anyway. Or it would access the tables in a wrong order, or ... But in reality, the value is not contained in the table at all, thus just leading to a wrong plan.
我假设表的统计信息已过时,并且与表的实际内容不匹配。这意味着优化器依赖于统计数据,例如: G。假设您在WHERE子句中使用的值根本不会出现在数据中,因此结果集相当小,而实际上它包含许多行。或者反过来说:依赖于统计数据,优化器可以假设 - 比如表中20%的行具有此值,因此最好进行全表扫描而不是首先访问索引页以进行评估where条件,然后跳转到几乎每个索引条目的数据页面来读取数据,最后不得不几乎读取所有页面。或者它会以错误的顺序访问表,或者......但实际上,该值根本不包含在表中,因此只会导致错误的计划。
One hint pointing to outdated statistics would be if the query plan shows a huge difference between estimated and actual number of rows.
如果查询计划显示估计行数与实际行数之间存在巨大差异,则提示指向过时统计信息的提示。
Which DBMS are you using? If SQL Server, then you can see the current statistics using DBCC SHOW_STATISTICS
and refresh the statistics for selected columns and tables using the UPDATE STATISTICS
statement. There are more views and procedures around this subject, most of them are linked from one of these two articles.
您使用的是哪个DBMS?如果是SQL Server,则可以使用DBCC SHOW_STATISTICS查看当前统计信息,并使用UPDATE STATISTICS语句刷新所选列和表的统计信息。围绕这个主题有更多的观点和程序,其中大部分都是从这两篇文章中的一篇链接起来的。
#1
0
I would assume that the statistics for the tables are outdated and do not match the real content of the tables. This would mean that the optimizer, relying on the statistics, e. g. assumes that a value you use in the WHERE clause does not occur in the data at all, hence the result set being rather small, while in reality it contains many rows. Or the other way round: Relying on the statistics, the optimizer could assume that - say- 20% of the rows of the table have this value, and hence it is better to do a full table scan than to first access index pages for evaluating the where condition, then jump to a data page for almost each index entry to read the data, and in the end having to read nearly all pages anyway. Or it would access the tables in a wrong order, or ... But in reality, the value is not contained in the table at all, thus just leading to a wrong plan.
我假设表的统计信息已过时,并且与表的实际内容不匹配。这意味着优化器依赖于统计数据,例如: G。假设您在WHERE子句中使用的值根本不会出现在数据中,因此结果集相当小,而实际上它包含许多行。或者反过来说:依赖于统计数据,优化器可以假设 - 比如表中20%的行具有此值,因此最好进行全表扫描而不是首先访问索引页以进行评估where条件,然后跳转到几乎每个索引条目的数据页面来读取数据,最后不得不几乎读取所有页面。或者它会以错误的顺序访问表,或者......但实际上,该值根本不包含在表中,因此只会导致错误的计划。
One hint pointing to outdated statistics would be if the query plan shows a huge difference between estimated and actual number of rows.
如果查询计划显示估计行数与实际行数之间存在巨大差异,则提示指向过时统计信息的提示。
Which DBMS are you using? If SQL Server, then you can see the current statistics using DBCC SHOW_STATISTICS
and refresh the statistics for selected columns and tables using the UPDATE STATISTICS
statement. There are more views and procedures around this subject, most of them are linked from one of these two articles.
您使用的是哪个DBMS?如果是SQL Server,则可以使用DBCC SHOW_STATISTICS查看当前统计信息,并使用UPDATE STATISTICS语句刷新所选列和表的统计信息。围绕这个主题有更多的观点和程序,其中大部分都是从这两篇文章中的一篇链接起来的。