Suppose I have a database with a gazillion rows and 5 fields - State
(A), City
(B), Category
(C), Subcategory
(D), and ID
(E). I will hit it with a gazillion queries like SELECT ID
, with WHERE
clauses that can contain the following, connected with AND
s:
假设我有一个包含大量行和5个字段的数据库 - 状态(A),城市(B),类别(C),子类别(D)和ID(E)。我将使用像SELECT ID这样的大量查询,WHERE子句可以包含以下内容,与ANDs连接:
A
A,B
A,C
A,B,C
A,C,D
A,B,C,D
C
C,D
In other words, it will only include B if it includes A, and only include D if it includes C, due to the nature of the hierarchy of the columns. Each of these would return a list of IDs, which may be many rows.
换句话说,如果它包括A,则它将仅包括B,并且由于列的层次结构的性质,它仅包括D(如果它包括C)。这些中的每一个都将返回ID列表,其可以是许多行。
Would the following technique be beneficial?
以下技术会有益吗?
-
Create two tables, one (
X
) with a compound clustered index on(A,B)
, and the other (Y
) with a compound clustered index on(C,D)
创建两个表,一个(X)在(A,B)上具有复合聚簇索引,另一个(Y)在(C,D)上具有复合聚簇索引
-
Take the part of my query on
{A,B}
(if any) and hit that againstX
; take the part of my query against{C,D}
(if any) and hit that againstY
.将我的查询部分放在{A,B}(如果有的话)上,然后点击X;将我的查询部分反对{C,D}(如果有的话)并针对Y点击。
-
If I hit both tables (i.e. the query included both parts of
{A,B}
and{C,D}
), then intersect both tables onID
.如果我同时击中两个表(即查询包括{A,B}和{C,D}的两个部分),则将两个表交叉在ID上。
Would this be more efficient than just doing the fully query against the entire table? Should I also make a secondary non-clustered index for ID
on X
and Y
?
这比仅对整个表进行完全查询更有效吗?我是否还要在X和Y上为ID创建辅助非聚集索引?
1 个解决方案
#1
0
If you create an index on A,B,C,D
, then it will help with these queries:
如果您在A,B,C,D上创建索引,那么它将有助于这些查询:
A
A,B
A,B,C
A,B,C,D
It will somewhat help with these queries:
它会对这些查询有所帮助:
A,C
A,C,D
It will not help with these queries:
它对这些查询没有帮助:
C
C,D
If you create a second index on C,D
, it will help with these queries:
如果在C,D上创建第二个索引,它将有助于这些查询:
C
C,D
If you create a third index on A,C,D
, it will help with these queries:
如果您在A,C,D上创建第三个索引,它将有助于这些查询:
A
A,C
A,C,D
So, I would keep everything in one table and have three indexes on it. As for, which index to make a clustered, it's hard to say. I'll start with simple index on ID
, which would be primary key as well.
所以,我会将所有内容保存在一个表中,并在其上有三个索引。至于哪个指数要聚集,很难说。我将从ID的简单索引开始,这也是主键。
Everything above assumes that your queries are using all these columns with "equality" filter, like this:
以上所有内容假定您的查询使用所有这些列与“相等”过滤器,如下所示:
WHERE A='ValueA' AND B='ValueB' and C='ValueC'
If some of the filters are not =
, but >
or >=
, then indexes may be not so effective. It depends on actual filter expression.
如果某些过滤器不是=,但是>或> =,则索引可能不那么有效。它取决于实际的过滤器表达式。
#1
0
If you create an index on A,B,C,D
, then it will help with these queries:
如果您在A,B,C,D上创建索引,那么它将有助于这些查询:
A
A,B
A,B,C
A,B,C,D
It will somewhat help with these queries:
它会对这些查询有所帮助:
A,C
A,C,D
It will not help with these queries:
它对这些查询没有帮助:
C
C,D
If you create a second index on C,D
, it will help with these queries:
如果在C,D上创建第二个索引,它将有助于这些查询:
C
C,D
If you create a third index on A,C,D
, it will help with these queries:
如果您在A,C,D上创建第三个索引,它将有助于这些查询:
A
A,C
A,C,D
So, I would keep everything in one table and have three indexes on it. As for, which index to make a clustered, it's hard to say. I'll start with simple index on ID
, which would be primary key as well.
所以,我会将所有内容保存在一个表中,并在其上有三个索引。至于哪个指数要聚集,很难说。我将从ID的简单索引开始,这也是主键。
Everything above assumes that your queries are using all these columns with "equality" filter, like this:
以上所有内容假定您的查询使用所有这些列与“相等”过滤器,如下所示:
WHERE A='ValueA' AND B='ValueB' and C='ValueC'
If some of the filters are not =
, but >
or >=
, then indexes may be not so effective. It depends on actual filter expression.
如果某些过滤器不是=,但是>或> =,则索引可能不那么有效。它取决于实际的过滤器表达式。