最有效的方法来选择分层查询 - SQL Server 2014

Suppose I have a database with a gazillion rows and 5 fields - State (A), City (B), Category (C), Subcategory (D), and ID (E). I will hit it with a gazillion queries like SELECT ID, with WHERE clauses that can contain the following, connected with ANDs:

假设我有一个包含大量行和5个字段的数据库 - 状态（A），城市（B），类别（C），子类别（D）和ID（E）。我将使用像SELECT ID这样的大量查询，WHERE子句可以包含以下内容，与ANDs连接：

A
A,B
A,C
A,B,C
A,C,D
A,B,C,D
C
C,D

In other words, it will only include B if it includes A, and only include D if it includes C, due to the nature of the hierarchy of the columns. Each of these would return a list of IDs, which may be many rows.

换句话说，如果它包括A，则它将仅包括B，并且由于列的层次结构的性质，它仅包括D（如果它包括C）。这些中的每一个都将返回ID列表，其可以是许多行。

Would the following technique be beneficial?

以下技术会有益吗？

Create two tables, one (X) with a compound clustered index on (A,B), and the other (Y) with a compound clustered index on (C,D)

创建两个表，一个（X）在（A，B）上具有复合聚簇索引，另一个（Y）在（C，D）上具有复合聚簇索引
Take the part of my query on {A,B} (if any) and hit that against X; take the part of my query against {C,D} (if any) and hit that against Y.

将我的查询部分放在{A，B}（如果有的话）上，然后点击X;将我的查询部分反对{C，D}（如果有的话）并针对Y点击。
If I hit both tables (i.e. the query included both parts of {A,B} and {C,D}), then intersect both tables on ID.

如果我同时击中两个表（即查询包括{A，B}和{C，D}的两个部分），则将两个表交叉在ID上。

Would this be more efficient than just doing the fully query against the entire table? Should I also make a secondary non-clustered index for ID on X and Y?

这比仅对整个表进行完全查询更有效吗？我是否还要在X和Y上为ID创建辅助非聚集索引？

1 个解决方案

#1

If you create an index on A,B,C,D, then it will help with these queries:

如果您在A，B，C，D上创建索引，那么它将有助于这些查询：

A
A,B
A,B,C
A,B,C,D

It will somewhat help with these queries:

它会对这些查询有所帮助：

A,C
A,C,D

It will not help with these queries:

它对这些查询没有帮助：

C
C,D

If you create a second index on C,D, it will help with these queries:

如果在C，D上创建第二个索引，它将有助于这些查询：

C
C,D

If you create a third index on A,C,D, it will help with these queries:

如果您在A，C，D上创建第三个索引，它将有助于这些查询：

A
A,C
A,C,D

So, I would keep everything in one table and have three indexes on it. As for, which index to make a clustered, it's hard to say. I'll start with simple index on ID, which would be primary key as well.

所以，我会将所有内容保存在一个表中，并在其上有三个索引。至于哪个指数要聚集，很难说。我将从ID的简单索引开始，这也是主键。

Everything above assumes that your queries are using all these columns with "equality" filter, like this:

以上所有内容假定您的查询使用所有这些列与“相等”过滤器，如下所示：

WHERE A='ValueA' AND B='ValueB' and C='ValueC'

If some of the filters are not =, but > or >=, then indexes may be not so effective. It depends on actual filter expression.

如果某些过滤器不是=，但是>或> =，则索引可能不那么有效。它取决于实际的过滤器表达式。

#1