使用索引加速SQL查询

时间:2022-09-20 14:24:44

I have a table called Products.

我有一个名为Products的表。

This table contains over 3 million entries. Every day there are approximately 5000 new entries. which only happens during the night in 2 minutes.

该表包含超过300万条目。每天大约有5000个新条目。这只发生在2分钟的夜晚。

But this table gets queried every night maybe over 20 000 times with this query.

但是这个表每天晚上查询可能会超过20 000次查询。

SELECT Price 
FROM Products 
WHERE Code = @code 
  AND Company = @company 
  AND CreatedDate = @createdDate

Table structure:

表结构:

Code          nvarchar(50)
Company       nvarchar(10)
CreatedDate   datetime

I can see that this query takes about a second to return a result from Products table.

我可以看到这个查询大约花了一秒钟从Products表返回一个结果。

There is no productId column in the table as it is not needed. So there is no primary key in the table.

表中没有productId列,因为它不需要。因此表中没有主键。

I would like to somehow improve this query to return the result faster.

我想以某种方式改进此查询以更快地返回结果。

I have never used indexes before. What would be the best way to use indexes on this table?

我之前从未使用过索引。在这个表上使用索引的最佳方法是什么?

If I provide a primary key do you think it would speed up the query result? Keep in mind that I will still have to query the table by providing 3 parameters as

如果我提供主键,您认为它会加快查询结果吗?请记住,我仍然需要通过提供3个参数来查询表

WHERE Code = @code 
  AND Company = @company 
  AND CreatedDate = @createdDate. 

This is mandatory.

这是强制性的。

As I mentioned that the table gets new entries in 2 minutes every day during the night. How would this affect the indexes?

正如我所提到的那样,桌子每天在2分钟内获得新的参赛作品。这将如何影响索引?

If I use indexes, which column would be the best to use and whether I should use clustered or non-clustered indexes?

如果我使用索引,哪个列最适合使用以及我是否应该使用聚簇索引或非聚簇索引?

1 个解决方案

#1


9  

The best thing to do would depend on what other fields the table has and what other queries run against that table.

最好的做法取决于表中的其他字段以及针对该表运行的其他查询。

Without more details, a non-clustered index on (code, company, createddate) that included the "price" column will certainly improve performance.

如果没有更多细节,包含“价格”列的(代码,公司,创建日期)上的非聚集索引肯定会提高性能。

CREATE NONCLUSTERED INDEX IX_code_company_createddate
ON Products(code, company, createddate)
INCLUDE (price);

That's because if you have that index in place, then SQL will not access the actual table at all when running the query, as it can find all rows with a given "code, company, createddate" in the index and it will be able to do that really fast as the index allows precisely for fast access when using the fields that define the key, and it will also have the "price" value for each row.

那是因为如果你有这个索引,那么SQL在运行查询时根本不会访问实际的表,因为它可以在索引中找到所有具有给定“code,company,createddate”的行,它将能够这样做真的很快,因为索引允许在使用定义键的字段时准确地进行快速访问,并且它还将具有每行的“价格”值。

Regarding the inserts, for each row added, SQL Server will have to add them to the index as well, so performance for inserts will be impacted. In think you should expect the gains on SELECT performance to outweigh the impact on the inserts, but you should test that.

关于插入,对于添加的每一行,SQL Server也必须将它们添加到索引中,因此插入的性能将受到影响。在您认为您应该期望SELECT性能上的收益超过对插入的影响,但您应该测试它。

Also, you will be using more space as the index will store all those fields for each row besides the space used by the original table.

此外,您将使用更多空间,因为索引将存储除原始表使用的空间之外的每一行的所有字段。

As others have noted in the comments, adding a PK to your table (even if that means adding a ProductId column you don't actually need) might be a good idea as well.

正如其他人在评论中指出的那样,在您的表中添加PK(即使这意味着添加您实际上不需要的ProductId列)也可能是个好主意。

#1


9  

The best thing to do would depend on what other fields the table has and what other queries run against that table.

最好的做法取决于表中的其他字段以及针对该表运行的其他查询。

Without more details, a non-clustered index on (code, company, createddate) that included the "price" column will certainly improve performance.

如果没有更多细节,包含“价格”列的(代码,公司,创建日期)上的非聚集索引肯定会提高性能。

CREATE NONCLUSTERED INDEX IX_code_company_createddate
ON Products(code, company, createddate)
INCLUDE (price);

That's because if you have that index in place, then SQL will not access the actual table at all when running the query, as it can find all rows with a given "code, company, createddate" in the index and it will be able to do that really fast as the index allows precisely for fast access when using the fields that define the key, and it will also have the "price" value for each row.

那是因为如果你有这个索引,那么SQL在运行查询时根本不会访问实际的表,因为它可以在索引中找到所有具有给定“code,company,createddate”的行,它将能够这样做真的很快,因为索引允许在使用定义键的字段时准确地进行快速访问,并且它还将具有每行的“价格”值。

Regarding the inserts, for each row added, SQL Server will have to add them to the index as well, so performance for inserts will be impacted. In think you should expect the gains on SELECT performance to outweigh the impact on the inserts, but you should test that.

关于插入,对于添加的每一行,SQL Server也必须将它们添加到索引中,因此插入的性能将受到影响。在您认为您应该期望SELECT性能上的收益超过对插入的影响,但您应该测试它。

Also, you will be using more space as the index will store all those fields for each row besides the space used by the original table.

此外,您将使用更多空间,因为索引将存储除原始表使用的空间之外的每一行的所有字段。

As others have noted in the comments, adding a PK to your table (even if that means adding a ProductId column you don't actually need) might be a good idea as well.

正如其他人在评论中指出的那样,在您的表中添加PK(即使这意味着添加您实际上不需要的ProductId列)也可能是个好主意。