使用'where id IN [table]'在[table]中只有1行的性能命中?

时间:2021-09-23 19:17:36

I have a stored procedure that takes a comma-delimited string of IDs. I split them and put them into a temporary table and pull out records from another table using where id IN [table]

我有一个存储过程,采用逗号分隔的ID字符串。我拆分它们并将它们放入一个临时表中并使用id IN [table]从另一个表中提取记录

Is it ok to use this same procedure when only one id is passed in for the param? I could write a second stored procedure that would do exactly the samething but instead do where id = @id.

当为param传入一个id时,是否可以使用相同的过程?我可以编写第二个存储过程,它将完全相同,但在id = @id的地方执行。

I have MANY stored procedures where multiple IDs or just one could be passed in. Do I try to reuse the existing procedures or write new ones? Is the performance hit significant?

我有很多存储过程,其中可以传入多个ID或只传入一个ID。我是否尝试重用现有过程或编写新过程?性能是否显着?

6 个解决方案

#1


You might like to try a JOIN instead of WHERE id IN - although I think you will get the same query plan.

您可能想尝试JOIN而不是WHERE id IN - 尽管我认为您将获得相同的查询计划。

So I assume you are doing:

所以我假设你在做:

SELECT COl1, Col2, ...
FROM MyTable
WHERE id IN (SELECT id FROM @MyTempTable)

in which case the equivalent JOIN syntax would be :

在这种情况下,等效的JOIN语法将是:

SELECT COl1, Col2, ...
FROM  MyTable AS T1
     JOIN @MyTempTable AS T2
         ON T2.id = T1.id

and in the second case whether there is 1, or many rows, it will be very effective provided [id] is indexed (I am assuming it is the PK on your table, and using a Clustered Index).

在第二种情况下,是否存在1行或多行,只要[id]被索引(我假设它是你桌面上的PK,并使用聚集索引),它将非常有效。

(Beware that if you have DUP IDs in @MyTempTable you will wind up getting dupes from MyTable as well :( )

(请注意,如果你在@MyTempTable中有DUP ID,你也会收到来自MyTable的欺骗:()

For best performance it would be worth explicitly declaring [id] as the PK on your temporary table (but given it only holds a few rows it probably won't make much odds)

为了获得最佳性能,值得明确地将[id]声明为临时表上的PK(但考虑到它只保留几行,它可能不会产生太多的可能性)

DECLARE @TempTable TABLE
(
    id int NOT NULL,
    PRIMARY KEY
    (
        id
    )
)

#2


I wouldn't worry about the performance hit of in with only one item until I had observed a performance problem with it. The query optimizer is smart and may very well deal with the one item in, but even if it doesn't, your routines will probably be slowest elsewhere.

在我观察到性能问题之前,我不会担心只有一个项目的性能损失。查询优化器很聪明,可以很好地处理其中的一个项目,但即使它没有,你的例程也可能在其他地方最慢。

I would look at the performance of the string parsing, temp table creation and insertion into the temp table. Making those as fast a possible, will have a bigger effect on overall performance than wether you use in or = for the one item case.

我会看一下字符串解析,临时表创建和插入临时表的性能。尽可能快地进行这些操作会对整体性能产生更大的影响,而不是使用或者=对于单项情况。

#3


You could use the same procedure, but use a conditional statement to determine whether use use the IN clause.

您可以使用相同的过程,但使用条件语句来确定使用是否使用IN子句。

There is a performance hit with IN; the execution plan should detail this for you.

IN的性能受到打击;执行计划应该为您详细说明。

#4


As rde6173 says, perform a COUNT on the temporary table to determine which SELECT query to use.

正如rde6173所说,在临时表上执行COUNT以确定要使用哪个SELECT查询。

#5


Since you've specified that it's a comma-delimited list, you can do something like this in your sproc:

既然你已经指定它是逗号分隔的列表,你可以在你的sproc中做这样的事情:

IF (CHARINDEX(',', @id) = 0)
BEGIN
    -- the @id parameter contains a single value
    SELECT *
    FROM your_table
    WHERE id = @id  -- maybe need to cast @id if the column isn't a string
END
ELSE
BEGIN
    -- the @id parameter contains a comma-delimited list
    -- only perform the expensive splitting logic at this point
    -- eg, SET @yourTempTable = dbo.SplitCommaDelimitedIDsIntoTable(@id)
    SELECT *
    FROM your_table
    WHERE id IN (SELECT id FROM @yourTempTable)
END

#6


When you create a temporary table (not a table variable), it has statistics. As such, the optimizer will determine the best plan, and the best plan for one ID might be the same as for 10 IDs, but for 50K IDs it may choose a different plan. So, I would not try to optimize it further unless you have performance concerns.

创建临时表(不是表变量)时,它具有统计信息。因此,优化器将确定最佳计划,并且一个ID的最佳计划可能与10个ID相同,但对于50K ID,它可以选择不同的计划。因此,除非您有性能问题,否则我不会尝试进一步优化它。

#1


You might like to try a JOIN instead of WHERE id IN - although I think you will get the same query plan.

您可能想尝试JOIN而不是WHERE id IN - 尽管我认为您将获得相同的查询计划。

So I assume you are doing:

所以我假设你在做:

SELECT COl1, Col2, ...
FROM MyTable
WHERE id IN (SELECT id FROM @MyTempTable)

in which case the equivalent JOIN syntax would be :

在这种情况下,等效的JOIN语法将是:

SELECT COl1, Col2, ...
FROM  MyTable AS T1
     JOIN @MyTempTable AS T2
         ON T2.id = T1.id

and in the second case whether there is 1, or many rows, it will be very effective provided [id] is indexed (I am assuming it is the PK on your table, and using a Clustered Index).

在第二种情况下,是否存在1行或多行,只要[id]被索引(我假设它是你桌面上的PK,并使用聚集索引),它将非常有效。

(Beware that if you have DUP IDs in @MyTempTable you will wind up getting dupes from MyTable as well :( )

(请注意,如果你在@MyTempTable中有DUP ID,你也会收到来自MyTable的欺骗:()

For best performance it would be worth explicitly declaring [id] as the PK on your temporary table (but given it only holds a few rows it probably won't make much odds)

为了获得最佳性能,值得明确地将[id]声明为临时表上的PK(但考虑到它只保留几行,它可能不会产生太多的可能性)

DECLARE @TempTable TABLE
(
    id int NOT NULL,
    PRIMARY KEY
    (
        id
    )
)

#2


I wouldn't worry about the performance hit of in with only one item until I had observed a performance problem with it. The query optimizer is smart and may very well deal with the one item in, but even if it doesn't, your routines will probably be slowest elsewhere.

在我观察到性能问题之前,我不会担心只有一个项目的性能损失。查询优化器很聪明,可以很好地处理其中的一个项目,但即使它没有,你的例程也可能在其他地方最慢。

I would look at the performance of the string parsing, temp table creation and insertion into the temp table. Making those as fast a possible, will have a bigger effect on overall performance than wether you use in or = for the one item case.

我会看一下字符串解析,临时表创建和插入临时表的性能。尽可能快地进行这些操作会对整体性能产生更大的影响,而不是使用或者=对于单项情况。

#3


You could use the same procedure, but use a conditional statement to determine whether use use the IN clause.

您可以使用相同的过程,但使用条件语句来确定使用是否使用IN子句。

There is a performance hit with IN; the execution plan should detail this for you.

IN的性能受到打击;执行计划应该为您详细说明。

#4


As rde6173 says, perform a COUNT on the temporary table to determine which SELECT query to use.

正如rde6173所说,在临时表上执行COUNT以确定要使用哪个SELECT查询。

#5


Since you've specified that it's a comma-delimited list, you can do something like this in your sproc:

既然你已经指定它是逗号分隔的列表,你可以在你的sproc中做这样的事情:

IF (CHARINDEX(',', @id) = 0)
BEGIN
    -- the @id parameter contains a single value
    SELECT *
    FROM your_table
    WHERE id = @id  -- maybe need to cast @id if the column isn't a string
END
ELSE
BEGIN
    -- the @id parameter contains a comma-delimited list
    -- only perform the expensive splitting logic at this point
    -- eg, SET @yourTempTable = dbo.SplitCommaDelimitedIDsIntoTable(@id)
    SELECT *
    FROM your_table
    WHERE id IN (SELECT id FROM @yourTempTable)
END

#6


When you create a temporary table (not a table variable), it has statistics. As such, the optimizer will determine the best plan, and the best plan for one ID might be the same as for 10 IDs, but for 50K IDs it may choose a different plan. So, I would not try to optimize it further unless you have performance concerns.

创建临时表(不是表变量)时,它具有统计信息。因此,优化器将确定最佳计划,并且一个ID的最佳计划可能与10个ID相同,但对于50K ID,它可以选择不同的计划。因此,除非您有性能问题,否则我不会尝试进一步优化它。