We have been using User-Defined Table Types to pass a list of integers to our stored procedures.
我们一直在使用用户定义的表类型将整数列表传递给我们的存储过程。
We then use these to join to other tables in our stored proc queries.
然后,我们使用这些连接到存储过程查询中的其他表。
For example:
例如:
CREATE PROCEDURE [dbo].[sp_Name]
(
@Ids [dbo].[OurTableType] READONLY
)
AS
SET Nocount ON
SELECT
*
FROM
SOMETABLE
INNER JOIN @Ids [OurTableType] ON [OurTableType].Id = SOMETABLE.Id
We have seen very poor performance from this when using larger datasets.
当使用更大的数据集时,我们已经看到了非常差的性能。
One approach we've used to speed things up, is the dump the contents into a temp table and join off that instead.
我们用来加快速度的一种方法是将内容转储到临时表中,然后将其加入。
For example:
例如:
CREATE PROCEDURE [dbo].[sp_Name]
(
@Ids [dbo].[OurTableType] READONLY
)
AS
SET Nocount ON
CREATE TABLE #TempTable(Id INT)
INSERT INTO #TempTable
SELECT Id from @Ids
SELECT
*
FROM
SOMETABLE
INNER JOIN #TempTable ON #TempTable.Id = SOMETABLE.Id
DROP TABLE #TempTable
This does improve performance significantly, but I wanted to get some opinions on this approach and any other consequences we haven't considered. Also an explanation as to why this improves performance may also be useful.
这确实显着提高了性能,但我希望对这种方法以及我们未考虑的任何其他后果有所了解。关于为什么这改善性能的解释也可能是有用的。
N.B. sometime we may need to pass in more than just an integer, hence why we don't use a comma separated list or something like that.
注:有时我们可能需要传递的不仅仅是一个整数,因此我们不使用逗号分隔列表或类似的东西。
1 个解决方案
#1
9
This topic has been discussed before. The primary reason for the poor performance of the JOIN is that the Table-Valued Parameter (TVP) is a Table Variable. Table Variables do not keep statistics and appear to the Query Optimizer to only have 1 row. Hence they are just fine to do something like INSERT INTO Table (column_list) SELECT column_list FROM @TVP;
but not a JOIN.
之前已经讨论过这个话题。 JOIN性能不佳的主要原因是表值参数(TVP)是表变量。表变量不保留统计信息,并且查询优化器看起来只有1行。因此,他们可以做一些像INSERT INTO Table(column_list)SELECT column_list FROM @TVP;但不是加入。
There are a few things to try to get around this:
有几件事要试图解决这个问题:
-
Dump everything to a local temporary table (you are already doing this). A technical downside here is that you are duplicating the data passed into the TVP in
tempdb
(where both the TVP and temp table store their data).将所有内容转储到本地临时表(您已经在执行此操作)。这里的技术缺点是你正在复制传递到tempdb中的TVP的数据(其中TVP和临时表存储他们的数据)。
-
Maybe try defining the User-Defined Table Type to have a Clustered Primary Key. You can do this inline on the
[Id]
field:也许尝试将用户定义的表类型定义为具有群集主键。您可以在[Id]字段内联执行此操作:
[ID] INT NOT NULL PRIMARY KEY
Not sure how much this helps performance, but worth a try.
不确定这对性能有多大帮助,但值得一试。
-
You could try adding
OPTION (RECOMPILE)
to the query. This is a way of getting the Query Optimizer to see how many rows are in a Table Variable so that it can have proper estimates.您可以尝试向查询添加OPTION(RECOMPILE)。这是一种让查询优化器查看表变量中有多少行的方法,以便它可以具有适当的估计值。
SELECT column_list FROM SOMETABLE INNER JOIN @Ids [OurTableType] ON [OurTableType].Id = SOMETABLE.Id OPTION (RECOMPILE);
The downside here is that you have a RECOMPILE which takes additional time each time this proc is called. But that might be an overall net gain.
这里的缺点是你有一个RECOMPILE,每次调用这个proc时需要额外的时间。但这可能是整体净收益。
PS. Don't do SELECT *
. Always specify a column list. Unless doing something like an IF EXIST(SELECT * FROM)...
.
PS。不要做SELECT *。始终指定列列表。除非像IF EXIST(SELECT * FROM)那样做....
#1
9
This topic has been discussed before. The primary reason for the poor performance of the JOIN is that the Table-Valued Parameter (TVP) is a Table Variable. Table Variables do not keep statistics and appear to the Query Optimizer to only have 1 row. Hence they are just fine to do something like INSERT INTO Table (column_list) SELECT column_list FROM @TVP;
but not a JOIN.
之前已经讨论过这个话题。 JOIN性能不佳的主要原因是表值参数(TVP)是表变量。表变量不保留统计信息,并且查询优化器看起来只有1行。因此,他们可以做一些像INSERT INTO Table(column_list)SELECT column_list FROM @TVP;但不是加入。
There are a few things to try to get around this:
有几件事要试图解决这个问题:
-
Dump everything to a local temporary table (you are already doing this). A technical downside here is that you are duplicating the data passed into the TVP in
tempdb
(where both the TVP and temp table store their data).将所有内容转储到本地临时表(您已经在执行此操作)。这里的技术缺点是你正在复制传递到tempdb中的TVP的数据(其中TVP和临时表存储他们的数据)。
-
Maybe try defining the User-Defined Table Type to have a Clustered Primary Key. You can do this inline on the
[Id]
field:也许尝试将用户定义的表类型定义为具有群集主键。您可以在[Id]字段内联执行此操作:
[ID] INT NOT NULL PRIMARY KEY
Not sure how much this helps performance, but worth a try.
不确定这对性能有多大帮助,但值得一试。
-
You could try adding
OPTION (RECOMPILE)
to the query. This is a way of getting the Query Optimizer to see how many rows are in a Table Variable so that it can have proper estimates.您可以尝试向查询添加OPTION(RECOMPILE)。这是一种让查询优化器查看表变量中有多少行的方法,以便它可以具有适当的估计值。
SELECT column_list FROM SOMETABLE INNER JOIN @Ids [OurTableType] ON [OurTableType].Id = SOMETABLE.Id OPTION (RECOMPILE);
The downside here is that you have a RECOMPILE which takes additional time each time this proc is called. But that might be an overall net gain.
这里的缺点是你有一个RECOMPILE,每次调用这个proc时需要额外的时间。但这可能是整体净收益。
PS. Don't do SELECT *
. Always specify a column list. Unless doing something like an IF EXIST(SELECT * FROM)...
.
PS。不要做SELECT *。始终指定列列表。除非像IF EXIST(SELECT * FROM)那样做....