在sql server中什么时候使用表变量vs临时表?

时间:2021-08-01 00:13:24

I'm learning more details in table variable. It says that temp tables are always on disk, and table variables are in memory, that is to say, the performance of table variable is better than temp table because table variable uses less IO operations than temp table.

我正在学习表变量的更多细节。它说临时表总是在磁盘上,表变量在内存中,也就是说,表变量的性能比临时表好,因为表变量使用的IO操作比临时表少。

But sometimes, if there are too many records in a table variable that can not be contained in memory, the table variable will be put on disk like the temp table.

但有时,如果表变量中有太多的记录不能包含在内存中,那么表变量将像临时表一样放在磁盘上。

But I don't know what the "too many records" is. 100,000 records? or 1000,000 records? How can I know if a table variable I'm using is in memory or is on disk? Is there any function or tool in SQL Server 2005 to measure the scale of the table variable or letting me know when the table variable is put on disk from memory?

但我不知道“太多的唱片”是什么。100000条记录吗?或1000,000条记录?我如何知道我正在使用的表变量是在内存中还是在磁盘上?SQL Server 2005中是否有任何函数或工具来度量表变量的规模,或者让我知道何时将表变量从内存放到磁盘上?

6 个解决方案

#1


295  

Your question shows you have succumbed to some of the common misconceptions surrounding table variables and temporary tables.

您的问题表明您已经屈从于围绕表变量和临时表的一些常见误解。

I have written quite an extensive answer on the DBA site looking at the differences between the two object types. This also addresses your question about disc vs memory (I didn't see any significant difference in behaviour between the two).

我在DBA站点上写了相当详细的回答,讨论了这两种对象类型之间的差异。这也解决了你关于磁盘与内存的问题(我没有发现两者之间的行为有任何显著差异)。

Regarding the question in the title though as to when to use a table variable vs a local temporary table you don't always have a choice. In functions, for example, it is only possible to use a table variable and if you need to write to the table in a child scope then only a #temp table will do (table-valued parameters allow readonly access).

关于标题中的问题,当使用表变量与本地临时表时,您并不总是可以选择。例如,在函数中,只能使用一个表变量,如果需要在子范围中写入表,那么只有一个#temp表可以这样做(表值参数允许只读访问)。

Where you do have a choice some suggestions are below (though the most reliable method is to simply test both with your specific workload).

在您有选择的地方,以下是一些建议(尽管最可靠的方法是使用特定的工作负载对两者进行测试)。

  1. If you need an index that cannot be created implicitly through a UNIQUE or PRIMARY KEY constraint then you need a #temporary table as it is not possible to create these on table variables. (Examples of such indexes are non-unique ones, filtered indexes or indexes with INCLUDEd columns). NB: SQL Server 2014 will allow non-unique indexes to be declared inline for table variables.
  2. 如果您需要一个不能通过唯一或主键约束隐式创建的索引,那么您需要一个#临时表,因为在表变量上创建这些表是不可能的。(此类索引的例子是非惟一的、过滤的索引或包含列的索引)。NB: SQL Server 2014允许为表变量声明非唯一索引。
  3. If you will be repeatedly adding and deleting large numbers of rows from the table then use a #temporary table. That supports TRUNCATE (which is more efficient than DELETE for large tables) and additionally subsequent inserts following a TRUNCATE can have better performance than those following a DELETE as illustrated here.
  4. 如果您将不断地从表中添加和删除大量的行,那么使用#临时表。它支持truncatetable(对于大型表来说,它比DELETE更有效),此外,在truncatetable后面的后续插入也比在DELETE后面的插入具有更好的性能。
  5. If you will be deleting or updating a large number of rows then the temp table may well perform much better than a table variable - if it is able to use rowset sharing (see "Effects of rowset sharing" below for an example).
  6. 如果要删除或更新大量的行,那么temp表可能比表变量执行得更好——如果它能够使用行集共享(参见下面的“行集共享的影响”示例)。
  7. If the optimal plan using the table will vary dependent on data then use a #temporary table. That supports creation of statistics which allows the plan to be dynamically recompiled according to the data (though for cached temporary tables in stored procedures the recompilation behaviour needs to be understood separately).
  8. 如果使用该表的最优计划会因数据而异,那么使用#临时表。它支持创建统计信息,允许根据数据动态地重新编译计划(尽管对于存储过程中缓存的临时表,重新编译行为需要分别理解)。
  9. If the optimal plan for the query using the table is unlikely to ever change then you may consider a table variable to skip the overhead of statistics creation and recompiles (would possibly require hints to fix the plan you want).
  10. 如果使用该表的查询的最优计划不太可能更改,那么您可以考虑使用一个表变量来跳过创建和重新编译统计信息的开销(可能需要提示来修复您想要的计划)。
  11. If the source for the data inserted to the table is from a potentially expensive SELECT statement then consider that using a table variable will block the possibility of this using a parallel plan.
  12. 如果插入到表中的数据的源来自一个潜在的昂贵的SELECT语句,那么请考虑使用表变量将阻止使用并行计划进行此操作的可能性。
  13. If you need the data in the table to survive a rollback of an outer user transaction then use a table variable. A possible use case for this might be logging the progress of different steps in a long SQL batch.
  14. 如果您需要表中的数据才能在外部用户事务回滚后继续运行,那么请使用表变量。一个可能的用例可能是在长SQL批处理中记录不同步骤的进度。
  15. When using a #temp table within a user transaction locks can be held longer than for table variables (potentially until the end of transaction vs end of statement dependent on the type of lock and isolation level) and also it can prevent truncation of the tempdb transaction log until the user transaction ends. So this might favour the use of table variables.
  16. 当使用#临时表在一个用户事务锁可以举行超过表变量(可能在年底前结束事务和语句依赖于类型的锁和隔离级别),也可以防止tempdb的截断事务日志,直到用户事务结束。这可能有利于使用表变量。
  17. Within stored routines, both table variables and temporary tables can be cached. The metadata maintenance for cached table variables is less than that for #temporary tables. Bob Ward points out in his tempdb presentation that this can cause additional contention on system tables under conditions of high concurrency. Additionally, when dealing with small quantities of data this can make a measurable difference to performance.
  18. 在存储的例程中,可以缓存表变量和临时表。缓存表变量的元数据维护比#临时表的元数据维护要少。Bob Ward在他的tempdb报告中指出,这可能导致在高并发条件下的系统表上出现额外的争用。此外,当处理少量数据时,这可以对性能产生可度量的影响。

Effects of rowset sharing

rowset的分享

DECLARE @T TABLE(id INT PRIMARY KEY, Flag BIT);

CREATE TABLE #T (id INT PRIMARY KEY, Flag BIT);

INSERT INTO @T 
output inserted.* into #T
SELECT TOP 1000000 ROW_NUMBER() OVER (ORDER BY @@SPID), 0
FROM master..spt_values v1, master..spt_values v2

SET STATISTICS TIME ON

/*CPU time = 7016 ms,  elapsed time = 7860 ms.*/
UPDATE @T SET Flag=1;

/*CPU time = 6234 ms,  elapsed time = 7236 ms.*/
DELETE FROM @T

/* CPU time = 828 ms,  elapsed time = 1120 ms.*/
UPDATE #T SET Flag=1;

/*CPU time = 672 ms,  elapsed time = 980 ms.*/
DELETE FROM #T

DROP TABLE #T

#2


56  

Use a table variable if for a very small quantity of data (thousands of bytes)

如果数据量很小(数千字节),则使用表变量

Use a temporary table for a lot of data

对大量数据使用临时表

Another way to think about it: if you think you might benefit from an index, automated statistics, or any SQL optimizer goodness, then your data set is probably too large for a table variable.

另一种考虑方法是:如果您认为可以从索引、自动统计或任何SQL优化器的优点中获益,那么对于表变量来说,您的数据集可能太大了。

In my example, I just wanted to put about 20 rows into a format and modify them as a group, before using them to UPDATE / INSERT a permanent table. So a table variable is perfect.

在我的示例中,我只想将大约20行放入一个格式中,并将它们作为一个组进行修改,然后使用它们更新/插入一个永久表。表变量是完美的。

But I am also running SQL to back-fill thousands of rows at a time, and I can definitely say that the temporary tables perform much better than table variables.

但是,我同时也在运行SQL来填充数千行,我可以肯定地说,临时表的性能要比表变量好得多。

This is not unlike how CTE's are a concern for a similar size reason - if the data in the CTE is very small, I find a CTE performs as good as or better than what the optimizer comes up with, but if it is quite large then it hurts you bad.

这不是与CTE的如何关心一个类似规模的原因——如果CTE中的数据非常小,我发现一个CTE执行一样好或者比优化器提出了,但是如果它是相当大的那么疼你坏。

My understanding is mostly based on http://www.developerfusion.com/article/84397/table-variables-v-temporary-tables-in-sql-server/, which has a lot more detail.

我的理解主要是基于http://www.developerfusion.com/article/84397/table- variabls -v- temporal - tables-insql -server/,它有更多的细节。

#3


28  

Microsoft says here

微软说这里

Table variables does not have distribution statistics, they will not trigger recompiles. Therefore, in many cases, the optimizer will build a query plan on the assumption that the table variable has no rows. For this reason, you should be cautious about using a table variable if you expect a larger number of rows (greater than 100). Temp tables may be a better solution in this case.

表变量没有分布统计信息,它们不会触发重新编译。因此,在许多情况下,优化器将基于表变量没有行这一假设构建查询计划。由于这个原因,如果您希望有更多的行(大于100),那么应该谨慎使用表变量。在这种情况下,临时表可能是更好的解决方案。

#4


12  

I totally agree with Abacus (sorry - don't have enough points to comment).

我完全同意Abacus的观点(对不起——没有足够的观点可以评论)。

Also, keep in mind it doesn't necessarily come down to how many records you have, but the size of your records.

另外,记住这并不一定取决于你有多少条记录,而是你记录的大小。

For instance, have you considered the performance difference between 1,000 records with 50 columns each vs 100,000 records with only 5 columns each?

例如,您是否考虑过1,000条记录(每个记录有50列)与100,000条记录(每个记录只有5列)之间的性能差异?

Lastly, maybe you're querying/storing more data than you need? Here's a good read on SQL optimization strategies. Limit the amount of data you're pulling, especially if you're not using it all (some SQL programmers do get lazy and just select everything even though they only use a tiny subset). Don't forget the SQL query analyzer may also become your best friend.

最后,也许您正在查询/存储比您需要的更多的数据?下面是关于SQL优化策略的详细阅读。限制所拉数据的数量,特别是如果您没有使用所有数据(一些SQL程序员确实会变得懒惰,即使只使用很小的子集,也只选择所有数据)。不要忘记SQL查询分析器也可能成为您最好的朋友。

#5


3  

Variable table is available only to the current session, for example, if you need to EXEC another stored procedure within the current one you will have to pass the table as Table Valued Parameter and of course this will affect the performance, with temporary tables you can do this with only passing the temporary table name

变量表只提供给当前会话,例如,如果您需要执行另一个存储过程在当前你将不得不通过表如表值参数,当然,这将影响性能,与临时表你可以只通过临时表的名字

To test a Temporary table:

测试临时表:

  • Open management studio query editor
  • Open management studio查询编辑器
  • Create a temporary table
  • 创建一个临时表
  • Open another query editor window
  • 打开另一个查询编辑器窗口。
  • Select from this table "Available"
  • 从该表中选择“可用”

To test a Variable table:

测试变量表:

  • Open management studio query editor
  • Open management studio查询编辑器
  • Create a Variable table
  • 表创建一个变量
  • Open another query editor window
  • 打开另一个查询编辑器窗口。
  • Select from this table "Not Available"
  • 从该表中选择“不可用”

Something else I have experienced is: If your schema doesn't have GRANT privilege to create tables then use variable tables. Moreover the temporary tables located in tempdb database and the table variables can be located on both memory and the disk of tempdb database. See here: https://www.linkedin.com/pulse/highlighted-differences-between-sql-server-temporary-tables-andrew

我还经历了另一件事:如果您的模式没有创建表的GRANT权限,那么请使用可变表。此外,位于tempdb数据库中的临时表和表变量可以同时位于tempdb数据库的内存和磁盘上。在这里看到的:https://www.linkedin.com/pulse/highlighted-differences-between-sql-server-temporary-tables-andrew

#6


1  

writing data in tables declared declare @tb and after joining with other tables, I realized that the response time compared to temporary tables tempdb .. # tb is much higher.

在声明为declare @tb的表中编写数据,并与其他表连接后,我意识到与临时表tempdb相比,响应时间要长。# tb高得多。

When I join them with @tb the time is much longer to return the result, unlike #tm, the return is almost instantaneous.

当我加入@tb时,返回结果的时间要长得多,不像#tm,返回几乎是瞬间的。

I did tests with a 10,000 rows join and join with 5 other tables

我使用10,000行连接和5个其他表进行了测试

#1


295  

Your question shows you have succumbed to some of the common misconceptions surrounding table variables and temporary tables.

您的问题表明您已经屈从于围绕表变量和临时表的一些常见误解。

I have written quite an extensive answer on the DBA site looking at the differences between the two object types. This also addresses your question about disc vs memory (I didn't see any significant difference in behaviour between the two).

我在DBA站点上写了相当详细的回答,讨论了这两种对象类型之间的差异。这也解决了你关于磁盘与内存的问题(我没有发现两者之间的行为有任何显著差异)。

Regarding the question in the title though as to when to use a table variable vs a local temporary table you don't always have a choice. In functions, for example, it is only possible to use a table variable and if you need to write to the table in a child scope then only a #temp table will do (table-valued parameters allow readonly access).

关于标题中的问题,当使用表变量与本地临时表时,您并不总是可以选择。例如,在函数中,只能使用一个表变量,如果需要在子范围中写入表,那么只有一个#temp表可以这样做(表值参数允许只读访问)。

Where you do have a choice some suggestions are below (though the most reliable method is to simply test both with your specific workload).

在您有选择的地方,以下是一些建议(尽管最可靠的方法是使用特定的工作负载对两者进行测试)。

  1. If you need an index that cannot be created implicitly through a UNIQUE or PRIMARY KEY constraint then you need a #temporary table as it is not possible to create these on table variables. (Examples of such indexes are non-unique ones, filtered indexes or indexes with INCLUDEd columns). NB: SQL Server 2014 will allow non-unique indexes to be declared inline for table variables.
  2. 如果您需要一个不能通过唯一或主键约束隐式创建的索引,那么您需要一个#临时表,因为在表变量上创建这些表是不可能的。(此类索引的例子是非惟一的、过滤的索引或包含列的索引)。NB: SQL Server 2014允许为表变量声明非唯一索引。
  3. If you will be repeatedly adding and deleting large numbers of rows from the table then use a #temporary table. That supports TRUNCATE (which is more efficient than DELETE for large tables) and additionally subsequent inserts following a TRUNCATE can have better performance than those following a DELETE as illustrated here.
  4. 如果您将不断地从表中添加和删除大量的行,那么使用#临时表。它支持truncatetable(对于大型表来说,它比DELETE更有效),此外,在truncatetable后面的后续插入也比在DELETE后面的插入具有更好的性能。
  5. If you will be deleting or updating a large number of rows then the temp table may well perform much better than a table variable - if it is able to use rowset sharing (see "Effects of rowset sharing" below for an example).
  6. 如果要删除或更新大量的行,那么temp表可能比表变量执行得更好——如果它能够使用行集共享(参见下面的“行集共享的影响”示例)。
  7. If the optimal plan using the table will vary dependent on data then use a #temporary table. That supports creation of statistics which allows the plan to be dynamically recompiled according to the data (though for cached temporary tables in stored procedures the recompilation behaviour needs to be understood separately).
  8. 如果使用该表的最优计划会因数据而异,那么使用#临时表。它支持创建统计信息,允许根据数据动态地重新编译计划(尽管对于存储过程中缓存的临时表,重新编译行为需要分别理解)。
  9. If the optimal plan for the query using the table is unlikely to ever change then you may consider a table variable to skip the overhead of statistics creation and recompiles (would possibly require hints to fix the plan you want).
  10. 如果使用该表的查询的最优计划不太可能更改,那么您可以考虑使用一个表变量来跳过创建和重新编译统计信息的开销(可能需要提示来修复您想要的计划)。
  11. If the source for the data inserted to the table is from a potentially expensive SELECT statement then consider that using a table variable will block the possibility of this using a parallel plan.
  12. 如果插入到表中的数据的源来自一个潜在的昂贵的SELECT语句,那么请考虑使用表变量将阻止使用并行计划进行此操作的可能性。
  13. If you need the data in the table to survive a rollback of an outer user transaction then use a table variable. A possible use case for this might be logging the progress of different steps in a long SQL batch.
  14. 如果您需要表中的数据才能在外部用户事务回滚后继续运行,那么请使用表变量。一个可能的用例可能是在长SQL批处理中记录不同步骤的进度。
  15. When using a #temp table within a user transaction locks can be held longer than for table variables (potentially until the end of transaction vs end of statement dependent on the type of lock and isolation level) and also it can prevent truncation of the tempdb transaction log until the user transaction ends. So this might favour the use of table variables.
  16. 当使用#临时表在一个用户事务锁可以举行超过表变量(可能在年底前结束事务和语句依赖于类型的锁和隔离级别),也可以防止tempdb的截断事务日志,直到用户事务结束。这可能有利于使用表变量。
  17. Within stored routines, both table variables and temporary tables can be cached. The metadata maintenance for cached table variables is less than that for #temporary tables. Bob Ward points out in his tempdb presentation that this can cause additional contention on system tables under conditions of high concurrency. Additionally, when dealing with small quantities of data this can make a measurable difference to performance.
  18. 在存储的例程中,可以缓存表变量和临时表。缓存表变量的元数据维护比#临时表的元数据维护要少。Bob Ward在他的tempdb报告中指出,这可能导致在高并发条件下的系统表上出现额外的争用。此外,当处理少量数据时,这可以对性能产生可度量的影响。

Effects of rowset sharing

rowset的分享

DECLARE @T TABLE(id INT PRIMARY KEY, Flag BIT);

CREATE TABLE #T (id INT PRIMARY KEY, Flag BIT);

INSERT INTO @T 
output inserted.* into #T
SELECT TOP 1000000 ROW_NUMBER() OVER (ORDER BY @@SPID), 0
FROM master..spt_values v1, master..spt_values v2

SET STATISTICS TIME ON

/*CPU time = 7016 ms,  elapsed time = 7860 ms.*/
UPDATE @T SET Flag=1;

/*CPU time = 6234 ms,  elapsed time = 7236 ms.*/
DELETE FROM @T

/* CPU time = 828 ms,  elapsed time = 1120 ms.*/
UPDATE #T SET Flag=1;

/*CPU time = 672 ms,  elapsed time = 980 ms.*/
DELETE FROM #T

DROP TABLE #T

#2


56  

Use a table variable if for a very small quantity of data (thousands of bytes)

如果数据量很小(数千字节),则使用表变量

Use a temporary table for a lot of data

对大量数据使用临时表

Another way to think about it: if you think you might benefit from an index, automated statistics, or any SQL optimizer goodness, then your data set is probably too large for a table variable.

另一种考虑方法是:如果您认为可以从索引、自动统计或任何SQL优化器的优点中获益,那么对于表变量来说,您的数据集可能太大了。

In my example, I just wanted to put about 20 rows into a format and modify them as a group, before using them to UPDATE / INSERT a permanent table. So a table variable is perfect.

在我的示例中,我只想将大约20行放入一个格式中,并将它们作为一个组进行修改,然后使用它们更新/插入一个永久表。表变量是完美的。

But I am also running SQL to back-fill thousands of rows at a time, and I can definitely say that the temporary tables perform much better than table variables.

但是,我同时也在运行SQL来填充数千行,我可以肯定地说,临时表的性能要比表变量好得多。

This is not unlike how CTE's are a concern for a similar size reason - if the data in the CTE is very small, I find a CTE performs as good as or better than what the optimizer comes up with, but if it is quite large then it hurts you bad.

这不是与CTE的如何关心一个类似规模的原因——如果CTE中的数据非常小,我发现一个CTE执行一样好或者比优化器提出了,但是如果它是相当大的那么疼你坏。

My understanding is mostly based on http://www.developerfusion.com/article/84397/table-variables-v-temporary-tables-in-sql-server/, which has a lot more detail.

我的理解主要是基于http://www.developerfusion.com/article/84397/table- variabls -v- temporal - tables-insql -server/,它有更多的细节。

#3


28  

Microsoft says here

微软说这里

Table variables does not have distribution statistics, they will not trigger recompiles. Therefore, in many cases, the optimizer will build a query plan on the assumption that the table variable has no rows. For this reason, you should be cautious about using a table variable if you expect a larger number of rows (greater than 100). Temp tables may be a better solution in this case.

表变量没有分布统计信息,它们不会触发重新编译。因此,在许多情况下,优化器将基于表变量没有行这一假设构建查询计划。由于这个原因,如果您希望有更多的行(大于100),那么应该谨慎使用表变量。在这种情况下,临时表可能是更好的解决方案。

#4


12  

I totally agree with Abacus (sorry - don't have enough points to comment).

我完全同意Abacus的观点(对不起——没有足够的观点可以评论)。

Also, keep in mind it doesn't necessarily come down to how many records you have, but the size of your records.

另外,记住这并不一定取决于你有多少条记录,而是你记录的大小。

For instance, have you considered the performance difference between 1,000 records with 50 columns each vs 100,000 records with only 5 columns each?

例如,您是否考虑过1,000条记录(每个记录有50列)与100,000条记录(每个记录只有5列)之间的性能差异?

Lastly, maybe you're querying/storing more data than you need? Here's a good read on SQL optimization strategies. Limit the amount of data you're pulling, especially if you're not using it all (some SQL programmers do get lazy and just select everything even though they only use a tiny subset). Don't forget the SQL query analyzer may also become your best friend.

最后,也许您正在查询/存储比您需要的更多的数据?下面是关于SQL优化策略的详细阅读。限制所拉数据的数量,特别是如果您没有使用所有数据(一些SQL程序员确实会变得懒惰,即使只使用很小的子集,也只选择所有数据)。不要忘记SQL查询分析器也可能成为您最好的朋友。

#5


3  

Variable table is available only to the current session, for example, if you need to EXEC another stored procedure within the current one you will have to pass the table as Table Valued Parameter and of course this will affect the performance, with temporary tables you can do this with only passing the temporary table name

变量表只提供给当前会话,例如,如果您需要执行另一个存储过程在当前你将不得不通过表如表值参数,当然,这将影响性能,与临时表你可以只通过临时表的名字

To test a Temporary table:

测试临时表:

  • Open management studio query editor
  • Open management studio查询编辑器
  • Create a temporary table
  • 创建一个临时表
  • Open another query editor window
  • 打开另一个查询编辑器窗口。
  • Select from this table "Available"
  • 从该表中选择“可用”

To test a Variable table:

测试变量表:

  • Open management studio query editor
  • Open management studio查询编辑器
  • Create a Variable table
  • 表创建一个变量
  • Open another query editor window
  • 打开另一个查询编辑器窗口。
  • Select from this table "Not Available"
  • 从该表中选择“不可用”

Something else I have experienced is: If your schema doesn't have GRANT privilege to create tables then use variable tables. Moreover the temporary tables located in tempdb database and the table variables can be located on both memory and the disk of tempdb database. See here: https://www.linkedin.com/pulse/highlighted-differences-between-sql-server-temporary-tables-andrew

我还经历了另一件事:如果您的模式没有创建表的GRANT权限,那么请使用可变表。此外,位于tempdb数据库中的临时表和表变量可以同时位于tempdb数据库的内存和磁盘上。在这里看到的:https://www.linkedin.com/pulse/highlighted-differences-between-sql-server-temporary-tables-andrew

#6


1  

writing data in tables declared declare @tb and after joining with other tables, I realized that the response time compared to temporary tables tempdb .. # tb is much higher.

在声明为declare @tb的表中编写数据,并与其他表连接后,我意识到与临时表tempdb相比,响应时间要长。# tb高得多。

When I join them with @tb the time is much longer to return the result, unlike #tm, the return is almost instantaneous.

当我加入@tb时,返回结果的时间要长得多,不像#tm,返回几乎是瞬间的。

I did tests with a 10,000 rows join and join with 5 other tables

我使用10,000行连接和5个其他表进行了测试