Ok. Here's what I try to run:
好。这是我尝试运行的内容:
USE tempdb;
SELECT TOP 1000000 IDENTITY(INT, 1, 1) Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
CROSS JOIN sys.objects s3
CROSS JOIN sys.objects s4;
This is one of those "make me a numbers table" queries.
这是“让我成为数字表”查询之一。
Here's the problem. If I run this immediately after the SQL Server service is (re)started, it takes forever. Not forever as in ten seconds and I want it faster. Forever as in, I let it go over two hours once by accident and still had to kill it. I'm thinking it just never ever comes back. And ordinarily it takes under two seconds on my machine to run this.
这是问题所在。如果我在(重新)启动SQL Server服务后立即运行它,它将永远需要。不是十分之一,我希望它更快。永远如此,我让它一次意外地超过两个小时,仍然不得不杀死它。我在想它永远不会回来。通常我的机器需要不到两秒的时间来运行它。
However, if I do this instead:
但是,如果我这样做:
USE tempdb;
SELECT TOP 1000000 IDENTITY(INT, 1, 1) Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
CROSS JOIN sys.objects s3;
DROP TABLE Numbers;
SELECT TOP 1000000 IDENTITY(INT, 1, 1) Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
CROSS JOIN sys.objects s3
CROSS JOIN sys.objects s4;
Then it works as you would expect — the first SELECT
runs in under two seconds, as does the second. Why don't I just use the three-table version? Because there aren't enough entries in sys.objects
for that number cubed to equal a million result rows. But that's not even the point anymore.
然后它就像你期望的那样工作 - 第一个SELECT在两秒内运行,第二个也是如此。为什么我不使用三表版本?因为sys.objects中没有足够的条目用于该数字的立方数等于一百万个结果行。但这甚至不再是重点。
Anyway, from here on, I can repeat that second DROP
/ SELECT…INTO
as much as I want, no problem. Somehow that first three-table version made it ok forever. At least, till the next time the service is restarted and/or the machine rebooted. At which point, running that last SELECT
again never comes back. Again.
无论如何,从这里开始,我可以尽可能多地重复第二次DROP / SELECT ... INTO,没问题。不知何故,第一个三桌版本让它永远存在。至少,直到下次重新启动服务和/或重新启动机器。此时,再次运行最后一个SELECT永远不会回来。再次。
Here's where it starts getting even weirder. If I pare that first SELECT
back to a two-table version:
这是它开始变得更加怪异的地方。如果我把第一个SELECT削减回两个表版本:
USE tempdb;
SELECT TOP 1000000 IDENTITY(INT, 1, 1) Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2;
DROP TABLE Numbers;
SELECT TOP 1000000 IDENTITY(INT, 1, 1) Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
CROSS JOIN sys.objects s3
CROSS JOIN sys.objects s4;
This also makes the second SELECT
run forever. As does a one-table version. Somehow, that three-table version is magical!
这也使第二个SELECT永远运行。和单表版本一样。不知何故,三表版本是神奇的!
What is going on here? Why is this slow?
这里发生了什么?为什么这么慢?
(And before anyone points out that I'm creating a permanent table in tempdb
, yes, I know. Changing to actual temp tables doesn't make any difference.)
(并且在任何人指出我在tempdb中创建一个永久表之前,是的,我知道。更改为实际临时表没有任何区别。)
Added info:
补充信息:
- This is SQL Server 2012 Developer Edition
- 这是SQL Server 2012 Developer Edition
- Output of
EXEC sp_WhoIsActive @find_block_leaders = 1, @sort_order = '[blocked_session_count] DESC'
(scripted as XML so it can be read here) is: - 输出EXEC sp_WhoIsActive @find_block_leaders = 1,@ sort_order ='[blocked_session_count] DESC'(编写为XML,因此可以在此处阅读)是:
<?xml version="1.0" ?> <RESULTS1> <RECORD> <dd hh:mm:ss.mss>00 00:10:45.066</dd hh:mm:ss.mss> <session_id>52</session_id> <sql_text><?query -- SELECT TOP 1000000 IDENTITY(INT, 1, 1) Number INTO Numbers FROM sys.objects s1 CROSS JOIN sys.objects s2 CROSS JOIN sys.objects s3 CROSS JOIN sys.objects s4; --?></sql_text> <login_name>my own login name redacted</login_name> <wait_info>(99ms)LCK_M_X</wait_info> <CPU> 9,750</CPU> <tempdb_allocations> 713</tempdb_allocations> <tempdb_current> 702</tempdb_current> <blocking_session_id>NULL</blocking_session_id> <blocked_session_count> 0</blocked_session_count> <reads> 583,273</reads> <writes> 537</writes> <physical_reads> 50</physical_reads> <used_memory> 3</used_memory> <status>suspended</status> <open_tran_count> 2</open_tran_count> <percent_complete>NULL</percent_complete> <host_name>my own machine name redacted</host_name> <database_name>tempdb</database_name> <program_name>Microsoft SQL Server Management Studio - Query</program_name> <start_time>2013-11-23 23:48:19.473</start_time> <login_time>2013-11-23 23:47:47.060</login_time> <request_id>0</request_id> <collection_time>2013-11-23 23:59:04.560</collection_time> </RECORD> </RESULTS1>
More added info:
更多信息:
Why I'm putting this in tempdb is that it's part of a script intended to be run on virgin installations, and tempdb is guaranteed to be there. As I said, changing to global temp tables does no different.
为什么我把它放在tempdb中是因为它是一个打算在原始安装上运行的脚本的一部分,并且tempdb保证在那里。正如我所说,更改为全局临时表并没有什么不同。
2 个解决方案
#1
21
I
can
could also reproduce this 100% of the time on my machine. (see note at end)
我也可以在我的机器上100%重现这个。 (见末尾注)
The gist of the problem is that you are taking out S
locks on system table rows in tempdb
that can conflict with the locks needed for internal tempdb
cleanup transactions.
问题的关键在于,您正在对tempdb中的系统表行取出S锁,这可能与内部tempdb清理事务所需的锁冲突。
When this clean up work is allocated to the same session that owns the S
lock an indefinite hang can occur.
当此清理工作分配给拥有S锁的同一会话时,可能会发生无限期挂起。
To avoid this problem for certain you need to stop referencing the system
objects inside tempdb
.
为了避免这个问题,您需要停止引用tempdb内的系统对象。
It is possible to create a numbers table without referencing any external tables at all. The following needs to read no base table rows and thus also takes no locks.
可以在不引用任何外部表的情况下创建数字表。以下需要不读取基表行,因此也不需要锁定。
WITH Ten(N) AS
(
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
SELECT TOP 1000000 IDENTITY(INT, 1, 1) Number
INTO Numbers
FROM Ten T10,
Ten T100,
Ten T1000,
Ten T10000,
Ten T100000,
Ten T1000000
Steps to Reproduce
First create a procedure
首先创建一个过程
CREATE PROC P
AS
SET NOCOUNT ON;
DECLARE @T TABLE (X INT)
GO
Then restart the SQL Service and in one connection execute
然后重新启动SQL Service并在一个连接中执行
WHILE NOT EXISTS(SELECT *
FROM sys.dm_os_waiting_tasks
WHERE session_id = blocking_session_id)
BEGIN
/*This will cause the problematic droptemp transactions*/
EXEC sp_recompile 'P'
EXEC P
END;
SELECT *
FROM sys.dm_os_waiting_tasks
WHERE session_id = blocking_session_id
Then in another connection run
然后在另一个连接运行
USE tempdb;
SELECT TOP 1000000 IDENTITY(INT, 1, 1) Number
INTO #T
FROM sys.objects s1
CROSS JOIN sys.objects s2
CROSS JOIN sys.objects s3
CROSS JOIN sys.objects s4;
DROP TABLE #T
The query populating the Numbers table seems to manage to get into a live lock situation with the internal system transactions that clean up temporary objects such as table variables.
填充Numbers表的查询似乎设法进入实时锁定状态,内部系统事务清理临时对象(如表变量)。
I managed to get session id 53 blocked in this manner. It is blocked indefinitely. The output of sp_WhoIsActive
shows that this spid spends almost all of the time suspended. In consecutive runs the numbers in the reads
column increases but the values in the other columns remain largely the same.
我设法以这种方式阻止会话ID 53。它被无限期阻止。 sp_WhoIsActive的输出显示此spid几乎在所有时间都处于暂停状态。在连续运行中,读取列中的数字会增加,但其他列中的值保持大致相同。
The wait duration doesn't show an increasing pattern though indicating that it must get unblocked periodically before getting blocked again.
等待持续时间并未显示增加的模式,但表示必须在再次被阻止之前定期取消阻止。
SELECT *
FROM sys.dm_os_waiting_tasks
WHERE session_id = blocking_session_id
Returns
返回
+----------------------+------------+-----------------+------------------+-----------+--------------------+-----------------------+---------------------+--------------------------+--------------------------------------------------------------------------------------------------+
| waiting_task_address | session_id | exec_context_id | wait_duration_ms | wait_type | resource_address | blocking_task_address | blocking_session_id | blocking_exec_context_id | resource_description |
+----------------------+------------+-----------------+------------------+-----------+--------------------+-----------------------+---------------------+--------------------------+--------------------------------------------------------------------------------------------------+
| 0x00000002F2C170C8 | 53 | 0 | 86 | LCK_M_X | 0x00000002F9B13040 | 0x00000002F2C170C8 | 53 | NULL | keylock hobtid=281474978938880 dbid=2 id=lock2f9ac8880 mode=U associatedObjectId=281474978938880 |
+----------------------+------------+-----------------+------------------+-----------+--------------------+-----------------------+---------------------+--------------------------+--------------------------------------------------------------------------------------------------+
Using the id in the resource description
在资源描述中使用id
SELECT o.name
FROM sys.allocation_units au WITH (NOLOCK)
INNER JOIN sys.partitions p WITH (NOLOCK)
ON au.container_id = p.partition_id
INNER JOIN sys.all_objects o WITH (NOLOCK)
ON o.object_id = p.object_id
WHERE allocation_unit_id = 281474978938880
Returns
返回
+------------+
| name |
+------------+
| sysschobjs |
+------------+
Running
运行
SELECT resource_description,request_status
FROM sys.dm_tran_locks
WHERE request_session_id = 53 AND request_status <> 'GRANT'
Returns
返回
+----------------------+----------------+
| resource_description | request_status |
+----------------------+----------------+
| (246708db8c1f) | CONVERT |
+----------------------+----------------+
Connecting via the DAC and running
通过DAC连接并运行
SELECT id,name
FROM tempdb.sys.sysschobjs WITH (NOLOCK)
WHERE %%LOCKRES%% = '(246708db8c1f)'
Returns
返回
+-------------+-----------+
| id | name |
+-------------+-----------+
| -1578606288 | #A1E86130 |
+-------------+-----------+
Curious about what that is
对这是什么感到好奇
SELECT name,user_type_id
FROM tempdb.sys.columns
WHERE object_id = -1578606288
Returns
返回
+------+--------------+
| name | user_type_id |
+------+--------------+
| X | 56 |
+------+--------------+
This is the column name in the table variable used by the stored proc.
这是存储过程使用的表变量中的列名。
Running
运行
SELECT request_mode,
request_status,
request_session_id,
request_owner_id,
lock_owner_address,
t.transaction_id,
t.name,
t.transaction_begin_time
FROM sys.dm_tran_locks l
JOIN sys.dm_tran_active_transactions t
ON l.request_owner_id = t.transaction_id
WHERE resource_description = '(246708db8c1f)'
Returns
返回
+--------------+----------------+--------------------+------------------+--------------------+----------------+-------------+-------------------------+
| request_mode | request_status | request_session_id | request_owner_id | lock_owner_address | transaction_id | name | transaction_begin_time |
+--------------+----------------+--------------------+------------------+--------------------+----------------+-------------+-------------------------+
| U | GRANT | 53 | 227647 | 0x00000002F1EF6800 | 227647 | droptemp | 2013-11-24 18:36:28.267 |
| S | GRANT | 53 | 191790 | 0x00000002F9B16380 | 191790 | SELECT INTO | 2013-11-24 18:21:30.083 |
| X | CONVERT | 53 | 227647 | 0x00000002F9B12FC0 | 227647 | droptemp | 2013-11-24 18:36:28.267 |
+--------------+----------------+--------------------+------------------+--------------------+----------------+-------------+-------------------------+
So the SELECT INTO
transaction is holding an S
lock on the row in tempdb.sys.sysschobjs
pertaining to the table variable #A1E86130
. The droptemp
transaction cannot get an X
lock on this row because of this conflicting S
lock.
因此,SELECT INTO事务在tempdb.sys.sysschobjs中与表变量#A1E86130相关的行上持有S锁。由于这种冲突的S锁,droptemp事务无法在此行上获得X锁定。
Running this query repeatedly reveals that the transaction_id
for the droptemp
transaction repeatedly changes.
重复运行此查询会显示droptemp事务的transaction_id重复更改。
I speculate that SQL Server must allocate these internal transactions on user spids and prioritise them before doing the user work. So the session id 53 is stuck in a constant cycle where it starts up a droptemp
transaction, is blocked by the user transaction running on the same spid. Rolls back the internal transaction then repeats the process indefinitely.
我推测SQL Server必须在用户spid上分配这些内部事务,并在执行用户工作之前对它们进行优先级排序。因此会话ID 53停留在一个恒定的循环中,它启动一个droptemp事务,被在同一个spid上运行的用户事务阻塞。回滚内部事务,然后无限期地重复该过程。
This is borne out by tracing the various locking and transaction events in SQL Server Profiler after the spid becomes hung.
通过在spid挂起后跟踪SQL Server Profiler中的各种锁定和事务事件来证实这一点。
I also traced the locking events prior to that.
在此之前我还跟踪了锁定事件。
Lock Events Blocking
Most of the shared key locks taken out by the SELECT INTO
transaction on keys in sysschobjs
get released immediately. The exception is the first lock on (246708db8c1f)
.
由sysschobjs中的键上的SELECT INTO事务取出的大多数共享密钥锁立即被释放。例外是第一次锁定(246708db8c1f)。
This makes some sense as the plan shows nested loops scans of [sys].[sysschobjs].[clst] [o]
and because temporary objects get given negative objectids they will be the first rows encountered in scan order.
这是有道理的,因为计划显示[sys]。[sysschobjs]。[clst] [o]的嵌套循环扫描,并且因为临时对象被给予否定对象,它们将是扫描顺序中遇到的第一行。
I also encountered the situation described in the OP where running a three way cross join first seems to allow the four way one to succeed.
我还遇到了OP中描述的情况,其中首先运行三向交叉连接似乎允许四路成功。
The first few events in the trace for the SELECT INTO
transaction there are an entirely different pattern.
SELECT INTO事务跟踪的前几个事件有一个完全不同的模式。
This was after a service restart so the lock resource values in the text data column are not directly comparable.
这是在服务重新启动之后,因此文本数据列中的锁定资源值无法直接比较。
Instead of retaining the lock on the first key and then a pattern of acquiring and releasing subsequent keys it seems to acquire a lot more locks without releasing them initially.
而不是保留第一个键上的锁,然后获取和释放后续键的模式,它似乎获得了更多的锁,而不是最初释放它们。
I presume there must be some variance in execution strategy that avoids the issue.
我认为执行策略必须有一些差异才能避免这个问题。
Update
更新
The Connect Item I raised about this has not been marked as fixed but I am now on SQL Server 2012 SP2 and can now only reproduce temporary self blocking rather than permanent. I still get the self blocking but after some amount of failed attempts to execute the droptemp
transaction successfully it appears to go back to processing the user transaction. After that commits the system transaction then is executed successfully. Still on the same spid. (eight attempts in one example run. I'm not sure if this will be consistently repeated)
我提出的关于这个的连接项目没有被标记为已修复但我现在在SQL Server 2012 SP2上,现在只能重现临时自我阻止而不是永久阻止。我仍然得到自我阻止但是在成功执行droptemp事务的一些失败尝试之后,它似乎又回到处理用户事务。之后提交系统事务然后成功执行。仍在同一个spid上。 (在一个示例中运行了八次尝试。我不确定这是否会一直重复)
#2
9
Rather than chase this problem, why don't you just create the table once in the model
database, then it will be created for you in tempdb
automatically?
而不是追逐这个问题,为什么不在模型数据库中创建一次表,然后它会在tempdb中自动创建?
For the actual problem, we don't know. My first guess would be that your initial size for your tempdb file(s) is very small (like, 1MB). So, when you create the table, it has to expand the file(s) to accommodate it. This can be quite expensive especially if you don't have instant file initialization enabled, and it can also be very expensive to grow the log to accommodate the activity required there as well.
对于实际问题,我们不知道。我的第一个猜测是你的tempdb文件的初始大小非常小(比如,1MB)。因此,在创建表时,必须扩展文件以适应它。这可能非常昂贵,特别是如果您没有启用即时文件初始化,并且增长日志以适应其中所需的活动也可能非常昂贵。
Other than that, we could keep guessing, but you will be better suited to investigate what is actually happening. Questions you'll want to ask:
除此之外,我们可以继续猜测,但你会更适合调查实际发生的事情。你想问的问题:
- For the spid that is trying to create the table, what does
sys.dm_exec_requests
say forwait_type
? - 对于尝试创建表的spid,sys.dm_exec_requests对wait_type说了什么?
- Does it have a
blocking_session_id
? - 它有一个blocking_session_id吗?
- If so, what is that session doing?
- 如果是这样,该会话在做什么?
#1
21
I
can
could also reproduce this 100% of the time on my machine. (see note at end)
我也可以在我的机器上100%重现这个。 (见末尾注)
The gist of the problem is that you are taking out S
locks on system table rows in tempdb
that can conflict with the locks needed for internal tempdb
cleanup transactions.
问题的关键在于,您正在对tempdb中的系统表行取出S锁,这可能与内部tempdb清理事务所需的锁冲突。
When this clean up work is allocated to the same session that owns the S
lock an indefinite hang can occur.
当此清理工作分配给拥有S锁的同一会话时,可能会发生无限期挂起。
To avoid this problem for certain you need to stop referencing the system
objects inside tempdb
.
为了避免这个问题,您需要停止引用tempdb内的系统对象。
It is possible to create a numbers table without referencing any external tables at all. The following needs to read no base table rows and thus also takes no locks.
可以在不引用任何外部表的情况下创建数字表。以下需要不读取基表行,因此也不需要锁定。
WITH Ten(N) AS
(
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
SELECT TOP 1000000 IDENTITY(INT, 1, 1) Number
INTO Numbers
FROM Ten T10,
Ten T100,
Ten T1000,
Ten T10000,
Ten T100000,
Ten T1000000
Steps to Reproduce
First create a procedure
首先创建一个过程
CREATE PROC P
AS
SET NOCOUNT ON;
DECLARE @T TABLE (X INT)
GO
Then restart the SQL Service and in one connection execute
然后重新启动SQL Service并在一个连接中执行
WHILE NOT EXISTS(SELECT *
FROM sys.dm_os_waiting_tasks
WHERE session_id = blocking_session_id)
BEGIN
/*This will cause the problematic droptemp transactions*/
EXEC sp_recompile 'P'
EXEC P
END;
SELECT *
FROM sys.dm_os_waiting_tasks
WHERE session_id = blocking_session_id
Then in another connection run
然后在另一个连接运行
USE tempdb;
SELECT TOP 1000000 IDENTITY(INT, 1, 1) Number
INTO #T
FROM sys.objects s1
CROSS JOIN sys.objects s2
CROSS JOIN sys.objects s3
CROSS JOIN sys.objects s4;
DROP TABLE #T
The query populating the Numbers table seems to manage to get into a live lock situation with the internal system transactions that clean up temporary objects such as table variables.
填充Numbers表的查询似乎设法进入实时锁定状态,内部系统事务清理临时对象(如表变量)。
I managed to get session id 53 blocked in this manner. It is blocked indefinitely. The output of sp_WhoIsActive
shows that this spid spends almost all of the time suspended. In consecutive runs the numbers in the reads
column increases but the values in the other columns remain largely the same.
我设法以这种方式阻止会话ID 53。它被无限期阻止。 sp_WhoIsActive的输出显示此spid几乎在所有时间都处于暂停状态。在连续运行中,读取列中的数字会增加,但其他列中的值保持大致相同。
The wait duration doesn't show an increasing pattern though indicating that it must get unblocked periodically before getting blocked again.
等待持续时间并未显示增加的模式,但表示必须在再次被阻止之前定期取消阻止。
SELECT *
FROM sys.dm_os_waiting_tasks
WHERE session_id = blocking_session_id
Returns
返回
+----------------------+------------+-----------------+------------------+-----------+--------------------+-----------------------+---------------------+--------------------------+--------------------------------------------------------------------------------------------------+
| waiting_task_address | session_id | exec_context_id | wait_duration_ms | wait_type | resource_address | blocking_task_address | blocking_session_id | blocking_exec_context_id | resource_description |
+----------------------+------------+-----------------+------------------+-----------+--------------------+-----------------------+---------------------+--------------------------+--------------------------------------------------------------------------------------------------+
| 0x00000002F2C170C8 | 53 | 0 | 86 | LCK_M_X | 0x00000002F9B13040 | 0x00000002F2C170C8 | 53 | NULL | keylock hobtid=281474978938880 dbid=2 id=lock2f9ac8880 mode=U associatedObjectId=281474978938880 |
+----------------------+------------+-----------------+------------------+-----------+--------------------+-----------------------+---------------------+--------------------------+--------------------------------------------------------------------------------------------------+
Using the id in the resource description
在资源描述中使用id
SELECT o.name
FROM sys.allocation_units au WITH (NOLOCK)
INNER JOIN sys.partitions p WITH (NOLOCK)
ON au.container_id = p.partition_id
INNER JOIN sys.all_objects o WITH (NOLOCK)
ON o.object_id = p.object_id
WHERE allocation_unit_id = 281474978938880
Returns
返回
+------------+
| name |
+------------+
| sysschobjs |
+------------+
Running
运行
SELECT resource_description,request_status
FROM sys.dm_tran_locks
WHERE request_session_id = 53 AND request_status <> 'GRANT'
Returns
返回
+----------------------+----------------+
| resource_description | request_status |
+----------------------+----------------+
| (246708db8c1f) | CONVERT |
+----------------------+----------------+
Connecting via the DAC and running
通过DAC连接并运行
SELECT id,name
FROM tempdb.sys.sysschobjs WITH (NOLOCK)
WHERE %%LOCKRES%% = '(246708db8c1f)'
Returns
返回
+-------------+-----------+
| id | name |
+-------------+-----------+
| -1578606288 | #A1E86130 |
+-------------+-----------+
Curious about what that is
对这是什么感到好奇
SELECT name,user_type_id
FROM tempdb.sys.columns
WHERE object_id = -1578606288
Returns
返回
+------+--------------+
| name | user_type_id |
+------+--------------+
| X | 56 |
+------+--------------+
This is the column name in the table variable used by the stored proc.
这是存储过程使用的表变量中的列名。
Running
运行
SELECT request_mode,
request_status,
request_session_id,
request_owner_id,
lock_owner_address,
t.transaction_id,
t.name,
t.transaction_begin_time
FROM sys.dm_tran_locks l
JOIN sys.dm_tran_active_transactions t
ON l.request_owner_id = t.transaction_id
WHERE resource_description = '(246708db8c1f)'
Returns
返回
+--------------+----------------+--------------------+------------------+--------------------+----------------+-------------+-------------------------+
| request_mode | request_status | request_session_id | request_owner_id | lock_owner_address | transaction_id | name | transaction_begin_time |
+--------------+----------------+--------------------+------------------+--------------------+----------------+-------------+-------------------------+
| U | GRANT | 53 | 227647 | 0x00000002F1EF6800 | 227647 | droptemp | 2013-11-24 18:36:28.267 |
| S | GRANT | 53 | 191790 | 0x00000002F9B16380 | 191790 | SELECT INTO | 2013-11-24 18:21:30.083 |
| X | CONVERT | 53 | 227647 | 0x00000002F9B12FC0 | 227647 | droptemp | 2013-11-24 18:36:28.267 |
+--------------+----------------+--------------------+------------------+--------------------+----------------+-------------+-------------------------+
So the SELECT INTO
transaction is holding an S
lock on the row in tempdb.sys.sysschobjs
pertaining to the table variable #A1E86130
. The droptemp
transaction cannot get an X
lock on this row because of this conflicting S
lock.
因此,SELECT INTO事务在tempdb.sys.sysschobjs中与表变量#A1E86130相关的行上持有S锁。由于这种冲突的S锁,droptemp事务无法在此行上获得X锁定。
Running this query repeatedly reveals that the transaction_id
for the droptemp
transaction repeatedly changes.
重复运行此查询会显示droptemp事务的transaction_id重复更改。
I speculate that SQL Server must allocate these internal transactions on user spids and prioritise them before doing the user work. So the session id 53 is stuck in a constant cycle where it starts up a droptemp
transaction, is blocked by the user transaction running on the same spid. Rolls back the internal transaction then repeats the process indefinitely.
我推测SQL Server必须在用户spid上分配这些内部事务,并在执行用户工作之前对它们进行优先级排序。因此会话ID 53停留在一个恒定的循环中,它启动一个droptemp事务,被在同一个spid上运行的用户事务阻塞。回滚内部事务,然后无限期地重复该过程。
This is borne out by tracing the various locking and transaction events in SQL Server Profiler after the spid becomes hung.
通过在spid挂起后跟踪SQL Server Profiler中的各种锁定和事务事件来证实这一点。
I also traced the locking events prior to that.
在此之前我还跟踪了锁定事件。
Lock Events Blocking
Most of the shared key locks taken out by the SELECT INTO
transaction on keys in sysschobjs
get released immediately. The exception is the first lock on (246708db8c1f)
.
由sysschobjs中的键上的SELECT INTO事务取出的大多数共享密钥锁立即被释放。例外是第一次锁定(246708db8c1f)。
This makes some sense as the plan shows nested loops scans of [sys].[sysschobjs].[clst] [o]
and because temporary objects get given negative objectids they will be the first rows encountered in scan order.
这是有道理的,因为计划显示[sys]。[sysschobjs]。[clst] [o]的嵌套循环扫描,并且因为临时对象被给予否定对象,它们将是扫描顺序中遇到的第一行。
I also encountered the situation described in the OP where running a three way cross join first seems to allow the four way one to succeed.
我还遇到了OP中描述的情况,其中首先运行三向交叉连接似乎允许四路成功。
The first few events in the trace for the SELECT INTO
transaction there are an entirely different pattern.
SELECT INTO事务跟踪的前几个事件有一个完全不同的模式。
This was after a service restart so the lock resource values in the text data column are not directly comparable.
这是在服务重新启动之后,因此文本数据列中的锁定资源值无法直接比较。
Instead of retaining the lock on the first key and then a pattern of acquiring and releasing subsequent keys it seems to acquire a lot more locks without releasing them initially.
而不是保留第一个键上的锁,然后获取和释放后续键的模式,它似乎获得了更多的锁,而不是最初释放它们。
I presume there must be some variance in execution strategy that avoids the issue.
我认为执行策略必须有一些差异才能避免这个问题。
Update
更新
The Connect Item I raised about this has not been marked as fixed but I am now on SQL Server 2012 SP2 and can now only reproduce temporary self blocking rather than permanent. I still get the self blocking but after some amount of failed attempts to execute the droptemp
transaction successfully it appears to go back to processing the user transaction. After that commits the system transaction then is executed successfully. Still on the same spid. (eight attempts in one example run. I'm not sure if this will be consistently repeated)
我提出的关于这个的连接项目没有被标记为已修复但我现在在SQL Server 2012 SP2上,现在只能重现临时自我阻止而不是永久阻止。我仍然得到自我阻止但是在成功执行droptemp事务的一些失败尝试之后,它似乎又回到处理用户事务。之后提交系统事务然后成功执行。仍在同一个spid上。 (在一个示例中运行了八次尝试。我不确定这是否会一直重复)
#2
9
Rather than chase this problem, why don't you just create the table once in the model
database, then it will be created for you in tempdb
automatically?
而不是追逐这个问题,为什么不在模型数据库中创建一次表,然后它会在tempdb中自动创建?
For the actual problem, we don't know. My first guess would be that your initial size for your tempdb file(s) is very small (like, 1MB). So, when you create the table, it has to expand the file(s) to accommodate it. This can be quite expensive especially if you don't have instant file initialization enabled, and it can also be very expensive to grow the log to accommodate the activity required there as well.
对于实际问题,我们不知道。我的第一个猜测是你的tempdb文件的初始大小非常小(比如,1MB)。因此,在创建表时,必须扩展文件以适应它。这可能非常昂贵,特别是如果您没有启用即时文件初始化,并且增长日志以适应其中所需的活动也可能非常昂贵。
Other than that, we could keep guessing, but you will be better suited to investigate what is actually happening. Questions you'll want to ask:
除此之外,我们可以继续猜测,但你会更适合调查实际发生的事情。你想问的问题:
- For the spid that is trying to create the table, what does
sys.dm_exec_requests
say forwait_type
? - 对于尝试创建表的spid,sys.dm_exec_requests对wait_type说了什么?
- Does it have a
blocking_session_id
? - 它有一个blocking_session_id吗?
- If so, what is that session doing?
- 如果是这样,该会话在做什么?