I'm currently merging large data sets in Windows Azure SQL Database. I was wondering if there was a way not to get the following error:
我目前正在Windows Azure SQL数据库中合并大型数据集。我想知道是否有办法不会出现以下错误:
40552 : The session has been terminated because of excessive transaction log space usage. Try modifying fewer rows in a single transaction.
40552:由于事务日志空间使用过多,会话已终止。尝试在单个事务中修改较少的行。
My data sets are larger than 15 million records.
我的数据集大于1500万条记录。
1 个解决方案
#1
3
From an application find the number of record that need to be imported and based on the batch size calculate the number of batches that need to be executed. Then for each batch, create a new SQL Command which executes the merge Stored Procedure with the right offset Index. Be sure to insert a short delay between each SQL Command, because if they run too fast Windows Azure SQL Database might raise a 40501 due to an excessive number of requests.
从应用程序中查找需要导入的记录数,并根据批量大小计算需要执行的批次数。然后,对于每个批处理,创建一个新的SQL命令,该命令使用正确的偏移索引执行合并存储过程。确保在每个SQL命令之间插入一个短暂的延迟,因为如果它们运行得太快,由于请求数量过多,Windows Azure SQL数据库可能会引发40501。
CREATE PROCEDURE [dbo].[MergeCustomerTables]
@offsetIndex int = 0,
@batchSize int = 10000
AS
DECLARE @offset as bigint
SELECT @offset = @offsetIndex * @batchSize
MERGE Customers AS Target
USING (SELECT a.[Name], a.[LastName], a.[Email] from CustomerInsertTable as a
ORDER BY a.[Email]
OFFSET @offset ROWS
FETCH NEXT @batchSize ROWS ONLY
) AS Source
ON (Target.[Email] = Source.[Email])
WHEN MATCHED THEN
UPDATE SET Target.[Name] = Source.[Name]
, Target.[LastName] = Source.[LastName]
WHEN NOT MATCHED BY TARGET THEN
INSERT ([Name], [LastName], [Email])
VALUES (
Source.[Name]
, Source.[LastName]
, Source.[Email]
);
#1
3
From an application find the number of record that need to be imported and based on the batch size calculate the number of batches that need to be executed. Then for each batch, create a new SQL Command which executes the merge Stored Procedure with the right offset Index. Be sure to insert a short delay between each SQL Command, because if they run too fast Windows Azure SQL Database might raise a 40501 due to an excessive number of requests.
从应用程序中查找需要导入的记录数,并根据批量大小计算需要执行的批次数。然后,对于每个批处理,创建一个新的SQL命令,该命令使用正确的偏移索引执行合并存储过程。确保在每个SQL命令之间插入一个短暂的延迟,因为如果它们运行得太快,由于请求数量过多,Windows Azure SQL数据库可能会引发40501。
CREATE PROCEDURE [dbo].[MergeCustomerTables]
@offsetIndex int = 0,
@batchSize int = 10000
AS
DECLARE @offset as bigint
SELECT @offset = @offsetIndex * @batchSize
MERGE Customers AS Target
USING (SELECT a.[Name], a.[LastName], a.[Email] from CustomerInsertTable as a
ORDER BY a.[Email]
OFFSET @offset ROWS
FETCH NEXT @batchSize ROWS ONLY
) AS Source
ON (Target.[Email] = Source.[Email])
WHEN MATCHED THEN
UPDATE SET Target.[Name] = Source.[Name]
, Target.[LastName] = Source.[LastName]
WHEN NOT MATCHED BY TARGET THEN
INSERT ([Name], [LastName], [Email])
VALUES (
Source.[Name]
, Source.[LastName]
, Source.[Email]
);