大数据量表中,增加一个NOT NULL的新列

时间:2022-12-23 15:53:44

 

这次,发布清洗列表功能,需要对数据库进行升级。MailingList表加个IfCleaning字段,所有的t_User*表加个IfCleaned字段。

 

脚本如下

对所有的t_User表执行

alter table t_User** add IfCleaned bit default(0) not null

对Mailing list表执行

alter table t_MailingList add IfCleanning bit default(0) not null

 

简简单单的两个语句,在执行过程中,Solution生产环境机器变得无法访问,远程连不上,Solution程序也连不上了。后来不得不打电话到萧山机房叫机房那边把机器强制重启。

 

当时就分析原因应该是整个脚本需要执行的t_User表太多,表中数据太大,引起数据库将内存或者磁盘资源占满了。t_User表中有几千万数据的表也存在的。

 

以后对大数据量的表进行操作时要格外小心,不管是程序对表的操作,还是数据库升级时对表的操作,都需要在大数据量下进行完备的测试后才可以进行发布。

 

经过讨论和研究,新的脚本如下,将原先一步的脚本分为好多步,而且,在第三步中又分为好多步。方法主要参考右侧的*的文章。http://*.com/questions/287954/how-do-you-add-a-not-null-column-to-a-large-table-in-sql-server

 

第一步,增加新列,赋予默认值,允许为NULL

alter table t_User***** add NewColumnIfCleaned bit default(0)

这样,表中原有记录的值均为NULL,

但若有新的记录插入进来,新纪录的该列值为默认的0.

 

第二步,增加一个NOT NULL的约束,并设置NOCHECK

alter table t_User***** with nocheck add constraint NewColumnIfCleaned_NotNull check (NewColumnIfCleaned is not null)

这样,表中原有记录仍可保持为NULL,

若插入新纪录,则会有这个NOT NULL的约束

 

第三步,分批将原有记录更新为0. 一次执行3000条,完整的脚本如下。

原文章中的Go 1000的方法在我们这里并不适用,因为我们要GO多少次是不确定的,要根据t_user表数据量来计算出来的。

 

declare @i int, @strSql nvarchar(2000), @table nvarchar(200), @start int, @strNum nvarchar(100), @preUpate int, @totalCount int, @goCount int, @siteDelay datetime, @dbDelay datetime, @tbDelay datetime;

 

select @preUpate=3000, @siteDelay='00:00:02', @dbDelay='00:05:00', @tbDelay='00:00:01';

 

declare @time1 datetime;

select @time1=GETDATE();

 

--更新Site****库

use [Comm100.Site****]

select @totalCount=0,@goCount=0,@i=0,@start=10000000;

while @i<500

begin

select @strNum= CONVERT(nvarchar(50),@start+@i);

select @table='t_User'+@strNum;

select @strSql='if exists (select top 1 * from [dbo].[sysobjects] where [Id]=object_id(N''[dbo].['+@table+']'') and objectproperty(id, N''IsUserTable'') = 1)

begin

if exists(select * from syscolumns where id=OBJECT_ID('''+@table+''') and name=''IfCleaned'')

begin

select @totalCount1=count(0) from '+@table+' where IfCleaned is null;

end

end

';

 

--print @strSql;

exec sp_executesql @strSql,N'@totalCount1 int output',@totalCount output;

 

if(@totalCount>0)

begin

select @goCount=@totalCount / @preUpate +1;

        while (@goCount>0)

         begin      

select @strSql='       

update top('+CONVERT(nvarchar(10),@preUpate)+') '+@table+' set IfCleaned=0 where IfCleaned is null;

';

--print @strSql;       

exec(@strSql);

select @goCount=@goCount-1;

waitfor delay @tbDelay;

end

end

set @i=@i+1;

waitfor delay @siteDelay;

end

 

select DATEDIFF(MILLISECOND,@time1,GETDATE());

 

注:在新的SQL Server 2012中,在表中增加一个NOT NULL的字段,情况好像有所不同,

Adding NOT NULL Columns as an Online Operation

In SQL Server 2012 Enterprise Edition, adding a NOT NULL column with a default value is an online operation when the default value is a runtime constant. This means that the operation is completed almost instantaneously regardless of the number of rows in the table. This is because the existing rows in the table are not updated during the operation; instead, the default value is stored only in the metadata of the table and the value is looked up as needed in queries that access these rows. This behavior is automatic; no additional syntax is required to implement the online operation beyond the ADD COLUMN syntax. A runtime constant is an expression that produces the same value at runtime for each row in the table regardless of its determinism. For example, the constant expression "My temporary data", or the system function GETUTCDATETIME() are runtime constants. In contrast, the functions NEWID() or NEWSEQUENTIALID() are not runtime constants because a unique value is produced for each row in the table. Adding a NOT NULL column with a default value that is not a runtime constant is always performed offline and an exclusive (SCH-M) lock is acquired for the duration of the operation.

While the existing rows reference the value stored in metadata, the default value is stored on the row for any new rows that are inserted and do not specify another value for the column. The default value stored in metadata is moved to an existing row when the row is updated (even if the actual column is not specified in the UPDATE statement), or if the table or clustered index is rebuilt.

Columns of type varchar(max), nvarchar(max), varbinary(max), xml, text, ntext, image, hierarchyid, geometry, geography, or CLR UDTS, cannot be added in an online operation. A column cannot be added online if doing so causes the maximum possible row size to exceed the 8,060 byte limit. The column is added as an offline operation in this case.