问题描述:
问题表的数据量在千万级别,无主键、无唯一字段、存在大字段,由于业务及性能需求,现需要给问题表增添一列主键,要去主键按顺序增长,且后面为自增长。
解决方式:
1.先给问题表,增加一个int类型的普通字段ID;
2.给已存在的记录设置ID值;
set @rownum=0;
update table xx_rel a set a.id=(@rownum:=@rownum+1);
因为这个表里面有大字段,修改时会产生大量的二进制日志,因为服务器参数大小限制,导致修改报错。
于是按记录分布情况分批统计,分段修改:
+--------------------------------+----------+ | DATE_FORMAT(UPDATED_AT,'%Y%m') | count(*) | +--------------------------------+----------+ | 201604 | 10 | | 201605 | 146 | | 201606 | 441 | | 201607 | 505 | | 201608 | 1317 | | 201609 | 14485 | | 201610 | 17110 | | 201611 | 35426 | | 201612 | 29740 | +--------------------------------+----------+ set @rownum = 0; UPDATE tt a SET a.ID = (@rownum:=@rownum+1) where DATE_FORMAT(UPDATED_AT,'%Y%m')='201604'; set @rownum = 100; UPDATE tt a SET a.ID = (@rownum:=@rownum+1) where DATE_FORMAT(UPDATED_AT,'%Y%m')='201605'; set @rownum = 300; UPDATE tt a SET a.ID = (@rownum:=@rownum+1) where DATE_FORMAT(UPDATED_AT,'%Y%m')='201606'; set @rownum = 1000; UPDATE tt a SET a.ID = (@rownum:=@rownum+1) where DATE_FORMAT(UPDATED_AT,'%Y%m')='201607'; set @rownum = 1600; UPDATE tt a SET a.ID = (@rownum:=@rownum+1) where DATE_FORMAT(UPDATED_AT,'%Y%m')='201608'; set @rownum = 3000; UPDATE tt a SET a.ID = (@rownum:=@rownum+1) where DATE_FORMAT(UPDATED_AT,'%Y%m')='201609'; set @rownum = 20000; UPDATE tt a SET a.ID = (@rownum:=@rownum+1) where DATE_FORMAT(UPDATED_AT,'%Y%m')='201610'; set @rownum = 40000; UPDATE tt a SET a.ID = (@rownum:=@rownum+1) where DATE_FORMAT(UPDATED_AT,'%Y%m')='201611'; set @rownum = 80000; UPDATE tt a SET a.ID = (@rownum:=@rownum+1) where DATE_FORMAT(UPDATED_AT,'%Y%m')='201612'; SELECT COUNT(*) FROM tt a WHERE a.ID IS NULL; 0 SELECT a.ID,COUNT(*) FROM tt a GROUP BY a.ID HAVING COUNT(*)>1; 0 select id,updated_at from tt where id in( SELECT a.ID FROM tt a GROUP BY a.ID HAVING COUNT(*)>1); set @rownum = 110000; UPDATE tt a SET a.ID = (@rownum:=@rownum+1) where a.ID IS NULL; ALTER TABLE tt MODIFY COLUMN ID bigint(20) NOT NULL PRIMARY KEY AUTO_INCREMENT COMMENT '主键';
ALTER TABLE tt AUTO_INCREMENT=120000;
小结:
1、关于大字段,要么存成文件的形式,要么单独存一张表,不要和其它字段和在一起。这样的话会导致任何对该表的操作都会牵一发而动全身。
2、表一定要有主键;
|