I want to calculate interest over millions of record in mysql. So I am using scheduler
我想计算mysql中数百万条记录的兴趣。所以我正在使用调度程序
create event cal_interest
on every 1 day
do
update
userTable
set interest=(money*rate)/100
My question are:
1. Is it possible millions of record update simultaneously?
2. Any possibility some record updated and some are failure.
3. If it wrong way to calculate interest over multiple record so please suggest me how can do this ?
我的问题是:1。是否有可能同时更新数百万条记录? 2.任何记录更新的可能性,有些是失败的。 3.如果计算多个记录的利息是错误的,那么请建议我怎么做?
1 个解决方案
#1
3
-
In general, updating millions of rows at once isn't a very good idea. especially if you have database cluster (there'll be replication delays almost for sure). Better strategy is to split the update into batches.
通常,一次更新数百万行并不是一个好主意。特别是如果你有数据库集群(几乎肯定会有复制延迟)。更好的策略是将更新分成批次。
-
Yes. There is always a possibility of failure
是。总有可能失败
-
See 1 :) Split your table into batches of N records (N from 100 to 1000) and update them batch-by-batch. One of the strategies is to make a client job that initiates and monitors these batch updates. (one possible way: add an indexed field to store the date of the last update, then choose N rows which have last_update_date < current_date)
请参阅1 :)将表拆分为N批记录(N从100到1000),并逐批更新。其中一个策略是创建一个启动和监视这些批量更新的客户端作业。 (一种可能的方法:添加一个索引字段来存储上次更新的日期,然后选择N行,其中包含last_update_date
)
Comment: by "splitting the table" I didn't mean physically splitting, just the following:
评论:通过“拆分表”,我并不是指物理分裂,只是如下:
-
add the field where you keep the date of the last sync (and make it indexed) (e.g. last_sync_date);
添加保存上次同步日期的字段(并将其编入索引)(例如last_sync_date);
-
when the job starts, within the cycle do the following:
当工作开始时,在周期内执行以下操作:
-
retrieve ID's of the next N records (e.g. N=500) with last_sync_date < curdate():
使用last_sync_date
()检索下n个记录的id(例如n> -
if you didn't get anything, you are done, exit the cycle;
如果你没有得到任何东西,你就完成了,退出了周期;
-
otherwise,
set interest=(money*rate)/100, last_sync_date = curdate()
for the records with these IDs .否则,为具有这些ID的记录设置interest =(money * rate)/ 100,last_sync_date = curdate()。
I would rather do it as a job written outside of MySQL and scheduled via e.g. cron (because then it's easier to monitor how the job runs and kill it if necessary ), but you can, in theory, do it in MySQL too, for example (untested) something like that (I assume that your records have unique IDs stored in the field
id
):我宁愿这样做是作为一个在MySQL之外编写的工作,并通过例如cron(因为这样可以更容易地监视作业如何运行并在必要时杀死它),但理论上你可以在MySQL中执行它,例如(未经测试)类似的东西(我假设你的记录存有唯一的ID)在字段id):
-
delimiter | create event cal_interest on every 1 day do create temporary table if not exists temp_ids(id int) engine=memory; declare keep_sync int default 1; begin repeat truncate temp_ids; insert into temp_ids(id) select id from userTable where last_sync_date < curdate() limit 500; select count(1) from temp_ids into keep_sync; update userTable set interest=(money*rate)/100, last_sync_date = curdate() where id in (select id from temp_ids) ids; until keep_sync>0; drop table temp_ids; end | delimiter ;
#1
3
-
In general, updating millions of rows at once isn't a very good idea. especially if you have database cluster (there'll be replication delays almost for sure). Better strategy is to split the update into batches.
通常,一次更新数百万行并不是一个好主意。特别是如果你有数据库集群(几乎肯定会有复制延迟)。更好的策略是将更新分成批次。
-
Yes. There is always a possibility of failure
是。总有可能失败
-
See 1 :) Split your table into batches of N records (N from 100 to 1000) and update them batch-by-batch. One of the strategies is to make a client job that initiates and monitors these batch updates. (one possible way: add an indexed field to store the date of the last update, then choose N rows which have last_update_date < current_date)
请参阅1 :)将表拆分为N批记录(N从100到1000),并逐批更新。其中一个策略是创建一个启动和监视这些批量更新的客户端作业。 (一种可能的方法:添加一个索引字段来存储上次更新的日期,然后选择N行,其中包含last_update_date
)
Comment: by "splitting the table" I didn't mean physically splitting, just the following:
评论:通过“拆分表”,我并不是指物理分裂,只是如下:
-
add the field where you keep the date of the last sync (and make it indexed) (e.g. last_sync_date);
添加保存上次同步日期的字段(并将其编入索引)(例如last_sync_date);
-
when the job starts, within the cycle do the following:
当工作开始时,在周期内执行以下操作:
-
retrieve ID's of the next N records (e.g. N=500) with last_sync_date < curdate():
使用last_sync_date
()检索下n个记录的id(例如n> -
if you didn't get anything, you are done, exit the cycle;
如果你没有得到任何东西,你就完成了,退出了周期;
-
otherwise,
set interest=(money*rate)/100, last_sync_date = curdate()
for the records with these IDs .否则,为具有这些ID的记录设置interest =(money * rate)/ 100,last_sync_date = curdate()。
I would rather do it as a job written outside of MySQL and scheduled via e.g. cron (because then it's easier to monitor how the job runs and kill it if necessary ), but you can, in theory, do it in MySQL too, for example (untested) something like that (I assume that your records have unique IDs stored in the field
id
):我宁愿这样做是作为一个在MySQL之外编写的工作,并通过例如cron(因为这样可以更容易地监视作业如何运行并在必要时杀死它),但理论上你可以在MySQL中执行它,例如(未经测试)类似的东西(我假设你的记录存有唯一的ID)在字段id):
-
delimiter | create event cal_interest on every 1 day do create temporary table if not exists temp_ids(id int) engine=memory; declare keep_sync int default 1; begin repeat truncate temp_ids; insert into temp_ids(id) select id from userTable where last_sync_date < curdate() limit 500; select count(1) from temp_ids into keep_sync; update userTable set interest=(money*rate)/100, last_sync_date = curdate() where id in (select id from temp_ids) ids; until keep_sync>0; drop table temp_ids; end | delimiter ;