I want to :
我要 :
- Select N rows from a table for processing where flag=0
- 从表中选择N行以进行处理,其中flag = 0
- Do some work on a second table using values from these N rows
- 使用来自这N行的值在第二个表上进行一些工作
- Update these N rows and set flag=1
- 更新这N行并设置flag = 1
I have parallel processes doing this same work together, and i want to ensure that all get to work on unique rows. How do i ensure that?
我有并行进程一起完成同样的工作,我想确保所有人都能在独特的行上工作。我该如何确保?
2 个解决方案
#1
5
I assume you are running on SQL Server (because of the tag), if not then my answer is not applicable. Locking alone is not enough. If you use database record locking SqL server will block other processes trying to access the locked row and in effect you will handle only one row at a time. The solution for you is to combine row locking with READPAST hint so the rows locked by someone else will be skipped. Here's what each process should do:
我假设你在SQL Server上运行(因为标签),如果没有,那么我的答案不适用。单独锁定是不够的。如果使用数据库记录锁定,SqL服务器将阻止尝试访问锁定行的其他进程,实际上您一次只能处理一行。您的解决方案是将行锁定与READPAST提示相结合,以便跳过其他人锁定的行。以下是每个流程应该做的事情:
- select next unlocked row for processing and lock it
- 选择下一个未锁定的行进行处理并锁定它
- do the work
- 做的工作
- update the row and end transaction
- 更新行和结束事务
select top 1 id, ... from TheTable with (updlock, readpast) where flag = 0
选择top 1 id,...来自TheTable with(updlock,readpast),其中flag = 0
//do the work now
//现在就去做
update TheTable set flag = 1 where id=<previously retrieved id>
update TheTable set flag = 1其中id = <先前检索的id>
The nice thing here that the operation of selecting the next unlocked row and locking it is atomic so it guarantees that no one else will be able to select the same row.
这里的好处是选择下一个未锁定行并锁定它的操作是原子的,因此它保证没有其他人能够选择同一行。
#2
0
One way is to have a master program hand out segments to the child threads.
一种方法是让主程序将子句分发给子线程。
Another way is to lock the table, get CEIL(N/#processes)
rows where flag = 0, update the flag to 2, then release the lock. Then the next process will continue since it got the lock, and since flag = 2 it won't get those rows.
另一种方法是锁定表,获取CEIL(N /#个进程)行,其中flag = 0,将标志更新为2,然后释放锁。然后下一个进程将继续,因为它获得了锁定,并且由于flag = 2,它将不会获得这些行。
You have two ways to lock the table - you can either lock the whole thing, or do SELECT ... FOR UPDATE with a limit (to not get too many rows). See: SELECT FOR UPDATE with SQL Server
你有两种方法来锁定表 - 你可以锁定整个事物,或者使用限制执行SELECT ... FOR UPDATE(不要获得太多行)。请参阅:使用SQL Server进行SELECT FOR UPDATE
Even better than setting the flag to 2 is set the flag to the process_id. Then all you have to do is update all the rows to distribute numbers, then let the process go to work, each checking only their own rows.
甚至比将标志设置为2更好的是将标志设置为process_id。然后,您所要做的就是更新所有行以分配数字,然后让流程开始工作,每个行只检查自己的行。
#1
5
I assume you are running on SQL Server (because of the tag), if not then my answer is not applicable. Locking alone is not enough. If you use database record locking SqL server will block other processes trying to access the locked row and in effect you will handle only one row at a time. The solution for you is to combine row locking with READPAST hint so the rows locked by someone else will be skipped. Here's what each process should do:
我假设你在SQL Server上运行(因为标签),如果没有,那么我的答案不适用。单独锁定是不够的。如果使用数据库记录锁定,SqL服务器将阻止尝试访问锁定行的其他进程,实际上您一次只能处理一行。您的解决方案是将行锁定与READPAST提示相结合,以便跳过其他人锁定的行。以下是每个流程应该做的事情:
- select next unlocked row for processing and lock it
- 选择下一个未锁定的行进行处理并锁定它
- do the work
- 做的工作
- update the row and end transaction
- 更新行和结束事务
select top 1 id, ... from TheTable with (updlock, readpast) where flag = 0
选择top 1 id,...来自TheTable with(updlock,readpast),其中flag = 0
//do the work now
//现在就去做
update TheTable set flag = 1 where id=<previously retrieved id>
update TheTable set flag = 1其中id = <先前检索的id>
The nice thing here that the operation of selecting the next unlocked row and locking it is atomic so it guarantees that no one else will be able to select the same row.
这里的好处是选择下一个未锁定行并锁定它的操作是原子的,因此它保证没有其他人能够选择同一行。
#2
0
One way is to have a master program hand out segments to the child threads.
一种方法是让主程序将子句分发给子线程。
Another way is to lock the table, get CEIL(N/#processes)
rows where flag = 0, update the flag to 2, then release the lock. Then the next process will continue since it got the lock, and since flag = 2 it won't get those rows.
另一种方法是锁定表,获取CEIL(N /#个进程)行,其中flag = 0,将标志更新为2,然后释放锁。然后下一个进程将继续,因为它获得了锁定,并且由于flag = 2,它将不会获得这些行。
You have two ways to lock the table - you can either lock the whole thing, or do SELECT ... FOR UPDATE with a limit (to not get too many rows). See: SELECT FOR UPDATE with SQL Server
你有两种方法来锁定表 - 你可以锁定整个事物,或者使用限制执行SELECT ... FOR UPDATE(不要获得太多行)。请参阅:使用SQL Server进行SELECT FOR UPDATE
Even better than setting the flag to 2 is set the flag to the process_id. Then all you have to do is update all the rows to distribute numbers, then let the process go to work, each checking only their own rows.
甚至比将标志设置为2更好的是将标志设置为process_id。然后,您所要做的就是更新所有行以分配数字,然后让流程开始工作,每个行只检查自己的行。