I am trying to implement a transaction system for Cassandra with the help of ZooKeeper. Since I don't think I have enough experience in database implementation, I would like to know if my idea would work in principle, or is there any major flaw.
我试图在ZooKeeper的帮助下为Cassandra实现一个事务系统。由于我认为我没有足够的数据库实施经验,我想知道我的想法是否原则上有效,或者是否有任何重大缺陷。
Here is the high level description of the steps:
以下是步骤的高级描述:
- identify all the rows(keys) and columns to be edited. Let the keys be [K0..Kn]
- 识别要编辑的所有行(键)和列。让键为[K0..Kn]
- apply write lock on all the rows involved (locks are in-memory Zookeeper implementation)
- 对所有涉及的行应用写锁定(锁是内存中的Zookeeper实现)
- copy the old values to separate locations in Cassandra which are uniquely identified by key: [K'0..K'n]
- 将旧值复制到Cassandra中由键唯一标识的不同位置:[K'0..K'n]
- store [K'0..K'n] and the mapping of them to [K0..Kn] in ZooKeeper using persistent mode
- 使用持久模式存储[K'0..K'n]并将它们映射到ZooKeeper中的[K0..Kn]
- go ahead apply the update to the data
- 继续将更新应用于数据
- delete the entries in ZooKeeper
- 删除ZooKeeper中的条目
- unlock the rows
- 解锁行
- delete the entries of [K'0..K'n] lazily on a maintenance thread (cassandra deletion uses timestamp, so K'0..K'n can be reused for another transaction with a newer time stamp)
- 在维护线程上懒洋洋地删除[K'0..K'n]的条目(cassandra删除使用时间戳,因此K'0..K'n可以重新用于具有较新时间戳的另一个事务)
Justification:
理由:
- if the transaction failed on step 1-4, no change is applied, I can abort the transaction and delete whatever is stored in zookeeper and backup-ed in cassandra, if any.
- 如果在步骤1-4中事务失败,则不应用任何更改,我可以中止事务并删除存储在zookeeper中的任何内容并在cassandra中备份,如果有的话。
- if the transaction failed on step 5, the information saved on step 3 is used to rollback the any changes.
- 如果在步骤5中事务失败,则在步骤3中保存的信息用于回滚任何更改。
- if the server happen to be failed/crashed/stolen by cleaning man, upon restart before serving any request, I check if there is any keys persisted in the zookeeper from step 4, if so, i will use those keys to fetch backed up data stored by step 3, and put those data to where they were, thus roll-back any failed transactions.
- 如果服务器碰巧被清理人员发生故障/崩溃/被盗,在重新启动服务器任务请求之前,我会检查动物园管理员是否在步骤4中保留了任何密钥,如果是,我将使用这些密钥来获取备份数据由步骤3存储,并将这些数据放到它们所在的位置,从而回滚任何失败的事务。
One of my concern is what would happen if some of the servers are partitioned from the cluster. I have no experience in this area, does my scheme work at all? and does it work if partition happens?
我担心的一个问题是,如果某些服务器是从群集中分区的,会发生什么。我没有这方面的经验,我的计划是否有效?分区发生时是否有效?
1 个解决方案
#1
5
You should look into Cages: http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/
您应该查看Cages:http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/
http://code.google.com/p/cages/
http://code.google.com/p/cages/
#1
5
You should look into Cages: http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/
您应该查看Cages:http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/
http://code.google.com/p/cages/
http://code.google.com/p/cages/