I'm trying to get my head around some not quite so trivial promise/asynchronous use-cases. In an example I'm wrestling with at the moment, I have an array of books returned from a knex query (thenable array) I wish to insert into a database:
我正试图搞清楚一些不那么琐碎的承诺/异步用例。在我目前正在处理的一个示例中,我有一个从knex查询(thenable array)中返回的图书数组,我希望将其插入到数据库中:
books.map(function(book) {
// Insert into DB
});
Each book item looks like:
每一本书都是这样的:
var book = {
title: 'Book title',
author: 'Author name'
};
However, before I insert each book, I need to retrieve the author's ID from a separate table since this data is normalised. The author may or may not exist, so I need to:
然而,在插入每本书之前,我需要从单独的表中检索作者的ID,因为数据是规范化的。作者可能存在,也可能不存在,所以我需要:
- Check if the author is present in the DB
- 检查作者是否在数据库中
- If it is, use this ID
- 如果是,使用这个ID
- Otherwise, insert the author and use the new ID
- 否则,插入作者并使用新的ID
However, the above operations are also all asynchronous.
但是,上面的操作也是异步的。
I can just use a promise within the original map (fetch and/or insert ID) as a prerequisite of the insert operation. But the problem here is that, because everything's ran asynchronously, the code may well insert duplicate authors because the initial check-if-author-exists is decoupled from the insert-a-new-author block.
我可以在原始映射(获取和/或插入ID)中使用一个承诺作为插入操作的先决条件。但是这里的问题是,因为所有的代码都是异步运行的,所以代码很可能插入重复的作者,因为初始的check-if-author存在,与insert-a-new-author块是分离的。
I can think of a few ways to achieve the above but they all involve splitting up the promise chain and generally seem a bit messy. This seems like the kind of problem that must arise quite commonly. I'm sure I'm missing something fundamental here!
我可以想出一些实现上述目标的方法,但它们都涉及到分割承诺链,通常看起来有点混乱。这似乎是一种常见的问题。我肯定我漏掉了一些基本的东西!
Any tips?
任何建议吗?
2 个解决方案
#1
8
Let's assume that you can process each book in parallel. Then everything is quite simple (using only ES6 API):
让我们假设您可以并行地处理每一本书。那么一切都很简单(只使用ES6 API):
Promise
.all(books.map(book => {
return getAuthor(book.author)
.catch(createAuthor.bind(null, book.author));
.then(author => Object.assign(book, { author: author.id }))
.then(saveBook);
}))
.then(() => console.log('All done'))
The problem is that there is a race condition between getting author and creating new author. Consider the following order of events:
问题是,在获得作者和创造新作者之间存在一个竞争条件。考虑以下事件顺序:
- we try to get author A for book B;
- 我们试着让A写书B;
- getting author A fails;
- 让作者失败;
- we request creating author A, but it is not created yet;
- 我们要求创建作者A,但还没有创建;
- we try to get author A for book C;
- 我们试着为C找到作者A;
- getting author A fails;
- 让作者失败;
- we request creating author A (again!);
- 我们请求创建作者A(再次!)
- first request completes;
- 第一个请求完成;
- second request completes;
- 第二个请求完成;
Now we have two instances of A in author table. This is bad! To solve this problem we can use traditional approach: locking. We need keep a table of per author locks. When we send creation request we lock the appropriate lock. After request completes we unlock it. All other operations involving the same author need to acquire the lock first before doing anything.
现在我们有了一个in author表的两个实例。这是不好的!要解决这个问题,我们可以使用传统的方法:锁定。我们需要保存每个作者锁的表。当我们发送创建请求时,我们锁定适当的锁。请求完成后,我们解锁它。所有涉及同一作者的其他操作都需要在执行任何操作之前先获取锁。
This seems hard, but can be simplified a lot in our case, since we can use our request promises instead of locks:
这似乎很难,但在我们的情况下可以简化很多,因为我们可以使用我们的请求承诺而不是锁:
const authorPromises = {};
function getAuthor(authorName) {
if (authorPromises[authorName]) {
return authorPromises[authorName];
}
const promise = getAuthorFromDatabase(authorName)
.catch(createAuthor.bind(null, authorName))
.then(author => {
delete authorPromises[authorName];
return author;
});
authorPromises[author] = promise;
return promise;
}
Promise
.all(books.map(book => {
return getAuthor(book.author)
.then(author => Object.assign(book, { author: author.id }))
.then(saveBook);
}))
.then(() => console.log('All done'))
That's it! Now if a request for author is inflight the same promise will be returned.
就是这样!现在,如果有人向作者提出请求,同样的承诺也会被退回。
#2
3
Here is how I would implement it. I think some important requirements are:
下面是我如何实现它。我认为一些重要的要求是:
- No duplicate authors are ever created (this should be a constraint in the database itself too).
- 不会创建重复的作者(这也应该是数据库本身的约束)。
- If the server does not reply in the middle - no inconsistent data is inserted.
- 如果服务器在中间没有应答,则不会插入不一致的数据。
- Possibility to enter multiple authors.
- 可能输入多个作者。
- Don't make
n
queries to the database forn
things - avoiding the classic "n+1" problem. - 不要对数据库进行n个查询来获取n个东西——避免典型的“n+1”问题。
I'd use a transaction, to make sure that updates are atomic - that is if the operation is run and the client dies in the middle - no authors are created without books. It's also important that a temportary failure does not cause a memory leak (like in the answer with the authors map that keeps failed promises).
我将使用事务,以确保更新是原子性的——也就是说,如果运行操作,客户端在中间死亡——没有书就不会创建作者。同样重要的是,临时故障不会导致内存泄漏(如保留失败承诺的authors map中的答案)。
knex.transaction(Promise.coroutine(function*(t) {
//get books inside the transaction
var authors = yield books.map(x => x.author);
// name should be indexed, this is a single query
var inDb = yield t.select("authors").whereIn("name", authors);
var notIn = authors.filter(author => !inDb.includes("author"));
// now, perform a single multi row insert on the transaction
// I'm assuming PostgreSQL here (return IDs), this is a bit different for SQLite
var ids = yield t("authors").insert(notIn.map(name => {authorName: name });
// update books _inside the transaction_ now with the IDs array
})).then(() => console.log("All done!"));
This has the advantage of only making a fixed number of queries and is likely to be safer and perform better. Moreover, your database is not in a consistent state (although you may have to retry the operation for multiple instances).
这样做的优点是只进行固定数量的查询,而且可能更安全,性能更好。而且,您的数据库不是处于一致的状态(尽管您可能需要重新尝试多个实例的操作)。
#1
8
Let's assume that you can process each book in parallel. Then everything is quite simple (using only ES6 API):
让我们假设您可以并行地处理每一本书。那么一切都很简单(只使用ES6 API):
Promise
.all(books.map(book => {
return getAuthor(book.author)
.catch(createAuthor.bind(null, book.author));
.then(author => Object.assign(book, { author: author.id }))
.then(saveBook);
}))
.then(() => console.log('All done'))
The problem is that there is a race condition between getting author and creating new author. Consider the following order of events:
问题是,在获得作者和创造新作者之间存在一个竞争条件。考虑以下事件顺序:
- we try to get author A for book B;
- 我们试着让A写书B;
- getting author A fails;
- 让作者失败;
- we request creating author A, but it is not created yet;
- 我们要求创建作者A,但还没有创建;
- we try to get author A for book C;
- 我们试着为C找到作者A;
- getting author A fails;
- 让作者失败;
- we request creating author A (again!);
- 我们请求创建作者A(再次!)
- first request completes;
- 第一个请求完成;
- second request completes;
- 第二个请求完成;
Now we have two instances of A in author table. This is bad! To solve this problem we can use traditional approach: locking. We need keep a table of per author locks. When we send creation request we lock the appropriate lock. After request completes we unlock it. All other operations involving the same author need to acquire the lock first before doing anything.
现在我们有了一个in author表的两个实例。这是不好的!要解决这个问题,我们可以使用传统的方法:锁定。我们需要保存每个作者锁的表。当我们发送创建请求时,我们锁定适当的锁。请求完成后,我们解锁它。所有涉及同一作者的其他操作都需要在执行任何操作之前先获取锁。
This seems hard, but can be simplified a lot in our case, since we can use our request promises instead of locks:
这似乎很难,但在我们的情况下可以简化很多,因为我们可以使用我们的请求承诺而不是锁:
const authorPromises = {};
function getAuthor(authorName) {
if (authorPromises[authorName]) {
return authorPromises[authorName];
}
const promise = getAuthorFromDatabase(authorName)
.catch(createAuthor.bind(null, authorName))
.then(author => {
delete authorPromises[authorName];
return author;
});
authorPromises[author] = promise;
return promise;
}
Promise
.all(books.map(book => {
return getAuthor(book.author)
.then(author => Object.assign(book, { author: author.id }))
.then(saveBook);
}))
.then(() => console.log('All done'))
That's it! Now if a request for author is inflight the same promise will be returned.
就是这样!现在,如果有人向作者提出请求,同样的承诺也会被退回。
#2
3
Here is how I would implement it. I think some important requirements are:
下面是我如何实现它。我认为一些重要的要求是:
- No duplicate authors are ever created (this should be a constraint in the database itself too).
- 不会创建重复的作者(这也应该是数据库本身的约束)。
- If the server does not reply in the middle - no inconsistent data is inserted.
- 如果服务器在中间没有应答,则不会插入不一致的数据。
- Possibility to enter multiple authors.
- 可能输入多个作者。
- Don't make
n
queries to the database forn
things - avoiding the classic "n+1" problem. - 不要对数据库进行n个查询来获取n个东西——避免典型的“n+1”问题。
I'd use a transaction, to make sure that updates are atomic - that is if the operation is run and the client dies in the middle - no authors are created without books. It's also important that a temportary failure does not cause a memory leak (like in the answer with the authors map that keeps failed promises).
我将使用事务,以确保更新是原子性的——也就是说,如果运行操作,客户端在中间死亡——没有书就不会创建作者。同样重要的是,临时故障不会导致内存泄漏(如保留失败承诺的authors map中的答案)。
knex.transaction(Promise.coroutine(function*(t) {
//get books inside the transaction
var authors = yield books.map(x => x.author);
// name should be indexed, this is a single query
var inDb = yield t.select("authors").whereIn("name", authors);
var notIn = authors.filter(author => !inDb.includes("author"));
// now, perform a single multi row insert on the transaction
// I'm assuming PostgreSQL here (return IDs), this is a bit different for SQLite
var ids = yield t("authors").insert(notIn.map(name => {authorName: name });
// update books _inside the transaction_ now with the IDs array
})).then(() => console.log("All done!"));
This has the advantage of only making a fixed number of queries and is likely to be safer and perform better. Moreover, your database is not in a consistent state (although you may have to retry the operation for multiple instances).
这样做的优点是只进行固定数量的查询,而且可能更安全,性能更好。而且,您的数据库不是处于一致的状态(尽管您可能需要重新尝试多个实例的操作)。