使用异步i/o编写数据库

时间:2022-12-31 09:40:51

I recently came across libuv, the low level library that lets nodejs do its async magic. This got me thinking. I would like clarifications along these lines:

我最近遇到了libuv,这是一个低级库,nodejs可以使用它来执行异步操作。这引起了我的思考。我想就这些问题作出澄清:

  1. Nodejs has async i/o calls. However if I'm calling APIs to a (remote) database the actual read/write into the db will be synchronous but node does not have to wait for that. Its possible to make the db itself write to the disk in async maybe? Are there any databases that uses libuv for actual async i/o?

    Nodejs有异步i/o调用。但是,如果我将api调用到(远程)数据库,那么在db中实际的读/写将是同步的,但是node不需要等待。可以让db本身以异步方式写入磁盘吗?是否有使用libuv进行实际异步i/o的数据库?

  2. Javascript is famously single threaded. I understand that the nodejs runtime need not be - I can fire up 4 instances if I have 4 cpu cores. But if I use libuv to write yet another web framework in a language that support threads, wouldn't it have all the goodness of async i/o AND multithreading? Does something like that already exist?

    众所周知,Javascript是单线程的。我理解nodejs运行时不需要-如果我有4个cpu内核,我可以启动4个实例。但是,如果我使用libuv编写另一个支持线程的语言的web框架,它难道不具有异步I /o和多线程的所有优点吗?这样的东西已经存在了吗?

1 个解决方案

#1


1  

You're mixing up two concepts. The fact that you, while doing a query to a service, can wait (via epoll/kpoll/libuv...) asynchronously, doesn't mean that your query is non-blocking on the other side, and vice versa. It also does not mean that, while in your event loop, things "feel" async, that they truly are.

你混淆了两个概念。在对服务执行查询时,可以异步地等待(通过epoll/kpoll/libuv…),这并不意味着您的查询在另一端是非阻塞的,反之亦然。它也不意味着,在事件循环中,事物“感觉”是异步的,而它们实际上是异步的。

Let's go back to what an event loop is and how nodeJS does its magic. I feel it's a good start to the story.

让我们回到事件循环是什么以及nodeJS是如何发挥其魔力的。我觉得这是一个很好的开始。

The visible part of an event loop is a change in the way code is written - from mostly synchronous to mostly asynchronous. The invisible part is that this asynchronous code is thrown as much as possible on an event loop, which, in the background, checks for things to do - IO, timers, etc. It isn't a new idea and it does its job (providing concurrency) really well.

事件循环的可见部分是代码编写方式的改变——从大部分同步到大部分异步。不可见的部分是,这个异步代码在事件循环中尽可能多地抛出,在事件循环中,它在后台检查要做的事情——IO、计时器等等。

libuv's documentation is actually very descriptive on this. Over there is a description of the design choices they took, and from there came this flowchart:

libuv的文档实际上是很有描述性的。这里有一个关于他们的设计选择的描述,从这里来了这个流程图:

使用异步i/o编写数据库

Note that nowhere do they state that they have made anything truly async - because they haven't. The underlying system calls remain synchronous. It just feels like it isn't. That is the key take-away.

请注意,他们在任何地方都没有声明他们已经做了任何真正的异步——因为他们还没有。底层系统调用保持同步。只是感觉它不是。这就是关键所在。

Regarding disk I/O on databases, I gave a talk in the Hague a while back about this, and, quite frankly, most of the crucial I/O is blocking. For instance, you can't go "Hey, I'll update the disk snapshot and append-only txlog at the same time!" - because, if one of them fails, you've got a serious, serious rollback issue and possibly unknown state.

关于数据库上的磁盘I/O,我曾在海牙做过一次演讲,坦率地说,大多数关键的I/O正在阻塞。例如,您不能“嘿,我将在同一时间更新磁盘快照和append-only txlog !”-因为,如果其中一个失败了,您将面临严重的、严重的回滚问题,并且可能处于未知状态。

Regarding question 2, I'd give code examples but I'm not sure what languages you are familiar with. The bottom-line is, the moment something crosses a thread boundary, things become hell. A very naive example would be this - suppose your event loop has two timers as follows:

关于问题2,我将给出代码示例,但我不确定您熟悉哪种语言。底线是,当某个东西越过一个线程边界时,事情就变成了地狱。一个非常简单的例子是——假设您的事件循环有两个计时器,如下所示:

  • Timer 1, firing every 0.5s, increments a given state variable A
  • 定时器1,每隔0.5秒发射一次,增加一个给定的状态变量a。
  • Timer 2, firing every time somebody provides user input, divides the state variable by 2.
  • 计时器2,每当有人提供用户输入时触发,将状态变量除以2。

Suppose you're running on a single-thread. Even though your event loop feels asynchronous, it is completely sequential - timer 1 will never run while timer 2 is running.

假设您在一个单线程上运行。尽管您的事件循环感觉是异步的,但它是完全连续的——计时器1永远不会在计时器2运行时运行。

Now add in a second thread, make timer 2 run from it. Without a guard in place, there is a fair possibility that something, somewhere, will go very wrong.

现在添加第二个线程,让timer 2从它运行。如果没有警卫在场,很有可能在某个地方会出问题。

In order to be able to divide something by 2 the naive way (without taking advantage of CPU instructions dedicated to this kind of stuff), one has to retrieve the variable, divide it by 2, then set it back in memory.

为了能够以简单的方式(不利用CPU指令来处理这类东西)将某样东西除以2,我们必须检索变量,除以2,然后将其放回内存中。

The same goes, incrementing is also a three-stage process (again, taking the naive approach).

同样,递增也是一个三阶段的过程(同样,采用朴素的方法)。

Once those two timers *, you can get some crazy race conditions like the following:

一旦这两个计时器发生冲突,你会得到以下疯狂的比赛条件:

THREAD 1          | THREAD 2
   <- A=1         |
 Local:A=1+1=2    |  <- A=1
                  |  Local: A=1*2=2
     A=2 ->       |  A=2 ->

Thread 2 started running halfway through thread 1's computation, retrieved the wrong state variable value (as thread 1 had not updated the variable yet), and multiplied it by 2. You should have had 3, but in reality you ended up with 2.

线程2开始在线程1计算的中途运行,检索错误的状态变量值(因为线程1还没有更新该变量),并将其乘以2。你本来应该有3个,但实际上你最后是2个。

To protect against this, there are a whole bunch of methods and tools. Most processor architectures these days have atomic instructions (Intel, for instance), and developers can leverage those if they know where they need them. You can have a whole bunch of tools on top of these - mutexes, read/write locks, semaphores, etc... to reduce or remove those issues, at a cost, and when you know where you'll need them.

为了防止这种情况,有很多方法和工具。现在大多数处理器架构都有原子指令(例如Intel),如果开发者知道他们需要什么,他们就可以利用这些指令。你可以在上面有一大堆工具——互斥体、读/写锁、信号量等等……为了减少或消除这些问题,要付出代价,当你知道你需要它们的时候。

Needless to say, it is far from trivial to generalize this.

毫无疑问,概括这一点绝非小事。

#1


1  

You're mixing up two concepts. The fact that you, while doing a query to a service, can wait (via epoll/kpoll/libuv...) asynchronously, doesn't mean that your query is non-blocking on the other side, and vice versa. It also does not mean that, while in your event loop, things "feel" async, that they truly are.

你混淆了两个概念。在对服务执行查询时,可以异步地等待(通过epoll/kpoll/libuv…),这并不意味着您的查询在另一端是非阻塞的,反之亦然。它也不意味着,在事件循环中,事物“感觉”是异步的,而它们实际上是异步的。

Let's go back to what an event loop is and how nodeJS does its magic. I feel it's a good start to the story.

让我们回到事件循环是什么以及nodeJS是如何发挥其魔力的。我觉得这是一个很好的开始。

The visible part of an event loop is a change in the way code is written - from mostly synchronous to mostly asynchronous. The invisible part is that this asynchronous code is thrown as much as possible on an event loop, which, in the background, checks for things to do - IO, timers, etc. It isn't a new idea and it does its job (providing concurrency) really well.

事件循环的可见部分是代码编写方式的改变——从大部分同步到大部分异步。不可见的部分是,这个异步代码在事件循环中尽可能多地抛出,在事件循环中,它在后台检查要做的事情——IO、计时器等等。

libuv's documentation is actually very descriptive on this. Over there is a description of the design choices they took, and from there came this flowchart:

libuv的文档实际上是很有描述性的。这里有一个关于他们的设计选择的描述,从这里来了这个流程图:

使用异步i/o编写数据库

Note that nowhere do they state that they have made anything truly async - because they haven't. The underlying system calls remain synchronous. It just feels like it isn't. That is the key take-away.

请注意,他们在任何地方都没有声明他们已经做了任何真正的异步——因为他们还没有。底层系统调用保持同步。只是感觉它不是。这就是关键所在。

Regarding disk I/O on databases, I gave a talk in the Hague a while back about this, and, quite frankly, most of the crucial I/O is blocking. For instance, you can't go "Hey, I'll update the disk snapshot and append-only txlog at the same time!" - because, if one of them fails, you've got a serious, serious rollback issue and possibly unknown state.

关于数据库上的磁盘I/O,我曾在海牙做过一次演讲,坦率地说,大多数关键的I/O正在阻塞。例如,您不能“嘿,我将在同一时间更新磁盘快照和append-only txlog !”-因为,如果其中一个失败了,您将面临严重的、严重的回滚问题,并且可能处于未知状态。

Regarding question 2, I'd give code examples but I'm not sure what languages you are familiar with. The bottom-line is, the moment something crosses a thread boundary, things become hell. A very naive example would be this - suppose your event loop has two timers as follows:

关于问题2,我将给出代码示例,但我不确定您熟悉哪种语言。底线是,当某个东西越过一个线程边界时,事情就变成了地狱。一个非常简单的例子是——假设您的事件循环有两个计时器,如下所示:

  • Timer 1, firing every 0.5s, increments a given state variable A
  • 定时器1,每隔0.5秒发射一次,增加一个给定的状态变量a。
  • Timer 2, firing every time somebody provides user input, divides the state variable by 2.
  • 计时器2,每当有人提供用户输入时触发,将状态变量除以2。

Suppose you're running on a single-thread. Even though your event loop feels asynchronous, it is completely sequential - timer 1 will never run while timer 2 is running.

假设您在一个单线程上运行。尽管您的事件循环感觉是异步的,但它是完全连续的——计时器1永远不会在计时器2运行时运行。

Now add in a second thread, make timer 2 run from it. Without a guard in place, there is a fair possibility that something, somewhere, will go very wrong.

现在添加第二个线程,让timer 2从它运行。如果没有警卫在场,很有可能在某个地方会出问题。

In order to be able to divide something by 2 the naive way (without taking advantage of CPU instructions dedicated to this kind of stuff), one has to retrieve the variable, divide it by 2, then set it back in memory.

为了能够以简单的方式(不利用CPU指令来处理这类东西)将某样东西除以2,我们必须检索变量,除以2,然后将其放回内存中。

The same goes, incrementing is also a three-stage process (again, taking the naive approach).

同样,递增也是一个三阶段的过程(同样,采用朴素的方法)。

Once those two timers *, you can get some crazy race conditions like the following:

一旦这两个计时器发生冲突,你会得到以下疯狂的比赛条件:

THREAD 1          | THREAD 2
   <- A=1         |
 Local:A=1+1=2    |  <- A=1
                  |  Local: A=1*2=2
     A=2 ->       |  A=2 ->

Thread 2 started running halfway through thread 1's computation, retrieved the wrong state variable value (as thread 1 had not updated the variable yet), and multiplied it by 2. You should have had 3, but in reality you ended up with 2.

线程2开始在线程1计算的中途运行,检索错误的状态变量值(因为线程1还没有更新该变量),并将其乘以2。你本来应该有3个,但实际上你最后是2个。

To protect against this, there are a whole bunch of methods and tools. Most processor architectures these days have atomic instructions (Intel, for instance), and developers can leverage those if they know where they need them. You can have a whole bunch of tools on top of these - mutexes, read/write locks, semaphores, etc... to reduce or remove those issues, at a cost, and when you know where you'll need them.

为了防止这种情况,有很多方法和工具。现在大多数处理器架构都有原子指令(例如Intel),如果开发者知道他们需要什么,他们就可以利用这些指令。你可以在上面有一大堆工具——互斥体、读/写锁、信号量等等……为了减少或消除这些问题,要付出代价,当你知道你需要它们的时候。

Needless to say, it is far from trivial to generalize this.

毫无疑问,概括这一点绝非小事。