In scatter and gather (i.e. readv
and writev
), Linux reads into multiple buffers and writes from multiple buffers.
在分散和收集(例如readv和writev)中,Linux读取多个缓冲区,并从多个缓冲区写入。
If say, I have a vector of 3 buffers, I can use readv
, OR I can use a single buffer, which is of combined size of 3 buffers and do fread
.
如果我有一个3个缓冲区的向量,我可以使用readv,或者我可以使用一个单独的缓冲区,它的组合大小是3个缓冲区,然后做fread。
Hence, I am confused: For which cases should scatter/gather be used and when should a single large buffer be used?
因此,我感到困惑:对于哪些情况应该使用/收集,何时应该使用一个大型缓冲区?
1 个解决方案
#1
86
The main convenience offered by readv
, writev
is:
readv, writev提供的主要便利是:
- It allows working with non contiguous blocks of data. i.e. buffers need not be part of an array, but separately allocated.
- 它允许处理非连续的数据块。例如,缓冲区不需要是数组的一部分,而是单独分配的。
- The I/O is 'atomic'. i.e. If you do a
writev
, all the elements in the vector will be written in one contiguous operation, and writes done by other processes will not occur in between them. - I / O是“原子”。例如,如果您执行一个writev,那么向量中的所有元素都将在一个连续的操作中被写入,而由其他进程执行的写入将不会在它们之间发生。
e.g. say, your data is naturally segmented, and comes from different sources:
例如,你的数据是自然分割的,来自不同的来源:
struct foo *my_foo;
struct bar *my_bar;
struct baz *my_baz;
my_foo = get_my_foo();
my_bar = get_my_bar();
my_baz = get_my_baz();
Now, all three 'buffers' are not one big contiguous block. But you want to write them contiguously into a file, for whatever reason (say for example, they are fields in a file header for a file format).
现在,这三个“缓冲区”都不是一个大的连续块。但是,无论出于什么原因,您希望将它们连续地写入文件(例如,它们是文件格式的文件头中的字段)。
If you use write
you have to choose between:
如果你使用书面形式,你必须在以下两者之间做出选择:
- Copying them over into one block of memory using, say,
memcpy
(overhead), followed by a singlewrite
call. Then the write will be atomic. - 使用memcpy(开销)将它们复制到一个内存块中,然后进行一次写入调用。那么写入将是原子的。
- Making three separate calls to
write
(overhead). Also,write
calls from other processes can intersperse between these writes (not atomic). - 写三个独立的调用(开销)。另外,来自其他进程的写调用可以在这些写(不是原子)之间穿插。
If you use writev
instead, its all good:
如果你用writev代替,一切都好:
- You make exactly one system call, and no
memcpy
to make a single buffer from the three. - 您只执行一个系统调用,而没有memcpy从这三个调用中创建一个缓冲区。
- Also, the three buffers are written atomically, as one block write. i.e. if other processes also write, then these writes will not come in between the writes of the three vectors.
- 另外,三个缓冲区用原子方式编写,就像一个块写的那样。例如,如果其他进程也写,那么这些写将不会出现在三个向量的写之间。
So you would do something like:
所以你会这样做:
struct iovec iov[3];
iov[0].iov_base = my_foo;
iov[0].iov_len = sizeof (struct foo);
iov[1].iov_base = my_bar;
iov[1].iov_len = sizeof (struct bar);
iov[2].iov_base = my_baz;
iov[2].iov_len = sizeof (struct baz);
bytes_written = writev (fd, iov, 3);
Sources:
来源:
- http://pubs.opengroup.org/onlinepubs/009604499/functions/writev.html
- http://pubs.opengroup.org/onlinepubs/009604499/functions/writev.html
- http://linux.die.net/man/2/readv
- http://linux.die.net/man/2/readv
#1
86
The main convenience offered by readv
, writev
is:
readv, writev提供的主要便利是:
- It allows working with non contiguous blocks of data. i.e. buffers need not be part of an array, but separately allocated.
- 它允许处理非连续的数据块。例如,缓冲区不需要是数组的一部分,而是单独分配的。
- The I/O is 'atomic'. i.e. If you do a
writev
, all the elements in the vector will be written in one contiguous operation, and writes done by other processes will not occur in between them. - I / O是“原子”。例如,如果您执行一个writev,那么向量中的所有元素都将在一个连续的操作中被写入,而由其他进程执行的写入将不会在它们之间发生。
e.g. say, your data is naturally segmented, and comes from different sources:
例如,你的数据是自然分割的,来自不同的来源:
struct foo *my_foo;
struct bar *my_bar;
struct baz *my_baz;
my_foo = get_my_foo();
my_bar = get_my_bar();
my_baz = get_my_baz();
Now, all three 'buffers' are not one big contiguous block. But you want to write them contiguously into a file, for whatever reason (say for example, they are fields in a file header for a file format).
现在,这三个“缓冲区”都不是一个大的连续块。但是,无论出于什么原因,您希望将它们连续地写入文件(例如,它们是文件格式的文件头中的字段)。
If you use write
you have to choose between:
如果你使用书面形式,你必须在以下两者之间做出选择:
- Copying them over into one block of memory using, say,
memcpy
(overhead), followed by a singlewrite
call. Then the write will be atomic. - 使用memcpy(开销)将它们复制到一个内存块中,然后进行一次写入调用。那么写入将是原子的。
- Making three separate calls to
write
(overhead). Also,write
calls from other processes can intersperse between these writes (not atomic). - 写三个独立的调用(开销)。另外,来自其他进程的写调用可以在这些写(不是原子)之间穿插。
If you use writev
instead, its all good:
如果你用writev代替,一切都好:
- You make exactly one system call, and no
memcpy
to make a single buffer from the three. - 您只执行一个系统调用,而没有memcpy从这三个调用中创建一个缓冲区。
- Also, the three buffers are written atomically, as one block write. i.e. if other processes also write, then these writes will not come in between the writes of the three vectors.
- 另外,三个缓冲区用原子方式编写,就像一个块写的那样。例如,如果其他进程也写,那么这些写将不会出现在三个向量的写之间。
So you would do something like:
所以你会这样做:
struct iovec iov[3];
iov[0].iov_base = my_foo;
iov[0].iov_len = sizeof (struct foo);
iov[1].iov_base = my_bar;
iov[1].iov_len = sizeof (struct bar);
iov[2].iov_base = my_baz;
iov[2].iov_len = sizeof (struct baz);
bytes_written = writev (fd, iov, 3);
Sources:
来源:
- http://pubs.opengroup.org/onlinepubs/009604499/functions/writev.html
- http://pubs.opengroup.org/onlinepubs/009604499/functions/writev.html
- http://linux.die.net/man/2/readv
- http://linux.die.net/man/2/readv