fread/fwrite采用size和count作为参数的基本原理是什么?

时间:2022-03-14 15:30:55

We had a discussion here at work regarding why fread and fwrite take a size per member and count and return the number of members read/written rather than just taking a buffer and size. The only use for it we could come up with is if you want to read/write an array of structs which aren't evenly divisible by the platform alignment and hence have been padded but that can't be so common as to warrant this choice in design.

我们在这里讨论了为什么fread和fwrite每个成员都有一个大小,并计算和返回读/写的成员数量,而不是仅仅取一个缓冲区和大小。我们能想到的唯一用途是,如果你想读/写一组结构体,这些结构体不能被平台对齐平均分割,因此被填充了,但这并不常见,因此不能保证在设计中进行这种选择。

From FREAD(3):

从文件中读(3):

The function fread() reads nmemb elements of data, each size bytes long, from the stream pointed to by stream, storing them at the location given by ptr.

函数fread()从流指向的流中读取nmemb元素(每个大小为字节长),并将它们存储在ptr给定的位置。

The function fwrite() writes nmemb elements of data, each size bytes long, to the stream pointed to by stream, obtaining them from the location given by ptr.

函数fwrite()将每个大小为字节的数据的nmemb元素写入流所指向的流,从ptr给出的位置获取它们。

fread() and fwrite() return the number of items successfully read or written (i.e., not the number of characters). If an error occurs, or the end-of-file is reached, the return value is a short item count (or zero).

fread()和fwrite()返回已成功读取或写入的项的数量(即,而不是字符的数量)。如果出现错误,或者达到文件的结束,返回值是一个短条目计数(或零)。

6 个解决方案

#1


18  

It's based on how fread is implemented.

这是基于fread是如何实现的。

The Single UNIX Specification says

单一UNIX规范说。

For each object, size calls shall be made to the fgetc() function and the results stored, in the order read, in an array of unsigned char exactly overlaying the object.

对于每个对象,应该对fgetc()函数进行大小调用,并按照读取的顺序将结果存储在一个完全覆盖对象的无符号字符数组中。

fgetc also has this note:

fgetc还有以下说明:

Since fgetc() operates on bytes, reading a character consisting of multiple bytes (or "a multi-byte character") may require multiple calls to fgetc().

由于fgetc()对字节进行操作,因此读取由多个字节(或“多字节字符”)组成的字符可能需要对fgetc()进行多次调用。

Of course, this predates fancy variable-byte character encodings like UTF-8.

当然,这比复杂的可变字节字符编码(如UTF-8)要早。

The SUS notes that this is actually taken from the ISO C documents.

SUS指出这实际上是从ISO C文档中提取的。

#2


52  

The difference in fread(buf, 1000, 1, stream) and fread(buf, 1, 1000, stream) is, that in the first case you get only one chunk of 1000 bytes or nuthin, if the file is smaller and in the second case you get everything in the file less than and up to 1000 bytes.

从文件中读的差异(buf 1000 1流)和从文件中读(流缓冲区,1000),在第一种情况下,你只能得到一块1000字节或什么也没有,如果文件更小,在第二种情况下得到的一切文件不足和最高可达1000个字节。

#3


12  

This is pure speculations, however back in the days(Some are still around) many filesystems were not simple byte streams on a hard drive.

这是纯粹的推测,然而,在过去(有些仍然存在),许多文件系统并不是硬盘上的简单字节流。

Many file systems were record based, thus to satisfy such filesystems in an efficient manner, you'll have to specify the number of items ("records"), allowing fwrite/fread to operate on the storage as records, not just byte streams.

许多文件系统都是基于记录的,因此要以一种有效的方式满足这些文件系统,您必须指定条目的数量(“记录”),允许fwrite/fread作为记录(而不仅仅是字节流)在存储上操作。

#4


9  

Here, let me fix those functions:

在这里,让我修正这些函数:

size_t fread_buf( void* ptr, size_t size, FILE* stream)
{
    return fread( ptr, 1, size, stream);
}


size_t fwrite_buf( void const* ptr, size_t size, FILE* stream)
{
    return fwrite( ptr, 1, size, stream);
}

As for a rationale for the parameters to fread()/fwrite(), I've lost my copy of K&R long ago so I can only guess. I think that a likely answer is that Kernighan and Ritchie may have simply thought that performing binary I/O would be most naturally done on arrays of objects. Also, they may have thought that block I/O would be faster/easier to implement or whatever on some architectures.

至于fread()/fwrite()的参数的基本原理,我很久以前就丢失了K&R的副本,所以只能猜测。我认为一个可能的答案是,Kernighan和Ritchie可能仅仅认为执行二进制I/O将会在对象数组中自然完成。而且,他们可能认为在某些体系结构上,块I/O会更快/更容易实现,或者更容易实现。

Even though the C standard specifies that fread() and fwrite() be implemented in terms of fgetc() and fputc(), remember that the standard came into existence long after C was defined by K&R and that things specified in the standard might not have been in the original designers ideas. It's even possible that things said in K&R's "The C Programming Language" might not be the same as when the language was first being designed.

尽管C标准指定了fread()和fwrite()是用fgetc()和fputc()来实现的,但是请记住,在C被K&R定义之后,标准出现的时间很长,而标准中指定的内容可能并没有出现在最初的设计人员的想法中。甚至有可能在K&R的“C编程语言”中说的事情可能与最初设计语言的时候不一样。

Finally, here's what P.J. Plauger has to say about fread() in "The Standard C Library":

最后,以下是P.J.普劳格在《标准C库》中对fread()的看法:

If the size (second) argument is greater than one, you cannot determine whether the function also read up to size - 1 additional characters beyond what it reports. As a rule, you are better off calling the function as fread(buf, 1, size * n, stream); instead of fread(buf, size, n, stream);

如果size (second)参数大于1,则不能确定该函数是否也读取大小为1的字符——比它报告的字符多1个字符。通常,最好将函数调用为fread(buf, 1, size * n, stream);代替fread(buf, size, n, stream);

Bascially, he's saying that fread()'s interface is broken. For fwrite() he notes that, "Write errors are generally rare, so this is not a major shortcoming" - a statement I wouldn't agree with.

Bascially,他说fread()的接口被破坏了。对于fwrite(),他指出,“写错误通常很少,所以这不是主要的缺点”——我不同意这种说法。

#5


3  

Likely it goes back to the way that file I/O was implemented. (back in the day) It might have been faster to write / read to files in blocks then to write everything at once.

它可能会回到文件I/O实现的方式。(回到过去)以块的形式对文件进行写/读,然后一次写所有东西可能会更快。

#6


1  

I think it is because C lacks function overloading. If there was some, size would be redundant. But in C you can't determine a size of an array element, you have to specify one.

我认为这是因为C缺乏函数重载。如果有的话,规模将是多余的。但是在C中,你不能确定数组元素的大小,你必须指定一个。

Consider this:

考虑一下:

int intArray[10];
fwrite(intArray, sizeof(int), 10, fd);

If fwrite accepted number of bytes, you could write the following:

如果fwrite接受字节数,您可以写以下内容:

int intArray[10];
fwrite(intArray, sizeof(int)*10, fd);

But it is just inefficient. You will have sizeof(int) times more system calls.

但这只是效率低下。您将拥有sizeof(int)乘以更多的系统调用。

Another point that should be taked into consideration is that you usually don't want a part of an array element be written to a file. You want the whole integer or nothing. fwrite returns a number of elements succesfully written. So if you discover that only 2 low bytes of an element is written what would you do?

另一个需要考虑的问题是,您通常不希望将数组元素的一部分写入文件。你要么想要整个整数,要么什么都不想要。fwrite成功地返回一些元素。如果你发现一个元素只有2个低字节被写入你会怎么做?

On some systems (due to alignment) you can't access one byte of an integer without creating a copy and shifting.

在某些系统(由于对齐)中,如果不创建副本并进行移动,就不能访问一个整数的一个字节。

#1


18  

It's based on how fread is implemented.

这是基于fread是如何实现的。

The Single UNIX Specification says

单一UNIX规范说。

For each object, size calls shall be made to the fgetc() function and the results stored, in the order read, in an array of unsigned char exactly overlaying the object.

对于每个对象,应该对fgetc()函数进行大小调用,并按照读取的顺序将结果存储在一个完全覆盖对象的无符号字符数组中。

fgetc also has this note:

fgetc还有以下说明:

Since fgetc() operates on bytes, reading a character consisting of multiple bytes (or "a multi-byte character") may require multiple calls to fgetc().

由于fgetc()对字节进行操作,因此读取由多个字节(或“多字节字符”)组成的字符可能需要对fgetc()进行多次调用。

Of course, this predates fancy variable-byte character encodings like UTF-8.

当然,这比复杂的可变字节字符编码(如UTF-8)要早。

The SUS notes that this is actually taken from the ISO C documents.

SUS指出这实际上是从ISO C文档中提取的。

#2


52  

The difference in fread(buf, 1000, 1, stream) and fread(buf, 1, 1000, stream) is, that in the first case you get only one chunk of 1000 bytes or nuthin, if the file is smaller and in the second case you get everything in the file less than and up to 1000 bytes.

从文件中读的差异(buf 1000 1流)和从文件中读(流缓冲区,1000),在第一种情况下,你只能得到一块1000字节或什么也没有,如果文件更小,在第二种情况下得到的一切文件不足和最高可达1000个字节。

#3


12  

This is pure speculations, however back in the days(Some are still around) many filesystems were not simple byte streams on a hard drive.

这是纯粹的推测,然而,在过去(有些仍然存在),许多文件系统并不是硬盘上的简单字节流。

Many file systems were record based, thus to satisfy such filesystems in an efficient manner, you'll have to specify the number of items ("records"), allowing fwrite/fread to operate on the storage as records, not just byte streams.

许多文件系统都是基于记录的,因此要以一种有效的方式满足这些文件系统,您必须指定条目的数量(“记录”),允许fwrite/fread作为记录(而不仅仅是字节流)在存储上操作。

#4


9  

Here, let me fix those functions:

在这里,让我修正这些函数:

size_t fread_buf( void* ptr, size_t size, FILE* stream)
{
    return fread( ptr, 1, size, stream);
}


size_t fwrite_buf( void const* ptr, size_t size, FILE* stream)
{
    return fwrite( ptr, 1, size, stream);
}

As for a rationale for the parameters to fread()/fwrite(), I've lost my copy of K&R long ago so I can only guess. I think that a likely answer is that Kernighan and Ritchie may have simply thought that performing binary I/O would be most naturally done on arrays of objects. Also, they may have thought that block I/O would be faster/easier to implement or whatever on some architectures.

至于fread()/fwrite()的参数的基本原理,我很久以前就丢失了K&R的副本,所以只能猜测。我认为一个可能的答案是,Kernighan和Ritchie可能仅仅认为执行二进制I/O将会在对象数组中自然完成。而且,他们可能认为在某些体系结构上,块I/O会更快/更容易实现,或者更容易实现。

Even though the C standard specifies that fread() and fwrite() be implemented in terms of fgetc() and fputc(), remember that the standard came into existence long after C was defined by K&R and that things specified in the standard might not have been in the original designers ideas. It's even possible that things said in K&R's "The C Programming Language" might not be the same as when the language was first being designed.

尽管C标准指定了fread()和fwrite()是用fgetc()和fputc()来实现的,但是请记住,在C被K&R定义之后,标准出现的时间很长,而标准中指定的内容可能并没有出现在最初的设计人员的想法中。甚至有可能在K&R的“C编程语言”中说的事情可能与最初设计语言的时候不一样。

Finally, here's what P.J. Plauger has to say about fread() in "The Standard C Library":

最后,以下是P.J.普劳格在《标准C库》中对fread()的看法:

If the size (second) argument is greater than one, you cannot determine whether the function also read up to size - 1 additional characters beyond what it reports. As a rule, you are better off calling the function as fread(buf, 1, size * n, stream); instead of fread(buf, size, n, stream);

如果size (second)参数大于1,则不能确定该函数是否也读取大小为1的字符——比它报告的字符多1个字符。通常,最好将函数调用为fread(buf, 1, size * n, stream);代替fread(buf, size, n, stream);

Bascially, he's saying that fread()'s interface is broken. For fwrite() he notes that, "Write errors are generally rare, so this is not a major shortcoming" - a statement I wouldn't agree with.

Bascially,他说fread()的接口被破坏了。对于fwrite(),他指出,“写错误通常很少,所以这不是主要的缺点”——我不同意这种说法。

#5


3  

Likely it goes back to the way that file I/O was implemented. (back in the day) It might have been faster to write / read to files in blocks then to write everything at once.

它可能会回到文件I/O实现的方式。(回到过去)以块的形式对文件进行写/读,然后一次写所有东西可能会更快。

#6


1  

I think it is because C lacks function overloading. If there was some, size would be redundant. But in C you can't determine a size of an array element, you have to specify one.

我认为这是因为C缺乏函数重载。如果有的话,规模将是多余的。但是在C中,你不能确定数组元素的大小,你必须指定一个。

Consider this:

考虑一下:

int intArray[10];
fwrite(intArray, sizeof(int), 10, fd);

If fwrite accepted number of bytes, you could write the following:

如果fwrite接受字节数,您可以写以下内容:

int intArray[10];
fwrite(intArray, sizeof(int)*10, fd);

But it is just inefficient. You will have sizeof(int) times more system calls.

但这只是效率低下。您将拥有sizeof(int)乘以更多的系统调用。

Another point that should be taked into consideration is that you usually don't want a part of an array element be written to a file. You want the whole integer or nothing. fwrite returns a number of elements succesfully written. So if you discover that only 2 low bytes of an element is written what would you do?

另一个需要考虑的问题是,您通常不希望将数组元素的一部分写入文件。你要么想要整个整数,要么什么都不想要。fwrite成功地返回一些元素。如果你发现一个元素只有2个低字节被写入你会怎么做?

On some systems (due to alignment) you can't access one byte of an integer without creating a copy and shifting.

在某些系统(由于对齐)中,如果不创建副本并进行移动,就不能访问一个整数的一个字节。