如何在需要旧式unsigned char的地方使用新的std :: byte类型?

时间:2022-10-11 06:51:26

std::byte is a new type in C++17 which is made as enum class byte : unsigned char. This makes impossible to use it without appropriate conversion. So, I have made an alias for the vector of such type to represent a byte array:

std :: byte是C ++ 17中的一个新类型,它是枚举类的字节:unsigned char。如果没有适当的转换,这将无法使用它。所以,我为这种类型的向量做了一个别名来表示一个字节数组:

using Bytes = std::vector<std::byte>;

However, it is impossible to use it in old-style: the functions which accept it as a parameter fail because this type can not be easily converted to old std::vector<unsigned char> type, for example, a usage of zipper library:

但是,它不可能在旧式中使用它:接受它作为参数的函数失败,因为这种类型不能轻易转换为旧的std :: vector 类型,例如,拉链库的用法:

/resourcecache/pakfile.cpp: In member function 'utils::Bytes resourcecache::PakFile::readFile(const string&)':
/resourcecache/pakfile.cpp:48:52: error: no matching function for call to 'zipper::Unzipper::extractEntryToMemory(const string&, utils::Bytes&)'
     unzipper_->extractEntryToMemory(fileName, bytes);
                                                    ^
In file included from /resourcecache/pakfile.hpp:13:0,
                 from /resourcecache/pakfile.cpp:1:
/projects/linux/../../thirdparty/zipper/zipper/unzipper.h:31:10: note: candidate: bool zipper::Unzipper::extractEntryToMemory(const string&, std::vector<unsigned char>&)
     bool extractEntryToMemory(const std::string& name, std::vector<unsigned char>& vec);
          ^~~~~~~~~~~~~~~~~~~~
/projects/linux/../../thirdparty/zipper/zipper/unzipper.h:31:10: note:   no known conversion for argument 2 from 'utils::Bytes {aka std::vector<std::byte>}' to 'std::vector<unsigned char>&'

I have tried to perform naive casts but this does not help also. So, if it is designed to be useful, will it be actually useful in old contexts? The only method I see is to use std::transform for using new vector of bytes in these places:

我试图进行天真的演员阵容,但这也无济于事。那么,如果它被设计为有用,它在旧的上下文中是否真的有用?我看到的唯一方法是使用std :: transform在这些地方使用新的字节向量:

utils::Bytes bytes;
std::vector<unsigned char> rawBytes;
unzipper_->extractEntryToMemory(fileName, rawBytes);
std::transform(rawBytes.cbegin(),
               rawBytes.cend(),
               std::back_inserter(bytes),
               [](const unsigned char c) {
                   return static_cast<std::byte>(c);
               });
return bytes;

Which is:

  1. Ugly.
  2. Takes a lot of useless lines (can be rewritten but still it needs to be written before:)).
  3. 需要很多无用的行(可以重写但仍然需要在之前编写:))。

  4. Copies the memory instead of just using already created chunk of rawBytes.
  5. 复制内存而不是仅使用已创建的rawBytes块。

So, how to use it in old places?

那么,如何在旧地方使用它?

2 个解决方案

#1


26  

You're missing the point why std::byte was invented in the first place. The reason it was invented is to hold a raw byte in memory without the assumption that it's a character. You can see that in cppreference.

你错过了为什么std :: byte首先被发明的原因。它被发明的原因是在存储器中保存一个原始字节而不假设它是一个字符。你可以在cppreference中看到它。

Like char and unsigned char, it can be used to access raw memory occupied by other objects (object representation), but unlike those types, it is not a character type and is not an arithmetic type.

与char和unsigned char一样,它可以用于访问其他对象占用的原始内存(对象表示),但与这些类型不同,它不是字符类型,也不是算术类型。

Remember that C++ is a strongly typed language in the interest of safety (so implicit conversions are restricted in many cases). Meaning: If an implicit conversion from byte to char was possible, it would defeat the purpose.

请记住,为了安全起见,C ++是一种强类型语言(因此在许多情况下隐式转换受到限制)。含义:如果从字节到char的隐式转换是可能的,那么它将失败目的。

So, to answer your question: To use it, you have to cast it whenever you want to make an assignment to it:

所以,要回答你的问题:要使用它,你必须在你想要分配它时施放它:

std::byte x = (std::byte)10;
std::byte y = (std::byte)'a';
std::cout << (int)x << std::endl;
std::cout << (char)y << std::endl;

Anything else shall not work, by design! So that transform is ugly, agreed, but if you want to store chars, then use char. Don't use bytes unless you want to store raw memory that should not be interpreted as char by default.

其他任何东西都不能按设计工作!因此转换是丑陋的,同意,但如果你想存储字符,那么使用char。除非您希望存储默认情况下不应解释为char的原始内存,否则不要使用字节。

And also the last part of your question is generally incorrect: You don't have to make copies, because you don't have to copy the whole vector. If you temporarily need to read a byte as a char, simply static_cast it at the place where you need to use it as a char. It costs nothing, and is type-safe.

而且你问题的最后一部分通常也是错误的:你不必复制,因为你不必复制整个载体。如果您暂时需要将字节读取为char,只需在需要将其用作char的位置进行static_cast。它没有任何成本,而且是类型安全的。


As to your question in the comment about casting std::vector<char> to std::vector<std::byte>, you can't do that. But you can use the raw array underneath. So, the following has a type (char*):

std::vector<std::byte> bytes;
//fill it...
char* charBytes = static_cast<char*>(&bytes.front()); 

This has type char*, which is a pointer to the first element of your array, and can be dereferenced without copying, as follows:

它具有char *类型,它是指向数组第一个元素的指针,可以在不复制的情况下取消引用,如下所示:

std::cout << charBytes[5] << std::endl; //6th element of the vector as char

And the size you get from bytes.size(). This is valid, since std::vector is contiguous in memory. You can't generally do this with any other std container (deque, list, etc...).

以及从bytes.size()获得的大小。这是有效的,因为std :: vector在内存中是连续的。你通常不能用任何其他std容器(deque,list等)来做这件事。

While this is valid, it removes part of the safety from the equation, keep that in mind. If you need char, don't use byte.

虽然这是有效的,但它会从等式中消除部分安全性,请记住这一点。如果需要char,请不要使用byte。

#2


0  

If you want something that behaves like a byte in the way you'd probably expect it but is named distinctly different from unsigned char use uint8_t from stdint.h. For almost all implementations this will probably be a

如果你想要的东西就像你可能期望的那样行为,但命名与unsigned char明显不同,请使用stdint.h中的uint8_t。对于几乎所有的实现,这可能是一个

typedef unsigned char uint8_t;

and again an unsigned char under the hood - but who cares? You just want to emphasize "This is not a character type". You just don't have to expect to be able to have two overloads of some functions, one for unsigned char and one for uint8_t. But if you do the compiler will push your nose onto it anyway...

再次成为引擎盖下的无符号字符 - 但是谁在乎呢?你只想强调“这不是一个字符类型”。您不必期望能够对某些函数进行两次重载,一次用于unsigned char,另一次用于uint8_t。但是,如果你这样做,编译器会把你的鼻子推到它上面......

#1


26  

You're missing the point why std::byte was invented in the first place. The reason it was invented is to hold a raw byte in memory without the assumption that it's a character. You can see that in cppreference.

你错过了为什么std :: byte首先被发明的原因。它被发明的原因是在存储器中保存一个原始字节而不假设它是一个字符。你可以在cppreference中看到它。

Like char and unsigned char, it can be used to access raw memory occupied by other objects (object representation), but unlike those types, it is not a character type and is not an arithmetic type.

与char和unsigned char一样,它可以用于访问其他对象占用的原始内存(对象表示),但与这些类型不同,它不是字符类型,也不是算术类型。

Remember that C++ is a strongly typed language in the interest of safety (so implicit conversions are restricted in many cases). Meaning: If an implicit conversion from byte to char was possible, it would defeat the purpose.

请记住,为了安全起见,C ++是一种强类型语言(因此在许多情况下隐式转换受到限制)。含义:如果从字节到char的隐式转换是可能的,那么它将失败目的。

So, to answer your question: To use it, you have to cast it whenever you want to make an assignment to it:

所以,要回答你的问题:要使用它,你必须在你想要分配它时施放它:

std::byte x = (std::byte)10;
std::byte y = (std::byte)'a';
std::cout << (int)x << std::endl;
std::cout << (char)y << std::endl;

Anything else shall not work, by design! So that transform is ugly, agreed, but if you want to store chars, then use char. Don't use bytes unless you want to store raw memory that should not be interpreted as char by default.

其他任何东西都不能按设计工作!因此转换是丑陋的,同意,但如果你想存储字符,那么使用char。除非您希望存储默认情况下不应解释为char的原始内存,否则不要使用字节。

And also the last part of your question is generally incorrect: You don't have to make copies, because you don't have to copy the whole vector. If you temporarily need to read a byte as a char, simply static_cast it at the place where you need to use it as a char. It costs nothing, and is type-safe.

而且你问题的最后一部分通常也是错误的:你不必复制,因为你不必复制整个载体。如果您暂时需要将字节读取为char,只需在需要将其用作char的位置进行static_cast。它没有任何成本,而且是类型安全的。


As to your question in the comment about casting std::vector<char> to std::vector<std::byte>, you can't do that. But you can use the raw array underneath. So, the following has a type (char*):

std::vector<std::byte> bytes;
//fill it...
char* charBytes = static_cast<char*>(&bytes.front()); 

This has type char*, which is a pointer to the first element of your array, and can be dereferenced without copying, as follows:

它具有char *类型,它是指向数组第一个元素的指针,可以在不复制的情况下取消引用,如下所示:

std::cout << charBytes[5] << std::endl; //6th element of the vector as char

And the size you get from bytes.size(). This is valid, since std::vector is contiguous in memory. You can't generally do this with any other std container (deque, list, etc...).

以及从bytes.size()获得的大小。这是有效的,因为std :: vector在内存中是连续的。你通常不能用任何其他std容器(deque,list等)来做这件事。

While this is valid, it removes part of the safety from the equation, keep that in mind. If you need char, don't use byte.

虽然这是有效的,但它会从等式中消除部分安全性,请记住这一点。如果需要char,请不要使用byte。

#2


0  

If you want something that behaves like a byte in the way you'd probably expect it but is named distinctly different from unsigned char use uint8_t from stdint.h. For almost all implementations this will probably be a

如果你想要的东西就像你可能期望的那样行为,但命名与unsigned char明显不同,请使用stdint.h中的uint8_t。对于几乎所有的实现,这可能是一个

typedef unsigned char uint8_t;

and again an unsigned char under the hood - but who cares? You just want to emphasize "This is not a character type". You just don't have to expect to be able to have two overloads of some functions, one for unsigned char and one for uint8_t. But if you do the compiler will push your nose onto it anyway...

再次成为引擎盖下的无符号字符 - 但是谁在乎呢?你只想强调“这不是一个字符类型”。您不必期望能够对某些函数进行两次重载,一次用于unsigned char,另一次用于uint8_t。但是,如果你这样做,编译器会把你的鼻子推到它上面......