C:如果x是char类型,x=~x会发生什么?

时间:2021-01-26 16:07:21

If we have the following code:

如果我们有以下代码:

char x = -1;
x =~x;

On an x86 platform with MS VS compiler (which partly supports C99) - what happens in detail when it is running?

在一个使用MS VS编译器(部分支持C99)的x86平台上——运行时会发生什么?

To my knowledge, the following happens (please correct me if I am wrong):

据我所知,以下情况(如有错误请纠正):

  • x is assigned the value -1, which is represented by the bit pattern 0xff since a char is represented by one byte.
  • x被赋值为-1,该值由位模式0xff表示,因为char由一个字节表示。
  • The ~ operator promotes x to an int, that is, it internally works with the bit pattern 0xffffffff.
  • ~操作符将x提升为int型,也就是说,它在内部使用位模式0xffffffffff。
  • The ~ operator's result is 0x00000000 (of type int).
  • ~操作符的结果是0x00000000(类型为int)。
  • To perform the assignment, the integer promotions apply (principally). Since in our case the operand on the right hand side is an int, no conversion occurs. The operand on the left hand side is converted to int. The assignment's result is 0x00000000.
  • 为了执行任务,整数推广应用(主要是)。因为在我们的例子中,右边的操作数是int,所以不会发生转换。左侧的操作数转换为int,赋值结果为0x00000000。
  • As a side effect, the left hand side of the assignment is assigned the value 0x00000000. Since x is of type char, there is another implicit conversion, which converts 0x00000000 to 0x00.
  • 作为副作用,赋值的左边被赋值为0x00000000。由于x是char类型,所以还有另一个隐式转换,它将0x00000000转换为0x00。

There are so many things that actually happen - I find it somehow confusing. In particular: Is my understanding of the last implicit conversion (of int to char) correct? What would happen if the assignment's result could not be stored in a char?

有太多的事情发生了——我觉得有点困惑。特别是:我对最后一个隐式转换(int到char)的理解是正确的吗?如果分配的结果不能存储在字符中,会发生什么?

3 个解决方案

#1


7  

Indeed ~x is an int type.

实际上~x是一个int类型。

The conversion back to char is well-defined if char is unsigned. It's also well-defined, of course, if the value is in the range supported by char.

如果char是无符号的,则可以定义返回到char的转换。当然,如果值在char支持的范围内,那么它也具有良好的定义。

If char is signed, then the conversion of ~x to char is implementation-defined, with the possibility that an implementation defined signal is raised.

如果char被签名,那么~x到char的转换是实现定义的,可能会引发实现定义的信号。

In your case, you have a platform with a 2's complement int and a 2's complement char, and so ~x is observed as 0.

在您的例子中,您有一个带有2的补整数和2的补字符的平台,因此~x被观察为0。

Note that MSVC doesn't fully support any C standard, and neither does it claim to.

请注意,MSVC并不完全支持任何C标准,它也不声称支持任何C标准。

#2


4  

You are almost correct, but missing out that char has implementation-defined signedness. It can either be signed or unsigned, depending on compiler.

您几乎是正确的,但是忽略了char具有实现定义的签名。它可以是签名的,也可以是无符号的,这取决于编译器。

In either case, the bit pattern for a 8 bit 2's complement char is indeed 0xFF regardless of its signedness. But in case the char is signed, integer promotion will preserve the sign and you still have value -1, binary 0xFFFFFFFF on a 32 bit computer. But if char was unsigned, -1 would have been converted to 255 upon assignment and integer promotion would have given 255 (0x000000FF). So you'd get a different result.

在这两种情况下,8位2的补字符的位模式实际上是0xFF,而不考虑它的符号。但是,如果char被签名,那么整数提升将保留符号,并且在32位计算机上仍然有值-1,二进制0xffffffffff。但是如果char没有签名,那么-1将被转换为255,如果赋值和整数提升将得到255 (0x000000FF)。你会得到不同的结果。

Regarding integer promotion of ~, it only has one operator to the right and that one is promoted.

对于~的整数推广,只有一个操作符在右边,一个操作符在右边。

Finally you assign the result back to char and the outcome will again depend on signedness. You'll have an implicit "lvalue conversion" upon assignment from int to char. The result is implementation-defined - most likely you get the least significant byte of the int.

最后,将结果赋给char,结果将再次取决于签名。在将int赋值为char时,您将有一个隐式的“lvalue转换”。结果是实现定义的——最有可能的情况是您得到的是整数中最不重要的字节。


From this we can learn:

从中我们可以学到:

  • Never use char for storing integer values or for arithmetic. Use it for storing characters only. Instead, use uint8_t.
  • 永远不要使用char来存储整数值或算术。只用于存储字符。相反,使用uint8_t。
  • Never perform bitwise arithmetic on operands that are potentially signed, or was made signed silently through implicit promotion.
  • 永远不要对可能被签名的操作数执行位运算,或者通过隐式提升以静默签名。
  • The ~ operator is particularly dangerous unless the operand is unsigned int or a larger unsigned type.
  • ~操作符尤其危险,除非操作数是无符号整数或较大的无符号类型。

#3


1  

To my knowledge, the following happens (please correct me if I am wrong):

据我所知,以下情况(如有错误请纠正):

x is assigned the value -1, which is represented by the bit pattern 0xff since a char is represented by one byte.

x被赋值为-1,该值由位模式0xff表示,因为char由一个字节表示。

1 is an integer constant of type int. - negates that to -1 and remains an int. -1 is assigned to a char x . If that char is signed, then x takes on the value of -1. If that char is unsigned, x takes on the value of CHAR_MAX which is also UCHAR_MAX. "bit pattern 0xff" is not relevant here, yet.

1是整数常量,类型为int. -将其否定为-1并保持为整数。-1被分配给char x。如果这个字符有符号,那么x的值为-1。如果这个char是无符号的,那么x取CHAR_MAX的值,也就是UCHAR_MAX。“位模式0xff”在这里还不相关。

The ~ operator promotes x to an int, that is, it internally works with the bit pattern 0xffffffff.

~操作符将x提升为int型,也就是说,它在内部使用位模式0xffffffffff。

x is promoted to either int (or unsigned on rare machines where CHAR_MAX == UINT_MAX - we will ignore that). An int is at least 16 bits. The value of -1, when encoded as the overwhelmingly common 2's complement, is an all 1 bits pattern. (Other encoding possible - we will ignore that too). If x has the value of UCHAR_MAX, then x will have the bit pattern 00...00 1111 1111 - assuming 8-bit char. Other widths possible - another thing we will ignore.

x被提升为int(或在CHAR_MAX == UINT_MAX -我们将忽略这一点的罕见机器上的无符号)。int至少是16位。-1的值,当被编码成绝对公共的2的补码时,是一个所有1位的模式。(其他可能的编码——我们也会忽略这一点)。如果x的值为UCHAR_MAX,那么x的位模式为00…00 1111 1111 -假设8位字符。其他可能的宽度——另一件我们将忽略的事情。

The ~ operator's result is 0x00000000 (of type int).

~操作符的结果是0x00000000(类型为int)。

Yes, (unless CHAR_MAX == UINT_MAX, in which case it is unsigned and value 11...11 0000 0000).

是的,(除非CHAR_MAX == UINT_MAX,在这种情况下,它是无符号的,值为11……)11 0000 0000)。

To perform the assignment, the integer promotions apply (principally). Since in our case the operand on the right hand side is an int, no conversion occurs. The operand on the left hand side is converted to int. The assignment's result is 0x00000000.

为了执行任务,整数推广应用(主要是)。因为在我们的例子中,右边的操作数是int,所以不会发生转换。左侧的操作数转换为int,赋值结果为0x00000000。

No integer promotions here due to assignment. Promotions already occurred due to ~. A type change will occur, assigning an int to a char. That is not a promotion. The result is of type char. As part of the narrowing, the value of 0 goes through no range issues and results in a value of 0 and type char. The value of 11...11 0000 0000 would go through implementation defined behavior and likely result in a value 0 and certainly type char.

由于分配,这里没有整数升序。由于~已经发生了促销活动。将发生类型更改,将int分配给char。这不是晋升。结果是char类型。作为收缩的一部分,0的值没有经过范围问题,结果是0和char类型的值。11的价值……11 0000将经过实现定义的行为,并可能导致值0,当然类型为char。

Had code been (x =~x) + 0, that char (x =~x) would have been promoted to int before the addition.

如果代码是(x =~x) + 0,那么在添加之前,char (x =~x)将被提升为int型。

As a side effect, the left hand side of the assignment is assigned the value 0x00000000. Since x is of type char, there is another implicit conversion, which converts 0x00000000 to 0x00.

作为副作用,赋值的左边被赋值为0x00000000。由于x是char类型,所以还有另一个隐式转换,它将0x00000000转换为0x00。

Addressed in previous.

在之前解决。

What would happen if the assignment's result could not be stored in a char?

如果分配的结果不能存储在字符中,会发生什么?

It is implementation defined behavior which value is saved. It could include (rarely) raising an exception.

它是实现定义的行为,值被保存。它可以包括(很少)引发异常。


Bit masking and manipulation is best handled using unsigned types and math.

位屏蔽和操作最好使用无符号类型和数学来处理。

#1


7  

Indeed ~x is an int type.

实际上~x是一个int类型。

The conversion back to char is well-defined if char is unsigned. It's also well-defined, of course, if the value is in the range supported by char.

如果char是无符号的,则可以定义返回到char的转换。当然,如果值在char支持的范围内,那么它也具有良好的定义。

If char is signed, then the conversion of ~x to char is implementation-defined, with the possibility that an implementation defined signal is raised.

如果char被签名,那么~x到char的转换是实现定义的,可能会引发实现定义的信号。

In your case, you have a platform with a 2's complement int and a 2's complement char, and so ~x is observed as 0.

在您的例子中,您有一个带有2的补整数和2的补字符的平台,因此~x被观察为0。

Note that MSVC doesn't fully support any C standard, and neither does it claim to.

请注意,MSVC并不完全支持任何C标准,它也不声称支持任何C标准。

#2


4  

You are almost correct, but missing out that char has implementation-defined signedness. It can either be signed or unsigned, depending on compiler.

您几乎是正确的,但是忽略了char具有实现定义的签名。它可以是签名的,也可以是无符号的,这取决于编译器。

In either case, the bit pattern for a 8 bit 2's complement char is indeed 0xFF regardless of its signedness. But in case the char is signed, integer promotion will preserve the sign and you still have value -1, binary 0xFFFFFFFF on a 32 bit computer. But if char was unsigned, -1 would have been converted to 255 upon assignment and integer promotion would have given 255 (0x000000FF). So you'd get a different result.

在这两种情况下,8位2的补字符的位模式实际上是0xFF,而不考虑它的符号。但是,如果char被签名,那么整数提升将保留符号,并且在32位计算机上仍然有值-1,二进制0xffffffffff。但是如果char没有签名,那么-1将被转换为255,如果赋值和整数提升将得到255 (0x000000FF)。你会得到不同的结果。

Regarding integer promotion of ~, it only has one operator to the right and that one is promoted.

对于~的整数推广,只有一个操作符在右边,一个操作符在右边。

Finally you assign the result back to char and the outcome will again depend on signedness. You'll have an implicit "lvalue conversion" upon assignment from int to char. The result is implementation-defined - most likely you get the least significant byte of the int.

最后,将结果赋给char,结果将再次取决于签名。在将int赋值为char时,您将有一个隐式的“lvalue转换”。结果是实现定义的——最有可能的情况是您得到的是整数中最不重要的字节。


From this we can learn:

从中我们可以学到:

  • Never use char for storing integer values or for arithmetic. Use it for storing characters only. Instead, use uint8_t.
  • 永远不要使用char来存储整数值或算术。只用于存储字符。相反,使用uint8_t。
  • Never perform bitwise arithmetic on operands that are potentially signed, or was made signed silently through implicit promotion.
  • 永远不要对可能被签名的操作数执行位运算,或者通过隐式提升以静默签名。
  • The ~ operator is particularly dangerous unless the operand is unsigned int or a larger unsigned type.
  • ~操作符尤其危险,除非操作数是无符号整数或较大的无符号类型。

#3


1  

To my knowledge, the following happens (please correct me if I am wrong):

据我所知,以下情况(如有错误请纠正):

x is assigned the value -1, which is represented by the bit pattern 0xff since a char is represented by one byte.

x被赋值为-1,该值由位模式0xff表示,因为char由一个字节表示。

1 is an integer constant of type int. - negates that to -1 and remains an int. -1 is assigned to a char x . If that char is signed, then x takes on the value of -1. If that char is unsigned, x takes on the value of CHAR_MAX which is also UCHAR_MAX. "bit pattern 0xff" is not relevant here, yet.

1是整数常量,类型为int. -将其否定为-1并保持为整数。-1被分配给char x。如果这个字符有符号,那么x的值为-1。如果这个char是无符号的,那么x取CHAR_MAX的值,也就是UCHAR_MAX。“位模式0xff”在这里还不相关。

The ~ operator promotes x to an int, that is, it internally works with the bit pattern 0xffffffff.

~操作符将x提升为int型,也就是说,它在内部使用位模式0xffffffffff。

x is promoted to either int (or unsigned on rare machines where CHAR_MAX == UINT_MAX - we will ignore that). An int is at least 16 bits. The value of -1, when encoded as the overwhelmingly common 2's complement, is an all 1 bits pattern. (Other encoding possible - we will ignore that too). If x has the value of UCHAR_MAX, then x will have the bit pattern 00...00 1111 1111 - assuming 8-bit char. Other widths possible - another thing we will ignore.

x被提升为int(或在CHAR_MAX == UINT_MAX -我们将忽略这一点的罕见机器上的无符号)。int至少是16位。-1的值,当被编码成绝对公共的2的补码时,是一个所有1位的模式。(其他可能的编码——我们也会忽略这一点)。如果x的值为UCHAR_MAX,那么x的位模式为00…00 1111 1111 -假设8位字符。其他可能的宽度——另一件我们将忽略的事情。

The ~ operator's result is 0x00000000 (of type int).

~操作符的结果是0x00000000(类型为int)。

Yes, (unless CHAR_MAX == UINT_MAX, in which case it is unsigned and value 11...11 0000 0000).

是的,(除非CHAR_MAX == UINT_MAX,在这种情况下,它是无符号的,值为11……)11 0000 0000)。

To perform the assignment, the integer promotions apply (principally). Since in our case the operand on the right hand side is an int, no conversion occurs. The operand on the left hand side is converted to int. The assignment's result is 0x00000000.

为了执行任务,整数推广应用(主要是)。因为在我们的例子中,右边的操作数是int,所以不会发生转换。左侧的操作数转换为int,赋值结果为0x00000000。

No integer promotions here due to assignment. Promotions already occurred due to ~. A type change will occur, assigning an int to a char. That is not a promotion. The result is of type char. As part of the narrowing, the value of 0 goes through no range issues and results in a value of 0 and type char. The value of 11...11 0000 0000 would go through implementation defined behavior and likely result in a value 0 and certainly type char.

由于分配,这里没有整数升序。由于~已经发生了促销活动。将发生类型更改,将int分配给char。这不是晋升。结果是char类型。作为收缩的一部分,0的值没有经过范围问题,结果是0和char类型的值。11的价值……11 0000将经过实现定义的行为,并可能导致值0,当然类型为char。

Had code been (x =~x) + 0, that char (x =~x) would have been promoted to int before the addition.

如果代码是(x =~x) + 0,那么在添加之前,char (x =~x)将被提升为int型。

As a side effect, the left hand side of the assignment is assigned the value 0x00000000. Since x is of type char, there is another implicit conversion, which converts 0x00000000 to 0x00.

作为副作用,赋值的左边被赋值为0x00000000。由于x是char类型,所以还有另一个隐式转换,它将0x00000000转换为0x00。

Addressed in previous.

在之前解决。

What would happen if the assignment's result could not be stored in a char?

如果分配的结果不能存储在字符中,会发生什么?

It is implementation defined behavior which value is saved. It could include (rarely) raising an exception.

它是实现定义的行为,值被保存。它可以包括(很少)引发异常。


Bit masking and manipulation is best handled using unsigned types and math.

位屏蔽和操作最好使用无符号类型和数学来处理。