如何在C中将带符号的整数转换为相应的无符号整数?

时间:2021-03-22 07:16:49

I'd like to define a C macro

我想定义一个C宏

#define TO_UNSIGNED(x) (...)

, which takes a signed integer x (can be: signed char, short, int, long, long long, or anything else, even something longer than a long long), and it converts x to the corresponding unsigned integer type of the same size.

,它取一个带符号的整数x(可以是:带符号的char、short、int、long、long、long,或者其他任何东西,甚至是比长长长一些的东西),并将x转换为相同大小的相应的无符号整数类型。

It's OK to assume that signed integers use the two's complement representation. So to convert any value (positive or negative), its two's complement binary representation should be taken, and that should be interpreted as an unsigned integer of the same size.

可以假设有符号整数使用了两个补码表示。因此,要转换任何值(正或负),应该采用它的两个补充二进制表示,这应该被解释为相同大小的无符号整数。

I'm assuming that a reasonably modern, optimizing compiler is used which can eliminate unused branches, e.g. if sizeof(X) < 4 ? f(Y) : g(Z) is executed, then X is not evaluated, and only one of f(Y) or g(Z) is generated and evaluated.

我假设使用一个合理的现代的,优化的编译器可以消除未使用的分支,例如,如果sizeof(X) < 4 ?f(Y): g(Z)被执行,X不被求值,只生成f(Y)或g(Z),并对其进行评估。

4 个解决方案

#1


7  

I'll bite, but I have to say it's more in the spirit of macro hacking, not because I think such a macro is useful. Here goes:

我会咬一口,但我不得不说它更符合宏观黑客的精神,不是因为我认为这样的宏观是有用的。是:

#include <stdlib.h>
#include <stdio.h>

#define TO_UNSIGNED(x) (                                            \
    (sizeof(x) == 1)                ? (unsigned char) (x) :         \
    (sizeof(x) == sizeof(short))    ? (unsigned short) (x) :        \
    (sizeof(x) == sizeof(int))      ? (unsigned int) (x) :          \
    (sizeof(x) == sizeof(long))     ? (unsigned long) (x) :         \
                                      (unsigned long long) (x)      \
    )

// Now put the macro to use ...

short minus_one_s()
{
    return -1;
}

long long minus_one_ll()
{
    return -1LL;
}

int main()
{
    signed char c = -1;
    short s = -1;
    int i = -1;
    long int l = -1L;
    long long int ll = -1LL;

    printf("%llx\n", (unsigned long long) TO_UNSIGNED(c));
    printf("%llx\n", (unsigned long long) TO_UNSIGNED(s));
    printf("%llx\n", (unsigned long long) TO_UNSIGNED(i));
    printf("%llx\n", (unsigned long long) TO_UNSIGNED(l));
    printf("%llx\n", (unsigned long long) TO_UNSIGNED(ll));

    printf("%llx\n", (unsigned long long) TO_UNSIGNED(minus_one_s()));
    printf("%llx\n", (unsigned long long) TO_UNSIGNED(minus_one_ll()));

    return 0;
}

The macro uses the ternary comparison operator ?: to emulate a switch statement for all known signed integer sizes. (This should catch the appropriate unsigned integers and the typedef'd typed from <stdint.h>, too. It works with expressions. It also accepts floats, although not quite as I'd expect.)

宏使用ternary比较运算符?:模拟所有已知有符号整数大小的switch语句。(这将捕获适当的无符号整数和来自 。它使用表达式。它也接受浮点数,尽管不完全像我预期的那样。 的类型定义。h>

The somewhat convoluted printfs show that the negative numbers are expanded to the native size of the source integer.

有点复杂的printfs显示负数被扩展到源整数的本机大小。

Edit: The OP is looking for a macro that returns an expression of the unsigned type of the same length as the source type. The above macro doesn't do that: Because the two alternative values of the ternary comparison are promoted to a common type, the result of the macro will always be the type of the greatest size, which is unsigned long long.

编辑:OP正在寻找一个宏,该宏返回与源类型相同长度的无符号类型的表达式。上面的宏没有这样做:因为三元比较的两个可选值被提升为一个公共类型,所以宏的结果总是最大的类型,即无符号长。

Branches of different types could probably be achieved with a pure macro solution, such that after preprocessing, the compiler only sees one type, but the preprocessor doesn't know about types, so sizeof cannot be used here, which rules out such a macro.

可以使用纯宏解决方案实现不同类型的分支,例如,在预处理之后,编译器只看到一个类型,但是预处理器不知道类型,所以这里不能使用sizeof,这就排除了这样一个宏。

But to my (weak) defense, I'll say that if the value of the unsigned long long result of the macro is assigned to the appropriate unsigned type (i.e. unsigned short for short), the value should never be truncated, so the macro might have some use.

但是对于我的(弱)辩护,我要说的是,如果宏的无符号长结果的值被分配给适当的无符号类型(即无符号短),那么这个值不应该被截断,因此宏可能有一些用处。

Edit II: Now that I've stumbled upon the C11 _Generic keyword in another question (and have installed a compiler that supports it), I can present a working solution: The following macro really returns the correct value with the correct type:

编辑II:既然我在另一个问题中偶然发现了C11 _Generic关键字(并安装了支持它的编译器),我就可以给出一个有效的解决方案:下面的宏确实返回正确类型的正确值:

#define TO_UNSIGNED(x) _Generic((x),           \
    char:        (unsigned char) (x),          \
    signed char: (unsigned char) (x),          \
    short:       (unsigned short) (x),         \
    int:         (unsigned int) (x),           \
    long:        (unsigned long) (x),          \
    long long:   (unsigned long long) (x),     \
    default:     (unsigned int) (x)            \
    )

The _Generic selection is resolved at compile time and doesn't have the overhead of producing intermediate results in an oversized int type. (A real-world macro should probably include the unsigned types themself for a null-cast. Also note that I had to include signed char explicitly, just char didn't work, even though my chars are signed.)

泛型选择在编译时解析,并没有在过大的int类型中产生中间结果的开销。(一个真实的宏应该包含无符号类型themself,用于null类型转换。还要注意,我必须显式地包含有符号的char,只是char不起作用,即使我的chars是有符号的。

It requires a recent compiler that implements C11 or at least its _Generic keyword, which means this solution is not very portable, though, see here.

它需要一个最近实现C11的编译器,或者至少实现它的_general关键字,这意味着这个解决方案不是非常可移植的,但是,请参见这里。

#2


3  

You don't need a macro. The conversion happens automatically. E.g.:

你不需要宏。转换是自动发生的。例如:

int x = -1;
unsigned int y;

y = x;

EDIT

编辑

You seem to want a macro that can infer the type of a variable from its name. That is impossible. Macros are run at a stage of compilation where the compiler doesn't have the type information available. So the macro must emit the same code regardless of the variable's type.

您似乎需要一个宏,它可以从变量名推断变量的类型。这是不可能的。宏在编译阶段运行,编译器没有可用的类型信息。因此,无论变量的类型如何,宏都必须发出相同的代码。

At the stage when type information becomes available, the compiler will insist that every expression has a consistent type. But you're asking for code that is inconsistently typed.

在类型信息可用的阶段,编译器将坚持每个表达式都具有一致的类型。但是你需要的是不一致类型的代码。

The best you can hope for is to supply the type information yourself. E.g.:

您所能期望的最好结果是自己提供类型信息。例如:

#define TO_UNSIGNED(type, name) (unsigned type(name))

#3


2  

Ok, since you intend to use this macro to implicitly convert negative values to their 2's complement counterparts, I think we can address it the following way:

好的,既然你打算使用这个宏来隐式地将负值转换成它们的2的补码对应物,我想我们可以用以下方法来处理它:

#include "stdio.h"
#include "stdint.h"


#define TO_UNSIGNED(x) ( \
                          (sizeof(x) == 1 ? (uint8_t)x : \
                          (sizeof(x) <= 2 ? (uint16_t)x : \
                          (sizeof(x) <= 4 ? (uint32_t)x : \
                          (sizeof(x) <= 8 ? (uint64_t)x : \
                          x \
                        )))))



int main () {
    char a = -4;
    int b = -4;

    printf ("TO_UNSIGNED(a) = %u\n", TO_UNSIGNED(a));
    printf ("TO_UNSIGNED(b) = %u\n", TO_UNSIGNED(b));
    return 0;
}

Output:

输出:

TO_UNSIGNED(a) = 252
TO_UNSIGNED(b) = 4294967292

Of course support for further lengths may be required, I left the > 64bit to just return x itself for now.

当然,可能还需要支持更长的长度,我将> 64bit暂时只返回x本身。

#4


0  

It looks like there is no generic solution which supports integers of all possible sizes.

似乎没有通用的解决方案支持所有可能大小的整数。

For a hardcoded list of types, I was able to make it work using __builtin_choose_expr in C and overloaded function in C++. Here is the solution: https://github.com/pts/to-unsigned/blob/master/to_unsigned.h

对于硬编码的类型列表,我可以使用C中的__builtin_choose_expr和c++中的重载函数使其工作。解决方案如下:https://github.com/pts/tounsigned/blob/master/to_unsigned.h

The relevant C code looks like this:

相关的C代码如下所示:

#define TO_UNSIGNED(x) ( \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), unsigned char), (unsigned char)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), char), (unsigned char)(x), \
    __builtin_choose_expr(sizeof(x) == sizeof(char), (unsigned char)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), unsigned short), (unsigned short)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), short), (unsigned short)(x), \
    __builtin_choose_expr(sizeof(x) == sizeof(short), (unsigned short)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), unsigned), (unsigned)(x), \ 
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), int), (unsigned)(x), \
    __builtin_choose_expr(sizeof(x) == sizeof(int), (unsigned)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), unsigned long), (unsigned long)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), long), (unsigned long)(x), \
    __builtin_choose_expr(sizeof(x) == sizeof(long), (unsigned long)(x), \
    __extension__ __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), unsigned long long), (unsigned long long)(x), \
    __extension__ __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), long long), (unsigned long long)(x), \
    __extension__ __builtin_choose_expr(sizeof(x) == sizeof(long long), (unsigned long)(x), \
    (void)0)))))))))))))))) 

Instead of __builtin_choose_expr + __builtin_types_compatible_p, the equivalent _Generic construct can also be used with compilers that support it, starting from C11.

与__builtin_choose_expr + __builtin_types_compatible_p不同,等价的_Generic构造也可以与支持它的编译器一起使用,从C11开始。

C++11 has std::make_unsigned, and its implementation in libstdc++ explicitly enumerates the integer types it knows about, similar to how my C++ implementation of TO_UNSIGNED does.

c++ 11有std::make_unsigned,它在libstdc++中的实现显式地列举了它所知道的整数类型,类似于我的c++ TO_UNSIGNED的实现。

#1


7  

I'll bite, but I have to say it's more in the spirit of macro hacking, not because I think such a macro is useful. Here goes:

我会咬一口,但我不得不说它更符合宏观黑客的精神,不是因为我认为这样的宏观是有用的。是:

#include <stdlib.h>
#include <stdio.h>

#define TO_UNSIGNED(x) (                                            \
    (sizeof(x) == 1)                ? (unsigned char) (x) :         \
    (sizeof(x) == sizeof(short))    ? (unsigned short) (x) :        \
    (sizeof(x) == sizeof(int))      ? (unsigned int) (x) :          \
    (sizeof(x) == sizeof(long))     ? (unsigned long) (x) :         \
                                      (unsigned long long) (x)      \
    )

// Now put the macro to use ...

short minus_one_s()
{
    return -1;
}

long long minus_one_ll()
{
    return -1LL;
}

int main()
{
    signed char c = -1;
    short s = -1;
    int i = -1;
    long int l = -1L;
    long long int ll = -1LL;

    printf("%llx\n", (unsigned long long) TO_UNSIGNED(c));
    printf("%llx\n", (unsigned long long) TO_UNSIGNED(s));
    printf("%llx\n", (unsigned long long) TO_UNSIGNED(i));
    printf("%llx\n", (unsigned long long) TO_UNSIGNED(l));
    printf("%llx\n", (unsigned long long) TO_UNSIGNED(ll));

    printf("%llx\n", (unsigned long long) TO_UNSIGNED(minus_one_s()));
    printf("%llx\n", (unsigned long long) TO_UNSIGNED(minus_one_ll()));

    return 0;
}

The macro uses the ternary comparison operator ?: to emulate a switch statement for all known signed integer sizes. (This should catch the appropriate unsigned integers and the typedef'd typed from <stdint.h>, too. It works with expressions. It also accepts floats, although not quite as I'd expect.)

宏使用ternary比较运算符?:模拟所有已知有符号整数大小的switch语句。(这将捕获适当的无符号整数和来自 。它使用表达式。它也接受浮点数,尽管不完全像我预期的那样。 的类型定义。h>

The somewhat convoluted printfs show that the negative numbers are expanded to the native size of the source integer.

有点复杂的printfs显示负数被扩展到源整数的本机大小。

Edit: The OP is looking for a macro that returns an expression of the unsigned type of the same length as the source type. The above macro doesn't do that: Because the two alternative values of the ternary comparison are promoted to a common type, the result of the macro will always be the type of the greatest size, which is unsigned long long.

编辑:OP正在寻找一个宏,该宏返回与源类型相同长度的无符号类型的表达式。上面的宏没有这样做:因为三元比较的两个可选值被提升为一个公共类型,所以宏的结果总是最大的类型,即无符号长。

Branches of different types could probably be achieved with a pure macro solution, such that after preprocessing, the compiler only sees one type, but the preprocessor doesn't know about types, so sizeof cannot be used here, which rules out such a macro.

可以使用纯宏解决方案实现不同类型的分支,例如,在预处理之后,编译器只看到一个类型,但是预处理器不知道类型,所以这里不能使用sizeof,这就排除了这样一个宏。

But to my (weak) defense, I'll say that if the value of the unsigned long long result of the macro is assigned to the appropriate unsigned type (i.e. unsigned short for short), the value should never be truncated, so the macro might have some use.

但是对于我的(弱)辩护,我要说的是,如果宏的无符号长结果的值被分配给适当的无符号类型(即无符号短),那么这个值不应该被截断,因此宏可能有一些用处。

Edit II: Now that I've stumbled upon the C11 _Generic keyword in another question (and have installed a compiler that supports it), I can present a working solution: The following macro really returns the correct value with the correct type:

编辑II:既然我在另一个问题中偶然发现了C11 _Generic关键字(并安装了支持它的编译器),我就可以给出一个有效的解决方案:下面的宏确实返回正确类型的正确值:

#define TO_UNSIGNED(x) _Generic((x),           \
    char:        (unsigned char) (x),          \
    signed char: (unsigned char) (x),          \
    short:       (unsigned short) (x),         \
    int:         (unsigned int) (x),           \
    long:        (unsigned long) (x),          \
    long long:   (unsigned long long) (x),     \
    default:     (unsigned int) (x)            \
    )

The _Generic selection is resolved at compile time and doesn't have the overhead of producing intermediate results in an oversized int type. (A real-world macro should probably include the unsigned types themself for a null-cast. Also note that I had to include signed char explicitly, just char didn't work, even though my chars are signed.)

泛型选择在编译时解析,并没有在过大的int类型中产生中间结果的开销。(一个真实的宏应该包含无符号类型themself,用于null类型转换。还要注意,我必须显式地包含有符号的char,只是char不起作用,即使我的chars是有符号的。

It requires a recent compiler that implements C11 or at least its _Generic keyword, which means this solution is not very portable, though, see here.

它需要一个最近实现C11的编译器,或者至少实现它的_general关键字,这意味着这个解决方案不是非常可移植的,但是,请参见这里。

#2


3  

You don't need a macro. The conversion happens automatically. E.g.:

你不需要宏。转换是自动发生的。例如:

int x = -1;
unsigned int y;

y = x;

EDIT

编辑

You seem to want a macro that can infer the type of a variable from its name. That is impossible. Macros are run at a stage of compilation where the compiler doesn't have the type information available. So the macro must emit the same code regardless of the variable's type.

您似乎需要一个宏,它可以从变量名推断变量的类型。这是不可能的。宏在编译阶段运行,编译器没有可用的类型信息。因此,无论变量的类型如何,宏都必须发出相同的代码。

At the stage when type information becomes available, the compiler will insist that every expression has a consistent type. But you're asking for code that is inconsistently typed.

在类型信息可用的阶段,编译器将坚持每个表达式都具有一致的类型。但是你需要的是不一致类型的代码。

The best you can hope for is to supply the type information yourself. E.g.:

您所能期望的最好结果是自己提供类型信息。例如:

#define TO_UNSIGNED(type, name) (unsigned type(name))

#3


2  

Ok, since you intend to use this macro to implicitly convert negative values to their 2's complement counterparts, I think we can address it the following way:

好的,既然你打算使用这个宏来隐式地将负值转换成它们的2的补码对应物,我想我们可以用以下方法来处理它:

#include "stdio.h"
#include "stdint.h"


#define TO_UNSIGNED(x) ( \
                          (sizeof(x) == 1 ? (uint8_t)x : \
                          (sizeof(x) <= 2 ? (uint16_t)x : \
                          (sizeof(x) <= 4 ? (uint32_t)x : \
                          (sizeof(x) <= 8 ? (uint64_t)x : \
                          x \
                        )))))



int main () {
    char a = -4;
    int b = -4;

    printf ("TO_UNSIGNED(a) = %u\n", TO_UNSIGNED(a));
    printf ("TO_UNSIGNED(b) = %u\n", TO_UNSIGNED(b));
    return 0;
}

Output:

输出:

TO_UNSIGNED(a) = 252
TO_UNSIGNED(b) = 4294967292

Of course support for further lengths may be required, I left the > 64bit to just return x itself for now.

当然,可能还需要支持更长的长度,我将> 64bit暂时只返回x本身。

#4


0  

It looks like there is no generic solution which supports integers of all possible sizes.

似乎没有通用的解决方案支持所有可能大小的整数。

For a hardcoded list of types, I was able to make it work using __builtin_choose_expr in C and overloaded function in C++. Here is the solution: https://github.com/pts/to-unsigned/blob/master/to_unsigned.h

对于硬编码的类型列表,我可以使用C中的__builtin_choose_expr和c++中的重载函数使其工作。解决方案如下:https://github.com/pts/tounsigned/blob/master/to_unsigned.h

The relevant C code looks like this:

相关的C代码如下所示:

#define TO_UNSIGNED(x) ( \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), unsigned char), (unsigned char)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), char), (unsigned char)(x), \
    __builtin_choose_expr(sizeof(x) == sizeof(char), (unsigned char)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), unsigned short), (unsigned short)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), short), (unsigned short)(x), \
    __builtin_choose_expr(sizeof(x) == sizeof(short), (unsigned short)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), unsigned), (unsigned)(x), \ 
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), int), (unsigned)(x), \
    __builtin_choose_expr(sizeof(x) == sizeof(int), (unsigned)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), unsigned long), (unsigned long)(x), \
    __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), long), (unsigned long)(x), \
    __builtin_choose_expr(sizeof(x) == sizeof(long), (unsigned long)(x), \
    __extension__ __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), unsigned long long), (unsigned long long)(x), \
    __extension__ __builtin_choose_expr(__builtin_types_compatible_p(__typeof(x), long long), (unsigned long long)(x), \
    __extension__ __builtin_choose_expr(sizeof(x) == sizeof(long long), (unsigned long)(x), \
    (void)0)))))))))))))))) 

Instead of __builtin_choose_expr + __builtin_types_compatible_p, the equivalent _Generic construct can also be used with compilers that support it, starting from C11.

与__builtin_choose_expr + __builtin_types_compatible_p不同,等价的_Generic构造也可以与支持它的编译器一起使用,从C11开始。

C++11 has std::make_unsigned, and its implementation in libstdc++ explicitly enumerates the integer types it knows about, similar to how my C++ implementation of TO_UNSIGNED does.

c++ 11有std::make_unsigned,它在libstdc++中的实现显式地列举了它所知道的整数类型,类似于我的c++ TO_UNSIGNED的实现。