严格别名法，并通过联合铸造

Do you have any horror stories to tell? The GCC Manual recently added a warning regarding -fstrict-aliasing and casting a pointer through a union:

你有什么恐怖故事要讲吗?GCC手册最近增加了一个关于-fstrict- alialize的警告，并通过union使用指针:

[...] Taking the address, casting the resulting pointer and dereferencing the result has undefined behavior [emphasis added], even if the cast uses a union type, e.g.:

[…即使转换使用联合类型，例如:

    union a_union {
        int i;
        double d;
    };

    int f() {
        double d = 3.0;
        return ((union a_union *)&d)->i;
    }

Does anyone have an example to illustrate this undefined behavior?

有没有人能举个例子来说明这种未定义的行为?

Note this question is not about what the C99 standard says, or does not say. It is about the actual functioning of gcc, and other existing compilers, today.

注意，这个问题不是关于C99标准说什么或不说什么。它关系到今天gcc和其他现有编译器的实际功能。

I am only guessing, but one potential problem may lie in the setting of d to 3.0. Because d is a temporary variable which is never directly read, and which is never read via a 'somewhat-compatible' pointer, the compiler may not bother to set it. And then f() will return some garbage from the stack.

我只是在猜测，但是一个潜在的问题可能存在于d到3.0的设置中。因为d是一个临时变量，它永远不会直接读取，也不会通过“某某兼容”指针读取，所以编译器可能不会费心设置它。然后f()会从堆栈返回一些垃圾。

My simple, naive, attempt fails. For example:

我的简单，天真，尝试失败了。例如:

#include <stdio.h>

union a_union {
    int i;
    double d;
};

int f1(void) {
    union a_union t;
    t.d = 3333333.0;
    return t.i; // gcc manual: 'type-punning is allowed, provided...' (C90 6.3.2.3)
}

int f2(void) {
    double d = 3333333.0;
    return ((union a_union *)&d)->i; // gcc manual: 'undefined behavior' 
}

int main(void) {
    printf("%d\n", f1());
    printf("%d\n", f2());
    return 0;
}

works fine, giving on CYGWIN:

在CYGWIN上效果不错:

-2147483648
-2147483648

Looking at the assembler, we see that gcc completely optimizes t away: f1() simply stores the pre-calculated answer:

查看汇编程序，我们看到gcc完全优化了t: f1()存储了预先计算的答案:

movl    $-2147483648, %eax

while f2() pushes 3333333.0 onto the floating-point stack, and then extracts the return value:

而f2()将333333333.0推送到浮点堆栈，然后提取返回值:

flds   LC0                 # LC0: 1246458708 (= 3333333.0) (--> 80 bits)
fstpl  -8(%ebp)            # save in d (64 bits)
movl   -8(%ebp), %eax      # return value (32 bits)

And the functions are also inlined (which seems to be the cause of some subtle strict-aliasing bugs) but that is not relevant here. (And this assembler is not that relevant, but it adds corroborative detail.)

函数也是内联的(这似乎是导致一些细微的严格别名错误的原因)，但这在这里没有关系。(这个汇编程序并不是很相关，但它添加了确证细节。)

Also note that taking addresses is obviously wrong (or right, if you are trying to illustrate undefined behavior). For example, just as we know this is wrong:

还要注意，获取地址显然是错误的(或者，如果您试图说明未定义的行为)。例如，正如我们所知道的，这是错误的:

extern void foo(int *, double *);
union a_union t;
t.d = 3.0;
foo(&t.i, &t.d); // undefined behavior

we likewise know this is wrong:

我们同样知道这是错误的:

extern void foo(int *, double *);
double d = 3.0;
foo(&((union a_union *)&d)->i, &d); // undefined behavior

For background discussion about this, see for example:

有关这方面的背景讨论，请参见:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1422.pdf
http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html
http://davmac.wordpress.com/2010/02/26/c99-revisited/
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
( = search page on Google then view cached page )

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1422.pdf http://gcc.gnu.org/ml/gcc2010.01 /msg00013.html http://davmac.wordpress.com/2010/02/02/26/c99-revisi/http://cellperformance.beyon3d.com/200.html

What is the strict aliasing rule?
C99 strict aliasing rules in C++ (GCC)

什么是严格的混叠规则?C99中严格的锯齿规则(GCC)

In the first link, draft minutes of an ISO meeting seven months ago, one participant notes in section 4.16:

在第一个环节，7个月前的ISO会议记录草案，一名与会者在第4.16节中指出:

Is there anybody that thinks the rules are clear enough? No one is really able to interpret them.

有没有人认为这些规则足够清晰?没有人能真正地解释它们。

Other notes: My test was with gcc 4.3.4, with -O2; options -O2 and -O3 imply -fstrict-aliasing. The example from the GCC Manual assumes sizeof(double) >= sizeof(int); it doesn't matter if they are unequal.

其他注释:我的测试是用gcc 4.3.4，用-O2;选项-O2和-O3意味着-fstrict- alialize。GCC手册中的示例假设sizeof(double) >= sizeof(int);他们是否不平等并不重要。

Also, as noted by Mike Acton in the cellperformace link, -Wstrict-aliasing=2, but not =3, produces warning: dereferencing type-punned pointer might break strict-aliasing rules for the example here.

同样，正如Mike Acton在cellperformace链接中指出的，- wstrict - alialize =2，但不是=3，会产生警告:取消引用类型punned指针可能会打破这里示例的严格别名规则。

7 个解决方案

#1

The fact that GCC is warning about unions doesn't necessarily mean that unions don't currently work. But here's a slightly less simple example than yours:

海湾合作委员会警告工会的事实并不一定意味着工会目前没有发挥作用。但这里有一个比你的简单的例子:

#include <stdio.h>

struct B {
    int i1;
    int i2;
};

union A {
    struct B b;
    double d;
};

int main() {
    double d = 3.0;
    #ifdef USE_UNION
        ((union A*)&d)->b.i2 += 0x80000000;
    #else
        ((int*)&d)[1] += 0x80000000;
    #endif
    printf("%g\n", d);
}

Output:

输出:

$ gcc --version
gcc (GCC) 4.3.4 20090804 (release) 1
Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc -oalias alias.c -O1 -std=c99 && ./alias
-3

$ gcc -oalias alias.c -O3 -std=c99 && ./alias
3

$ gcc -oalias alias.c -O1 -std=c99 -DUSE_UNION && ./alias
-3

$ gcc -oalias alias.c -O3 -std=c99 -DUSE_UNION && ./alias
-3

So on GCC 4.3.4, the union "saves the day" (assuming I want the output "-3"). It disables the optimisation that relies on strict aliasing and that results in the output "3" in the second case (only). With -Wall, USE_UNION also disables the type-pun warning.

因此，在GCC 4.3.4中，union“save the day”(假设我想要输出“-3”)。它禁用依赖于严格的混叠的优化，这导致在第二种情况下(仅)输出“3”。使用-Wall, USE_UNION还禁用类型-pun警告。

I don't have gcc 4.4 to test, but please give this code a go. Your code in effect tests whether the memory for d is initialised before reading back through a union: mine tests whether it is modified.

我没有gcc 4.4要测试，但是请尝试一下这个代码。实际上，您的代码在通过union读回之前测试d的内存是否已初始化:我测试它是否被修改。

Btw, the safe way to read half of a double as an int is:

顺便说一句，读半个双引号作为int的安全方法是:

double d = 3;
int i;
memcpy(&i, &d, sizeof i);
return i;

With optimisation on GCC, this results in:

对GCC进行优化，结果如下:

    int thing() {
401130:       55                      push   %ebp
401131:       89 e5                   mov    %esp,%ebp
401133:       83 ec 10                sub    $0x10,%esp
        double d = 3;
401136:       d9 05 a8 20 40 00       flds   0x4020a8
40113c:       dd 5d f0                fstpl  -0x10(%ebp)
        int i;
        memcpy(&i, &d, sizeof i);
40113f:       8b 45 f0                mov    -0x10(%ebp),%eax
        return i;
    }
401142:       c9                      leave
401143:       c3                      ret

So there's no actual call to memcpy. If you aren't doing this, you deserve what you get if union casts stop working in GCC ;-)

没有调用memcpy。如果你不这样做，你应该得到你应得的如果联合铸造停止工作在GCC;

#2

Your assertion that the following code is "wrong":

您断言以下代码是“错误的”:

extern void foo(int *, double *);
union a_union t;
t.d = 3.0;
foo(&t.i, &t.d); // undefined behavior

... is wrong. Just taking the address of the two union members and passing them to an external function doesn't result in undefined behaviour; you only get that from dereferencing one of those pointers in an invalid way. For instance if the function foo returns immediately without dereferencing the pointers you passed it, then the behaviour is not undefined. With a strict reading of the C99 standard, there are even some cases where the pointers can be dereferenced without invoking undefined behaviour; for instance, it could read the value referenced by the second pointer, and then store a value through the first pointer, as long as they both point to a dynamically allocated object (i.e. one without a "declared type").

…是错误的。只要把两个工会成员的地址传递给外部函数，就不会产生不确定的行为;你只能通过无效的方式去引用其中一个指针来得到这个结果。例如，如果函数foo立即返回，而没有取消传递给它的指针的引用，则行为不是未定义的。在严格阅读C99标准的情况下，甚至有些情况下，指针可以被取消引用，而不会调用未定义的行为;例如，它可以读取第二个指针引用的值，然后通过第一个指针存储一个值，只要它们指向一个动态分配的对象(即没有“声明类型”的对象)。

#3

Aliasing occurs when the compiler has two different pointers to the same piece of memory. By typecasting a pointer, you're generating a new temporary pointer. If the optimizer reorders the assembly instructions for example, accessing the two pointers might give two totally different results - it might reorder a read before a write to the same address. This is why it is undefined behavior.

当编译器有两个指向同一块内存的不同指针时，就会发生别名。通过键入一个指针，您正在生成一个新的临时指针。例如，如果优化器重新排序程序集指令，访问这两个指针可能会得到两个完全不同的结果——它可能会在写入到相同地址之前重新排序读取。这就是为什么它是未定义的行为。

You are unlikely to see the problem in very simple test code, but it will appear when there's a lot going on.

您不太可能在非常简单的测试代码中看到这个问题，但是当出现很多问题时，它就会出现。

I think the warning is to make clear that unions are not a special case, even though you might expect them to be.

我认为，这个警告是要表明，工会不是一个特例，尽管你可能会认为他们是特例。

See this Wikipedia article for more information about aliasing: http://en.wikipedia.org/wiki/Aliasing_(computing)#Conflicts_with_optimization

有关别名的更多信息，请参阅*文章:http://en.wikipedia.org/wiki/Aliasing_(computing)#Conflicts_with_optimization

#4

Well it's a bit of necro-posting, but here is a horror story. I'm porting a program that was written with the assumption that the native byte order is big endian. Now I need it to work on little endian too. Unfortunately, I can't just use native byte order everywhere, as data could be accessed in many ways. For example, a 64-bit integer could be treated as two 32-bit integers or as 4 16-bit integers, or even as 16 4-bit integers. To make things worse, there is no way to figure out what exactly is stored in memory, because the software is an interpreter for some sort of byte code, and the data is formed by that byte code. For example, the byte code may contain instructions to write an array of 16-bit integers, and then access a pair of them as a 32-bit float. And there is no way to predict it or alter the byte code.

好吧，这有点像是亡灵宣言，但这里有一个恐怖的故事。我正在移植一个程序，它的编写假定本机字节顺序是大端的。现在我也需要它来处理小恩德。不幸的是，我不能在任何地方都使用本机字节顺序，因为数据可以通过多种方式访问。例如，一个64位整数可以被视为两个32位整数，或者是4个16位整数，或者甚至是16个4位整数。更糟糕的是，无法确定内存中究竟存储了什么，因为软件是某种字节代码的解释器，数据是由字节代码形成的。例如，字节代码可能包含编写16位整数数组的指令，然后以32位浮点数的形式访问一对整数。而且无法预测它或更改字节代码。

Therefore, I had to create a set of wrapper classes to work with values stored in the big endian order regardless of the native endianness. Worked perfectly in Visual Studio and in GCC on Linux with no optimizations. But with gcc -O2, hell broke loose. After a lot of debugging I figured out that the reason was here:

因此，我必须创建一组包装器类，以处理存储在大endian顺序中的值，而不考虑本机endianness。在Visual Studio和Linux上的GCC中工作得很好，没有优化。但是，在gcc -O2的作用下，地狱爆发了。经过大量调试，我发现原因在这里:

double D;
float F; 
Ul *pF=(Ul*)&F; // Ul is unsigned long
*pF=pop0->lu.r(); // r() returns Ul
D=(double)F;

This code was used to convert a 32-bit representation of a float stored in a 32-bit integer to double. It seems that the compiler decided to do the assignment to *pF after the assignment to D - the result was that the first time the code was executed, the value of D was garbage, and the consequent values were "late" by 1 iteration.

此代码用于将存储在32位整数中的浮点数的32位表示形式转换为double。似乎编译器在赋值给D后决定给*pF赋值——结果是第一次执行代码时，D的值是垃圾，结果值在1次迭代后“延迟”。

Miraculously, there were no other problems at that point. So I decided to move on and test my new code on the original platform, HP-UX on a RISC processor with native big endian order. Now it broke again, this time in my new class:

奇迹般的，那时没有其他问题。所以我决定继续，在原来的平台上测试我的新代码，HP-UX在RISC处理器上，带有本地的大端指令。现在它又坏了，这次是在我的新课上:

typedef unsigned long long Ur; // 64-bit uint
typedef unsigned char Uc;
class BEDoubleRef {
        double *p;
public:
        inline BEDoubleRef(double *p): p(p) {}
        inline operator double() {
                Uc *pu = reinterpret_cast<Uc*>(p);
                Ur n = (pu[7] & 0xFFULL) | ((pu[6] & 0xFFULL) << 8)
                        | ((pu[5] & 0xFFULL) << 16) | ((pu[4] & 0xFFULL) << 24)
                        | ((pu[3] & 0xFFULL) << 32) | ((pu[2] & 0xFFULL) << 40)
                        | ((pu[1] & 0xFFULL) << 48) | ((pu[0] & 0xFFULL) << 56);
                return *reinterpret_cast<double*>(&n);
        }
        inline BEDoubleRef &operator=(const double &d) {
                Uc *pc = reinterpret_cast<Uc*>(p);
                const Ur *pu = reinterpret_cast<const Ur*>(&d);
                pc[0] = (*pu >> 56) & 0xFFu;
                pc[1] = (*pu >> 48) & 0xFFu;
                pc[2] = (*pu >> 40) & 0xFFu;
                pc[3] = (*pu >> 32) & 0xFFu;
                pc[4] = (*pu >> 24) & 0xFFu;
                pc[5] = (*pu >> 16) & 0xFFu;
                pc[6] = (*pu >> 8) & 0xFFu;
                pc[7] = *pu & 0xFFu;
                return *this;
        }
        inline BEDoubleRef &operator=(const BEDoubleRef &d) {
                *p = *d.p;
                return *this;
        }
};

For some really weird reason, the first assignment operator only correctly assigned bytes 1 through 7. Byte 0 always had some nonsense in it, which broke everything as there is a sign bit and a part of order.

由于某些非常奇怪的原因，第一个赋值操作符只正确地分配了1到7个字节。字节0中总是有一些无意义的东西，它破坏了所有的东西，因为有一个符号位和一部分的顺序。

I have tried to use unions as a workaround:

我曾试图利用工会作为解决办法:

union {
    double d;
    Uc c[8];
} un;
Uc *pc = un.c;
const Ur *pu = reinterpret_cast<const Ur*>(&d);
pc[0] = (*pu >> 56) & 0xFFu;
pc[1] = (*pu >> 48) & 0xFFu;
pc[2] = (*pu >> 40) & 0xFFu;
pc[3] = (*pu >> 32) & 0xFFu;
pc[4] = (*pu >> 24) & 0xFFu;
pc[5] = (*pu >> 16) & 0xFFu;
pc[6] = (*pu >> 8) & 0xFFu;
pc[7] = *pu & 0xFFu;
*p = un.d;

but it didn't work either. In fact, it was a bit better - it only failed for negative numbers.

但它也不管用。事实上，它更好一点——它只对负数无效。

At this point I'm thinking about adding a simple test for native endianness, then doing everything via char* pointers with if (LITTLE_ENDIAN) checks around. To make things worse, the program makes heavy use of unions all around, which seems to work ok for now, but after all this mess I won't be surprised if it suddenly breaks for no apparent reason.

此时，我正在考虑为本机endianness添加一个简单的测试，然后通过if (LITTLE_ENDIAN)检查所有操作。更糟糕的是，这个计划在所有的地方都使用了大量的工会，现在看来是可行的，但是在这些混乱之后，如果突然间没有明显的原因，我也不会感到惊讶。

#5

Have you seen this ? What is the strict aliasing rule?

你见过这个吗?什么是严格的混叠规则?

The link contains a secondary link to this article with gcc examples. http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html

该链接包含了本文的辅助链接，其中包含gcc示例。http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html

Trying a union like this would be closer to the problem.

尝试这样的联盟将更接近问题。

union a_union {
    int i;
    double *d;
};

That way you have 2 types, an int and a double* pointing to the same memory. In this case using the double (*(double*)&i) could cause the problem.

这样就有了两种类型，一个int型和一个double*指向相同的内存。在这种情况下使用double (*(double*)&i)会导致问题。

#6

Here is mine: In think this is a bug in all GCC v5.x and later

这是我的:我认为这是GCC v5中的一个bug。x和后

#include <iostream>
#include <complex>
#include <pmmintrin.h>

template <class Scalar_type, class Vector_type>
class simd {
 public:
  typedef Vector_type vector_type;
  typedef Scalar_type scalar_type;
  typedef union conv_t_union {
    Vector_type v;
    Scalar_type s[sizeof(Vector_type) / sizeof(Scalar_type)];
    conv_t_union(){};
  } conv_t;

  static inline constexpr int Nsimd(void) {
    return sizeof(Vector_type) / sizeof(Scalar_type);
  }

  Vector_type v;

  template <class functor>
  friend inline simd SimdApply(const functor &func, const simd &v) {
    simd ret;
    simd::conv_t conv;

    conv.v = v.v;
    for (int i = 0; i < simd::Nsimd(); i++) {
      conv.s[i] = func(conv.s[i]);
    }
    ret.v = conv.v;
    return ret;
  }

};

template <class scalar>
struct RealFunctor {
  scalar operator()(const scalar &a) const {
    return std::real(a);
  }
};

template <class S, class V>
inline simd<S, V> real(const simd<S, V> &r) {
  return SimdApply(RealFunctor<S>(), r);
}



typedef simd<std::complex<double>, __m128d> vcomplexd;

int main(int argc, char **argv)
{
  vcomplexd a,b;
  a.v=_mm_set_pd(2.0,1.0);
  b = real(a);

  vcomplexd::conv_t conv;
  conv.v = b.v;
  for(int i=0;i<vcomplexd::Nsimd();i++){
    std::cout << conv.s[i]<<" ";
  }
  std::cout << std::endl;
}

Should give

应该给

c010200:~ peterboyle$ g++-mp-5 Gcc-test.cc -std=c++11 
c010200:~ peterboyle$ ./a.out 
(1,0)

But under -O3: I THINK THIS IS WRONG AND A COMPILER ERROR

但是在-O3下面:我认为这是错误的，是编译错误

c010200:~ peterboyle$ g++-mp-5 Gcc-test.cc -std=c++11 -O3 
c010200:~ peterboyle$ ./a.out 
(0,0)

Under g++4.9

在g + + 4.9

c010200:~ peterboyle$ g++-4.9 Gcc-test.cc -std=c++11 -O3 
c010200:~ peterboyle$ ./a.out 
(1,0)

Under llvm xcode

在llvm xcode

c010200:~ peterboyle$ g++ Gcc-test.cc -std=c++11 -O3 
c010200:~ peterboyle$ ./a.out 
(1,0)

#7

I don't really understand your problem. The compiler did exactly what it was supposed to do in your example. The union conversion is what you did in f1. In f2 it's a normal pointer typecast, that you casted it to a union is irrelevant, it's still a pointer casting

我真的不明白你的问题。编译器完成了它在示例中应该做的事情。联合转换就是你在f1中所做的。在f2中，它是一个普通的指针类型转换，你把它转换成一个连接是不相关的，它仍然是一个指针类型。

#1