编译器如何处理错位?

时间:2022-11-02 23:05:53

The SO question Does GCC's __attribute__((__packed__))…? mentions that __attribute__((__packed__)) does "packing which introduces alignment issues when accessing the fields of a packed structure. The compiler will account for that when the the fields are accessed directly, but not when they are accessed via pointers".

SO问题是GCC的__attribute __((__ packed __))......?提到__attribute __((__ packed__))确实“打包时会在访问打包结构的字段时引入对齐问题。编译器会在直接访问字段时考虑到这一点,但在通过指针访问字段时则不会。”

How does the compiler makes sure that the fields are accessed directly? I suppose it internally add some padding or does some pointer magic. In the case below, how does the compiler makes sure that the y is accessed correctly compared to the pointer?

编译器如何确保直接访问字段?我想它在内部添加一些填充或做一些指针魔术。在下面的例子中,编译器如何确保与指针相比正确访问y?

struct packet {
    uint8_t x;
    uint32_t y;
} __attribute__((packed));

int main ()
{
    uint8_t bytes[5] = {1, 0, 0, 0, 2};
    struct packet *p = (struct packet *)bytes;

    // compiler handles misalignment because it knows that
    // "struct packet" is packed
    printf("y=%"PRIX32", ", ntohl(p->y));

    // compiler does not handle misalignment - py does not inherit
    // the packed attribute
    uint32_t *py = &p->y;
    printf("*py=%"PRIX32"\n", ntohl(*py));
    return 0;
}

1 个解决方案

#1


2  

When the compiler sees the notation p->y, it knows you're accessing a structure member, and that the structure is packed, because of the declaration of p. It translates this into code that reads byte by byte, and performs the necessary bit shifting to combine them into a uint32_t variable. Essentially, it treats the expression p->y as if it were something like:

当编译器看到符号p-> y时,它知道你正在访问一个结构成员,并且由于p的声明,结构被打包。它将其转换为逐字节读取的代码,并执行必要的位移以将它们组合成uint32_t变量。从本质上讲,它将表达式p-> y视为类似于:

*((char*)p+3) << 24 + *((char*)p+2) << 16 + *((char*p)+1) << 8 + *(char*)p

But when you indirect through *py, the compiler doesn't know where the value of that variable came from. It doesn't know that it points into a packed structure, so that it would need to perform this shifting. py is declared to point to uint32_t, which can normally be accessed using an instruction that reads an entire 32-bit word at once. But this instruction expects the pointer to be aligned to a 4-byte boundary, so when you try to do this you'll get a bus error due to the misalignment.

但是当您间接通过* py时,编译器不知道该变量的值来自何处。它不知道它指向一个打包的结构,所以它需要执行这种转换。 py被声明为指向uint32_t,通常可以使用一次读取整个32位字的指令来访问它。但是这条指令要求指针与4字节边界对齐,所以当你尝试这样做时,由于未对准会导致总线错误。

#1


2  

When the compiler sees the notation p->y, it knows you're accessing a structure member, and that the structure is packed, because of the declaration of p. It translates this into code that reads byte by byte, and performs the necessary bit shifting to combine them into a uint32_t variable. Essentially, it treats the expression p->y as if it were something like:

当编译器看到符号p-> y时,它知道你正在访问一个结构成员,并且由于p的声明,结构被打包。它将其转换为逐字节读取的代码,并执行必要的位移以将它们组合成uint32_t变量。从本质上讲,它将表达式p-> y视为类似于:

*((char*)p+3) << 24 + *((char*)p+2) << 16 + *((char*p)+1) << 8 + *(char*)p

But when you indirect through *py, the compiler doesn't know where the value of that variable came from. It doesn't know that it points into a packed structure, so that it would need to perform this shifting. py is declared to point to uint32_t, which can normally be accessed using an instruction that reads an entire 32-bit word at once. But this instruction expects the pointer to be aligned to a 4-byte boundary, so when you try to do this you'll get a bus error due to the misalignment.

但是当您间接通过* py时,编译器不知道该变量的值来自何处。它不知道它指向一个打包的结构,所以它需要执行这种转换。 py被声明为指向uint32_t,通常可以使用一次读取整个32位字的指令来访问它。但是这条指令要求指针与4字节边界对齐,所以当你尝试这样做时,由于未对准会导致总线错误。