If a trivial program is compiled with the following command:
如果用以下命令编译一个普通的程序:
arm-none-eabi-gcc -shared -fpic -pie --specs=nosys.specs simple.c -o simple.exe
and the relocation entries are printed with the command:
重新定位条目打印为:
arm-none-eabi-readelf simple.exe -r
There are a bunch of relocation entries section (see below).
有一堆重新定位条目部分(见下面)。
Since -fpic / -pie flags cause the compiler to generate a position independent executable, my naive (and clearly incorrect) assumption is that there is no need for a relocation table because the loader can place the executable image anywhere without issue. So why is there a relocation table there at all, and does this indicate that the code isn't actually position independent?
由于-fpic / -pie标志会导致编译器生成独立于位置的可执行文件,所以我天真地(显然是错误的)假设不需要重新定位表,因为加载程序可以将可执行映像放置在任何地方,而不会出现问题。那么,为什么会有一个重新定位表呢?这是否表明代码实际上不是独立于位置的呢?
Relocation section '.rel.dyn' at offset 0x82d4 contains 37 entries:
Offset Info Type Sym.Value Sym. Name
000084a8 00000017 R_ARM_RELATIVE
000084d0 00000017 R_ARM_RELATIVE
00008508 00000017 R_ARM_RELATIVE
00008510 00000017 R_ARM_RELATIVE
0000855c 00000017 R_ARM_RELATIVE
00008560 00000017 R_ARM_RELATIVE
00008564 00000017 R_ARM_RELATIVE
00008678 00000017 R_ARM_RELATIVE
0000867c 00000017 R_ARM_RELATIVE
0000870c 00000017 R_ARM_RELATIVE
00008710 00000017 R_ARM_RELATIVE
00008714 00000017 R_ARM_RELATIVE
00008718 00000017 R_ARM_RELATIVE
00008978 00000017 R_ARM_RELATIVE
000089dc 00000017 R_ARM_RELATIVE
000089e0 00000017 R_ARM_RELATIVE
00008abc 00000017 R_ARM_RELATIVE
00008ae4 00000017 R_ARM_RELATIVE
00018af4 00000017 R_ARM_RELATIVE
00018af8 00000017 R_ARM_RELATIVE
00018afc 00000017 R_ARM_RELATIVE
00018c04 00000017 R_ARM_RELATIVE
00018c08 00000017 R_ARM_RELATIVE
00018c0c 00000017 R_ARM_RELATIVE
00018c34 00000017 R_ARM_RELATIVE
00019028 00000017 R_ARM_RELATIVE
000084cc 00000c02 R_ARM_ABS32 00000000 __libc_fini
0000850c 00000602 R_ARM_ABS32 00000000 __deregister_frame_inf
00008558 00001302 R_ARM_ABS32 00000000 __register_frame_info
00008568 00001202 R_ARM_ABS32 00000000 _Jv_RegisterClasses
00008664 00000d02 R_ARM_ABS32 00000000 __stack
00008668 00000a02 R_ARM_ABS32 00000000 hardware_init_hook
0000866c 00000802 R_ARM_ABS32 00000000 software_init_hook
00008670 00000502 R_ARM_ABS32 0001902c __bss_start__
00008674 00000702 R_ARM_ABS32 00019048 __bss_end__
0000897c 00001402 R_ARM_ABS32 00000000 free
00008ac0 00000402 R_ARM_ABS32 00000000 malloc
Relocation section '.rel.plt' at offset 0x83fc contains 4 entries:
Offset Info Type Sym.Value Sym. Name
00018be8 00000416 R_ARM_JUMP_SLOT 00000000 malloc
00018bec 00000616 R_ARM_JUMP_SLOT 00000000 __deregister_frame_inf
00018bf0 00001316 R_ARM_JUMP_SLOT 00000000 __register_frame_info
00018bf4 00001416 R_ARM_JUMP_SLOT 00000000 free
1 个解决方案
#1
0
An executable consists of several sections. While actual implementation details differ, these can be roughly categorized in four groups:
可执行文件由几个部分组成。虽然实际实现细节不同,但大致可分为四类:
- Read-Only Executable Code, also known as "Text"
- 只读可执行代码,也称为“文本”
- Read-Only Constant Data (global constants)
- 只读常量数据(全局常量)
- (Initialized) Read-Write Data (global variables with initializers)
- (初始化)读写数据(带有初始化器的全局变量)
- Uninitialized Read-Write Data (other global variables, initialized to 0)
- 未初始化的读写数据(其他全局变量,初始化为0)
Non-position-independent code contains a lot of references to the addresses of functions, global variables and global constsants.
非位置无关的代码包含了对函数地址、全局变量和全局常量的大量引用。
Read-Only Data and Initialized Read-Write Data sometimes contain references to the addresses of functions, global variables and global constsants:
只读数据和初始化的读写数据有时包含对函数地址、全局变量和全局常量的引用:
int x;
int *y = &x; // y needs a relocation.
The loader can relocate code based on relocations, there are only two problems:
加载程序可以根据重新定位重新定位代码,只有两个问题:
- Relocations take time on program startup / library loading
- 重新定位在程序启动/库加载时需要时间
- If we relocate, we now have an in-RAM modified copy of the text segment, which is different for every process that loads our library, so we will be wasting RAM.
- 如果重新放置,我们现在有一个内存中修改的文本段副本,这对于装载库的每个进程都是不同的,因此我们将浪费RAM。
Now for the real answer: PIC was intended to solve the above problems by getting rid of text relocations, not to get rid of all relocations.
现在真正的答案是:PIC的目的是通过去除文本重定位来解决上述问题,而不是去除所有重定位。
There are comparatively few relocations in read-only data and initialized data, so neither (1.) nor (2.) are usually an issue. We don't even care about (2.) for read-write data, as we need separate copies of that for each process, anyway. And in fact, there is no way for the compiler to make data position-independent, because if you asked for a global int* y = &x;
then the compiler has no choice but to put the pointer there.
在只读数据和初始化数据中有相对较少的重新定位,所以(1.)和(2)通常都是一个问题。我们甚至不关心(2)读-写数据,因为每个进程都需要它的独立副本。实际上,编译器无法使数据位置独立,因为如果你要求全局int* y = &x;然后编译器别无选择,只能把指针放在那里。
Now, how is code made position-independent? That depends on the platform, but it often involves a few relatively inefficient operations, or the processor imposes arbitrary limits on the maximum offsets used in the more efficient instructions for accessing data & code in a position-independent way. Also, dynamic linking means the address of some functions isn't even known as a relative offset, either. So, compilers tend to use tables that contain the actual addresses, and the code will look up the actual addresses from the table. The tables, variously known as GOT, TOC, PLT and probably a few other names on different platforms, will likely be Constant Data with lots of relocations.
现在,如何使代码独立于位置?这取决于平台,但它通常涉及一些相对低效的操作,或者处理器对以位置无关的方式访问数据和代码的更有效指令中使用的最大偏移量施加任意限制。此外,动态链接意味着一些函数的地址甚至不被称为相对偏移量。因此,编译器倾向于使用包含实际地址的表,代码将从表中查找实际地址。这些表,在不同的平台上有不同的名称,TOC, PLT,可能还有一些其他的名字,很可能是大量的重新定位的数据。
If relocations can't be avoided, the idea is to put them all into one place to minimize problems (1.) and (2.).
如果无法避免重新定位,那么我们的想法是将它们放在一个地方,以最小化问题(1.)和(2.)。
#1
0
An executable consists of several sections. While actual implementation details differ, these can be roughly categorized in four groups:
可执行文件由几个部分组成。虽然实际实现细节不同,但大致可分为四类:
- Read-Only Executable Code, also known as "Text"
- 只读可执行代码,也称为“文本”
- Read-Only Constant Data (global constants)
- 只读常量数据(全局常量)
- (Initialized) Read-Write Data (global variables with initializers)
- (初始化)读写数据(带有初始化器的全局变量)
- Uninitialized Read-Write Data (other global variables, initialized to 0)
- 未初始化的读写数据(其他全局变量,初始化为0)
Non-position-independent code contains a lot of references to the addresses of functions, global variables and global constsants.
非位置无关的代码包含了对函数地址、全局变量和全局常量的大量引用。
Read-Only Data and Initialized Read-Write Data sometimes contain references to the addresses of functions, global variables and global constsants:
只读数据和初始化的读写数据有时包含对函数地址、全局变量和全局常量的引用:
int x;
int *y = &x; // y needs a relocation.
The loader can relocate code based on relocations, there are only two problems:
加载程序可以根据重新定位重新定位代码,只有两个问题:
- Relocations take time on program startup / library loading
- 重新定位在程序启动/库加载时需要时间
- If we relocate, we now have an in-RAM modified copy of the text segment, which is different for every process that loads our library, so we will be wasting RAM.
- 如果重新放置,我们现在有一个内存中修改的文本段副本,这对于装载库的每个进程都是不同的,因此我们将浪费RAM。
Now for the real answer: PIC was intended to solve the above problems by getting rid of text relocations, not to get rid of all relocations.
现在真正的答案是:PIC的目的是通过去除文本重定位来解决上述问题,而不是去除所有重定位。
There are comparatively few relocations in read-only data and initialized data, so neither (1.) nor (2.) are usually an issue. We don't even care about (2.) for read-write data, as we need separate copies of that for each process, anyway. And in fact, there is no way for the compiler to make data position-independent, because if you asked for a global int* y = &x;
then the compiler has no choice but to put the pointer there.
在只读数据和初始化数据中有相对较少的重新定位,所以(1.)和(2)通常都是一个问题。我们甚至不关心(2)读-写数据,因为每个进程都需要它的独立副本。实际上,编译器无法使数据位置独立,因为如果你要求全局int* y = &x;然后编译器别无选择,只能把指针放在那里。
Now, how is code made position-independent? That depends on the platform, but it often involves a few relatively inefficient operations, or the processor imposes arbitrary limits on the maximum offsets used in the more efficient instructions for accessing data & code in a position-independent way. Also, dynamic linking means the address of some functions isn't even known as a relative offset, either. So, compilers tend to use tables that contain the actual addresses, and the code will look up the actual addresses from the table. The tables, variously known as GOT, TOC, PLT and probably a few other names on different platforms, will likely be Constant Data with lots of relocations.
现在,如何使代码独立于位置?这取决于平台,但它通常涉及一些相对低效的操作,或者处理器对以位置无关的方式访问数据和代码的更有效指令中使用的最大偏移量施加任意限制。此外,动态链接意味着一些函数的地址甚至不被称为相对偏移量。因此,编译器倾向于使用包含实际地址的表,代码将从表中查找实际地址。这些表,在不同的平台上有不同的名称,TOC, PLT,可能还有一些其他的名字,很可能是大量的重新定位的数据。
If relocations can't be avoided, the idea is to put them all into one place to minimize problems (1.) and (2.).
如果无法避免重新定位,那么我们的想法是将它们放在一个地方,以最小化问题(1.)和(2.)。