I built an executable for a simple program, by statically linking libc library, in x86 arch. The relocation table for that executable is empty as expected:
我为一个简单的程序构建了一个可执行文件,通过静态链接libc库,在x86 arch中。可执行文件的重新定位表为空,如预期:
$ readelf -r test There are no relocations in this file. $
While when I built an executable for the same program, by statically linking libc library, in x86_64 arch, the relocation table is not empty:
当我为同一个程序构建一个可执行文件时,通过静态链接libc库,在x86_64 arch中,重新定位表不是空的:
$ readelf -r test Relocation section '.rela.plt' at offset 0x1d8 contains 12 entries: Offset Info Type Sym. Value Sym. Name + Addend 0000006c2058 000000000025 R_X86_64_IRELATIV 000000000042de70 0000006c2050 000000000025 R_X86_64_IRELATIV 00000000004829d0 0000006c2048 000000000025 R_X86_64_IRELATIV 000000000042dfe0 0000006c2040 000000000025 R_X86_64_IRELATIV 000000000040a330 0000006c2038 000000000025 R_X86_64_IRELATIV 0000000000432520 0000006c2030 000000000025 R_X86_64_IRELATIV 0000000000409ef0 0000006c2028 000000000025 R_X86_64_IRELATIV 0000000000445ca0 0000006c2020 000000000025 R_X86_64_IRELATIV 0000000000437f40 0000006c2018 000000000025 R_X86_64_IRELATIV 00000000004323b0 0000006c2010 000000000025 R_X86_64_IRELATIV 0000000000430540 0000006c2008 000000000025 R_X86_64_IRELATIV 0000000000430210 0000006c2000 000000000025 R_X86_64_IRELATIV 0000000000432400 $
I googled up relocation type "R_X86_64_IRELATIV" but I could find any info about it. So can someone please tell me what does it mean?
我在谷歌上搜索了“R_X86_64_爱尔兰”,但我可以找到任何有关它的信息。有人能告诉我这是什么意思吗?
I thought if I debug the executable with gdb I might find an answer. But rather it actually brought up lot of questions :) Here is my bit of analysis:
我想如果我用gdb调试可执行文件,我可能会找到答案。但实际上它提出了很多问题:)这是我的一点分析:
The Sym.Name field in the above table lists the virtual address of some libc functions. When I objdump'd executable 'test' I found virtual address 0x430210 contains strcpy function. While on loading the corresponding PLT entry found at location 0x6c2008 gets changed from 0x400326 (virtual addr of next instruction ie)setting up the resolver) to 0x0x443cc0 (virtual addr of a libc function named __strcpy_sse2_unaligned) I dont why it gets resolved to a different function instead of strcpy? I assume its a different variant of strcpy.
上面表中的Sym.Name字段列出了一些libc函数的虚拟地址。当我反对可执行的“测试”时,我发现虚拟地址0x430210包含了strcpy函数。在加载在location 0x6c2008中找到的相应的PLT条目时,从0x400326(下一条指令的虚拟addr)设置为将解析器设置为0x0x443cc0(一个名为__strcpy_sse2_unaligned的libc函数的虚拟addr),我不知道为什么它被解析为一个不同的函数而不是strcpy?我假设它是不同的strcpy。
Having done this analysis I realized I missed the basic point upfront "How come dynamic linker can come into picture when loading a static executable?" I dont find a .interp section so dynamic linker is not involved for sure. Then I observed, a libc function "__libc_csu_irel()" modifies the PLT entries and NOT dynamic linker.
在做了这个分析之后,我意识到我忽略了一个基本的问题:“动态链接器在加载静态可执行文件时是如何出现的?”我找不到一个。interp部分,所以动态链接器肯定没有涉及。然后我观察到,一个libc函数“__libc_csu_irel()”修改了PLT条目,而不是动态链接器。
If my analysis makes more sense to anyone, please let me know whats it all about. I would be happy to know the reasons behind it.
如果我的分析对任何人都更有意义的话,请告诉我到底是怎么回事。我很高兴知道背后的原因。
Thanks a lot!!!
非常感谢! ! !
2 个解决方案
#1
1
You can take a look at the "System V Application Binary Interface AMD64 Architecture Processor Supplement" - I found it under https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
您可以看一下“System V应用程序二进制接口AMD64架构处理器补充”——我在https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf下发现了它。
If you go to the the relocation section (4.4) you'll find the documentation for this RLD type and also an explanation of the calculation method
如果您到迁移部分(4.4),您将找到这个RLD类型的文档,并解释计算方法。
R_X86_64_IRELATIVE 37 wordclass indirect (B + A)
R_X86_64_IRELATIVE 37 wordclass间接(B + A)
where
在哪里
- wordclass specifies word64 for LP64 and specifies word32 for ILP32.
- wordclass为LP64指定word64,并为ILP32指定word32。
- A Represents the addend used to compute the value of the relocatable field.
- A表示用于计算可重定位字段值的addend。
- B Represents the base address at which a shared object has been loaded into memory during execution. Generally, a shared object is built with a 0 base virtual address, but the execution address will be different.
- B表示在执行过程*享对象被加载到内存中的基本地址。通常,共享对象是用一个0基虚拟地址构建的,但是执行地址将会不同。
goodluck - BTW thank you for the great post at sploitfun ;-)
祝你好运- - -谢谢你在sploitfun的精彩文章;-)
#2
0
TL;DR
You are right. Those relocations just trying to find out what implementation of (not only) libc functions should be used. They are resolved before the main
is executed by the function __libc_start_main
inserted in the binary at the linking time.
你是对的。这些重新定位只是试图找出(不只是)libc函数的实现。它们在main由函数__libc_start_main在链接时间内插入二进制文件之前解决。
I will try to explain how this relocation type works.
我将试着解释这种迁移类型是如何工作的。
The example
I am using this code as reference
我使用此代码作为参考。
//test.c
#include <stdio.h>
#include <string.h>
int main(void)
{
char tmp[10];
char target[10];
fgets(tmp, 10, stdin);
strcpy(target, tmp);
}
compiled with GCC 7.3.1
使用GCC编译7.3.1
gcc -O0 -g -no-pie -fno-pie -o test -static test.c
The shorten output of relocation table (readelf -r test
):
搬迁表的缩短(readelf -r测试):
Relocation section '.rela.plt' at offset 0x1d8 contains 21 entries:
Offset Info Type Sym. Value Sym. Name + Addend
...
00000069bfd8 000000000025 R_X86_64_IRELATIV 415fe0
00000069c018 000000000025 R_X86_64_IRELATIV 416060
The shorten output of the section headers (readelf -s test
):
section header的缩短输出(readelf -s测试):
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[19] .got.plt PROGBITS 000000000069c000 0009c000
0000000000000020 0000000000000008 WA 0 0 8
...
It says that .got.plt
section is on the address 0x69c000
.
它说.got。plt部分位于地址0x69c000。
How is R_X86_64_IRELATIV relocation resolved
Every record in the relocation table contains two important information offset and addend. In the words the addend is pointer to function (also called indirect function) which takes no arguments and returns pointer to function. The returned pointer is placed on the offset from the relocation record.
重新定位表中的每条记录都包含两个重要的信息偏移量和addend。在单词中,addend是指向函数的指针(也称为间接函数),它不接受参数,并返回指向函数的指针。返回的指针被放置在移位记录的偏移位置上。
Simple realocation resolver implementation:
简单realocation解析器实现:
void reolve_reloc(uintptr_t* offset, void* (*addend)())
{
//addend is pointer to function
*offset = addend();
}
From the example at the start of this answer. The last addend from the relocation table points to the address 0x416060
which is function strcpy_ifunc
. See the output from disassembly:
从这个答案开始的例子。重新定位表的最后一个addend指向地址0x416060,该地址是函数strcpy_ifunc。请参阅拆卸的输出:
0000000000416060 <strcpy_ifunc>:
416060: f6 05 05 8d 28 00 10 testb $0x10,0x288d05(%rip) # 69ed6c <_dl_x86_cpu_features+0x4c>
416067: 75 27 jne 416090 <strcpy_ifunc+0x30>
416069: f6 05 c1 8c 28 00 02 testb $0x2,0x288cc1(%rip) # 69ed31 <_dl_x86_cpu_features+0x11>
416070: 75 0e jne 416080 <strcpy_ifunc+0x20>
416072: 48 c7 c0 70 dd 42 00 mov $0x42dd70,%rax
416079: c3 retq
41607a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
416080: 48 c7 c0 30 df 42 00 mov $0x42df30,%rax
416087: c3 retq
416088: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
41608f: 00
416090: 48 c7 c0 f0 0e 43 00 mov $0x430ef0,%rax
416097: c3 retq
416098: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
41609f: 00
The strcpy_ifunc
pick the best alternative of all strcpy
implementations adn returns pointer on it. In my case it return address 0x430ef0
which is __strcpy_sse2_unaligned
. This address is ten put at 0x69c018
which is at .glob.plt + 0x18
strcpy_ifunc选择了所有strcpy实现的最佳选择,adn返回指向它的指针。在我的例子中,它返回地址0x430ef0,它是__strcpy_sse2_unaligned。这个地址是10放在。glob的0x69c018。plt + 0 x18
Who and when resolve it
Usually the first thought with reallocation is that all this stuff handles dynamic interpreter (ldd
). But in this case the program is statically linked and the .interp
section is empty. In this case it resolved in the function __libc_start_main
which is part of the GLIBC. Except solving relocation this function also take care of passing command line argument to your main
and do some other stuff.
通常,重新分配的第一个想法是所有这些东西都处理动态解释器(ldd)。但是在这种情况下,程序是静态链接的,而.interp部分是空的。在本例中,它在函数__libc_start_main中解析,这是GLIBC的一部分。除了解决移位,这个函数还会处理通过命令行参数到你的main并做一些其他的事情。
Access to the relocation table
When I figure it out i had last question, how the __libc_start_main
access the relocation table saved in the ELF headers? The first thought was it somehow opens the running binary for reading and process it. Of course this is totally wrong. If you look at the program header of the executable you will see something like this (readlef -l test
):
当我找到最后一个问题时,__libc_start_main如何访问在ELF头中保存的重新定位表?第一个想法是,它以某种方式打开了运行的二进制文件来读取和处理它。当然这是完全错误的。如果你看一下可执行文件的程序头,你会看到这样的东西(readlef -l测试):
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000098451 0x0000000000098451 R E 0x200000
...
The offset in this header is offset from the first byte of the executable file. So what the first item in the program header says is copy first 0x98451 bytes of the test
file into memory. But on the offset 0x0 is ELF header. So with code segment it will also load ELF headers into memory and __libc_start_main
can easily access it.
这个头的偏移量由可执行文件的第一个字节偏移。所以程序头的第一个项目是将测试文件的第一个0x98451字节复制到内存中。但是在偏移量0x0上是ELF头。因此,在代码段中,它还将把ELF头加载到内存中,并且__libc_start_main可以很容易地访问它。
#1
1
You can take a look at the "System V Application Binary Interface AMD64 Architecture Processor Supplement" - I found it under https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
您可以看一下“System V应用程序二进制接口AMD64架构处理器补充”——我在https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf下发现了它。
If you go to the the relocation section (4.4) you'll find the documentation for this RLD type and also an explanation of the calculation method
如果您到迁移部分(4.4),您将找到这个RLD类型的文档,并解释计算方法。
R_X86_64_IRELATIVE 37 wordclass indirect (B + A)
R_X86_64_IRELATIVE 37 wordclass间接(B + A)
where
在哪里
- wordclass specifies word64 for LP64 and specifies word32 for ILP32.
- wordclass为LP64指定word64,并为ILP32指定word32。
- A Represents the addend used to compute the value of the relocatable field.
- A表示用于计算可重定位字段值的addend。
- B Represents the base address at which a shared object has been loaded into memory during execution. Generally, a shared object is built with a 0 base virtual address, but the execution address will be different.
- B表示在执行过程*享对象被加载到内存中的基本地址。通常,共享对象是用一个0基虚拟地址构建的,但是执行地址将会不同。
goodluck - BTW thank you for the great post at sploitfun ;-)
祝你好运- - -谢谢你在sploitfun的精彩文章;-)
#2
0
TL;DR
You are right. Those relocations just trying to find out what implementation of (not only) libc functions should be used. They are resolved before the main
is executed by the function __libc_start_main
inserted in the binary at the linking time.
你是对的。这些重新定位只是试图找出(不只是)libc函数的实现。它们在main由函数__libc_start_main在链接时间内插入二进制文件之前解决。
I will try to explain how this relocation type works.
我将试着解释这种迁移类型是如何工作的。
The example
I am using this code as reference
我使用此代码作为参考。
//test.c
#include <stdio.h>
#include <string.h>
int main(void)
{
char tmp[10];
char target[10];
fgets(tmp, 10, stdin);
strcpy(target, tmp);
}
compiled with GCC 7.3.1
使用GCC编译7.3.1
gcc -O0 -g -no-pie -fno-pie -o test -static test.c
The shorten output of relocation table (readelf -r test
):
搬迁表的缩短(readelf -r测试):
Relocation section '.rela.plt' at offset 0x1d8 contains 21 entries:
Offset Info Type Sym. Value Sym. Name + Addend
...
00000069bfd8 000000000025 R_X86_64_IRELATIV 415fe0
00000069c018 000000000025 R_X86_64_IRELATIV 416060
The shorten output of the section headers (readelf -s test
):
section header的缩短输出(readelf -s测试):
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[19] .got.plt PROGBITS 000000000069c000 0009c000
0000000000000020 0000000000000008 WA 0 0 8
...
It says that .got.plt
section is on the address 0x69c000
.
它说.got。plt部分位于地址0x69c000。
How is R_X86_64_IRELATIV relocation resolved
Every record in the relocation table contains two important information offset and addend. In the words the addend is pointer to function (also called indirect function) which takes no arguments and returns pointer to function. The returned pointer is placed on the offset from the relocation record.
重新定位表中的每条记录都包含两个重要的信息偏移量和addend。在单词中,addend是指向函数的指针(也称为间接函数),它不接受参数,并返回指向函数的指针。返回的指针被放置在移位记录的偏移位置上。
Simple realocation resolver implementation:
简单realocation解析器实现:
void reolve_reloc(uintptr_t* offset, void* (*addend)())
{
//addend is pointer to function
*offset = addend();
}
From the example at the start of this answer. The last addend from the relocation table points to the address 0x416060
which is function strcpy_ifunc
. See the output from disassembly:
从这个答案开始的例子。重新定位表的最后一个addend指向地址0x416060,该地址是函数strcpy_ifunc。请参阅拆卸的输出:
0000000000416060 <strcpy_ifunc>:
416060: f6 05 05 8d 28 00 10 testb $0x10,0x288d05(%rip) # 69ed6c <_dl_x86_cpu_features+0x4c>
416067: 75 27 jne 416090 <strcpy_ifunc+0x30>
416069: f6 05 c1 8c 28 00 02 testb $0x2,0x288cc1(%rip) # 69ed31 <_dl_x86_cpu_features+0x11>
416070: 75 0e jne 416080 <strcpy_ifunc+0x20>
416072: 48 c7 c0 70 dd 42 00 mov $0x42dd70,%rax
416079: c3 retq
41607a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
416080: 48 c7 c0 30 df 42 00 mov $0x42df30,%rax
416087: c3 retq
416088: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
41608f: 00
416090: 48 c7 c0 f0 0e 43 00 mov $0x430ef0,%rax
416097: c3 retq
416098: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
41609f: 00
The strcpy_ifunc
pick the best alternative of all strcpy
implementations adn returns pointer on it. In my case it return address 0x430ef0
which is __strcpy_sse2_unaligned
. This address is ten put at 0x69c018
which is at .glob.plt + 0x18
strcpy_ifunc选择了所有strcpy实现的最佳选择,adn返回指向它的指针。在我的例子中,它返回地址0x430ef0,它是__strcpy_sse2_unaligned。这个地址是10放在。glob的0x69c018。plt + 0 x18
Who and when resolve it
Usually the first thought with reallocation is that all this stuff handles dynamic interpreter (ldd
). But in this case the program is statically linked and the .interp
section is empty. In this case it resolved in the function __libc_start_main
which is part of the GLIBC. Except solving relocation this function also take care of passing command line argument to your main
and do some other stuff.
通常,重新分配的第一个想法是所有这些东西都处理动态解释器(ldd)。但是在这种情况下,程序是静态链接的,而.interp部分是空的。在本例中,它在函数__libc_start_main中解析,这是GLIBC的一部分。除了解决移位,这个函数还会处理通过命令行参数到你的main并做一些其他的事情。
Access to the relocation table
When I figure it out i had last question, how the __libc_start_main
access the relocation table saved in the ELF headers? The first thought was it somehow opens the running binary for reading and process it. Of course this is totally wrong. If you look at the program header of the executable you will see something like this (readlef -l test
):
当我找到最后一个问题时,__libc_start_main如何访问在ELF头中保存的重新定位表?第一个想法是,它以某种方式打开了运行的二进制文件来读取和处理它。当然这是完全错误的。如果你看一下可执行文件的程序头,你会看到这样的东西(readlef -l测试):
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000098451 0x0000000000098451 R E 0x200000
...
The offset in this header is offset from the first byte of the executable file. So what the first item in the program header says is copy first 0x98451 bytes of the test
file into memory. But on the offset 0x0 is ELF header. So with code segment it will also load ELF headers into memory and __libc_start_main
can easily access it.
这个头的偏移量由可执行文件的第一个字节偏移。所以程序头的第一个项目是将测试文件的第一个0x98451字节复制到内存中。但是在偏移量0x0上是ELF头。因此,在代码段中,它还将把ELF头加载到内存中,并且__libc_start_main可以很容易地访问它。