R_X86_64_IRELATIV是什么意思?

时间:2021-01-07 04:50:54

I built an executable for a simple program, by statically linking libc library, in x86 arch. The relocation table for that executable is empty as expected:

我为一个简单的程序构建了一个可执行文件,通过静态链接libc库,在x86 arch中。可执行文件的重新定位表为空,如预期:

$ readelf -r test
There are no relocations in this file.
$ 

While when I built an executable for the same program, by statically linking libc library, in x86_64 arch, the relocation table is not empty:

当我为同一个程序构建一个可执行文件时,通过静态链接libc库,在x86_64 arch中,重新定位表不是空的:

$ readelf -r test

Relocation section '.rela.plt' at offset 0x1d8 contains 12 entries:

  Offset          Info           Type           Sym. Value    Sym. Name + Addend
0000006c2058  000000000025 R_X86_64_IRELATIV                    000000000042de70
0000006c2050  000000000025 R_X86_64_IRELATIV                    00000000004829d0
0000006c2048  000000000025 R_X86_64_IRELATIV                    000000000042dfe0
0000006c2040  000000000025 R_X86_64_IRELATIV                    000000000040a330
0000006c2038  000000000025 R_X86_64_IRELATIV                    0000000000432520
0000006c2030  000000000025 R_X86_64_IRELATIV                    0000000000409ef0
0000006c2028  000000000025 R_X86_64_IRELATIV                    0000000000445ca0
0000006c2020  000000000025 R_X86_64_IRELATIV                    0000000000437f40
0000006c2018  000000000025 R_X86_64_IRELATIV                    00000000004323b0
0000006c2010  000000000025 R_X86_64_IRELATIV                    0000000000430540
0000006c2008  000000000025 R_X86_64_IRELATIV                    0000000000430210
0000006c2000  000000000025 R_X86_64_IRELATIV                    0000000000432400
$

I googled up relocation type "R_X86_64_IRELATIV" but I could find any info about it. So can someone please tell me what does it mean?

我在谷歌上搜索了“R_X86_64_爱尔兰”,但我可以找到任何有关它的信息。有人能告诉我这是什么意思吗?

I thought if I debug the executable with gdb I might find an answer. But rather it actually brought up lot of questions :) Here is my bit of analysis:

我想如果我用gdb调试可执行文件,我可能会找到答案。但实际上它提出了很多问题:)这是我的一点分析:

The Sym.Name field in the above table lists the virtual address of some libc functions. When I objdump'd executable 'test' I found virtual address 0x430210 contains strcpy function. While on loading the corresponding PLT entry found at location 0x6c2008 gets changed from 0x400326 (virtual addr of next instruction ie)setting up the resolver) to 0x0x443cc0 (virtual addr of a libc function named __strcpy_sse2_unaligned) I dont why it gets resolved to a different function instead of strcpy? I assume its a different variant of strcpy.

上面表中的Sym.Name字段列出了一些libc函数的虚拟地址。当我反对可执行的“测试”时,我发现虚拟地址0x430210包含了strcpy函数。在加载在location 0x6c2008中找到的相应的PLT条目时,从0x400326(下一条指令的虚拟addr)设置为将解析器设置为0x0x443cc0(一个名为__strcpy_sse2_unaligned的libc函数的虚拟addr),我不知道为什么它被解析为一个不同的函数而不是strcpy?我假设它是不同的strcpy。

Having done this analysis I realized I missed the basic point upfront "How come dynamic linker can come into picture when loading a static executable?" I dont find a .interp section so dynamic linker is not involved for sure. Then I observed, a libc function "__libc_csu_irel()" modifies the PLT entries and NOT dynamic linker.

在做了这个分析之后,我意识到我忽略了一个基本的问题:“动态链接器在加载静态可执行文件时是如何出现的?”我找不到一个。interp部分,所以动态链接器肯定没有涉及。然后我观察到,一个libc函数“__libc_csu_irel()”修改了PLT条目,而不是动态链接器。

If my analysis makes more sense to anyone, please let me know whats it all about. I would be happy to know the reasons behind it.

如果我的分析对任何人都更有意义的话,请告诉我到底是怎么回事。我很高兴知道背后的原因。

Thanks a lot!!!

非常感谢! ! !

2 个解决方案

#1


1  

You can take a look at the "System V Application Binary Interface AMD64 Architecture Processor Supplement" - I found it under https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf

您可以看一下“System V应用程序二进制接口AMD64架构处理器补充”——我在https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf下发现了它。

If you go to the the relocation section (4.4) you'll find the documentation for this RLD type and also an explanation of the calculation method

如果您到迁移部分(4.4),您将找到这个RLD类型的文档,并解释计算方法。

R_X86_64_IRELATIVE 37 wordclass indirect (B + A)

R_X86_64_IRELATIVE 37 wordclass间接(B + A)

where

在哪里

  • wordclass specifies word64 for LP64 and specifies word32 for ILP32.
  • wordclass为LP64指定word64,并为ILP32指定word32。
  • A Represents the addend used to compute the value of the relocatable field.
  • A表示用于计算可重定位字段值的addend。
  • B Represents the base address at which a shared object has been loaded into memory during execution. Generally, a shared object is built with a 0 base virtual address, but the execution address will be different.
  • B表示在执行过程*享对象被加载到内存中的基本地址。通常,共享对象是用一个0基虚拟地址构建的,但是执行地址将会不同。

goodluck - BTW thank you for the great post at sploitfun ;-)

祝你好运- - -谢谢你在sploitfun的精彩文章;-)

#2


0  

TL;DR

You are right. Those relocations just trying to find out what implementation of (not only) libc functions should be used. They are resolved before the main is executed by the function __libc_start_main inserted in the binary at the linking time.

你是对的。这些重新定位只是试图找出(不只是)libc函数的实现。它们在main由函数__libc_start_main在链接时间内插入二进制文件之前解决。


I will try to explain how this relocation type works.

我将试着解释这种迁移类型是如何工作的。

The example

I am using this code as reference

我使用此代码作为参考。

//test.c
#include <stdio.h>
#include <string.h>

int main(void)
{
    char tmp[10];
    char target[10];
    fgets(tmp, 10, stdin);
    strcpy(target, tmp);
}

compiled with GCC 7.3.1

使用GCC编译7.3.1

gcc -O0 -g -no-pie -fno-pie -o test -static test.c

The shorten output of relocation table (readelf -r test):

搬迁表的缩短(readelf -r测试):

Relocation section '.rela.plt' at offset 0x1d8 contains 21 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
...
00000069bfd8  000000000025 R_X86_64_IRELATIV                    415fe0
00000069c018  000000000025 R_X86_64_IRELATIV                    416060

The shorten output of the section headers (readelf -s test):

section header的缩短输出(readelf -s测试):

[Nr] Name              Type             Address           Offset
     Size              EntSize          Flags  Link  Info  Align
...
[19] .got.plt          PROGBITS         000000000069c000  0009c000
     0000000000000020  0000000000000008  WA       0     0     8
...

It says that .got.plt section is on the address 0x69c000.

它说.got。plt部分位于地址0x69c000。

How is R_X86_64_IRELATIV relocation resolved

Every record in the relocation table contains two important information offset and addend. In the words the addend is pointer to function (also called indirect function) which takes no arguments and returns pointer to function. The returned pointer is placed on the offset from the relocation record.

重新定位表中的每条记录都包含两个重要的信息偏移量和addend。在单词中,addend是指向函数的指针(也称为间接函数),它不接受参数,并返回指向函数的指针。返回的指针被放置在移位记录的偏移位置上。

Simple realocation resolver implementation:

简单realocation解析器实现:

void reolve_reloc(uintptr_t* offset, void* (*addend)())
{
    //addend is pointer to function
    *offset = addend();
}

From the example at the start of this answer. The last addend from the relocation table points to the address 0x416060 which is function strcpy_ifunc. See the output from disassembly:

从这个答案开始的例子。重新定位表的最后一个addend指向地址0x416060,该地址是函数strcpy_ifunc。请参阅拆卸的输出:

0000000000416060 <strcpy_ifunc>:
  416060:       f6 05 05 8d 28 00 10    testb  $0x10,0x288d05(%rip)        # 69ed6c <_dl_x86_cpu_features+0x4c>
  416067:       75 27                   jne    416090 <strcpy_ifunc+0x30>
  416069:       f6 05 c1 8c 28 00 02    testb  $0x2,0x288cc1(%rip)        # 69ed31 <_dl_x86_cpu_features+0x11>
  416070:       75 0e                   jne    416080 <strcpy_ifunc+0x20>
  416072:       48 c7 c0 70 dd 42 00    mov    $0x42dd70,%rax
  416079:       c3                      retq   
  41607a:       66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
  416080:       48 c7 c0 30 df 42 00    mov    $0x42df30,%rax
  416087:       c3                      retq   
  416088:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  41608f:       00 
  416090:       48 c7 c0 f0 0e 43 00    mov    $0x430ef0,%rax
  416097:       c3                      retq   
  416098:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  41609f:       00 

The strcpy_ifunc pick the best alternative of all strcpy implementations adn returns pointer on it. In my case it return address 0x430ef0 which is __strcpy_sse2_unaligned. This address is ten put at 0x69c018 which is at .glob.plt + 0x18

strcpy_ifunc选择了所有strcpy实现的最佳选择,adn返回指向它的指针。在我的例子中,它返回地址0x430ef0,它是__strcpy_sse2_unaligned。这个地址是10放在。glob的0x69c018。plt + 0 x18

Who and when resolve it

Usually the first thought with reallocation is that all this stuff handles dynamic interpreter (ldd). But in this case the program is statically linked and the .interp section is empty. In this case it resolved in the function __libc_start_main which is part of the GLIBC. Except solving relocation this function also take care of passing command line argument to your main and do some other stuff.

通常,重新分配的第一个想法是所有这些东西都处理动态解释器(ldd)。但是在这种情况下,程序是静态链接的,而.interp部分是空的。在本例中,它在函数__libc_start_main中解析,这是GLIBC的一部分。除了解决移位,这个函数还会处理通过命令行参数到你的main并做一些其他的事情。

Access to the relocation table

When I figure it out i had last question, how the __libc_start_main access the relocation table saved in the ELF headers? The first thought was it somehow opens the running binary for reading and process it. Of course this is totally wrong. If you look at the program header of the executable you will see something like this (readlef -l test):

当我找到最后一个问题时,__libc_start_main如何访问在ELF头中保存的重新定位表?第一个想法是,它以某种方式打开了运行的二进制文件来读取和处理它。当然这是完全错误的。如果你看一下可执行文件的程序头,你会看到这样的东西(readlef -l测试):

Type           Offset             VirtAddr           PhysAddr
               FileSiz            MemSiz              Flags  Align
LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
               0x0000000000098451 0x0000000000098451  R E    0x200000
...

The offset in this header is offset from the first byte of the executable file. So what the first item in the program header says is copy first 0x98451 bytes of the test file into memory. But on the offset 0x0 is ELF header. So with code segment it will also load ELF headers into memory and __libc_start_main can easily access it.

这个头的偏移量由可执行文件的第一个字节偏移。所以程序头的第一个项目是将测试文件的第一个0x98451字节复制到内存中。但是在偏移量0x0上是ELF头。因此,在代码段中,它还将把ELF头加载到内存中,并且__libc_start_main可以很容易地访问它。

#1


1  

You can take a look at the "System V Application Binary Interface AMD64 Architecture Processor Supplement" - I found it under https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf

您可以看一下“System V应用程序二进制接口AMD64架构处理器补充”——我在https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf下发现了它。

If you go to the the relocation section (4.4) you'll find the documentation for this RLD type and also an explanation of the calculation method

如果您到迁移部分(4.4),您将找到这个RLD类型的文档,并解释计算方法。

R_X86_64_IRELATIVE 37 wordclass indirect (B + A)

R_X86_64_IRELATIVE 37 wordclass间接(B + A)

where

在哪里

  • wordclass specifies word64 for LP64 and specifies word32 for ILP32.
  • wordclass为LP64指定word64,并为ILP32指定word32。
  • A Represents the addend used to compute the value of the relocatable field.
  • A表示用于计算可重定位字段值的addend。
  • B Represents the base address at which a shared object has been loaded into memory during execution. Generally, a shared object is built with a 0 base virtual address, but the execution address will be different.
  • B表示在执行过程*享对象被加载到内存中的基本地址。通常,共享对象是用一个0基虚拟地址构建的,但是执行地址将会不同。

goodluck - BTW thank you for the great post at sploitfun ;-)

祝你好运- - -谢谢你在sploitfun的精彩文章;-)

#2


0  

TL;DR

You are right. Those relocations just trying to find out what implementation of (not only) libc functions should be used. They are resolved before the main is executed by the function __libc_start_main inserted in the binary at the linking time.

你是对的。这些重新定位只是试图找出(不只是)libc函数的实现。它们在main由函数__libc_start_main在链接时间内插入二进制文件之前解决。


I will try to explain how this relocation type works.

我将试着解释这种迁移类型是如何工作的。

The example

I am using this code as reference

我使用此代码作为参考。

//test.c
#include <stdio.h>
#include <string.h>

int main(void)
{
    char tmp[10];
    char target[10];
    fgets(tmp, 10, stdin);
    strcpy(target, tmp);
}

compiled with GCC 7.3.1

使用GCC编译7.3.1

gcc -O0 -g -no-pie -fno-pie -o test -static test.c

The shorten output of relocation table (readelf -r test):

搬迁表的缩短(readelf -r测试):

Relocation section '.rela.plt' at offset 0x1d8 contains 21 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
...
00000069bfd8  000000000025 R_X86_64_IRELATIV                    415fe0
00000069c018  000000000025 R_X86_64_IRELATIV                    416060

The shorten output of the section headers (readelf -s test):

section header的缩短输出(readelf -s测试):

[Nr] Name              Type             Address           Offset
     Size              EntSize          Flags  Link  Info  Align
...
[19] .got.plt          PROGBITS         000000000069c000  0009c000
     0000000000000020  0000000000000008  WA       0     0     8
...

It says that .got.plt section is on the address 0x69c000.

它说.got。plt部分位于地址0x69c000。

How is R_X86_64_IRELATIV relocation resolved

Every record in the relocation table contains two important information offset and addend. In the words the addend is pointer to function (also called indirect function) which takes no arguments and returns pointer to function. The returned pointer is placed on the offset from the relocation record.

重新定位表中的每条记录都包含两个重要的信息偏移量和addend。在单词中,addend是指向函数的指针(也称为间接函数),它不接受参数,并返回指向函数的指针。返回的指针被放置在移位记录的偏移位置上。

Simple realocation resolver implementation:

简单realocation解析器实现:

void reolve_reloc(uintptr_t* offset, void* (*addend)())
{
    //addend is pointer to function
    *offset = addend();
}

From the example at the start of this answer. The last addend from the relocation table points to the address 0x416060 which is function strcpy_ifunc. See the output from disassembly:

从这个答案开始的例子。重新定位表的最后一个addend指向地址0x416060,该地址是函数strcpy_ifunc。请参阅拆卸的输出:

0000000000416060 <strcpy_ifunc>:
  416060:       f6 05 05 8d 28 00 10    testb  $0x10,0x288d05(%rip)        # 69ed6c <_dl_x86_cpu_features+0x4c>
  416067:       75 27                   jne    416090 <strcpy_ifunc+0x30>
  416069:       f6 05 c1 8c 28 00 02    testb  $0x2,0x288cc1(%rip)        # 69ed31 <_dl_x86_cpu_features+0x11>
  416070:       75 0e                   jne    416080 <strcpy_ifunc+0x20>
  416072:       48 c7 c0 70 dd 42 00    mov    $0x42dd70,%rax
  416079:       c3                      retq   
  41607a:       66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
  416080:       48 c7 c0 30 df 42 00    mov    $0x42df30,%rax
  416087:       c3                      retq   
  416088:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  41608f:       00 
  416090:       48 c7 c0 f0 0e 43 00    mov    $0x430ef0,%rax
  416097:       c3                      retq   
  416098:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  41609f:       00 

The strcpy_ifunc pick the best alternative of all strcpy implementations adn returns pointer on it. In my case it return address 0x430ef0 which is __strcpy_sse2_unaligned. This address is ten put at 0x69c018 which is at .glob.plt + 0x18

strcpy_ifunc选择了所有strcpy实现的最佳选择,adn返回指向它的指针。在我的例子中,它返回地址0x430ef0,它是__strcpy_sse2_unaligned。这个地址是10放在。glob的0x69c018。plt + 0 x18

Who and when resolve it

Usually the first thought with reallocation is that all this stuff handles dynamic interpreter (ldd). But in this case the program is statically linked and the .interp section is empty. In this case it resolved in the function __libc_start_main which is part of the GLIBC. Except solving relocation this function also take care of passing command line argument to your main and do some other stuff.

通常,重新分配的第一个想法是所有这些东西都处理动态解释器(ldd)。但是在这种情况下,程序是静态链接的,而.interp部分是空的。在本例中,它在函数__libc_start_main中解析,这是GLIBC的一部分。除了解决移位,这个函数还会处理通过命令行参数到你的main并做一些其他的事情。

Access to the relocation table

When I figure it out i had last question, how the __libc_start_main access the relocation table saved in the ELF headers? The first thought was it somehow opens the running binary for reading and process it. Of course this is totally wrong. If you look at the program header of the executable you will see something like this (readlef -l test):

当我找到最后一个问题时,__libc_start_main如何访问在ELF头中保存的重新定位表?第一个想法是,它以某种方式打开了运行的二进制文件来读取和处理它。当然这是完全错误的。如果你看一下可执行文件的程序头,你会看到这样的东西(readlef -l测试):

Type           Offset             VirtAddr           PhysAddr
               FileSiz            MemSiz              Flags  Align
LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
               0x0000000000098451 0x0000000000098451  R E    0x200000
...

The offset in this header is offset from the first byte of the executable file. So what the first item in the program header says is copy first 0x98451 bytes of the test file into memory. But on the offset 0x0 is ELF header. So with code segment it will also load ELF headers into memory and __libc_start_main can easily access it.

这个头的偏移量由可执行文件的第一个字节偏移。所以程序头的第一个项目是将测试文件的第一个0x98451字节复制到内存中。但是在偏移量0x0上是ELF头。因此,在代码段中,它还将把ELF头加载到内存中,并且__libc_start_main可以很容易地访问它。