在linux 64位上使用nasm + gcc时,scanf抛出了分段错误(核心转储)。

时间:2020-12-01 02:52:43

When compiling below code:

当编译下面代码:

global main
extern printf, scanf

section .data
   msg: db "Enter a number: ",10,0
   format:db "%d",0

section .bss
   number resb 4

section .text
main:
   mov rdi, msg
   mov al, 0
   call printf

   mov rsi, number
   mov rdi, format
   mov al, 0
   call scanf

   mov rdi,format
   mov rsi,[number]
   inc rsi
   mov rax,0
   call printf 

   ret

using:

使用:

nasm -f elf64 example.asm -o example.o
gcc -no-pie -m64 example.o -o example

and then run

然后运行

./example

it runs, print: enter a number: but then crashes and prints: Segmentation fault (core dumped)

它运行,打印:输入一个数字:然后崩溃并打印:分割错误(核心转储)

So printf works fine but scanf not. What am I doing wrong with scanf so?

所以printf可以,但是scanf不行。我在扫描中做错了什么?

1 个解决方案

#1


3  

Use sub rsp, 8 / add rsp, 8 at the start/end of your function to re-align the stack to 16 bytes before your function does a call.

在函数开始/结束时使用sub rsp, 8 / add rsp, 8在函数调用之前将堆栈重新对齐到16字节。

Or better push/pop a dummy register, e.g. push rdx / pop rcx, or save/restore a call-preserved register like RBP.

或者更好地推/弹出一个虚拟寄存器,例如推rdx /pop rcx,或者保存/恢复一个像RBP这样的调用保留寄存器。

On function entry, RSP is 8 bytes away from 16-byte alignment because the call pushed an 8-byte return address. See Printing floating point numbers from x86-64 seems to require %rbp to be saved, main and stack alignment, and Calling printf in x86_64 using GNU assembler. This is an ABI requirement which you used to be able to get away with violating when there weren't any FP args for printf. But not any more.

在函数条目上,RSP距离16字节对齐的距离为8字节,因为调用推入了8字节的返回地址。请参见从x86-64打印浮点数似乎需要保存%rbp,主和堆栈对齐,并使用GNU汇编程序在x86_64中调用printf。这是一个ABI需求,在没有任何用于printf的FP arg时,您可以不违反它。但是现在不是了。


gcc's code-gen for glibc scanf now depends on 16-byte stack alignment even when AL == 0.

用于glibc scanf的gcc代码生成现在依赖于16字节的堆栈对齐,即使AL = 0。

It seems to have auto-vectorized copying 16 bytes somewhere in __GI__IO_vfscanf, which regular scanf calls after spilling its register args to the stack1. (The many similar ways to call scanf share one big implementation as a back end to the various libc entry points like scanf, fscanf, etc.)

它似乎在__GI__IO_vfscanf中自动向量化复制了16个字节,在将寄存器args泄露给stack1之后,常规的scanf将调用这些字节。(许多类似的调用scanf的方法共享一个大型实现,作为各种libc入口点(如scanf、fscanf等)的后端。

I downloaded Ubuntu 18.04's libc6 binary package: https://packages.ubuntu.com/bionic/amd64/libc6/download and extracted the files (with 7z x blah.deb and tar xf data.tar, because 7z knows how to extract a lot of file formats).

我下载了Ubuntu 18.04的libc6二进制包:https://packages.ubuntu.com/bionic/amd64/libc6/下载并提取文件(使用7z x blah.deb和tar xf数据)。tar,因为7z知道如何提取大量文件格式)。

I can repro your bug with LD_LIBRARY_PATH=/tmp/bionic-libc/lib/x86_64-linux-gnu ./bad-printf, and also it turns out with the system glibc 2.27-3 on my Arch Linux desktop.

我可以使用LD_LIBRARY_PATH=/tmp/bionic-libc/lib/x86_64-linux-gnu ./bad-printf来修复您的bug,而且还可以在我的Arch Linux桌面上使用glibc 2.27-3系统。

With GDB, I ran it on your program and did set env LD_LIBRARY_PATH /tmp/bionic-libc/lib/x86_64-linux-gnu then run. With layout reg, the disassembly window looks like this at the point where it received SIGSEGV:

使用GDB,我在您的程序上运行它,并设置了env LD_LIBRARY_PATH /tmp/bionic-libc/lib/x86_64-linux-gnu然后运行。使用layout reg,拆卸窗口在接收SIGSEGV时是这样的:

   │0x7ffff786b49a <_IO_vfscanf+602>        cmp    r12b,0x25                                                                                             │
   │0x7ffff786b49e <_IO_vfscanf+606>        jne    0x7ffff786b3ff <_IO_vfscanf+447>                                                                      │
   │0x7ffff786b4a4 <_IO_vfscanf+612>        mov    rax,QWORD PTR [rbp-0x460]                                                                             │
   │0x7ffff786b4ab <_IO_vfscanf+619>        add    rax,QWORD PTR [rbp-0x458]                                                                             │
   │0x7ffff786b4b2 <_IO_vfscanf+626>        movq   xmm0,QWORD PTR [rbp-0x460]                                                                            │
   │0x7ffff786b4ba <_IO_vfscanf+634>        mov    DWORD PTR [rbp-0x678],0x0                                                                             │
   │0x7ffff786b4c4 <_IO_vfscanf+644>        mov    QWORD PTR [rbp-0x608],rax                                                                             │
   │0x7ffff786b4cb <_IO_vfscanf+651>        movzx  eax,BYTE PTR [rbx+0x1]                                                                                │
   │0x7ffff786b4cf <_IO_vfscanf+655>        movhps xmm0,QWORD PTR [rbp-0x608]                                                                            │
  >│0x7ffff786b4d6 <_IO_vfscanf+662>        movaps XMMWORD PTR [rbp-0x470],xmm0                                                                          │

So it copied two 8-byte objects to the stack with movq + movhps to load and movaps to store. But with the stack misaligned, movaps [rbp-0x470],xmm0 faults.

因此,它将两个8字节的对象复制到带有movq + movhps的堆栈中以加载和存储。但是由于堆栈不对齐,移动了[rbp-0x470],xmm0出错。

I didn't grab a debug build to find out exactly which part of the C source turned into this, but the function is written in C and compiled by GCC with optimization enabled. GCC has always been allowed to do this, but only recently did it get smart enough to take better advantage of SSE2 this way.

我没有抓取一个调试构建来确定C源代码的哪个部分变成了这个,但是函数是用C编写的,由启用了优化的GCC编译。GCC一直被允许这样做,但直到最近它才变得足够聪明,以这种方式更好地利用SSE2。


Footnote 1: printf / scanf with AL != 0 has always required 16-byte alignment because gcc's code-gen for variadic functions uses test al,al / je to spill the full 16-byte XMM regs xmm0..7 with aligned stores in that case. __m128i can be an argument to a variadic function, not just double, and gcc doesn't check whether the function ever actually reads any 16-byte FP args.

脚注1:printf / scanf与AL != 0总是需要16字节的对齐,因为gcc用于变量函数的代码生成使用test AL, AL / je来溢出完整的16字节XMM regs xmm0。在这种情况下,与对齐的商店。__m128i可以是一个变量函数的参数,而不仅仅是double, gcc不检查函数是否真正读取任何16字节的FP args。

#1


3  

Use sub rsp, 8 / add rsp, 8 at the start/end of your function to re-align the stack to 16 bytes before your function does a call.

在函数开始/结束时使用sub rsp, 8 / add rsp, 8在函数调用之前将堆栈重新对齐到16字节。

Or better push/pop a dummy register, e.g. push rdx / pop rcx, or save/restore a call-preserved register like RBP.

或者更好地推/弹出一个虚拟寄存器,例如推rdx /pop rcx,或者保存/恢复一个像RBP这样的调用保留寄存器。

On function entry, RSP is 8 bytes away from 16-byte alignment because the call pushed an 8-byte return address. See Printing floating point numbers from x86-64 seems to require %rbp to be saved, main and stack alignment, and Calling printf in x86_64 using GNU assembler. This is an ABI requirement which you used to be able to get away with violating when there weren't any FP args for printf. But not any more.

在函数条目上,RSP距离16字节对齐的距离为8字节,因为调用推入了8字节的返回地址。请参见从x86-64打印浮点数似乎需要保存%rbp,主和堆栈对齐,并使用GNU汇编程序在x86_64中调用printf。这是一个ABI需求,在没有任何用于printf的FP arg时,您可以不违反它。但是现在不是了。


gcc's code-gen for glibc scanf now depends on 16-byte stack alignment even when AL == 0.

用于glibc scanf的gcc代码生成现在依赖于16字节的堆栈对齐,即使AL = 0。

It seems to have auto-vectorized copying 16 bytes somewhere in __GI__IO_vfscanf, which regular scanf calls after spilling its register args to the stack1. (The many similar ways to call scanf share one big implementation as a back end to the various libc entry points like scanf, fscanf, etc.)

它似乎在__GI__IO_vfscanf中自动向量化复制了16个字节,在将寄存器args泄露给stack1之后,常规的scanf将调用这些字节。(许多类似的调用scanf的方法共享一个大型实现,作为各种libc入口点(如scanf、fscanf等)的后端。

I downloaded Ubuntu 18.04's libc6 binary package: https://packages.ubuntu.com/bionic/amd64/libc6/download and extracted the files (with 7z x blah.deb and tar xf data.tar, because 7z knows how to extract a lot of file formats).

我下载了Ubuntu 18.04的libc6二进制包:https://packages.ubuntu.com/bionic/amd64/libc6/下载并提取文件(使用7z x blah.deb和tar xf数据)。tar,因为7z知道如何提取大量文件格式)。

I can repro your bug with LD_LIBRARY_PATH=/tmp/bionic-libc/lib/x86_64-linux-gnu ./bad-printf, and also it turns out with the system glibc 2.27-3 on my Arch Linux desktop.

我可以使用LD_LIBRARY_PATH=/tmp/bionic-libc/lib/x86_64-linux-gnu ./bad-printf来修复您的bug,而且还可以在我的Arch Linux桌面上使用glibc 2.27-3系统。

With GDB, I ran it on your program and did set env LD_LIBRARY_PATH /tmp/bionic-libc/lib/x86_64-linux-gnu then run. With layout reg, the disassembly window looks like this at the point where it received SIGSEGV:

使用GDB,我在您的程序上运行它,并设置了env LD_LIBRARY_PATH /tmp/bionic-libc/lib/x86_64-linux-gnu然后运行。使用layout reg,拆卸窗口在接收SIGSEGV时是这样的:

   │0x7ffff786b49a <_IO_vfscanf+602>        cmp    r12b,0x25                                                                                             │
   │0x7ffff786b49e <_IO_vfscanf+606>        jne    0x7ffff786b3ff <_IO_vfscanf+447>                                                                      │
   │0x7ffff786b4a4 <_IO_vfscanf+612>        mov    rax,QWORD PTR [rbp-0x460]                                                                             │
   │0x7ffff786b4ab <_IO_vfscanf+619>        add    rax,QWORD PTR [rbp-0x458]                                                                             │
   │0x7ffff786b4b2 <_IO_vfscanf+626>        movq   xmm0,QWORD PTR [rbp-0x460]                                                                            │
   │0x7ffff786b4ba <_IO_vfscanf+634>        mov    DWORD PTR [rbp-0x678],0x0                                                                             │
   │0x7ffff786b4c4 <_IO_vfscanf+644>        mov    QWORD PTR [rbp-0x608],rax                                                                             │
   │0x7ffff786b4cb <_IO_vfscanf+651>        movzx  eax,BYTE PTR [rbx+0x1]                                                                                │
   │0x7ffff786b4cf <_IO_vfscanf+655>        movhps xmm0,QWORD PTR [rbp-0x608]                                                                            │
  >│0x7ffff786b4d6 <_IO_vfscanf+662>        movaps XMMWORD PTR [rbp-0x470],xmm0                                                                          │

So it copied two 8-byte objects to the stack with movq + movhps to load and movaps to store. But with the stack misaligned, movaps [rbp-0x470],xmm0 faults.

因此,它将两个8字节的对象复制到带有movq + movhps的堆栈中以加载和存储。但是由于堆栈不对齐,移动了[rbp-0x470],xmm0出错。

I didn't grab a debug build to find out exactly which part of the C source turned into this, but the function is written in C and compiled by GCC with optimization enabled. GCC has always been allowed to do this, but only recently did it get smart enough to take better advantage of SSE2 this way.

我没有抓取一个调试构建来确定C源代码的哪个部分变成了这个,但是函数是用C编写的,由启用了优化的GCC编译。GCC一直被允许这样做,但直到最近它才变得足够聪明,以这种方式更好地利用SSE2。


Footnote 1: printf / scanf with AL != 0 has always required 16-byte alignment because gcc's code-gen for variadic functions uses test al,al / je to spill the full 16-byte XMM regs xmm0..7 with aligned stores in that case. __m128i can be an argument to a variadic function, not just double, and gcc doesn't check whether the function ever actually reads any 16-byte FP args.

脚注1:printf / scanf与AL != 0总是需要16字节的对齐,因为gcc用于变量函数的代码生成使用test AL, AL / je来溢出完整的16字节XMM regs xmm0。在这种情况下,与对齐的商店。__m128i可以是一个变量函数的参数,而不仅仅是double, gcc不检查函数是否真正读取任何16字节的FP args。