ptrace如何在Linux中工作?

时间:2020-11-26 18:56:20

The ptrace system call allows the parent process to inspect the attached child. For example, in Linux, strace (which is implemented with the ptrace system call) can inspect the system calls invoked by the child process.

ptrace系统调用允许父进程检查附加的子进程。例如,在Linux中,strace(通过ptrace系统调用实现)可以检查子进程调用的系统调用。

When the attached child process invokes a system call, the ptracing parent process can be notified. But how exactly does that happen? I want to know the technical details behind this mechanism.

当附加的子进程调用系统调用时,可以通知ptrace父进程。但这究竟是怎么发生的呢?我想知道这个机制背后的技术细节。

Thank you in advance.

提前谢谢你。

1 个解决方案

#1


28  

When the attached child process invokes a system call, the ptracing parent process can be notified. But how exactly does that happen?

当附加的子进程调用系统调用时,可以通知ptrace父进程。但这究竟是怎么发生的呢?

Parent process calls ptrace with PTRACE_ATTACH, and his child calls ptrace with PTRACE_TRACEME option. This pair will connect two processes by filling some fields inside their task_struct (kernel/ptrace.c: sys_ptrace, child will have PT_PTRACED flag in ptrace field of struct task_struct, and pid of ptracer process as parent and in ptrace_entry list - __ptrace_link; parent will record child's pid in ptraced list).

父进程使用PTRACE_ATTACH调用ptrace,他的子进程使用PTRACE_TRACEME选项调用ptrace。这一对将通过填充task_struct(内核/ptrace)中的一些字段来连接两个进程。c: sys_ptrace, child将在struct task_struct的ptrace字段中有pt_ptrace标志,而ptracer进程的pid为父进程,在ptrace_entry列表中是__ptrace_link;父进程将在ptrace列表中记录子进程的pid)。

Then strace will call ptrace with PTRACE_SYSCALL flag to register itself as syscall debugger, setting thread_flag TIF_SYSCALL_TRACE in child process's struct thread_info (by something like set_tsk_thread_flag(child, TIF_SYSCALL_TRACE);). arch/x86/include/asm/thread_info.h:

然后strace将使用PTRACE_SYSCALL标志调用ptrace来注册为syscall调试器,在子进程的struct thread_info(通过set_tsk_thread_thread_flag (child, TIF_SYSCALL_TRACE)中设置thread_flag tif_flag TIF_SYSCALL_TRACE;arch / x86 / include / asm / thread_info.h:

 67 /*
 68  * thread information flags
 69  * - these are process state flags that various assembly files
 70  *   may need to access   ...*/

 75 #define TIF_SYSCALL_TRACE       0       /* syscall trace active */
 99 #define _TIF_SYSCALL_TRACE      (1 << TIF_SYSCALL_TRACE)

On every syscall entry or exit, architecture-specific syscall entry code will check this _TIF_SYSCALL_TRACE flag (directly in assembler implementation of syscall, for example x86 arch/x86/kernel/entry_32.S: jnz syscall_trace_entry in ENTRY(system_call) and similar code in syscall_exit_work), and if it is set, ptracer will be notified with signal (SIGTRAP) and child will be temporary stopped. This is done usually in syscall_trace_enter and syscall_trace_leave :

在每个syscall条目或退出中,特定于体系结构的syscall条目代码将检查这个_TIF_SYSCALL_TRACE标志(直接在syscall的汇编实现中,例如x86 arch/x86/内核/entry_32。S:输入的jnz syscall_trace_entry (system_call)和syscall_exit_work中的类似代码),如果设置了,ptracer将被通知信号(SIGTRAP),子程序将被临时停止。这通常在syscall_trace_enter和syscall_trace_leave中完成:

1457 long syscall_trace_enter(struct pt_regs *regs)

1483         if ((ret || test_thread_flag(TIF_SYSCALL_TRACE)) &&
1484             tracehook_report_syscall_entry(regs))
1485                 ret = -1L;

1507 void syscall_trace_leave(struct pt_regs *regs)

1531         if (step || test_thread_flag(TIF_SYSCALL_TRACE))
1532                 tracehook_report_syscall_exit(regs, step);

The tracehook_report_syscall_* are actual workers here, they will call ptrace_report_syscall. include/linux/tracehook.h:

tracehook_report_syscall_*是这里的实际工作人员,他们将调用ptrace_report_syscall。包括/ linux / tracehook.h:

 80 /**
 81  * tracehook_report_syscall_entry - task is about to attempt a system call
 82  * @regs:               user register state of current task
 83  *
 84  * This will be called if %TIF_SYSCALL_TRACE has been set, when the
 85  * current task has just entered the kernel for a system call.
 86  * Full user register state is available here.  Changing the values
 87  * in @regs can affect the system call number and arguments to be tried.
 88  * It is safe to block here, preventing the system call from beginning.
 89  *
 90  * Returns zero normally, or nonzero if the calling arch code should abort
 91  * the system call.  That must prevent normal entry so no system call is
 92  * made.  If @task ever returns to user mode after this, its register state
 93  * is unspecified, but should be something harmless like an %ENOSYS error
 94  * return.  It should preserve enough information so that syscall_rollback()
 95  * can work (see asm-generic/syscall.h).
 96  *
 97  * Called without locks, just after entering kernel mode.
 98  */
 99 static inline __must_check int tracehook_report_syscall_entry(
100         struct pt_regs *regs)
101 {
102         return ptrace_report_syscall(regs);
103 }
104 
105 /**
106  * tracehook_report_syscall_exit - task has just finished a system call
107  * @regs:               user register state of current task
108  * @step:               nonzero if simulating single-step or block-step
109  *
110  * This will be called if %TIF_SYSCALL_TRACE has been set, when the
111  * current task has just finished an attempted system call.  Full
112  * user register state is available here.  It is safe to block here,
113  * preventing signals from being processed.
114  *
115  * If @step is nonzero, this report is also in lieu of the normal
116  * trap that would follow the system call instruction because
117  * user_enable_block_step() or user_enable_single_step() was used.
118  * In this case, %TIF_SYSCALL_TRACE might not be set.
119  *
120  * Called without locks, just before checking for pending signals.
121  */
122 static inline void tracehook_report_syscall_exit(struct pt_regs *regs, int step)
123 {
...
130 
131         ptrace_report_syscall(regs);
132 }

And ptrace_report_syscall generates SIGTRAP for debugger or strace via ptrace_notify/ptrace_do_notify:

ptrace_report_syscall通过ptrace_notify/ptrace_do_notify为调试器或strace生成SIGTRAP:

 55 /*
 56  * ptrace report for syscall entry and exit looks identical.
 57  */
 58 static inline int ptrace_report_syscall(struct pt_regs *regs)
 59 {
 60         int ptrace = current->ptrace;
 61 
 62         if (!(ptrace & PT_PTRACED))
 63                 return 0;
 64 
 65         ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
 66 
 67         /*
 68          * this isn't the same as continuing with a signal, but it will do
 69          * for normal use.  strace only continues with a signal if the
 70          * stopping signal is not SIGTRAP.  -brl
 71          */
 72         if (current->exit_code) {
 73                 send_sig(current->exit_code, current, 1);
 74                 current->exit_code = 0;
 75         }
 76 
 77         return fatal_signal_pending(current);
 78 }

ptrace_notify is implemented in kernel/signal.c, it stops the child and pass sig_info to ptracer:

ptrace_notify是在内核/信号中实现的。c,停止子进程,将sig_info传递给ptracer:

1961 static void ptrace_do_notify(int signr, int exit_code, int why)
1962 {
1963         siginfo_t info;
1964 
1965         memset(&info, 0, sizeof info);
1966         info.si_signo = signr;
1967         info.si_code = exit_code;
1968         info.si_pid = task_pid_vnr(current);
1969         info.si_uid = from_kuid_munged(current_user_ns(), current_uid());
1970 
1971         /* Let the debugger run.  */
1972         ptrace_stop(exit_code, why, 1, &info);
1973 }
1974 
1975 void ptrace_notify(int exit_code)
1976 {
1977         BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
1978         if (unlikely(current->task_works))
1979                 task_work_run();
1980 
1981         spin_lock_irq(&current->sighand->siglock);
1982         ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED);
1983         spin_unlock_irq(&current->sighand->siglock);
1984 }

ptrace_stop is in the same signal.c file, line 1839 for 3.13.

ptrace_stop在同一个信号中。c文件,第1839行,3.13。

#1


28  

When the attached child process invokes a system call, the ptracing parent process can be notified. But how exactly does that happen?

当附加的子进程调用系统调用时,可以通知ptrace父进程。但这究竟是怎么发生的呢?

Parent process calls ptrace with PTRACE_ATTACH, and his child calls ptrace with PTRACE_TRACEME option. This pair will connect two processes by filling some fields inside their task_struct (kernel/ptrace.c: sys_ptrace, child will have PT_PTRACED flag in ptrace field of struct task_struct, and pid of ptracer process as parent and in ptrace_entry list - __ptrace_link; parent will record child's pid in ptraced list).

父进程使用PTRACE_ATTACH调用ptrace,他的子进程使用PTRACE_TRACEME选项调用ptrace。这一对将通过填充task_struct(内核/ptrace)中的一些字段来连接两个进程。c: sys_ptrace, child将在struct task_struct的ptrace字段中有pt_ptrace标志,而ptracer进程的pid为父进程,在ptrace_entry列表中是__ptrace_link;父进程将在ptrace列表中记录子进程的pid)。

Then strace will call ptrace with PTRACE_SYSCALL flag to register itself as syscall debugger, setting thread_flag TIF_SYSCALL_TRACE in child process's struct thread_info (by something like set_tsk_thread_flag(child, TIF_SYSCALL_TRACE);). arch/x86/include/asm/thread_info.h:

然后strace将使用PTRACE_SYSCALL标志调用ptrace来注册为syscall调试器,在子进程的struct thread_info(通过set_tsk_thread_thread_flag (child, TIF_SYSCALL_TRACE)中设置thread_flag tif_flag TIF_SYSCALL_TRACE;arch / x86 / include / asm / thread_info.h:

 67 /*
 68  * thread information flags
 69  * - these are process state flags that various assembly files
 70  *   may need to access   ...*/

 75 #define TIF_SYSCALL_TRACE       0       /* syscall trace active */
 99 #define _TIF_SYSCALL_TRACE      (1 << TIF_SYSCALL_TRACE)

On every syscall entry or exit, architecture-specific syscall entry code will check this _TIF_SYSCALL_TRACE flag (directly in assembler implementation of syscall, for example x86 arch/x86/kernel/entry_32.S: jnz syscall_trace_entry in ENTRY(system_call) and similar code in syscall_exit_work), and if it is set, ptracer will be notified with signal (SIGTRAP) and child will be temporary stopped. This is done usually in syscall_trace_enter and syscall_trace_leave :

在每个syscall条目或退出中,特定于体系结构的syscall条目代码将检查这个_TIF_SYSCALL_TRACE标志(直接在syscall的汇编实现中,例如x86 arch/x86/内核/entry_32。S:输入的jnz syscall_trace_entry (system_call)和syscall_exit_work中的类似代码),如果设置了,ptracer将被通知信号(SIGTRAP),子程序将被临时停止。这通常在syscall_trace_enter和syscall_trace_leave中完成:

1457 long syscall_trace_enter(struct pt_regs *regs)

1483         if ((ret || test_thread_flag(TIF_SYSCALL_TRACE)) &&
1484             tracehook_report_syscall_entry(regs))
1485                 ret = -1L;

1507 void syscall_trace_leave(struct pt_regs *regs)

1531         if (step || test_thread_flag(TIF_SYSCALL_TRACE))
1532                 tracehook_report_syscall_exit(regs, step);

The tracehook_report_syscall_* are actual workers here, they will call ptrace_report_syscall. include/linux/tracehook.h:

tracehook_report_syscall_*是这里的实际工作人员,他们将调用ptrace_report_syscall。包括/ linux / tracehook.h:

 80 /**
 81  * tracehook_report_syscall_entry - task is about to attempt a system call
 82  * @regs:               user register state of current task
 83  *
 84  * This will be called if %TIF_SYSCALL_TRACE has been set, when the
 85  * current task has just entered the kernel for a system call.
 86  * Full user register state is available here.  Changing the values
 87  * in @regs can affect the system call number and arguments to be tried.
 88  * It is safe to block here, preventing the system call from beginning.
 89  *
 90  * Returns zero normally, or nonzero if the calling arch code should abort
 91  * the system call.  That must prevent normal entry so no system call is
 92  * made.  If @task ever returns to user mode after this, its register state
 93  * is unspecified, but should be something harmless like an %ENOSYS error
 94  * return.  It should preserve enough information so that syscall_rollback()
 95  * can work (see asm-generic/syscall.h).
 96  *
 97  * Called without locks, just after entering kernel mode.
 98  */
 99 static inline __must_check int tracehook_report_syscall_entry(
100         struct pt_regs *regs)
101 {
102         return ptrace_report_syscall(regs);
103 }
104 
105 /**
106  * tracehook_report_syscall_exit - task has just finished a system call
107  * @regs:               user register state of current task
108  * @step:               nonzero if simulating single-step or block-step
109  *
110  * This will be called if %TIF_SYSCALL_TRACE has been set, when the
111  * current task has just finished an attempted system call.  Full
112  * user register state is available here.  It is safe to block here,
113  * preventing signals from being processed.
114  *
115  * If @step is nonzero, this report is also in lieu of the normal
116  * trap that would follow the system call instruction because
117  * user_enable_block_step() or user_enable_single_step() was used.
118  * In this case, %TIF_SYSCALL_TRACE might not be set.
119  *
120  * Called without locks, just before checking for pending signals.
121  */
122 static inline void tracehook_report_syscall_exit(struct pt_regs *regs, int step)
123 {
...
130 
131         ptrace_report_syscall(regs);
132 }

And ptrace_report_syscall generates SIGTRAP for debugger or strace via ptrace_notify/ptrace_do_notify:

ptrace_report_syscall通过ptrace_notify/ptrace_do_notify为调试器或strace生成SIGTRAP:

 55 /*
 56  * ptrace report for syscall entry and exit looks identical.
 57  */
 58 static inline int ptrace_report_syscall(struct pt_regs *regs)
 59 {
 60         int ptrace = current->ptrace;
 61 
 62         if (!(ptrace & PT_PTRACED))
 63                 return 0;
 64 
 65         ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
 66 
 67         /*
 68          * this isn't the same as continuing with a signal, but it will do
 69          * for normal use.  strace only continues with a signal if the
 70          * stopping signal is not SIGTRAP.  -brl
 71          */
 72         if (current->exit_code) {
 73                 send_sig(current->exit_code, current, 1);
 74                 current->exit_code = 0;
 75         }
 76 
 77         return fatal_signal_pending(current);
 78 }

ptrace_notify is implemented in kernel/signal.c, it stops the child and pass sig_info to ptracer:

ptrace_notify是在内核/信号中实现的。c,停止子进程,将sig_info传递给ptracer:

1961 static void ptrace_do_notify(int signr, int exit_code, int why)
1962 {
1963         siginfo_t info;
1964 
1965         memset(&info, 0, sizeof info);
1966         info.si_signo = signr;
1967         info.si_code = exit_code;
1968         info.si_pid = task_pid_vnr(current);
1969         info.si_uid = from_kuid_munged(current_user_ns(), current_uid());
1970 
1971         /* Let the debugger run.  */
1972         ptrace_stop(exit_code, why, 1, &info);
1973 }
1974 
1975 void ptrace_notify(int exit_code)
1976 {
1977         BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
1978         if (unlikely(current->task_works))
1979                 task_work_run();
1980 
1981         spin_lock_irq(&current->sighand->siglock);
1982         ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED);
1983         spin_unlock_irq(&current->sighand->siglock);
1984 }

ptrace_stop is in the same signal.c file, line 1839 for 3.13.

ptrace_stop在同一个信号中。c文件,第1839行,3.13。