在gdb中使用coredump时,我怎么知道是哪个线程导致了SIGSEGV?(复制)

时间:2021-07-01 07:15:49

This question already has an answer here:

这个问题已经有了答案:

My application uses more than 8 threads. When I run info threads in gdb I see the threads and the last function they were executing. It does not seem obvious to me exactly which thread caused the SIGSEGV. Is it possible to tell it? Is it thread 1? How are the threads numbered?

我的应用程序使用了超过8个线程。当我在gdb中运行info线程时,我会看到它们执行的线程和最后一个函数。在我看来,究竟是哪个线程导致了SIGSEGV似乎并不明显。能告诉我吗?线程1吗?线程是如何编号的?

1 个解决方案

#1


3  

When you use gdb to analyze the core dump file, the gdb will stop at the function which causes program core dump. And the current thread will be the murder. Take the following program as an example:

当您使用gdb来分析核心转储文件时,gdb将停止在导致程序核心转储的函数上。而当前的线索将是谋杀。以以下程序为例:

#include <stdio.h>
#include <pthread.h>
void *thread_func(void *p_arg)
{
        while (1)
        {
                printf("%s\n", (char*)p_arg);
                sleep(10);
        }
}
int main(void)
{
        pthread_t t1, t2;

        pthread_create(&t1, NULL, thread_func, "Thread 1");
        pthread_create(&t2, NULL, thread_func, NULL);

        sleep(1000);
        return;
}

The t2 thread will cause program down because it refers a NULL pointer. After the program down, use gdb to analyze the core dump file:

t2线程会导致程序下降,因为它引用一个空指针。程序结束后,使用gdb分析核心转储文件:

[root@localhost nan]# gdb -q a core.32794
Reading symbols from a...done.
[New LWP 32796]
[New LWP 32795]
[New LWP 32794]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./a'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6
(gdb)

The gdb stops at __strlen_sse2 function, this means this function causes the program down. Then use bt command to see it is called by which thread:

gdb在__strlen_sse2函数处停止,这意味着该函数导致程序关闭。然后使用bt命令查看它被调用的线程:

(gdb) bt
#0  0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6
#1  0x00000034e4268cdb in puts () from /lib64/libc.so.6
#2  0x00000000004005cc in thread_func (p_arg=0x0) at a.c:7
#3  0x00000034e4a079d1 in start_thread () from /lib64/libpthread.so.0
#4  0x00000034e42e8b6d in clone () from /lib64/libc.so.6
(gdb) i threads
  Id   Target Id         Frame
  3    Thread 0x7ff6104c1700 (LWP 32794) 0x00000034e42accdd in nanosleep () from /lib64/libc.so.6
  2    Thread 0x7ff6104bf700 (LWP 32795) 0x00000034e42accdd in nanosleep () from /lib64/libc.so.6
* 1    Thread 0x7ff60fabe700 (LWP 32796) 0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6

The bt command shows the stack frame of the current thread(which is the murder). "i threads" commands shows all the threads, the thread number which begins with * is the current thread.

bt命令显示当前线程的堆栈框架(这是谋杀)。“i threads”命令显示所有线程,以*开头的线程数是当前线程。

As for "How are the threads numbered?", it depends on the OS. you can refer the gdb manual for more information.

至于“线程是如何编号的?”,这取决于操作系统。有关更多信息,请参阅gdb手册。

#1


3  

When you use gdb to analyze the core dump file, the gdb will stop at the function which causes program core dump. And the current thread will be the murder. Take the following program as an example:

当您使用gdb来分析核心转储文件时,gdb将停止在导致程序核心转储的函数上。而当前的线索将是谋杀。以以下程序为例:

#include <stdio.h>
#include <pthread.h>
void *thread_func(void *p_arg)
{
        while (1)
        {
                printf("%s\n", (char*)p_arg);
                sleep(10);
        }
}
int main(void)
{
        pthread_t t1, t2;

        pthread_create(&t1, NULL, thread_func, "Thread 1");
        pthread_create(&t2, NULL, thread_func, NULL);

        sleep(1000);
        return;
}

The t2 thread will cause program down because it refers a NULL pointer. After the program down, use gdb to analyze the core dump file:

t2线程会导致程序下降,因为它引用一个空指针。程序结束后,使用gdb分析核心转储文件:

[root@localhost nan]# gdb -q a core.32794
Reading symbols from a...done.
[New LWP 32796]
[New LWP 32795]
[New LWP 32794]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./a'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6
(gdb)

The gdb stops at __strlen_sse2 function, this means this function causes the program down. Then use bt command to see it is called by which thread:

gdb在__strlen_sse2函数处停止,这意味着该函数导致程序关闭。然后使用bt命令查看它被调用的线程:

(gdb) bt
#0  0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6
#1  0x00000034e4268cdb in puts () from /lib64/libc.so.6
#2  0x00000000004005cc in thread_func (p_arg=0x0) at a.c:7
#3  0x00000034e4a079d1 in start_thread () from /lib64/libpthread.so.0
#4  0x00000034e42e8b6d in clone () from /lib64/libc.so.6
(gdb) i threads
  Id   Target Id         Frame
  3    Thread 0x7ff6104c1700 (LWP 32794) 0x00000034e42accdd in nanosleep () from /lib64/libc.so.6
  2    Thread 0x7ff6104bf700 (LWP 32795) 0x00000034e42accdd in nanosleep () from /lib64/libc.so.6
* 1    Thread 0x7ff60fabe700 (LWP 32796) 0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6

The bt command shows the stack frame of the current thread(which is the murder). "i threads" commands shows all the threads, the thread number which begins with * is the current thread.

bt命令显示当前线程的堆栈框架(这是谋杀)。“i threads”命令显示所有线程,以*开头的线程数是当前线程。

As for "How are the threads numbered?", it depends on the OS. you can refer the gdb manual for more information.

至于“线程是如何编号的?”,这取决于操作系统。有关更多信息,请参阅gdb手册。