先下载3个文件
其中内核对应uname -r的参数,相关安装包可以通过http://debuginfo.centos.org/7/x86_64/下载
kernel-debug-debuginfo-3.10.0-862.14.4.el7.x86_64.rpm
kernel-debuginfo-3.10.0-862.14.4.el7.x86_64.rpm
kernel-debuginfo-common-x86_64-3.10.0-862.14.4.el7.x86_64.rpm
crash 7.2.0-6.el7 Copyright (C) 2002-2017 Red Hat, Inc. Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005, 2011 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb (GDB) 7.6 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu"... WARNING: kernel relocated [858MB]: patching 82671 gdb minimal_symbol values KERNEL: /usr/lib/debug/lib/modules/3.10.0-862.14.4.el7.x86_64/vmlinux DUMPFILE: /var/crash/127.0.0.1-2018-11-03-22:12:29/vmcore [PARTIAL DUMP] CPUS: 48 DATE: Sat Nov 3 22:12:19 2018 UPTIME: 1 days, 23:16:57 LOAD AVERAGE: 0.00, 0.07, 0.26 TASKS: 1158 NODENAME: localhost.localdomain RELEASE: 3.10.0-862.14.4.el7.x86_64 VERSION: #1 SMP Wed Sep 26 15:12:11 UTC 2018 MACHINE: x86_64 (2300 Mhz) MEMORY: 191.6 GB PANIC: "SysRq : Trigger a crash" PID: 230493 COMMAND: "bash" TASK: ffff9fbee1f56eb0 [THREAD_INFO: ffff9fbf07700000] CPU: 5 STATE: TASK_RUNNING (SYSRQ) crash> bt PID: 230493 TASK: ffff9fbee1f56eb0 CPU: 5 COMMAND: "bash" #0 [ffff9fbf07703ae8] machine_kexec at ffffffffb6a62a0a #1 [ffff9fbf07703b48] __crash_kexec at ffffffffb6b166c2 #2 [ffff9fbf07703c18] crash_kexec at ffffffffb6b167b0 #3 [ffff9fbf07703c30] oops_end at ffffffffb711d728 #4 [ffff9fbf07703c58] no_context at ffffffffb710c84d #5 [ffff9fbf07703ca8] __bad_area_nosemaphore at ffffffffb710c8e4 #6 [ffff9fbf07703cf8] bad_area_nosemaphore at ffffffffb710ca55 #7 [ffff9fbf07703d08] __do_page_fault at ffffffffb71206e0 #8 [ffff9fbf07703d70] do_page_fault at ffffffffb71208d5 #9 [ffff9fbf07703da0] page_fault at ffffffffb711c758 [exception RIP: sysrq_handle_crash+22] RIP: ffffffffb6e33b56 RSP: ffff9fbf07703e58 RFLAGS: 00010246 RAX: ffffffffb6e33b40 RBX: ffffffffb76d7b20 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9fbf18b53978 RDI: 0000000000000063 RBP: ffff9fbf07703e58 R8: ffffffffb79c28bc R9: ffffffffb79ff607 R10: 0000000000000b37 R11: 0000000000000b36 R12: 0000000000000063 R13: 0000000000000000 R14: 0000000000000004 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff9fbf07703e60] __handle_sysrq at ffffffffb6e3430f #11 [ffff9fbf07703e90] write_sysrq_trigger at ffffffffb6e347e8 #12 [ffff9fbf07703ea8] proc_reg_write at ffffffffb6c94e00 #13 [ffff9fbf07703ec8] vfs_write at ffffffffb6c1f240 #14 [ffff9fbf07703f08] sys_write at ffffffffb6c2006f #15 [ffff9fbf07703f50] system_call_fastpath at ffffffffb712579b RIP: 00007f6eea301cd0 RSP: 00007ffceabd5e10 RFLAGS: 00010246 RAX: 0000000000000001 RBX: 0000000000000002 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 00007f6eeac2c000 RDI: 0000000000000001 RBP: 00007f6eeac2c000 R8: 000000000000000a R9: 00007f6eeac12740 R10: 00007f6eeac12740 R11: 0000000000000246 R12: 00007f6eea5d9400 R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM > 0 0 0 ffffffffb7616480 RU 0.0 0 0 [swapper/0] > 0 0 1 ffff9fa85ac33f40 RU 0.0 0 0 [swapper/1] > 0 0 2 ffff9fa85ac34f10 RU 0.0 0 0 [swapper/2] > 0 0 3 ffff9fa85ac35ee0 RU 0.0 0 0 [swapper/3] > 0 0 4 ffff9fa85ac36eb0 RU 0.0 0 0 [swapper/4] 0 0 5 ffff9fa85ac60000 RU 0.0 0 0 [swapper/5] > 0 0 6 ffff9fa85ac60fd0 RU 0.0 0 0 [swapper/6] > 0 0 7 ffff9fa85ac61fa0 RU 0.0 0 0 [swapper/7] > 0 0 8 ffff9fa85ac62f70 RU 0.0 0 0 [swapper/8] > 0 0 9 ffff9fa85ac63f40 RU 0.0 0 0 [swapper/9] > 0 0 10 ffff9fa85ac64f10 RU 0.0 0 0 [swapper/10] > 0 0 11 ffff9fa85ac65ee0 RU 0.0 0 0 [swapper/11] > 0 0 12 ffff9fbfdb3c0000 RU 0.0 0 0 [swapper/12] > 0 0 13 ffff9fbfdb3c0fd0 RU 0.0 0 0 [swapper/13] > 0 0 14 ffff9fbfdb3c1fa0 RU 0.0 0 0 [swapper/14] > 0 0 15 ffff9fbfdb3c2f70 RU 0.0 0 0 [swapper/15] > 0 0 16 ffff9fbfdb3c3f40 RU 0.0 0 0 [swapper/16] > 0 0 17 ffff9fbfdb3c4f10 RU 0.0 0 0 [swapper/17] > 0 0 18 ffff9fbfdb3c5ee0 RU 0.0 0 0 [swapper/18] > 0 0 19 ffff9fbfdb3c6eb0 RU 0.0 0 0 [swapper/19] > 0 0 20 ffff9fbfdb3f0000 RU 0.0 0 0 [swapper/20] > 0 0 21 ffff9fbfdb3f0fd0 RU 0.0 0 0 [swapper/21] > 0 0 22 ffff9fbfdb3f1fa0 RU 0.0 0 0 [swapper/22] > 0 0 23 ffff9fbfdb3f2f70 RU 0.0 0 0 [swapper/23] > 0 0 24 ffff9fa85ac66eb0 RU 0.0 0 0 [swapper/24] > 0 0 25 ffff9fa85ac98000 RU 0.0 0 0 [swapper/25] > 0 0 26 ffff9fa85ac98fd0 RU 0.0 0 0 [swapper/26] > 0 0 27 ffff9fa85ac99fa0 RU 0.0 0 0 [swapper/27] > 0 0 28 ffff9fa85ac9af70 RU 0.0 0 0 [swapper/28] > 0 0 29 ffff9fa85ac9bf40 RU 0.0 0 0 [swapper/29] > 0 0 30 ffff9fa85ac9cf10 RU 0.0 0 0 [swapper/30] > 0 0 31 ffff9fa85ac9dee0 RU 0.0 0 0 [swapper/31] > 0 0 32 ffff9fa85ac9eeb0 RU 0.0 0 0 [swapper/32] > 0 0 33 ffff9fa85acc0000 RU 0.0 0 0 [swapper/33] > 0 0 34 ffff9fa85acc0fd0 RU 0.0 0 0 [swapper/34] > 0 0 35 ffff9fa85acc1fa0 RU 0.0 0 0 [swapper/35] > 0 0 36 ffff9fbfdb3f6eb0 RU 0.0 0 0 [swapper/36] > 0 0 37 ffff9fbfdb3f5ee0 RU 0.0 0 0 [swapper/37] > 0 0 38 ffff9fbfdb3f4f10 RU 0.0 0 0 [swapper/38] > 0 0 39 ffff9fbfdb3f3f40 RU 0.0 0 0 [swapper/39] > 0 0 40 ffff9fbfdac18000 RU 0.0 0 0 [swapper/40] > 0 0 41 ffff9fbfdac18fd0 RU 0.0 0 0 [swapper/41] > 0 0 42 ffff9fbfdac19fa0 RU 0.0 0 0 [swapper/42] > 0 0 43 ffff9fbfdac1af70 RU 0.0 0 0 [swapper/43] > 0 0 44 ffff9fbfdac1bf40 RU 0.0 0 0 [swapper/44] > 0 0 45 ffff9fbfdac1cf10 RU 0.0 0 0 [swapper/45] crash>
log命令很重要。很多故障都会丢到dmesg信息。一般宕机后。只有最新一次宕机的dmesg信息..
crash > log
得到宕机前dmesg信息如下。
查看宕机时刻内存使用率
crash> kmem -i PAGES TOTAL PERCENTAGE TOTAL MEM 49362433 188.3 GB ---- FREE 48239978 184 GB 97% of TOTAL MEM USED 1122455 4.3 GB 2% of TOTAL MEM SHARED 569792 2.2 GB 1% of TOTAL MEM BUFFERS 0 0 0% of TOTAL MEM CACHED 580597 2.2 GB 1% of TOTAL MEM SLAB 66681 260.5 MB 0% of TOTAL MEM TOTAL HUGE 0 0 ---- HUGE FREE 0 0 0% of TOTAL HUGE TOTAL SWAP 1048575 4 GB ---- SWAP USED 416721 1.6 GB 39% of TOTAL SWAP SWAP FREE 631854 2.4 GB 60% of TOTAL SWAP COMMIT LIMIT 25729791 98.2 GB ---- COMMITTED 2367291 9 GB 9% of TOTAL LIMIT crash>
PS 显示宕机时刻,运行的进程。可以搭配grep检索
net 显示宕机时刻网络
bt 从最后往前看
PID: 230493 TASK: ffff9fbee1f56eb0 CPU: 5 COMMAND: "bash" #0 [ffff9fbf07703ae8] machine_kexec at ffffffffb6a62a0a #1 [ffff9fbf07703b48] __crash_kexec at ffffffffb6b166c2 #2 [ffff9fbf07703c18] crash_kexec at ffffffffb6b167b0 #3 [ffff9fbf07703c30] oops_end at ffffffffb711d728 #4 [ffff9fbf07703c58] no_context at ffffffffb710c84d #5 [ffff9fbf07703ca8] __bad_area_nosemaphore at ffffffffb710c8e4 #6 [ffff9fbf07703cf8] bad_area_nosemaphore at ffffffffb710ca55 #7 [ffff9fbf07703d08] __do_page_fault at ffffffffb71206e0 #8 [ffff9fbf07703d70] do_page_fault at ffffffffb71208d5 #9 [ffff9fbf07703da0] page_fault at ffffffffb711c758 [exception RIP: sysrq_handle_crash+22] RIP: ffffffffb6e33b56 RSP: ffff9fbf07703e58 RFLAGS: 00010246 RAX: ffffffffb6e33b40 RBX: ffffffffb76d7b20 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9fbf18b53978 RDI: 0000000000000063 RBP: ffff9fbf07703e58 R8: ffffffffb79c28bc R9: ffffffffb79ff607 R10: 0000000000000b37 R11: 0000000000000b36 R12: 0000000000000063 R13: 0000000000000000 R14: 0000000000000004 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff9fbf07703e60] __handle_sysrq at ffffffffb6e3430f #11 [ffff9fbf07703e90] write_sysrq_trigger at ffffffffb6e347e8 #12 [ffff9fbf07703ea8] proc_reg_write at ffffffffb6c94e00 #13 [ffff9fbf07703ec8] vfs_write at ffffffffb6c1f240 #14 [ffff9fbf07703f08] sys_write at ffffffffb6c2006f #15 [ffff9fbf07703f50] system_call_fastpath at ffffffffb712579b RIP: 00007f6eea301cd0 RSP: 00007ffceabd5e10 RFLAGS: 00010246 RAX: 0000000000000001 RBX: 0000000000000002 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 00007f6eeac2c000 RDI: 0000000000000001 RBP: 00007f6eeac2c000 R8: 000000000000000a R9: 00007f6eeac12740 R10: 00007f6eeac12740 R11: 0000000000000246 R12: 00007f6eea5d9400 R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
sys命令可以查看到宕机时间
crash> sys KERNEL: /usr/lib/debug/lib/modules/3.10.0-862.14.4.el7.x86_64/vmlinux DUMPFILE: /var/crash/127.0.0.1-2018-11-03-22:12:29/vmcore [PARTIAL DUMP] CPUS: 48 DATE: Sat Nov 3 22:12:19 2018 UPTIME: 1 days, 23:16:57 LOAD AVERAGE: 0.00, 0.07, 0.26 TASKS: 1158 NODENAME: localhost.localdomain RELEASE: 3.10.0-862.14.4.el7.x86_64 VERSION: #1 SMP Wed Sep 26 15:12:11 UTC 2018 MACHINE: x86_64 (2300 Mhz) MEMORY: 191.6 GB PANIC: "SysRq : Trigger a crash" crash>
查看宕机时刻执行的jobs
crash> ps | grep '>' > 0 0 0 ffffffffb7616480 RU 0.0 0 0 [swapper/0] > 0 0 1 ffff9fa85ac33f40 RU 0.0 0 0 [swapper/1] > 0 0 2 ffff9fa85ac34f10 RU 0.0 0 0 [swapper/2] > 0 0 3 ffff9fa85ac35ee0 RU 0.0 0 0 [swapper/3] > 0 0 4 ffff9fa85ac36eb0 RU 0.0 0 0 [swapper/4] > 0 0 6 ffff9fa85ac60fd0 RU 0.0 0 0 [swapper/6] > 0 0 7 ffff9fa85ac61fa0 RU 0.0 0 0 [swapper/7] > 0 0 8 ffff9fa85ac62f70 RU 0.0 0 0 [swapper/8] > 0 0 9 ffff9fa85ac63f40 RU 0.0 0 0 [swapper/9] > 0 0 10 ffff9fa85ac64f10 RU 0.0 0 0 [swapper/10] > 0 0 11 ffff9fa85ac65ee0 RU 0.0 0 0 [swapper/11] > 0 0 12 ffff9fbfdb3c0000 RU 0.0 0 0 [swapper/12] > 0 0 13 ffff9fbfdb3c0fd0 RU 0.0 0 0 [swapper/13] > 0 0 14 ffff9fbfdb3c1fa0 RU 0.0 0 0 [swapper/14] > 0 0 15 ffff9fbfdb3c2f70 RU 0.0 0 0 [swapper/15] > 0 0 16 ffff9fbfdb3c3f40 RU 0.0 0 0 [swapper/16] > 0 0 17 ffff9fbfdb3c4f10 RU 0.0 0 0 [swapper/17] > 0 0 18 ffff9fbfdb3c5ee0 RU 0.0 0 0 [swapper/18] > 0 0 19 ffff9fbfdb3c6eb0 RU 0.0 0 0 [swapper/19] > 0 0 20 ffff9fbfdb3f0000 RU 0.0 0 0 [swapper/20] > 0 0 21 ffff9fbfdb3f0fd0 RU 0.0 0 0 [swapper/21] > 0 0 22 ffff9fbfdb3f1fa0 RU 0.0 0 0 [swapper/22] > 0 0 23 ffff9fbfdb3f2f70 RU 0.0 0 0 [swapper/23] > 0 0 24 ffff9fa85ac66eb0 RU 0.0 0 0 [swapper/24] > 0 0 25 ffff9fa85ac98000 RU 0.0 0 0 [swapper/25] > 0 0 26 ffff9fa85ac98fd0 RU 0.0 0 0 [swapper/26] > 0 0 27 ffff9fa85ac99fa0 RU 0.0 0 0 [swapper/27] > 0 0 28 ffff9fa85ac9af70 RU 0.0 0 0 [swapper/28] > 0 0 29 ffff9fa85ac9bf40 RU 0.0 0 0 [swapper/29] > 0 0 30 ffff9fa85ac9cf10 RU 0.0 0 0 [swapper/30] > 0 0 31 ffff9fa85ac9dee0 RU 0.0 0 0 [swapper/31] > 0 0 32 ffff9fa85ac9eeb0 RU 0.0 0 0 [swapper/32] > 0 0 33 ffff9fa85acc0000 RU 0.0 0 0 [swapper/33] > 0 0 34 ffff9fa85acc0fd0 RU 0.0 0 0 [swapper/34] > 0 0 35 ffff9fa85acc1fa0 RU 0.0 0 0 [swapper/35] > 0 0 36 ffff9fbfdb3f6eb0 RU 0.0 0 0 [swapper/36] > 0 0 37 ffff9fbfdb3f5ee0 RU 0.0 0 0 [swapper/37] > 0 0 38 ffff9fbfdb3f4f10 RU 0.0 0 0 [swapper/38] > 0 0 39 ffff9fbfdb3f3f40 RU 0.0 0 0 [swapper/39] > 0 0 40 ffff9fbfdac18000 RU 0.0 0 0 [swapper/40] > 0 0 41 ffff9fbfdac18fd0 RU 0.0 0 0 [swapper/41] > 0 0 42 ffff9fbfdac19fa0 RU 0.0 0 0 [swapper/42] > 0 0 43 ffff9fbfdac1af70 RU 0.0 0 0 [swapper/43] > 0 0 44 ffff9fbfdac1bf40 RU 0.0 0 0 [swapper/44] > 0 0 45 ffff9fbfdac1cf10 RU 0.0 0 0 [swapper/45] > 0 0 46 ffff9fbfdac1dee0 RU 0.0 0 0 [swapper/46] > 0 0 47 ffff9fbfdac1eeb0 RU 0.0 0 0 [swapper/47] > 230493 230480 5 ffff9fbee1f56eb0 RU 0.0 116760 3568 bash crash>
查看宕机时刻占用内存最高的程序
[root@localhost home]# cat ps.txt | sed "s/^>//" | sort -n -k 7 |tail -n 20 8641 7507 20 ffff9fbefa629fa0 IN 0.1 6954692 161904 JS Helper 8642 7507 16 ffff9fbefa62af70 IN 0.1 6954692 161904 JS Helper 8643 7507 21 ffff9fbefa62bf40 IN 0.1 6954692 161904 JS Helper 8644 7507 37 ffff9fbefa62cf10 IN 0.1 6954692 161904 JS Helper 8733 7507 37 ffff9fbefb6c4f10 IN 0.1 6954692 161904 llvmpipe-0 8734 7507 37 ffff9fbefb6c3f40 IN 0.1 6954692 161904 llvmpipe-1 8736 7507 32 ffff9fbefb6c5ee0 IN 0.1 6954692 161904 llvmpipe-2 8737 7507 33 ffff9fbefe250000 IN 0.1 6954692 161904 llvmpipe-3 8738 7507 38 ffff9fbefe250fd0 IN 0.1 6954692 161904 llvmpipe-4 8739 7507 32 ffff9fbefe251fa0 IN 0.1 6954692 161904 llvmpipe-5 8740 7507 32 ffff9fbefe252f70 IN 0.1 6954692 161904 llvmpipe-6 8741 7507 32 ffff9fbefe253f40 IN 0.1 6954692 161904 llvmpipe-7 8742 7507 32 ffff9fbefe254f10 IN 0.1 6954692 161904 llvmpipe-8 8743 7507 32 ffff9fbefe255ee0 IN 0.1 6954692 161904 llvmpipe-9 8744 7507 32 ffff9fbefe256eb0 IN 0.1 6954692 161904 llvmpipe-10 8745 7507 32 ffff9fa8591f0000 IN 0.1 6954692 161904 llvmpipe-11 8746 7507 32 ffff9fa8591f0fd0 IN 0.1 6954692 161904 llvmpipe-12 8747 7507 32 ffff9fa8591f1fa0 IN 0.1 6954692 161904 llvmpipe-13 8748 7507 32 ffff9fa8591f2f70 IN 0.1 6954692 161904 llvmpipe-14 8749 7507 32 ffff9fa8591f3f40 IN 0.1 6954692 161904 llvmpipe-15 [root@localhost home]#
---恢复内容结束---