上个实例:http://www.cnblogs.com/weishengzhong/p/7429840.html
之前那个实例在运行过程中有个bug,将驱动模块装入内核后,不做任何操作,等待一段时间,就会出现内核错误,具体打印信息如下:
Unable to handle kernel NULL pointer dereference at virtual address 00000004 pgd = c0104000 [00000004] *pgd=00000000 Internal error: Oops: 17 [#1] ARM Modules linked in: buttondev(O) buttondrv(O) CPU: 0 Tainted: G O (3.4.2 #13) PC is at buttons_timer_function+0xc/0x68 [buttondrv] LR is at run_timer_softirq+0x10c/0x244 pc : [<bf0000a0>] lr : [<c01322a4>] psr: 80000013 sp : c0589ec8 ip : bf000494 fp : c05d072c r10: c05d032c r9 : c05d052c r8 : c0588000 r7 : bf000094 r6 : c0589ee8 r5 : bf000494 r4 : 00000000 r3 : 80000013 r2 : 00000000 r1 : 60000093 r0 : 00000000 Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: c000717f Table: 33ab8000 DAC: 00000017 Process swapper (pid: 0, stack limit = 0xc0588270) Stack: (0xc0589ec8 to 0xc058a000) 9ec0: c05cfb20 00000100 c0589ee8 c01322a4 c05a2a9c c059cc08 9ee0: c05a896c c05d092c c0589ee8 c0589ee8 00000001 00000004 00000001 00000001 9f00: 00000100 c05cf9c0 0000000a c0588000 c05a1754 c012d524 c05a896c 00000000 9f20: c0589f94 0000001e c05a896c 00000000 c0589f94 c0580520 41129200 c0590020 9f40: 00000000 c012d748 0000001e c01163a0 c01164d4 60000013 f6000000 c01150a4 9f60: f6100000 00000032 f6100000 60000013 c0588000 c05beb68 c05932ac c05beac0 9f80: c0580520 41129200 c0590020 00000000 00000000 c0589fa8 c01164c8 c01164d4 9fa0: 60000013 ffffffff c0588000 c0116b68 c0590170 c05beb40 c058137c c055e868 9fc0: 00000000 00000000 c055e3d4 00000000 00000000 c058137c 00000000 c0007175 9fe0: c0590094 c0581378 c05932a4 30104000 3057f878 30108040 00000000 00000000 [<bf0000a0>] (buttons_timer_function+0xc/0x68 [buttondrv]) from [<c01322a4>] (run_timer_softirq+0x10c/0x244) [<c01322a4>] (run_timer_softirq+0x10c/0x244) from [<c012d524>] (__do_softirq+0x88/0x148) [<c012d524>] (__do_softirq+0x88/0x148) from [<c012d748>] (irq_exit+0x48/0x50) [<c012d748>] (irq_exit+0x48/0x50) from [<c01163a0>] (handle_IRQ+0x34/0x84) [<c01163a0>] (handle_IRQ+0x34/0x84) from [<c01150a4>] (__irq_svc+0x24/0xa0) [<c01150a4>] (__irq_svc+0x24/0xa0) from [<c01164d4>] (default_idle+0x28/0x50) [<c01164d4>] (default_idle+0x28/0x50) from [<c0116b68>] (cpu_idle+0x94/0xbc) [<c0116b68>] (cpu_idle+0x94/0xbc) from [<c055e868>] (start_kernel+0x260/0x2f4) Code: bf000494 e92d4070 e59f5058 e5954020 (e5940004) ---[ end trace c5ecb8c491baf7b3 ]--- Kernel panic - not syncing: Fatal exception in interrupt
大致的看看信息,可以知道错误发生在 PC is at buttons_timer_function+0xc/0x68 [buttondrv]这个地方,用 cat /proc/kallsyms命令可以看出分别是加载的模块地址空间和内核函数地址空间,buttons_timer_function加载到内核空间的地址是:bf000094
看看出错时,各个寄存器保存的值是多少:pc : [<bf0000a0>] lr : [<c01322a4>] ,可以看出,出错时PC当前地址是加载的内核模块中bf0000a0地方,从内核函数c01322a4这个位置调用它的时候出错了;为什么调用bf0000a0所在的函数会出出错呢?
对buttondrv.ko文件进行反汇编,arm-linux-objdump -D buttondrv.ko >buttondrv.dis ;看如下反汇编代码:
00000094 <buttons_timer_function>: 94: e92d4070 push {r4, r5, r6, lr} 98: e59f5058 ldr r5, [pc, #88] ; f8 <buttons_timer_function+0x64> 9c: e5954020 ldr r4, [r5, #32] a0: e5940004 ldr r0, [r4, #4] a4: ebfffffe bl 0 <s3c2410_gpio_getpin> a8: e2506000 subs r6, r0, #0 ; 0x0 ac: 1a00000a bne dc <buttons_timer_function+0x48> b0: e3a01001 mov r1, #1 ; 0x1 b4: e1a03001 mov r3, r1 b8: e5942000 ldr r2, [r4] bc: e595001c ldr r0, [r5, #28] c0: ebfffffe bl 0 <input_event> c4: e1a01006 mov r1, r6 c8: e595001c ldr r0, [r5, #28] cc: e1a02001 mov r2, r1 d0: e1a03001 mov r3, r1 d4: e8bd4070 pop {r4, r5, r6, lr} d8: eafffffe b 0 <input_event> dc: e3a01001 mov r1, #1 ; 0x1 e0: e5942000 ldr r2, [r4] e4: e595001c ldr r0, [r5, #28] e8: e3a03000 mov r3, #0 ; 0x0 ec: ebfffffe bl 0 <input_event> f0: e3a01000 mov r1, #0 ; 0x0 f4: eafffff3 b c8 <buttons_timer_function+0x34> f8: 00000000 .word 0x00000000
buttons_timer_function加载到内核空间的地址是:bf000094,而对应反汇编的地址是0x94这个位置,出错的地方当然就是0xa0那个位置了;
看看a4位置,是个调用s3c2410_gpio_getpin函数的过程,看看buttons_timer_function函数的C语言代码:
static void buttons_timer_function ( unsigned long data ) { struct gpio_keys_button *buttonkey = ( struct gpio_keys_button * ) irq_pd; u32 pinval; pinval = s3c2410_gpio_getpin ( buttonkey->gpio ); if ( pinval ) { input_event ( iputdev, EV_KEY, buttonkey->code, 0 ); input_sync ( iputdev ); } else { input_event ( iputdev, EV_KEY, buttonkey->code, 1 ); input_sync ( iputdev ); } }
s3c2410_gpio_getpin调用语句是
pinval = s3c2410_gpio_getpin ( buttonkey->gpio );
这条语句,参数是buttonkey->gpio只有一个参数,正好是汇编的r0寄存器,我们再看看汇编r0寄存器赋了什么值给他:
a0: e5940004 ldr r0, [r4, #4]
这条汇编指令就是赋值r0语句,它是把r4寄存器的内容+4后赋给r0,看看出错时r4中的内容是0;而r0寄存器中存储的也是0,导致在零地址赋值0x04,正好是系统提示的
Unable to handle kernel NULL pointer dereference at virtual address 00000004
为什么会有这个错误呢?那就只能是参数有问题了!参数是个gpio_keys_button类型的指针指向的引脚,参考指针操作原则,这个指针可能是个空指针,在操作空指针的时候导致错误,要避免操作空指针的错误,只能是在操作之前先判断是不是空指针,
所以在调用函数s3c2410_gpio_getpin之前先判断指针是不是空 ,加入以下语句:
if ( !buttonkey ) return;
从新编译,然后装载模块,问题不再出现。