2016-10-14 06:25:16

by Abdul Haleem

[permalink] [raw]
Subject: [PowerPC] Kernel panic while running CPU toggle test on 4.8.0 kernel

Hi,

Kernel Oops, followed by kernel panic seen while running rcutorture test on 4.8.0 kernel
Machine Type : PowerPC Bare Metal

RCU torture test steps:
1. modprobe rcutorture
2. start CPU offline and online while torture test is running

dmesg start flooded with Oops messages following kernel panic resulted in machine reboot.

trace messages:
22:51:27 rcu-torture: rcu_torture_cbflood task started
22:51:27 13:23:02 INFO | Online all cpus 32
22:51:37 13:23:12 INFO | Offline all cpus 0 - 31
22:51:37
22:51:37 Unable to handle kernel paging request for data at address 0x00000078
22:51:37 Faulting instruction address: 0xc0000000000f6b78
22:51:37 Oops: Kernel access of bad area, sig: 11 [#1]
22:51:37 SMP NR_CPUS=32 NUMA
22:51:37 PowerNV
22:51:37 Modules linked in: rcutorture torture powernv_op_panel leds_powernv led_class powernv_rng rng_core autofs4 [last unloaded: rcutorture]
22:51:37 CPU: 0 PID: 12 Comm: cpuhp/0 Not tainted 4.8.0-autotest #1
22:51:37 task: c0000007de2f2300 task.stack: c0000007de2f8000
22:51:37 NIP: c0000000000f6b78 LR: c0000000000f69a0 CTR: c0000000000f60b0
22:51:37 REGS: c0000007de2fb860 TRAP: 0300 Not tainted (4.8.0-autotest)
22:51:37 MSR: 900000010280b033 <
22:51:37 SF,HV
22:51:37 ,VEC,VSX
22:51:37 ,EE,FP
22:51:37 ,ME,IR
22:51:37 ,DR,RI
22:51:37 ,LE,TM[
22:51:37 E]
22:51:37 > CR: 84400482 XER: 20000000
22:51:37 CFAR: c0000000000089e0 DAR: 0000000000000078 DSISR: 40000000
22:51:37 SOFTE: 0
22:51:37 GPR00:
22:51:37 c0000000000e8d30 c0000007de2fbae0
22:51:37 c000000000f95a00 0000000000000000
22:51:37 GPR04: c000000000cb5a00
22:51:37 0000000000000000 0000000000000000
22:51:37 c000000000cb5a00
22:51:37 GPR08:
22:51:37 00000007ff051000 c0000007ffb13300
22:51:37 00000000000f4240 0000000000160064
22:51:37 GPR12: 0000000024000484
22:51:37 c00000000fff8000 c0000000000d9418
22:51:37 c000000fde134d00
22:51:37 GPR16:
22:51:37 0000000000000000 0000000000000000
22:51:37 0000000000000000 0000000000000080
22:51:37 GPR20: 0000000000000010
22:51:37 c00000000298d000 c0000007da5f8000
22:51:37 0000000000000000
22:51:37 GPR24:
22:51:37 0000000000000010 0000000000000001
22:51:37 c0000007da5f8360 c000000000cc2300
22:51:37 GPR28: 0000000000000000
22:51:37 c000000000f9dae0 0000000000000010
22:51:37 0000000000000010
22:51:37 NIP [c0000000000f6b78] select_task_rq_fair+0xac8/0xca0
22:51:37 LR [c0000000000f69a0] select_task_rq_fair+0x8f0/0xca0
22:51:37 Call Trace:
22:51:37 [c0000007de2fbae0] [c0000007de2fbb90] 0xc0000007de2fbb90 (unreliable)
22:51:37 [c0000007de2fbbc0] [c0000000000e8d30] try_to_wake_up+0x140/0x520
22:51:37 [c0000007de2fbc40] [c000000000102334] __wake_up_common+0x84/0xf0
22:51:37 [c0000007de2fbca0] [c0000000001032b4] complete+0x54/0x90
22:51:37 [c0000007de2fbce0] [c0000000000b02f0] cpuhp_thread_fun+0x80/0x190
22:51:37 [c0000007de2fbd20] [c0000000000dee70] smpboot_thread_fn+0x290/0x2a0
22:51:37 [c0000007de2fbd80] [c0000000000d9518] kthread+0x108/0x130
22:51:37 [c0000007de2fbe30] [c00000000000c260] ret_from_kernel_thread+0x5c/0x7c
22:51:37 Instruction dump:
22:51:37 39243ea0 7d29502a 2fa90000 419e01ec 39400000 91490008 e92d0030 3ce2ffd2
22:51:37 39473e90 7f8a482a 7d3b4a14 e9490900 <e93c0078> 794aba42 7fa95040 419d0110
22:51:37 ---[ end trace 2b5fc0bd24058a84 ]---

Detailed logs and config file is attached.


Attachments:
rcu_config (87.32 kB)
build-testlogs.txt (207.18 kB)
Download all attachments

2016-10-14 11:05:26

by Balbir Singh

[permalink] [raw]
Subject: Re: [PowerPC] Kernel panic while running CPU toggle test on 4.8.0 kernel



On 14/10/16 17:24, Abdul Haleem wrote:
> Hi,
>
> Kernel Oops, followed by kernel panic seen while running rcutorture test on 4.8.0 kernel
> Machine Type : PowerPC Bare Metal
>
> RCU torture test steps:
> 1. modprobe rcutorture
> 2. start CPU offline and online while torture test is running
>
> dmesg start flooded with Oops messages following kernel panic resulted in machine reboot.

Good bug. The report looks good


Any chance you could add ftrace_dump_on_oops to command line and start this test
with tracing enabled? Otherwise, getting the line via addr2line of select_task_rq_fair+0xac8/0xca0
would be helpful.

It would also be nice to know the CFAR value (CFAR: c0000000000089e0) using
addr2line

Balbir Singh.