2017-03-13 09:20:37

by Abdul Haleem

[permalink] [raw]
Subject: [BUG] [PowerPC] Kernel Oops when booting Linux mainline

Hi,

Mainline boot is broken on PowerPC bare metal with below traces:
Machine Type : Power 8 Bare metal

[ OK ] Mounted Debug File System.
[ OK ] Started Nameserver information manager.
[ OK ] Started LVM2 metadata daemon.
Unable to handle kernel paging request for data at address 0x300000079
Faulting instruction address: 0xc000000000811cac
Oops: Kernel access of bad area, sig: 11 [#1]
Unable to handle kernel paging request for instruction fetch
Unable to handle kernel paging request for data at address 0xc000007c77881e28
Faulting instruction address: 0xc000003c7acd7b80
Faulting instruction address: 0xc00000000029fd88
Thread overran stack, or stack corrupted
SMP NR_CPUS=2048
NUMA
PowerNV
Modules linked in: ip_table�KK~<(E) x_tables�\8|<(E) autofs4(E) raid10 raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq multipath bnx2x(E) mdio(E) libcrc32�^B�w<(E)
CPU: 21 PID: 137 Comm: kworker/21:0 Not tainted 4.11.0-rc1-00335-g56b24d1-dirty #1
Workqueue: events dbs_work_handler
task: c000003c8e86b380 task.stack: c000003c8eae0000
NIP: c000000000811cac LR: c000000000813014 CTR: c000000000811c50
REGS: c000003c8eae3980 TRAP: 0300 Not tainted (4.11.0-rc1-00335-g56b24d1-dirty)
MSR: 900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>
CR: 24002422 XER: 00000000
CFAR: c000000000008860 DAR: 0000000300000079 DSISR: 40000000 SOFTE: 1
GPR00: c000000000813014 c000003c8eae3c00 c000000001019800 c000003c780d0800
GPR04: 0000000000000540 0000000000000000 0000000000000000 c000003c8e86b380
GPR08: 0000000000000000 000000000000003c 000000000000003c 0000000000008397
GPR12: c000000000811c50 c00000000fe05400 c0000000000f2e08 c000003ffb073d80
GPR16: c000003ffc564328 c000003ffc5640f8 c000003ffc5640a0 0000000000000001
GPR20: 0000000000000000 0000000000000000 c000000000f50bc0 fffffffffffffef7
GPR24: 0000000000000000 c000003ffc564400 0000000000000000 0000000300000001
GPR28: c000003c780d0b60 0000000300000001 c000003c780d0800 c000003c780d0b60
NIP [c000000000811cac] od_dbs_update+0x5c/0x260
LR [c000000000813014] dbs_work_handler+0x54/0xa0
Call Trace:
[c000003c8eae3c00] [c000003ffc564880] 0xc000003ffc564880 (unreliable)
[c000003c8eae3c50] [c000000000813014] dbs_work_handler+0x54/0xa0
[c000003c8eae3c90] [c0000000000ea600] process_one_work+0x2a0/0x590
[c000003c8eae3d20] [c0000000000ea998] worker_thread+0xa8/0x660
[c000003c8eae3dc0] [c0000000000f2f4c] kthread+0x14c/0x190
[c000003c8eae3e30] [c00000000000b4e8] ret_from_kernel_thread+0x5c/0x74
Instruction dump:
f821ffb1 ebe30340 813f00ac 895f00ac ebbf0078 71280001 5549003c 993f00ac
40820164 eb9e0340 7fc3f378 eb7c0078 <eb5b0078> 48001539 60000000 39200000
---[ end trace 587e7f5a13c0f2ad ]---


Detailed logs and config is attached.

FYI, Good commit of last successful boot is :

commit ea6200e84182989a3cce9687cf79a23ac44ec4db
Merge: b4fb8f6 fc69910
Author: Linus Torvalds <[email protected]>
Date: Wed Mar 8 14:45:31 2017 -0800

Merge branch 'core-urgent-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip



--
Regard's

Abdul Haleem
IBM Linux Technology Centre



Attachments:
bootlogs.txt (46.96 kB)
Hab-NV-config (83.40 kB)
Download all attachments

2017-03-13 16:54:27

by Abdul Haleem

[permalink] [raw]
Subject: Re: [BUG] [PowerPC] Kernel Oops when booting Linux mainline

On Mon, 2017-03-13 at 14:48 +0530, Abdul Haleem wrote:
> Hi,
>
> Mainline boot is broken on PowerPC bare metal with below traces:
> Machine Type : Power 8 Bare metal
>
> [ OK ] Mounted Debug File System.
> [ OK ] Started Nameserver information manager.
> [ OK ] Started LVM2 metadata daemon.
> Unable to handle kernel paging request for data at address 0x300000079
> Faulting instruction address: 0xc000000000811cac
> Oops: Kernel access of bad area, sig: 11 [#1]
> Unable to handle kernel paging request for instruction fetch
> Unable to handle kernel paging request for data at address 0xc000007c77881e28
> Faulting instruction address: 0xc000003c7acd7b80
> Faulting instruction address: 0xc00000000029fd88
> Thread overran stack, or stack corrupted
> SMP NR_CPUS=2048
> NUMA
> PowerNV
> Modules linked in: ip_table�KK~<(E) x_tables�\8|<(E) autofs4(E) raid10 raid456 async_raid6_recov
> async_memcpy async_pq async_xor async_tx xor raid6_pq multipath bnx2x(E) mdio(E) libcrc32�^B�w<(E)
> CPU: 21 PID: 137 Comm: kworker/21:0 Not tainted 4.11.0-rc1-00335-g56b24d1-dirty #1
> Workqueue: events dbs_work_handler
> task: c000003c8e86b380 task.stack: c000003c8eae0000
> NIP: c000000000811cac LR: c000000000813014 CTR: c000000000811c50
> REGS: c000003c8eae3980 TRAP: 0300 Not tainted (4.11.0-rc1-00335-g56b24d1-dirty)
> MSR: 900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>
> CR: 24002422 XER: 00000000
> CFAR: c000000000008860 DAR: 0000000300000079 DSISR: 40000000 SOFTE: 1
> GPR00: c000000000813014 c000003c8eae3c00 c000000001019800 c000003c780d0800
> GPR04: 0000000000000540 0000000000000000 0000000000000000 c000003c8e86b380
> GPR08: 0000000000000000 000000000000003c 000000000000003c 0000000000008397
> GPR12: c000000000811c50 c00000000fe05400 c0000000000f2e08 c000003ffb073d80
> GPR16: c000003ffc564328 c000003ffc5640f8 c000003ffc5640a0 0000000000000001
> GPR20: 0000000000000000 0000000000000000 c000000000f50bc0 fffffffffffffef7
> GPR24: 0000000000000000 c000003ffc564400 0000000000000000 0000000300000001
> GPR28: c000003c780d0b60 0000000300000001 c000003c780d0800 c000003c780d0b60
> NIP [c000000000811cac] od_dbs_update+0x5c/0x260
> LR [c000000000813014] dbs_work_handler+0x54/0xa0
> Call Trace:
> [c000003c8eae3c00] [c000003ffc564880] 0xc000003ffc564880 (unreliable)
> [c000003c8eae3c50] [c000000000813014] dbs_work_handler+0x54/0xa0
> [c000003c8eae3c90] [c0000000000ea600] process_one_work+0x2a0/0x590
> [c000003c8eae3d20] [c0000000000ea998] worker_thread+0xa8/0x660
> [c000003c8eae3dc0] [c0000000000f2f4c] kthread+0x14c/0x190
> [c000003c8eae3e30] [c00000000000b4e8] ret_from_kernel_thread+0x5c/0x74
> Instruction dump:
> f821ffb1 ebe30340 813f00ac 895f00ac ebbf0078 71280001 5549003c 993f00ac
> 40820164 eb9e0340 7fc3f378 eb7c0078 <eb5b0078> 48001539 60000000 39200000
> ---[ end trace 587e7f5a13c0f2ad ]---
>
>
> Detailed logs and config is attached.
>
> FYI, Good commit of last successful boot is :
>
> commit ea6200e84182989a3cce9687cf79a23ac44ec4db
> Merge: b4fb8f6 fc69910
> Author: Linus Torvalds <[email protected]>
> Date: Wed Mar 8 14:45:31 2017 -0800
>
> Merge branch 'core-urgent-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>
>
>

With this patch, ppc boots fine. Thanks for the fix

http://lkml.kernel.org/r/[email protected]

Tested-by : Abdul Haleem <[email protected]>

--
Regard's

Abdul Haleem
IBM Linux Technology Centre