2017-10-18 15:11:19

by Holger Kiehl

[permalink] [raw]
Subject: 3.16.49 Oops, does not boot on two socket server

Hello,

just tried to boot 3.16.49 on a 2 socket server and it fails with the
following error:

smpboot: Total of 24 processors activated (95818.36 BogoMIPS)
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at kernel/sched/core.c:5811 init_overlap_sched_group+0x114/0x120()
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.16.49-1.el6.x86_64 #1
Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/02/2014
0000000000000000 ffff880bfd6d3da8 ffffffff81542f1c 0000000000000000
00000000000016b3 ffff880bfd6d3de8 ffffffff8104cd72 ffff880c0f803c00
ffff880bfcc69650 ffff8817fd695ca8 ffff880bfd6e2300 0000000000000000
Call Trace:
[<ffffffff81542f1c>] dump_stack+0x4e/0x6a
[<ffffffff8104cd72>] warn_slowpath_common+0x82/0xb0
[<ffffffff8104cdb5>] warn_slowpath_null+0x15/0x20
[<ffffffff81079834>] init_overlap_sched_group+0x114/0x120
[<ffffffff81079974>] build_overlap_sched_groups+0x134/0x1e0
[<ffffffff8107a169>] build_sched_domains+0x159/0x330
[<ffffffff817c2b3c>] sched_init_smp+0x65/0xf8
[<ffffffff817abb12>] kernel_init_freeable+0xb2/0x12d
[<ffffffff81541400>] ? rest_init+0x80/0x80
[<ffffffff81541409>] kernel_init+0x9/0xf0
[<ffffffff81547248>] ret_from_fork+0x58/0x90
[<ffffffff81541400>] ? rest_init+0x80/0x80
---[ end trace a491a27c866dd06e ]---
BUG: unable to handle kernel paging request at 00000100000247bf
IP: [<ffffffff810797ce>] init_overlap_sched_group+0xae/0x120
PGD 0
Oops: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.16.49-1.el6.x86_64 #1
Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/02/2014
task: ffff8817fd6a8000 ti: ffff880bfd6d0000 task.ti: ffff880bfd6d0000
RIP: 0010:[<ffffffff810797ce>] [<ffffffff810797ce>] init_overlap_sched_group+0xae/0x120
RSP: 0000:ffff880bfd6d3e08 EFLAGS: 00010246
RAX: 000001000000ffff RBX: ffff880bfcc69650 RCX: 0000000000000020
RDX: 00000000000147c0 RSI: 0000000000000020 RDI: 0000000000000020
RBP: ffff880bfd6d3e28 R08: ffff880bfd6e2318 R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000001 R12: ffff8817fd695ca8
R13: ffff880bfd6e2300 R14: 0000000000000000 R15: ffff8817fd695ca8
FS: 0000000000000000(0000) GS:ffff880c0fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000100000247bf CR3: 0000001714000 CR4: 00000000000407f0
Stack:
0000000000000000 0000000000000000 0000000000000000 ffff880bfcc69650
ffff880bfd6d3ea8 ffffffff81079974 0000000000000011 ffff880bfd6e2300
0000000000000000 0000000000000000 000000000000cac8 0000000000000000
Call Trace:
[<ffffffff81079974>] build_overlap_sched_groups+0x134/0x1e0
[<ffffffff8107a169>] build_sched_domains+0x159/0x330
[<ffffffff817c2b3c>] sched_init_smp+0x65/0xf8
[<ffffffff817abb12>] kernel_init_freeable+0xb2/0x12d
[<ffffffff81541400>] ? rest_init+0x80/0x80
[<ffffffff81541409>] kernel_init+0x9/0xf0
[<ffffffff81547248>] ret_from_fork+0x58/0x90
[<ffffffff81541400>] ? rest_init+0x80/0x80
Code: 61 83 00 85 c0 74 70 49 8d 75 18 48 c7 c2 38 f9 8a 81 bf ff ff ff ff e8 51 fa 1f 00 49 8b 54 24 10 48 98 48 8b 04 c5 a0 fc 78 81 <48> 8b 14 10 b8 01 00 00 00 49 89 55 10 f0 0f c1 02 85 c0 75 0f
RIP [<ffffffff810797ce>] init_overlap_sched_group+0xae/0x120
RSP <ffff880bfd6d3e08>
CR2: 00000100000247bf
---[ end trace a491a27c866dd06f ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009

Rebooting in 5 seconds..

This happened on three different systems. On a similar system with just
one CPU in a socket it boots fine. The last Kernel of this series I tried
was 2.16.48 and that worked fine.

Any idea what is wrong? In case it is useful I have attached my kernel
config.

Regards,
Holger


Attachments:
.config (84.86 kB)