LinuxLists.cc - [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

2012-05-09 08:40:05

Subject: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

if we have one cpu that failed to boot and boot cpu gave up on waiting for it
and then another cpu is being booted, kernel might crash with following OOPS:

[ 723.865765] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[ 723.866616] IP: [<ffffffff812c3630>] __bitmap_weight+0x30/0x80
[ 723.866616] PGD 7ba91067 PUD 7a205067 PMD 0
[ 723.866616] Oops: 0000 [#1] SMP
[ 723.898527] CPU 1
...
[ 723.898527] Pid: 1221, comm: offV2.sh Tainted: G W 3.4.0-rc4+ #213 Red Hat KVM
[ 723.898527] RIP: 0010:[<ffffffff812c3630>] [<ffffffff812c3630>] __bitmap_weight+0x30/0x80
[ 723.898527] RSP: 0018:ffff88007ab9dc18 EFLAGS: 00010246
[ 723.898527] RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000000
[ 723.898527] RDX: 0000000000000018 RSI: 0000000000000100 RDI: 0000000000000018
[ 723.898527] RBP: ffff88007ab9dc18 R08: 0000000000000000 R09: 0000000000000020
[ 723.898527] R10: 0000000000000004 R11: 0000000000000000 R12: ffff88007c06ed60
[ 723.898527] R13: ffff880037a94000 R14: 0000000000000003 R15: ffff88007c06ed60
[ 723.898527] FS: 00007f1d6a7d8700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
[ 723.898527] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 723.898527] CR2: 0000000000000018 CR3: 000000007bb7f000 CR4: 00000000000007e0
[ 723.898527] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 723.898527] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 723.898527] Process offV2.sh (pid: 1221, threadinfo ffff88007ab9c000, task ffff88007b358000)
[ 723.898527] Stack:
[ 723.898527] ffff88007ab9dcc8 ffffffff8108b9b6 ffff88007ab9dc58 ffff88007b4f2a00
[ 723.898527] ffff88007c06ed60 0000000000000003 000000037ab9dc58 0000000000010008
[ 723.898527] ffffffff81a308e8 0000000000000003 ffff88007b489cc0 ffff880037b6bd20
[ 723.898527] Call Trace:
[ 723.898527] [<ffffffff8108b9b6>] build_sched_domains+0x7b6/0xa50
[ 723.898527] [<ffffffff8108bea9>] partition_sched_domains+0x259/0x3f0
[ 723.898527] [<ffffffff810c4485>] cpuset_update_active_cpus+0x85/0x90
[ 723.898527] [<ffffffff81084f65>] cpuset_cpu_active+0x25/0x30
[ 723.898527] [<ffffffff81545b45>] notifier_call_chain+0x55/0x80
[ 723.898527] [<ffffffff8107e59e>] __raw_notifier_call_chain+0xe/0x10
[ 723.898527] [<ffffffff81058be0>] __cpu_notify+0x20/0x40
[ 723.898527] [<ffffffff8153af08>] _cpu_up+0xc7/0x10e
[ 723.898527] [<ffffffff8153af9b>] cpu_up+0x4c/0x5c

crash happens in init_sched_groups_power() that expects sched_groups to be
circular linked list. However it is not always true, since sched_groups
preallocated in __sdt_alloc are initialized in build_sched_groups and it
may exit early

if (cpu != cpumask_first(sched_domain_span(sd)))
return 0;

without initializing sd->groups->next field.

Fix bug by initializing next field right after sched_group was allocated.

Signed-off-by: Igor Mammedov <[email protected]>
---
kernel/sched/core.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0533a68..e5212ae 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6382,6 +6382,8 @@ static int __sdt_alloc(const struct cpumask *cpu_map)
if (!sg)
return -ENOMEM;

+ sg->next = sg;
+
*per_cpu_ptr(sdd->sg, j) = sg;

sgp = kzalloc_node(sizeof(struct sched_group_power),
--
1.7.1

2012-05-09 10:21:57

by Jiang Liu

[permalink] [raw]

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Hi Igor,
Thanks for fixing this bug! We encountered the same issue with an
IA64 systems too. That system could boot with 2.6.32, but can't boot with
any 3.x.x kernels. We have just found the root cause today.
--gerry

On 05/09/2012 06:38 PM, Igor Mammedov wrote:
> if we have one cpu that failed to boot and boot cpu gave up on waiting for it
> and then another cpu is being booted, kernel might crash with following OOPS:
>
> [ 723.865765] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
> [ 723.866616] IP: [<ffffffff812c3630>] __bitmap_weight+0x30/0x80
> [ 723.866616] PGD 7ba91067 PUD 7a205067 PMD 0
> [ 723.866616] Oops: 0000 [#1] SMP
> [ 723.898527] CPU 1
> ...
> [ 723.898527] Pid: 1221, comm: offV2.sh Tainted: G W 3.4.0-rc4+ #213 Red Hat KVM
> [ 723.898527] RIP: 0010:[<ffffffff812c3630>] [<ffffffff812c3630>] __bitmap_weight+0x30/0x80
> [ 723.898527] RSP: 0018:ffff88007ab9dc18 EFLAGS: 00010246
> [ 723.898527] RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000000
> [ 723.898527] RDX: 0000000000000018 RSI: 0000000000000100 RDI: 0000000000000018
> [ 723.898527] RBP: ffff88007ab9dc18 R08: 0000000000000000 R09: 0000000000000020
> [ 723.898527] R10: 0000000000000004 R11: 0000000000000000 R12: ffff88007c06ed60
> [ 723.898527] R13: ffff880037a94000 R14: 0000000000000003 R15: ffff88007c06ed60
> [ 723.898527] FS: 00007f1d6a7d8700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
> [ 723.898527] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 723.898527] CR2: 0000000000000018 CR3: 000000007bb7f000 CR4: 00000000000007e0
> [ 723.898527] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 723.898527] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 723.898527] Process offV2.sh (pid: 1221, threadinfo ffff88007ab9c000, task ffff88007b358000)
> [ 723.898527] Stack:
> [ 723.898527] ffff88007ab9dcc8 ffffffff8108b9b6 ffff88007ab9dc58 ffff88007b4f2a00
> [ 723.898527] ffff88007c06ed60 0000000000000003 000000037ab9dc58 0000000000010008
> [ 723.898527] ffffffff81a308e8 0000000000000003 ffff88007b489cc0 ffff880037b6bd20
> [ 723.898527] Call Trace:
> [ 723.898527] [<ffffffff8108b9b6>] build_sched_domains+0x7b6/0xa50
> [ 723.898527] [<ffffffff8108bea9>] partition_sched_domains+0x259/0x3f0
> [ 723.898527] [<ffffffff810c4485>] cpuset_update_active_cpus+0x85/0x90
> [ 723.898527] [<ffffffff81084f65>] cpuset_cpu_active+0x25/0x30
> [ 723.898527] [<ffffffff81545b45>] notifier_call_chain+0x55/0x80
> [ 723.898527] [<ffffffff8107e59e>] __raw_notifier_call_chain+0xe/0x10
> [ 723.898527] [<ffffffff81058be0>] __cpu_notify+0x20/0x40
> [ 723.898527] [<ffffffff8153af08>] _cpu_up+0xc7/0x10e
> [ 723.898527] [<ffffffff8153af9b>] cpu_up+0x4c/0x5c
>
> crash happens in init_sched_groups_power() that expects sched_groups to be
> circular linked list. However it is not always true, since sched_groups
> preallocated in __sdt_alloc are initialized in build_sched_groups and it
> may exit early
>
> if (cpu != cpumask_first(sched_domain_span(sd)))
> return 0;
>
> without initializing sd->groups->next field.
>
> Fix bug by initializing next field right after sched_group was allocated.
>
> Signed-off-by: Igor Mammedov <[email protected]>
> ---
> kernel/sched/core.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 0533a68..e5212ae 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6382,6 +6382,8 @@ static int __sdt_alloc(const struct cpumask *cpu_map)
> if (!sg)
> return -ENOMEM;
>
> + sg->next = sg;
> +
> *per_cpu_ptr(sdd->sg, j) = sg;
>
> sgp = kzalloc_node(sizeof(struct sched_group_power),

2012-05-09 10:35:46

by Igor Mammedov

[permalink] [raw]

Subject: [tip:sched/urgent] sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption

Commit-ID: 30b4e9eb783d94e9f5d503b15eb31720679ae1c7
Gitweb: http://git.kernel.org/tip/30b4e9eb783d94e9f5d503b15eb31720679ae1c7
Author: Igor Mammedov <[email protected]>
AuthorDate: Wed, 9 May 2012 12:38:28 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 9 May 2012 12:27:35 +0200

sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption

If we have one cpu that failed to boot and boot cpu gave up on
waiting for it and then another cpu is being booted, kernel
might crash with following OOPS:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
IP: [<ffffffff812c3630>] __bitmap_weight+0x30/0x80
Call Trace:
[<ffffffff8108b9b6>] build_sched_domains+0x7b6/0xa50

The crash happens in init_sched_groups_power() that expects
sched_groups to be circular linked list. However it is not
always true, since sched_groups preallocated in __sdt_alloc are
initialized in build_sched_groups and it may exit early

if (cpu != cpumask_first(sched_domain_span(sd)))
return 0;

without initializing sd->groups->next field.

Fix bug by initializing next field right after sched_group was
allocated.

Also-Reported-by: Jiang Liu <[email protected]>
Signed-off-by: Igor Mammedov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched/core.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0533a68..e5212ae 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6382,6 +6382,8 @@ static int __sdt_alloc(const struct cpumask *cpu_map)
if (!sg)
return -ENOMEM;

+ sg->next = sg;
+
*per_cpu_ptr(sdd->sg, j) = sg;

sgp = kzalloc_node(sizeof(struct sched_group_power),

2012-05-09 11:41:23

Subject: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: [tip:sched/urgent] sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [RFC][PATCH] printk: Add %pb to print bitmaps

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation

Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation