2008-02-21 20:49:51

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 0/2] onlining cpus can break lockdep

Hi,

While testing with hotplugging cpus today, I've came across a stack
trace generated by lockdep. The reason for that is that
do_boot_cpu() in smpboot_64.c ends up initializing a struct work_struct
variable in the stack.

These patches fix this by making it static. a per-cpu variable is choosen,
since, to the best of my knowledge, nothing prevents two cpus going up
at the same time, and so a single static c_idle won't do.

I separated it in two patches to better isolate the change. The first patch
just paves the way, by turning access to c_idle into pointer dereferences.
The second one does the real work.

Thanks


2008-02-21 20:49:18

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 2/2] make work have a static address in do_boot_cpu()

This patch makes the work field in create_idle have
a static address. Otherwise, being a stack variable, it can
boild down the lockdep system.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/kernel/smpboot_64.c | 13 +++++++------
1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/smpboot_64.c b/arch/x86/kernel/smpboot_64.c
index f15b774..d79b7a8 100644
--- a/arch/x86/kernel/smpboot_64.c
+++ b/arch/x86/kernel/smpboot_64.c
@@ -545,6 +545,7 @@ static void __cpuinit do_fork_idle(struc
complete(&c_idle->done);
}

+static DEFINE_PER_CPU(struct create_idle, cpu_idle);
/*
* Boot one CPU.
*/
@@ -553,12 +554,12 @@ static int __cpuinit do_boot_cpu(int cpu
unsigned long boot_error;
int timeout;
unsigned long start_rip;
- struct create_idle create_idle = {
- .work = __WORK_INITIALIZER(create_idle.work, do_fork_idle),
- .cpu = cpu,
- .done = COMPLETION_INITIALIZER_ONSTACK(create_idle.done),
- };
- struct create_idle *c_idle = &create_idle;
+ struct create_idle *c_idle = &per_cpu(cpu_idle, cpu);
+ struct work_struct *c_idle_work = &c_idle->work;
+
+ INIT_WORK(c_idle_work, do_fork_idle);
+ c_idle->cpu = cpu;
+ init_completion(&c_idle->done);

/* allocate memory for gdts of secondary cpus. Hotplug is considered */
if (!cpu_gdt_descr[cpu].address &&
--
1.4.2

2008-02-21 20:51:28

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 1/2] turn c_idle into a pointer

This is not useful by itself, but part of making the
underlying work_struct have a static address.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/kernel/smpboot_64.c | 37 +++++++++++++++++++------------------
1 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/smpboot_64.c b/arch/x86/kernel/smpboot_64.c
index d53bd6f..f15b774 100644
--- a/arch/x86/kernel/smpboot_64.c
+++ b/arch/x86/kernel/smpboot_64.c
@@ -553,11 +553,12 @@ static int __cpuinit do_boot_cpu(int cpu
unsigned long boot_error;
int timeout;
unsigned long start_rip;
- struct create_idle c_idle = {
- .work = __WORK_INITIALIZER(c_idle.work, do_fork_idle),
+ struct create_idle create_idle = {
+ .work = __WORK_INITIALIZER(create_idle.work, do_fork_idle),
.cpu = cpu,
- .done = COMPLETION_INITIALIZER_ONSTACK(c_idle.done),
+ .done = COMPLETION_INITIALIZER_ONSTACK(create_idle.done),
};
+ struct create_idle *c_idle = &create_idle;

/* allocate memory for gdts of secondary cpus. Hotplug is considered */
if (!cpu_gdt_descr[cpu].address &&
@@ -584,12 +585,12 @@ static int __cpuinit do_boot_cpu(int cpu

alternatives_smp_switch(1);

- c_idle.idle = get_idle_for_cpu(cpu);
+ c_idle->idle = get_idle_for_cpu(cpu);

- if (c_idle.idle) {
- c_idle.idle->thread.sp = (unsigned long) (((struct pt_regs *)
- (THREAD_SIZE + task_stack_page(c_idle.idle))) - 1);
- init_idle(c_idle.idle, cpu);
+ if (c_idle->idle) {
+ c_idle->idle->thread.sp = (unsigned long) (((struct pt_regs *)
+ (THREAD_SIZE + task_stack_page(c_idle->idle))) - 1);
+ init_idle(c_idle->idle, cpu);
goto do_rest;
}

@@ -604,29 +605,29 @@ static int __cpuinit do_boot_cpu(int cpu
* thread.
*/
if (!keventd_up() || current_is_keventd())
- c_idle.work.func(&c_idle.work);
+ c_idle->work.func(&c_idle->work);
else {
- schedule_work(&c_idle.work);
- wait_for_completion(&c_idle.done);
+ schedule_work(&c_idle->work);
+ wait_for_completion(&c_idle->done);
}

- if (IS_ERR(c_idle.idle)) {
+ if (IS_ERR(c_idle->idle)) {
printk("failed fork for CPU %d\n", cpu);
- return PTR_ERR(c_idle.idle);
+ return PTR_ERR(c_idle->idle);
}

- set_idle_for_cpu(cpu, c_idle.idle);
+ set_idle_for_cpu(cpu, c_idle->idle);

do_rest:

- cpu_pda(cpu)->pcurrent = c_idle.idle;
+ cpu_pda(cpu)->pcurrent = c_idle->idle;

start_rip = setup_trampoline();

- init_rsp = c_idle.idle->thread.sp;
- load_sp0(&per_cpu(init_tss, cpu), &c_idle.idle->thread);
+ init_rsp = c_idle->idle->thread.sp;
+ load_sp0(&per_cpu(init_tss, cpu), &c_idle->idle->thread);
initial_code = start_secondary;
- clear_tsk_thread_flag(c_idle.idle, TIF_FORK);
+ clear_tsk_thread_flag(c_idle->idle, TIF_FORK);

printk(KERN_INFO "Booting processor %d/%d APIC 0x%x\n", cpu,
cpus_weight(cpu_present_map),
--
1.4.2

2008-02-21 21:16:54

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/2] onlining cpus can break lockdep


* Glauber Costa <[email protected]> wrote:

> Hi,
>
> While testing with hotplugging cpus today, I've came across a stack
> trace generated by lockdep. The reason for that is that do_boot_cpu()
> in smpboot_64.c ends up initializing a struct work_struct variable in
> the stack.

hm, could you post that trace?

Ingo

2008-02-21 21:26:16

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 0/2] onlining cpus can break lockdep

Ingo Molnar wrote:
> * Glauber Costa <[email protected]> wrote:
>
>> Hi,
>>
>> While testing with hotplugging cpus today, I've came across a stack
>> trace generated by lockdep. The reason for that is that do_boot_cpu()
>> in smpboot_64.c ends up initializing a struct work_struct variable in
>> the stack.
>
> hm, could you post that trace?

Sure. To make it clearer where the problem is, I also added the attached
patch to my testing.

Trace is: (note the call to do_fork_idle)

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
key: ffff81003bdf9d18, name: c_idle.work
Pid: 11, comm: events/1 Not tainted 2.6.25-rc2 #129

Call Trace:
[<ffffffff8104f757>] static_obj+0x5d/0x74
[<ffffffff81052218>] __lock_acquire+0x8b5/0xc3e
[<ffffffff81042e7f>] run_workqueue+0x84/0x1df
[<ffffffff810529ef>] lock_acquire+0x91/0xbc
[<ffffffff81042e90>] run_workqueue+0x95/0x1df
[<ffffffff81291746>] do_fork_idle+0x0/0x20
[<ffffffff81042ed4>] run_workqueue+0xd9/0x1df
[<ffffffff81043951>] worker_thread+0x90/0x9b
[<ffffffff8104632a>] autoremove_wake_function+0x0/0x2e
[<ffffffff810438c1>] worker_thread+0x0/0x9b
[<ffffffff8104620c>] kthread+0x47/0x73
[<ffffffff812986c9>] trace_hardirqs_on_thunk+0x35/0x3a
[<ffffffff8100d0e8>] child_rip+0xa/0x12
[<ffffffff8100c67c>] restore_args+0x0/0x34
[<ffffffff81046095>] kthreadd+0x14a/0x16f
[<ffffffff81046095>] kthreadd+0x14a/0x16f
[<ffffffff810461c5>] kthread+0x0/0x73
[<ffffffff8100d0de>] child_rip+0x0/0x12


Attachments:
lockdep.patch (461.00 B)

2008-02-21 22:15:29

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 2/2] make work have a static address in do_boot_cpu()


On Thu, 2008-02-21 at 17:40 -0300, Glauber Costa wrote:
> This patch makes the work field in create_idle have
> a static address. Otherwise, being a stack variable, it can
> boild down the lockdep system.

Looks way overkill. Doesn't something like:

> Signed-off-by: Glauber Costa <[email protected]>
> ---
> arch/x86/kernel/smpboot_64.c | 13 +++++++------
> 1 files changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kernel/smpboot_64.c b/arch/x86/kernel/smpboot_64.c
> index f15b774..d79b7a8 100644
> --- a/arch/x86/kernel/smpboot_64.c
> +++ b/arch/x86/kernel/smpboot_64.c

> @@ -553,12 +554,12 @@ static int __cpuinit do_boot_cpu(int cpu
> unsigned long boot_error;
> int timeout;
> unsigned long start_rip;
> struct create_idle create_idle = {
> - .work = __WORK_INITIALIZER(create_idle.work, do_fork_idle),
> .cpu = cpu,
> .done = COMPLETION_INITIALIZER_ONSTACK(create_idle.done),
> };
> + INIT_WORK(&create_idle.work, do_fork_idle);

work?

2008-02-21 22:57:53

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 2/2] make work have a static address in do_boot_cpu()

On Thu, Feb 21, 2008 at 7:12 PM, Peter Zijlstra <[email protected]> wrote:
>
> On Thu, 2008-02-21 at 17:40 -0300, Glauber Costa wrote:
> > This patch makes the work field in create_idle have
> > a static address. Otherwise, being a stack variable, it can
> > boild down the lockdep system.
>
> Looks way overkill. Doesn't something like:
>
>
> > Signed-off-by: Glauber Costa <[email protected]>
> > ---
> > arch/x86/kernel/smpboot_64.c | 13 +++++++------
> > 1 files changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/x86/kernel/smpboot_64.c b/arch/x86/kernel/smpboot_64.c
> > index f15b774..d79b7a8 100644
> > --- a/arch/x86/kernel/smpboot_64.c
> > +++ b/arch/x86/kernel/smpboot_64.c
>
>
> > @@ -553,12 +554,12 @@ static int __cpuinit do_boot_cpu(int cpu
> > unsigned long boot_error;
> > int timeout;
> > unsigned long start_rip;
> > struct create_idle create_idle = {
> > - .work = __WORK_INITIALIZER(create_idle.work, do_fork_idle),
> > .cpu = cpu,
> > .done = COMPLETION_INITIALIZER_ONSTACK(create_idle.done),
> > };
> > + INIT_WORK(&create_idle.work, do_fork_idle);
>
> work?

Indeed.

I missed the static lock_class_key definition inside INIT_WORK macro.
It does the job neatly.


--
Glauber Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

2008-02-22 09:47:23

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/2] make work have a static address in do_boot_cpu()


* Glauber Costa <[email protected]> wrote:

> > Looks way overkill. Doesn't something like:

> > > struct create_idle create_idle = {
> > > - .work = __WORK_INITIALIZER(create_idle.work, do_fork_idle),
> > > .cpu = cpu,
> > > .done = COMPLETION_INITIALIZER_ONSTACK(create_idle.done),
> > > };
> > > + INIT_WORK(&create_idle.work, do_fork_idle);
> >
> > work?
>
> Indeed.
>
> I missed the static lock_class_key definition inside INIT_WORK macro.
> It does the job neatly.

could you resend the updated patch?

Ingo