2008-06-25 19:46:51

by Steven Rostedt

[permalink] [raw]
Subject: Re: hrtimers: simplify lockdep handling


Hi Oleg,

I'm currently porting -rt to 26-rc7 and I came across this change:

Commit: 8e60e05fdc7344415fa69a3883b11f65db967b47

With the

- double_spin_lock(&new_base->lock, &old_base->lock,
- smp_processor_id() < cpu);
+ spin_lock(&new_base->lock);
+ spin_lock_nested(&old_base->lock, SINGLE_DEPTH_NESTING);


What's the reason that this is possible? Is it because the migration
happens only on CPU hotplugging and that the CPU hotplugging code has
locks that would prevent a reversal of the lock taking?

I'm not arguing that the code is incorrect, but this looks like a subtlety
that can bite us later.

In other words, we really need comments around this code to explain to
casual viewers why this code is not deadlock prone. The change log here
and for 0d180406f2914aea3a78ddb880e2fe9ac78a9372 does not explain why the
straight forward taking of the locks is OK.

Thanks,

-- Steve


2008-06-26 11:04:35

by Oleg Nesterov

[permalink] [raw]
Subject: Re: hrtimers: simplify lockdep handling

Hi Steven,

On 06/25, Steven Rostedt wrote:
>
> I'm currently porting -rt to 26-rc7 and I came across this change:
>
> Commit: 8e60e05fdc7344415fa69a3883b11f65db967b47
>
> With the
>
> - double_spin_lock(&new_base->lock, &old_base->lock,
> - smp_processor_id() < cpu);
> + spin_lock(&new_base->lock);
> + spin_lock_nested(&old_base->lock, SINGLE_DEPTH_NESTING);
>
>
> What's the reason that this is possible? Is it because the migration
> happens only on CPU hotplugging and that the CPU hotplugging code has
> locks that would prevent a reversal of the lock taking?

Yes. Even if we ignore CPU hotplugging locks, it is not possible that
2 migrate_timers()'s can take these locks in reverse order, this means
that both CPUs are dead and we are doing something meaningless.

> I'm not arguing that the code is incorrect, but this looks like a subtlety
> that can bite us later.
>
> In other words, we really need comments around this code to explain to
> casual viewers why this code is not deadlock prone. The change log here
> and for 0d180406f2914aea3a78ddb880e2fe9ac78a9372 does not explain why the
> straight forward taking of the locks is OK.

OK, I agree, I'll try to make the trivial doc patch.

But please note that the old code was confusing too, imho. It looked as
if we really have to avoid the deadlock, and the casual viewer (me) was
very confused when noticed the changes which added base_lock_keys and
double_spin_lock ;)

Oleg.