2008-02-28 00:21:50

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH] Fix hibernate/resume in presence of PREEMPT_RCU and hotplug

Hello!

This fixes a oops encountered when doing hibernate/resume in presence
of PREEMPT_RCU. The problem was that the code failed to disable preemption
when accessing a per-CPU variable. This is OK when called from code
that already has preemption disabled, but such is not the case from
the suspend/resume code path.

Reported-by: Dave Young <[email protected]>
Tested-by: Dave Young <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---

rcupreempt.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff -urpNa -X dontdiff linux-2.6.25-rc3/kernel/rcupreempt.c linux-2.6.25-rc3-rcuoffline/kernel/rcupreempt.c
--- linux-2.6.25-rc3/kernel/rcupreempt.c 2008-02-26 16:58:43.000000000 -0800
+++ linux-2.6.25-rc3-rcuoffline/kernel/rcupreempt.c 2008-02-26 17:04:17.000000000 -0800
@@ -702,8 +702,9 @@ void rcu_offline_cpu(int cpu)
* fix.
*/

+ local_irq_save(flags);
rdp = RCU_DATA_ME();
- spin_lock_irqsave(&rdp->lock, flags);
+ spin_lock(&rdp->lock);
*rdp->nexttail = list;
if (list)
rdp->nexttail = tail;


Subject: Re: [PATCH] Fix hibernate/resume in presence of PREEMPT_RCU and hotplug

On Wed, Feb 27, 2008 at 04:21:10PM -0800, Paul E. McKenney wrote:
> Hello!
>
> This fixes a oops encountered when doing hibernate/resume in presence
> of PREEMPT_RCU. The problem was that the code failed to disable preemption
> when accessing a per-CPU variable. This is OK when called from code
> that already has preemption disabled, but such is not the case from
> the suspend/resume code path.

Well, rcu_offline_cpu() is called from the cpu-offline callback path, and
AFAICS, it doesn't disable preemption.

I wonder how we're seeing this only now! Indeed a good catch!

Reviewed-by: Gautham R Shenoy <[email protected]>

>
> Reported-by: Dave Young <[email protected]>
> Tested-by: Dave Young <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
> ---
>
> rcupreempt.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff -urpNa -X dontdiff linux-2.6.25-rc3/kernel/rcupreempt.c linux-2.6.25-rc3-rcuoffline/kernel/rcupreempt.c
> --- linux-2.6.25-rc3/kernel/rcupreempt.c 2008-02-26 16:58:43.000000000 -0800
> +++ linux-2.6.25-rc3-rcuoffline/kernel/rcupreempt.c 2008-02-26 17:04:17.000000000 -0800
> @@ -702,8 +702,9 @@ void rcu_offline_cpu(int cpu)
> * fix.
> */
>
> + local_irq_save(flags);
> rdp = RCU_DATA_ME();
> - spin_lock_irqsave(&rdp->lock, flags);
> + spin_lock(&rdp->lock);
> *rdp->nexttail = list;
> if (list)
> rdp->nexttail = tail;

--
Thanks and Regards
gautham

2008-02-28 19:52:42

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] Fix hibernate/resume in presence of PREEMPT_RCU and hotplug

On Thu, Feb 28, 2008 at 10:35:08AM +0530, Gautham R Shenoy wrote:
> On Wed, Feb 27, 2008 at 04:21:10PM -0800, Paul E. McKenney wrote:
> > Hello!
> >
> > This fixes a oops encountered when doing hibernate/resume in presence
> > of PREEMPT_RCU. The problem was that the code failed to disable preemption
> > when accessing a per-CPU variable. This is OK when called from code
> > that already has preemption disabled, but such is not the case from
> > the suspend/resume code path.
>
> Well, rcu_offline_cpu() is called from the cpu-offline callback path, and
> AFAICS, it doesn't disable preemption.
>
> I wonder how we're seeing this only now! Indeed a good catch!

I wonder as well. In fact there is (at least) one other such bug,
submitting patch separately. :-/

Thanx, Paul

> Reviewed-by: Gautham R Shenoy <[email protected]>
>
> >
> > Reported-by: Dave Young <[email protected]>
> > Tested-by: Dave Young <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
> > ---
> >
> > rcupreempt.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff -urpNa -X dontdiff linux-2.6.25-rc3/kernel/rcupreempt.c linux-2.6.25-rc3-rcuoffline/kernel/rcupreempt.c
> > --- linux-2.6.25-rc3/kernel/rcupreempt.c 2008-02-26 16:58:43.000000000 -0800
> > +++ linux-2.6.25-rc3-rcuoffline/kernel/rcupreempt.c 2008-02-26 17:04:17.000000000 -0800
> > @@ -702,8 +702,9 @@ void rcu_offline_cpu(int cpu)
> > * fix.
> > */
> >
> > + local_irq_save(flags);
> > rdp = RCU_DATA_ME();
> > - spin_lock_irqsave(&rdp->lock, flags);
> > + spin_lock(&rdp->lock);
> > *rdp->nexttail = list;
> > if (list)
> > rdp->nexttail = tail;
>
> --
> Thanks and Regards
> gautham

2008-02-28 19:56:17

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH] Remove never-migrates assumption from rcu_process_callbacks()

Hello again!

This patch fixes a potentially invalid access to a per-CPU variable in
rcu_process_callbacks(). This per-CPU access needs to be done in such
a way as to guarantee that the code using it cannot move to some other
CPU before all uses of the value accessed have completed. Even though
this code is currently only invoked from softirq context, which currrently
cannot migrate to some other CPU, life would be better if this
code did not silently make such an assumption.

Signed-off-by: Paul E. McKenney <[email protected]>
---

rcupreempt.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff -urpNa -X dontdiff linux-2.6.25-rc3-rcuoffline/kernel/rcupreempt.c linux-2.6.25-rc3-rcuoffline-fp/kernel/rcupreempt.c
--- linux-2.6.25-rc3-rcuoffline/kernel/rcupreempt.c 2008-02-26 17:04:17.000000000 -0800
+++ linux-2.6.25-rc3-rcuoffline-fp/kernel/rcupreempt.c 2008-02-27 23:04:18.000000000 -0800
@@ -736,9 +736,11 @@ static void rcu_process_callbacks(struct
{
unsigned long flags;
struct rcu_head *next, *list;
- struct rcu_data *rdp = RCU_DATA_ME();
+ struct rcu_data *rdp;

- spin_lock_irqsave(&rdp->lock, flags);
+ local_irq_save(flags);
+ rdp = RCU_DATA_ME();
+ spin_lock(&rdp->lock);
list = rdp->donelist;
if (list == NULL) {
spin_unlock_irqrestore(&rdp->lock, flags);

2008-02-28 19:59:20

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] Remove never-migrates assumption from rcu_process_callbacks()


* Paul E. McKenney <[email protected]> wrote:

> Hello again!
>
> This patch fixes a potentially invalid access to a per-CPU variable in
> rcu_process_callbacks(). This per-CPU access needs to be done in such
> a way as to guarantee that the code using it cannot move to some other
> CPU before all uses of the value accessed have completed. Even though
> this code is currently only invoked from softirq context, which
> currrently cannot migrate to some other CPU, life would be better if
> this code did not silently make such an assumption.

i've got the patch below queued up already - same thing, right?

Ingo

----------->
Subject: rcupreempt: fix hibernate/resume in presence of PREEMPT_RCU and hotplug
From: "Paul E. McKenney" <[email protected]>
Date: Wed, 27 Feb 2008 16:21:10 -0800

This fixes a oops encountered when doing hibernate/resume in presence
of PREEMPT_RCU. The problem was that the code failed to disable preemption
when accessing a per-CPU variable. This is OK when called from code
that already has preemption disabled, but such is not the case from
the suspend/resume code path.

Reported-by: Dave Young <[email protected]>
Tested-by: Dave Young <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/rcupreempt.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux/kernel/rcupreempt.c
===================================================================
--- linux.orig/kernel/rcupreempt.c
+++ linux/kernel/rcupreempt.c
@@ -918,8 +918,9 @@ void rcu_offline_cpu(int cpu)
* fix.
*/

+ local_irq_save(flags);
rdp = RCU_DATA_ME();
- spin_lock_irqsave(&rdp->lock, flags);
+ spin_lock(&rdp->lock);
*rdp->nexttail = list;
if (list)
rdp->nexttail = tail;

2008-02-28 20:00:59

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] Remove never-migrates assumption from rcu_process_callbacks()


* Ingo Molnar <[email protected]> wrote:

> i've got the patch below queued up already - same thing, right?

oh, not the same thing - just looking very similar.

queued up the second patch too :-)

Ingo

2008-02-28 23:33:32

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] Remove never-migrates assumption from rcu_process_callbacks()

On Thu, Feb 28, 2008 at 08:54:54PM +0100, Ingo Molnar wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > i've got the patch below queued up already - same thing, right?
>
> oh, not the same thing - just looking very similar.
>
> queued up the second patch too :-)

Thank you, Ingo!

/me goes back to analyzing the rcu-preempt code...

Thanx, Paul