LinuxLists.cc - [RFC][PATCH RT 1/6] rt,rcu: Have rcu_read_lock_sched() use locks for PREEMPT

2013-06-26 19:31:10

Subject: [RFC][PATCH RT 1/6] rt,rcu: Have rcu_read_lock_sched() use locks for PREEMPT_RT

There are several critical sections that require synchronization with
synchronize_sched(). Usually these are done by disabling preemption and
the synchronize_sched() just waits for the kernel to schedule on each
of the CPUs.

The rcu_read_lock_sched() is the preferred API to use, but some areas
still use preempt_disable() and local_irq_*() to prevent preemption
from happening. But our main concern is with those users of
rcu_read_lock_sched(), where they may also call spin_locks() that turn
into a mutex for PREEMPT_RT. For these cases, we need to allow
rcu_read_lock_sched() to schedule out.

To allow rcu_read_lock_sched() sections to preempt when PREEMPT_RT is enabled,
instead of disabling preemption, it will grab a local_lock(). Then the
synchronize_sched() will grab all CPUs local_locks() and release them.
After that, it still does the normal synchronize_sched() as there may be
places that still disable preemption or irqs that it needs to
synchronize with. By grabbing all the locks and releasing them, it will
properly synchronize with those that use the locks instead of disabling
preemption or interrupts.

Note: The rcu_read_lock_sched_notrace() version still only disables
preemption, because they are used for lockdep and tracing, which require
real preemption disabling and not mutexes.

Signed-off-by: Steven Rostedt <[email protected]>

Index: linux-rt.git/include/linux/rcupdate.h
===================================================================
--- linux-rt.git.orig/include/linux/rcupdate.h
+++ linux-rt.git/include/linux/rcupdate.h
@@ -36,6 +36,7 @@
#include <linux/types.h>
#include <linux/cache.h>
#include <linux/spinlock.h>
+#include <linux/locallock.h>
#include <linux/threads.h>
#include <linux/cpumask.h>
#include <linux/seqlock.h>
@@ -870,6 +871,28 @@ static inline void rcu_read_unlock_bh(vo
local_bh_enable();
}

+/* asm-offsets.c gets confused with local_lock here */
+#if defined(CONFIG_PREEMPT_RT_FULL)
+DECLARE_LOCAL_IRQ_LOCK(rcu_sched_lock);
+static inline void rcu_read_lock_sched_disable(void)
+{
+ local_lock(rcu_sched_lock);
+}
+static inline void rcu_read_lock_sched_enable(void)
+{
+ local_unlock(rcu_sched_lock);
+}
+#else
+static inline void rcu_read_lock_sched_disable(void)
+{
+ preempt_disable();
+}
+static inline void rcu_read_lock_sched_enable(void)
+{
+ preempt_enable();
+}
+#endif
+
/**
* rcu_read_lock_sched() - mark the beginning of a RCU-sched critical section
*
@@ -885,7 +908,7 @@ static inline void rcu_read_unlock_bh(vo
*/
static inline void rcu_read_lock_sched(void)
{
- preempt_disable();
+ rcu_read_lock_sched_disable();
__acquire(RCU_SCHED);
rcu_lock_acquire(&rcu_sched_lock_map);
rcu_lockdep_assert(!rcu_is_cpu_idle(),
@@ -910,7 +933,7 @@ static inline void rcu_read_unlock_sched
"rcu_read_unlock_sched() used illegally while idle");
rcu_lock_release(&rcu_sched_lock_map);
__release(RCU_SCHED);
- preempt_enable();
+ rcu_read_lock_sched_enable();
}

/* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */
Index: linux-rt.git/kernel/rcutree.c
===================================================================
--- linux-rt.git.orig/kernel/rcutree.c
+++ linux-rt.git/kernel/rcutree.c
@@ -2491,6 +2491,31 @@ static inline int rcu_blocking_is_gp(voi
return ret;
}

+#ifdef CONFIG_PREEMPT_RT_FULL
+DEFINE_LOCAL_IRQ_LOCK(rcu_sched_lock);
+/*
+ * Real-time allows for synchronize sched to sleep but not migrate.
+ * This is done via the local locks. When calling synchronize_sched(),
+ * all the local locks need to be taken and released. This will ensure
+ * that all users of rcu_read_lock_sched() will be out of their critical
+ * sections at the completion of this function. synchronize_sched() will
+ * still perform the normal RCU sched operations to synchronize with
+ * locations that use disabling preemption or interrupts.
+ */
+static void rcu_synchronize_sched_rt(void)
+{
+ int cpu;
+
+ for_each_possible_cpu(cpu) {
+ spin_lock(&per_cpu(rcu_sched_lock, cpu).lock);
+ spin_unlock(&per_cpu(rcu_sched_lock, cpu).lock);
+ }
+}
+#else
+static inline void rcu_synchronize_sched_rt(void)
+{
+}
+#endif
/**
* synchronize_sched - wait until an rcu-sched grace period has elapsed.
*
@@ -2538,6 +2563,9 @@ void synchronize_sched(void)
!lock_is_held(&rcu_lock_map) &&
!lock_is_held(&rcu_sched_lock_map),
"Illegal synchronize_sched() in RCU-sched read-side critical section");
+
+ rcu_synchronize_sched_rt();
+
if (rcu_blocking_is_gp())
return;
if (rcu_expedited)

2013-06-26 20:53:39

by Paul E. McKenney

[permalink] [raw]

Subject: Re: [RFC][PATCH RT 1/6] rt,rcu: Have rcu_read_lock_sched() use locks for PREEMPT_RT

On Wed, Jun 26, 2013 at 03:28:07PM -0400, Steven Rostedt wrote:
> There are several critical sections that require synchronization with
> synchronize_sched(). Usually these are done by disabling preemption and
> the synchronize_sched() just waits for the kernel to schedule on each
> of the CPUs.
>
> The rcu_read_lock_sched() is the preferred API to use, but some areas
> still use preempt_disable() and local_irq_*() to prevent preemption
> from happening. But our main concern is with those users of
> rcu_read_lock_sched(), where they may also call spin_locks() that turn
> into a mutex for PREEMPT_RT. For these cases, we need to allow
> rcu_read_lock_sched() to schedule out.
>
> To allow rcu_read_lock_sched() sections to preempt when PREEMPT_RT is enabled,
> instead of disabling preemption, it will grab a local_lock(). Then the
> synchronize_sched() will grab all CPUs local_locks() and release them.
> After that, it still does the normal synchronize_sched() as there may be
> places that still disable preemption or irqs that it needs to
> synchronize with. By grabbing all the locks and releasing them, it will
> properly synchronize with those that use the locks instead of disabling
> preemption or interrupts.
>
> Note: The rcu_read_lock_sched_notrace() version still only disables
> preemption, because they are used for lockdep and tracing, which require
> real preemption disabling and not mutexes.

This looks much better!

A few more questions and comments below.

Thanx, Paul

> Signed-off-by: Steven Rostedt <[email protected]>
>
> Index: linux-rt.git/include/linux/rcupdate.h
> ===================================================================
> --- linux-rt.git.orig/include/linux/rcupdate.h
> +++ linux-rt.git/include/linux/rcupdate.h
> @@ -36,6 +36,7 @@
> #include <linux/types.h>
> #include <linux/cache.h>
> #include <linux/spinlock.h>
> +#include <linux/locallock.h>
> #include <linux/threads.h>
> #include <linux/cpumask.h>
> #include <linux/seqlock.h>
> @@ -870,6 +871,28 @@ static inline void rcu_read_unlock_bh(vo
> local_bh_enable();
> }
>
> +/* asm-offsets.c gets confused with local_lock here */
> +#if defined(CONFIG_PREEMPT_RT_FULL)
> +DECLARE_LOCAL_IRQ_LOCK(rcu_sched_lock);
> +static inline void rcu_read_lock_sched_disable(void)
> +{
> + local_lock(rcu_sched_lock);
> +}
> +static inline void rcu_read_lock_sched_enable(void)
> +{
> + local_unlock(rcu_sched_lock);
> +}
> +#else
> +static inline void rcu_read_lock_sched_disable(void)
> +{
> + preempt_disable();
> +}
> +static inline void rcu_read_lock_sched_enable(void)
> +{
> + preempt_enable();
> +}
> +#endif
> +
> /**
> * rcu_read_lock_sched() - mark the beginning of a RCU-sched critical section
> *
> @@ -885,7 +908,7 @@ static inline void rcu_read_unlock_bh(vo
> */

How about having an rcu_read_lock_sched_rt() and rcu_read_unlock_sched_rt()?
Leave rcu_read_lock_sched() and rcu_read_unlock_sched() with their prior
semantics and deadlock immunity, with a header comment for the _rt()
variants that gives their properties and where they should be used.

> static inline void rcu_read_lock_sched(void)
> {
> - preempt_disable();
> + rcu_read_lock_sched_disable();
> __acquire(RCU_SCHED);
> rcu_lock_acquire(&rcu_sched_lock_map);
> rcu_lockdep_assert(!rcu_is_cpu_idle(),
> @@ -910,7 +933,7 @@ static inline void rcu_read_unlock_sched
> "rcu_read_unlock_sched() used illegally while idle");
> rcu_lock_release(&rcu_sched_lock_map);
> __release(RCU_SCHED);
> - preempt_enable();
> + rcu_read_lock_sched_enable();
> }
>
> /* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */
> Index: linux-rt.git/kernel/rcutree.c
> ===================================================================
> --- linux-rt.git.orig/kernel/rcutree.c
> +++ linux-rt.git/kernel/rcutree.c
> @@ -2491,6 +2491,31 @@ static inline int rcu_blocking_is_gp(voi
> return ret;
> }
>
> +#ifdef CONFIG_PREEMPT_RT_FULL
> +DEFINE_LOCAL_IRQ_LOCK(rcu_sched_lock);
> +/*
> + * Real-time allows for synchronize sched to sleep but not migrate.
> + * This is done via the local locks. When calling synchronize_sched(),
> + * all the local locks need to be taken and released. This will ensure
> + * that all users of rcu_read_lock_sched() will be out of their critical
> + * sections at the completion of this function. synchronize_sched() will
> + * still perform the normal RCU sched operations to synchronize with
> + * locations that use disabling preemption or interrupts.
> + */
> +static void rcu_synchronize_sched_rt(void)

The name synchronize_sched_rt() would fit better with the companion
synchronize_sched() function.

> +{
> + int cpu;
> +
> + for_each_possible_cpu(cpu) {
> + spin_lock(&per_cpu(rcu_sched_lock, cpu).lock);
> + spin_unlock(&per_cpu(rcu_sched_lock, cpu).lock);
> + }
> +}
> +#else
> +static inline void rcu_synchronize_sched_rt(void)
> +{

I bet you want a synchronize_sched() here. ;-)

But looking below...

> +}
> +#endif
> /**
> * synchronize_sched - wait until an rcu-sched grace period has elapsed.
> *
> @@ -2538,6 +2563,9 @@ void synchronize_sched(void)
> !lock_is_held(&rcu_lock_map) &&
> !lock_is_held(&rcu_sched_lock_map),
> "Illegal synchronize_sched() in RCU-sched read-side critical section");
> +
> + rcu_synchronize_sched_rt();
> +

Are you sure you want a single primitive to wait on both types of
read-side critical sections? I can see arguments on either side...

For completeness, another approach would be to use SRCU instead of locking
for the preemptible RCU-sched read-side critical sections. One benefit
of doing this is that SRCU avoids introducing the potential deadlocks
that involve locks acquired both within and across read-side critical
sections.

> if (rcu_blocking_is_gp())
> return;
> if (rcu_expedited)
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-06-26 21:33:05

by Steven Rostedt

[permalink] [raw]

Subject: Re: [RFC][PATCH RT 1/6] rt,rcu: Have rcu_read_lock_sched() use locks for PREEMPT_RT

On Wed, 2013-06-26 at 13:53 -0700, Paul E. McKenney wrote:
> On Wed, Jun 26, 2013 at 03:28:07PM -0400, Steven Rostedt wrote:
> > There are several critical sections that require synchronization with
> > synchronize_sched(). Usually these are done by disabling preemption and
> > the synchronize_sched() just waits for the kernel to schedule on each
> > of the CPUs.
> >
> > The rcu_read_lock_sched() is the preferred API to use, but some areas
> > still use preempt_disable() and local_irq_*() to prevent preemption
> > from happening. But our main concern is with those users of
> > rcu_read_lock_sched(), where they may also call spin_locks() that turn
> > into a mutex for PREEMPT_RT. For these cases, we need to allow
> > rcu_read_lock_sched() to schedule out.
> >
> > To allow rcu_read_lock_sched() sections to preempt when PREEMPT_RT is enabled,
> > instead of disabling preemption, it will grab a local_lock(). Then the
> > synchronize_sched() will grab all CPUs local_locks() and release them.
> > After that, it still does the normal synchronize_sched() as there may be
> > places that still disable preemption or irqs that it needs to
> > synchronize with. By grabbing all the locks and releasing them, it will
> > properly synchronize with those that use the locks instead of disabling
> > preemption or interrupts.
> >
> > Note: The rcu_read_lock_sched_notrace() version still only disables
> > preemption, because they are used for lockdep and tracing, which require
> > real preemption disabling and not mutexes.
>
> This looks much better!

Really, I didn't think I changed it at all ;-)

>
> A few more questions and comments below.
>
> Thanx, Paul
>
> > Signed-off-by: Steven Rostedt <[email protected]>
> >
> > Index: linux-rt.git/include/linux/rcupdate.h
> > ===================================================================
> > --- linux-rt.git.orig/include/linux/rcupdate.h
> > +++ linux-rt.git/include/linux/rcupdate.h
> > @@ -36,6 +36,7 @@
> > #include <linux/types.h>
> > #include <linux/cache.h>
> > #include <linux/spinlock.h>
> > +#include <linux/locallock.h>
> > #include <linux/threads.h>
> > #include <linux/cpumask.h>
> > #include <linux/seqlock.h>
> > @@ -870,6 +871,28 @@ static inline void rcu_read_unlock_bh(vo
> > local_bh_enable();
> > }
> >
> > +/* asm-offsets.c gets confused with local_lock here */
> > +#if defined(CONFIG_PREEMPT_RT_FULL)
> > +DECLARE_LOCAL_IRQ_LOCK(rcu_sched_lock);
> > +static inline void rcu_read_lock_sched_disable(void)
> > +{
> > + local_lock(rcu_sched_lock);
> > +}
> > +static inline void rcu_read_lock_sched_enable(void)
> > +{
> > + local_unlock(rcu_sched_lock);
> > +}
> > +#else
> > +static inline void rcu_read_lock_sched_disable(void)
> > +{
> > + preempt_disable();
> > +}
> > +static inline void rcu_read_lock_sched_enable(void)
> > +{
> > + preempt_enable();
> > +}
> > +#endif
> > +
> > /**
> > * rcu_read_lock_sched() - mark the beginning of a RCU-sched critical section
> > *
> > @@ -885,7 +908,7 @@ static inline void rcu_read_unlock_bh(vo
> > */
>
> How about having an rcu_read_lock_sched_rt() and rcu_read_unlock_sched_rt()?
> Leave rcu_read_lock_sched() and rcu_read_unlock_sched() with their prior
> semantics and deadlock immunity, with a header comment for the _rt()
> variants that gives their properties and where they should be used.

I can do that instead, just so that we don't introduce a deadlock
somewhere unknowingly.

>
> > static inline void rcu_read_lock_sched(void)
> > {
> > - preempt_disable();
> > + rcu_read_lock_sched_disable();
> > __acquire(RCU_SCHED);
> > rcu_lock_acquire(&rcu_sched_lock_map);
> > rcu_lockdep_assert(!rcu_is_cpu_idle(),
> > @@ -910,7 +933,7 @@ static inline void rcu_read_unlock_sched
> > "rcu_read_unlock_sched() used illegally while idle");
> > rcu_lock_release(&rcu_sched_lock_map);
> > __release(RCU_SCHED);
> > - preempt_enable();
> > + rcu_read_lock_sched_enable();
> > }
> >
> > /* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */
> > Index: linux-rt.git/kernel/rcutree.c
> > ===================================================================
> > --- linux-rt.git.orig/kernel/rcutree.c
> > +++ linux-rt.git/kernel/rcutree.c
> > @@ -2491,6 +2491,31 @@ static inline int rcu_blocking_is_gp(voi
> > return ret;
> > }
> >
> > +#ifdef CONFIG_PREEMPT_RT_FULL
> > +DEFINE_LOCAL_IRQ_LOCK(rcu_sched_lock);
> > +/*
> > + * Real-time allows for synchronize sched to sleep but not migrate.
> > + * This is done via the local locks. When calling synchronize_sched(),
> > + * all the local locks need to be taken and released. This will ensure
> > + * that all users of rcu_read_lock_sched() will be out of their critical
> > + * sections at the completion of this function. synchronize_sched() will
> > + * still perform the normal RCU sched operations to synchronize with
> > + * locations that use disabling preemption or interrupts.
> > + */
> > +static void rcu_synchronize_sched_rt(void)
>
> The name synchronize_sched_rt() would fit better with the companion
> synchronize_sched() function.

See below.

>
> > +{
> > + int cpu;
> > +
> > + for_each_possible_cpu(cpu) {
> > + spin_lock(&per_cpu(rcu_sched_lock, cpu).lock);
> > + spin_unlock(&per_cpu(rcu_sched_lock, cpu).lock);
> > + }
> > +}
> > +#else
> > +static inline void rcu_synchronize_sched_rt(void)
> > +{
>
> I bet you want a synchronize_sched() here. ;-)
>
> But looking below...
>
> > +}
> > +#endif
> > /**
> > * synchronize_sched - wait until an rcu-sched grace period has elapsed.
> > *
> > @@ -2538,6 +2563,9 @@ void synchronize_sched(void)
> > !lock_is_held(&rcu_lock_map) &&
> > !lock_is_held(&rcu_sched_lock_map),
> > "Illegal synchronize_sched() in RCU-sched read-side critical section");
> > +
> > + rcu_synchronize_sched_rt();
> > +
>
> Are you sure you want a single primitive to wait on both types of
> read-side critical sections? I can see arguments on either side...

Actually, the problem is that the location I need this to work in is a
generic infrastructure (workqueues). I don't even see a call to
synchronize_sched() anyway. I'm assuming that it's the workqueue users
that are doing this.

I'm handling the change from commit fa1b54e69bc "workqueue: update
synchronization rules on worker_pool_idr" which states that it's
protecting the reads with synchronize_sched() (disabling interrupts).

>
> For completeness, another approach would be to use SRCU instead of locking
> for the preemptible RCU-sched read-side critical sections. One benefit
> of doing this is that SRCU avoids introducing the potential deadlocks
> that involve locks acquired both within and across read-side critical
> sections.
>
> > if (rcu_blocking_is_gp())
> > return;
> > if (rcu_expedited)
> >

I never understood fully the SRCU. Perhaps it will work, I'd just have
to look into it. But for now, I think I'll go with your original idea,
with the locks and use rcu_read_lock_sched_rt().

-- Steve

2013-06-27 03:53:01

by Mike Galbraith

[permalink] [raw]

Subject: Re: [RFC][PATCH RT 1/6] rt,rcu: Have rcu_read_lock_sched() use locks for PREEMPT_RT

On Wed, 2013-06-26 at 15:28 -0400, Steven Rostedt wrote:

> @@ -2491,6 +2491,31 @@ static inline int rcu_blocking_is_gp(voi
> return ret;
> }
>
> +#ifdef CONFIG_PREEMPT_RT_FULL
> +DEFINE_LOCAL_IRQ_LOCK(rcu_sched_lock);
> +/*
> + * Real-time allows for synchronize sched to sleep but not migrate.
> + * This is done via the local locks. When calling synchronize_sched(),
> + * all the local locks need to be taken and released. This will ensure
> + * that all users of rcu_read_lock_sched() will be out of their critical
> + * sections at the completion of this function. synchronize_sched() will
> + * still perform the normal RCU sched operations to synchronize with
> + * locations that use disabling preemption or interrupts.
> + */
> +static void rcu_synchronize_sched_rt(void)
> +{
> + int cpu;
> +
> + for_each_possible_cpu(cpu) {
> + spin_lock(&per_cpu(rcu_sched_lock, cpu).lock);
> + spin_unlock(&per_cpu(rcu_sched_lock, cpu).lock);
> + }
> +}

Does that have to be possible vs online?

-Mike

2013-06-27 11:28:13

by Steven Rostedt

[permalink] [raw]

Subject: Re: [RFC][PATCH RT 1/6] rt,rcu: Have rcu_read_lock_sched() use locks for PREEMPT_RT

On Thu, 2013-06-27 at 05:52 +0200, Mike Galbraith wrote:
> On Wed, 2013-06-26 at 15:28 -0400, Steven Rostedt wrote:
>
> > @@ -2491,6 +2491,31 @@ static inline int rcu_blocking_is_gp(voi
> > return ret;
> > }
> >
> > +#ifdef CONFIG_PREEMPT_RT_FULL
> > +DEFINE_LOCAL_IRQ_LOCK(rcu_sched_lock);
> > +/*
> > + * Real-time allows for synchronize sched to sleep but not migrate.
> > + * This is done via the local locks. When calling synchronize_sched(),
> > + * all the local locks need to be taken and released. This will ensure
> > + * that all users of rcu_read_lock_sched() will be out of their critical
> > + * sections at the completion of this function. synchronize_sched() will
> > + * still perform the normal RCU sched operations to synchronize with
> > + * locations that use disabling preemption or interrupts.
> > + */
> > +static void rcu_synchronize_sched_rt(void)
> > +{
> > + int cpu;
> > +
> > + for_each_possible_cpu(cpu) {
> > + spin_lock(&per_cpu(rcu_sched_lock, cpu).lock);
> > + spin_unlock(&per_cpu(rcu_sched_lock, cpu).lock);
> > + }
> > +}
>
> Does that have to be possible vs online?

No, I was doing the "paranoid" approach, as I didn't know how much
hotplug may be using synchronize_sched() and wanted to make sure I hit
all CPUs. But I think online would be sufficient.

-- Steve