2013-09-05 19:52:46

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

There is currently no way for kernel code to determine whether it
is safe to enter an RCU read-side critical section, in other words,
whether or not RCU is paying attention to the currently running CPU.
Given the large and increasing quantity of code shared by the idle loop
and non-idle code, the this shortcoming is becoming increasingly painful.

This commit therefore adds rcu_watching_this_cpu(), which returns true
if it is safe to enter an RCU read-side critical section on the currently
running CPU. This function is quite fast, using only a __this_cpu_read().
However, the caller must disable preemption.

Reported-by: Steven Rostedt <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>

include/linux/rcupdate.h | 1 +
kernel/rcutree.c | 12 ++++++++++++
2 files changed, 13 insertions(+)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 15d33d9..1c7112c 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -225,6 +225,7 @@ extern void rcu_idle_enter(void);
extern void rcu_idle_exit(void);
extern void rcu_irq_enter(void);
extern void rcu_irq_exit(void);
+extern bool rcu_watching_this_cpu(void);

#ifdef CONFIG_RCU_USER_QS
extern void rcu_user_enter(void);
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index a06d172..7b8fcee 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -710,6 +710,18 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
#endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */

/**
+ * rcu_watching_this_cpu - are RCU read-side critical sections safe?
+ *
+ * Return true if RCU is watching the running CPU, which means that this
+ * CPU can safely enter RCU read-side critical sections. The caller must
+ * have at least disabled preemption.
+ */
+bool rcu_watching_this_cpu(void)
+{
+ return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
+}
+
+/**
* rcu_is_cpu_rrupt_from_idle - see if idle or immediately interrupted from idle
*
* If the current CPU is idle or running at a first-level (not nested)


2013-09-05 20:26:04

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Thu, 5 Sep 2013 12:52:34 -0700
"Paul E. McKenney" <[email protected]> wrote:

> There is currently no way for kernel code to determine whether it
> is safe to enter an RCU read-side critical section, in other words,
> whether or not RCU is paying attention to the currently running CPU.
> Given the large and increasing quantity of code shared by the idle loop
> and non-idle code, the this shortcoming is becoming increasingly painful.
>
> This commit therefore adds rcu_watching_this_cpu(), which returns true
> if it is safe to enter an RCU read-side critical section on the currently
> running CPU. This function is quite fast, using only a __this_cpu_read().
> However, the caller must disable preemption.
>
> Reported-by: Steven Rostedt <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Thanks Paul!

>
> include/linux/rcupdate.h | 1 +
> kernel/rcutree.c | 12 ++++++++++++
> 2 files changed, 13 insertions(+)
>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 15d33d9..1c7112c 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -225,6 +225,7 @@ extern void rcu_idle_enter(void);
> extern void rcu_idle_exit(void);
> extern void rcu_irq_enter(void);
> extern void rcu_irq_exit(void);
> +extern bool rcu_watching_this_cpu(void);
>
> #ifdef CONFIG_RCU_USER_QS
> extern void rcu_user_enter(void);
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index a06d172..7b8fcee 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c

Don't we also need a rcutiny version too? Otherwise I'm guessing that
it wont compile if I use it with rcutiny (even if it always returns
true).

-- Steve

> @@ -710,6 +710,18 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
> #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */
>
> /**
> + * rcu_watching_this_cpu - are RCU read-side critical sections safe?
> + *
> + * Return true if RCU is watching the running CPU, which means that this
> + * CPU can safely enter RCU read-side critical sections. The caller must
> + * have at least disabled preemption.
> + */
> +bool rcu_watching_this_cpu(void)
> +{
> + return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
> +}
> +
> +/**
> * rcu_is_cpu_rrupt_from_idle - see if idle or immediately interrupted from idle
> *
> * If the current CPU is idle or running at a first-level (not nested)

2013-09-05 21:06:47

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Thu, Sep 05, 2013 at 01:59:59PM -0700, Paul E. McKenney wrote:
> On Thu, Sep 05, 2013 at 04:25:58PM -0400, Steven Rostedt wrote:
> > On Thu, 5 Sep 2013 12:52:34 -0700
> > "Paul E. McKenney" <[email protected]> wrote:
> >
> > > There is currently no way for kernel code to determine whether it
> > > is safe to enter an RCU read-side critical section, in other words,
> > > whether or not RCU is paying attention to the currently running CPU.
> > > Given the large and increasing quantity of code shared by the idle loop
> > > and non-idle code, the this shortcoming is becoming increasingly painful.
> > >
> > > This commit therefore adds rcu_watching_this_cpu(), which returns true
> > > if it is safe to enter an RCU read-side critical section on the currently
> > > running CPU. This function is quite fast, using only a __this_cpu_read().
> > > However, the caller must disable preemption.
> > >
> > > Reported-by: Steven Rostedt <[email protected]>
> > > Signed-off-by: Paul E. McKenney <[email protected]>
> >
> > Thanks Paul!
> >
> > >
> > > include/linux/rcupdate.h | 1 +
> > > kernel/rcutree.c | 12 ++++++++++++
> > > 2 files changed, 13 insertions(+)
> > >
> > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > index 15d33d9..1c7112c 100644
> > > --- a/include/linux/rcupdate.h
> > > +++ b/include/linux/rcupdate.h
> > > @@ -225,6 +225,7 @@ extern void rcu_idle_enter(void);
> > > extern void rcu_idle_exit(void);
> > > extern void rcu_irq_enter(void);
> > > extern void rcu_irq_exit(void);
> > > +extern bool rcu_watching_this_cpu(void);
> > >
> > > #ifdef CONFIG_RCU_USER_QS
> > > extern void rcu_user_enter(void);
> > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > index a06d172..7b8fcee 100644
> > > --- a/kernel/rcutree.c
> > > +++ b/kernel/rcutree.c
> >
> > Don't we also need a rcutiny version too? Otherwise I'm guessing that
> > it wont compile if I use it with rcutiny (even if it always returns
> > true).
>
> Yep, there needs to be an rcutiny version, and it must work in the
> same way. So what I currently have is that the rcutiny version is under
> CONFIG_RCU_TRACE. This will require some Kconfig monkeying somewhere
> to select CONFIG_RCU_TRACE if CONFIG_SMP and tracing is enabled.
>
> Just trying to keep rcutiny tiny...

And here is the updated patch, for whatever it is worth.

Thanx, Paul
------------------------------------------------------------------------

rcu: Is it safe to enter an RCU read-side critical section?

There is currently no way for kernel code to determine whether it
is safe to enter an RCU read-side critical section, in other words,
whether or not RCU is paying attention to the currently running CPU.
Given the large and increasing quantity of code shared by the idle loop
and non-idle code, the this shortcoming is becoming increasingly painful.

This commit therefore adds rcu_watching_this_cpu(), which returns true
if it is safe to enter an RCU read-side critical section on the currently
running CPU. This function is quite fast, using only a __this_cpu_read().
However, the caller must disable preemption.

Reported-by: Steven Rostedt <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 15d33d9..7c024fd 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -225,6 +225,9 @@ extern void rcu_idle_enter(void);
extern void rcu_idle_exit(void);
extern void rcu_irq_enter(void);
extern void rcu_irq_exit(void);
+#if defined(CONFIG_SMP) || defined(CONFIG_RCU_TRACE)
+extern bool rcu_watching_this_cpu(void);
+#endif /* #if defined(CONFIG_SMP) || defined(CONFIG_RCU_TRACE) */

#ifdef CONFIG_RCU_USER_QS
extern void rcu_user_enter(void);
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 7e3b0d6..fce820f 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -189,6 +189,17 @@ EXPORT_SYMBOL(rcu_is_cpu_idle);

#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */

+#ifdef CONFIG_RCU_TRACE
+/*
+ * Test whether the current CPU can safely enter RCU read-side critical
+ * sections. The caller must at least have disabled interrupts.
+ */
+bool rcu_watching_this_cpu(void)
+{
+ return !!rcu_dynticks_nesting;
+}
+#endif /* #ifdef CONFIG_RCU_TRACE */
+
/*
* Test whether the current CPU was interrupted from idle. Nested
* interrupts don't count, we must be running at the first interrupt
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index a06d172..7b8fcee 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -710,6 +710,18 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
#endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */

/**
+ * rcu_watching_this_cpu - are RCU read-side critical sections safe?
+ *
+ * Return true if RCU is watching the running CPU, which means that this
+ * CPU can safely enter RCU read-side critical sections. The caller must
+ * have at least disabled preemption.
+ */
+bool rcu_watching_this_cpu(void)
+{
+ return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
+}
+
+/**
* rcu_is_cpu_rrupt_from_idle - see if idle or immediately interrupted from idle
*
* If the current CPU is idle or running at a first-level (not nested)

2013-09-05 21:37:09

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Thu, Sep 05, 2013 at 04:25:58PM -0400, Steven Rostedt wrote:
> On Thu, 5 Sep 2013 12:52:34 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > There is currently no way for kernel code to determine whether it
> > is safe to enter an RCU read-side critical section, in other words,
> > whether or not RCU is paying attention to the currently running CPU.
> > Given the large and increasing quantity of code shared by the idle loop
> > and non-idle code, the this shortcoming is becoming increasingly painful.
> >
> > This commit therefore adds rcu_watching_this_cpu(), which returns true
> > if it is safe to enter an RCU read-side critical section on the currently
> > running CPU. This function is quite fast, using only a __this_cpu_read().
> > However, the caller must disable preemption.
> >
> > Reported-by: Steven Rostedt <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
>
> Thanks Paul!
>
> >
> > include/linux/rcupdate.h | 1 +
> > kernel/rcutree.c | 12 ++++++++++++
> > 2 files changed, 13 insertions(+)
> >
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 15d33d9..1c7112c 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -225,6 +225,7 @@ extern void rcu_idle_enter(void);
> > extern void rcu_idle_exit(void);
> > extern void rcu_irq_enter(void);
> > extern void rcu_irq_exit(void);
> > +extern bool rcu_watching_this_cpu(void);
> >
> > #ifdef CONFIG_RCU_USER_QS
> > extern void rcu_user_enter(void);
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index a06d172..7b8fcee 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
>
> Don't we also need a rcutiny version too? Otherwise I'm guessing that
> it wont compile if I use it with rcutiny (even if it always returns
> true).

Yep, there needs to be an rcutiny version, and it must work in the
same way. So what I currently have is that the rcutiny version is under
CONFIG_RCU_TRACE. This will require some Kconfig monkeying somewhere
to select CONFIG_RCU_TRACE if CONFIG_SMP and tracing is enabled.

Just trying to keep rcutiny tiny...

Thanx, Paul

> -- Steve
>
> > @@ -710,6 +710,18 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
> > #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */
> >
> > /**
> > + * rcu_watching_this_cpu - are RCU read-side critical sections safe?
> > + *
> > + * Return true if RCU is watching the running CPU, which means that this
> > + * CPU can safely enter RCU read-side critical sections. The caller must
> > + * have at least disabled preemption.
> > + */
> > +bool rcu_watching_this_cpu(void)
> > +{
> > + return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
> > +}
> > +
> > +/**
> > * rcu_is_cpu_rrupt_from_idle - see if idle or immediately interrupted from idle
> > *
> > * If the current CPU is idle or running at a first-level (not nested)
>

2013-09-05 23:40:12

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Thu, 5 Sep 2013 14:05:37 -0700
"Paul E. McKenney" <[email protected]> wrote:


>
> rcu: Is it safe to enter an RCU read-side critical section?
>
> There is currently no way for kernel code to determine whether it
> is safe to enter an RCU read-side critical section, in other words,

Shouldn't that be a semi-colon?

"read-side critical section; in other words,"

> whether or not RCU is paying attention to the currently running CPU.
> Given the large and increasing quantity of code shared by the idle loop
> and non-idle code, the this shortcoming is becoming increasingly painful.

s/the//

>
> This commit therefore adds rcu_watching_this_cpu(), which returns true
> if it is safe to enter an RCU read-side critical section on the currently
> running CPU. This function is quite fast, using only a __this_cpu_read().
> However, the caller must disable preemption.
>
> Reported-by: Steven Rostedt <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 15d33d9..7c024fd 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -225,6 +225,9 @@ extern void rcu_idle_enter(void);
> extern void rcu_idle_exit(void);
> extern void rcu_irq_enter(void);
> extern void rcu_irq_exit(void);
> +#if defined(CONFIG_SMP) || defined(CONFIG_RCU_TRACE)
> +extern bool rcu_watching_this_cpu(void);

I'm assuming that rcu is always watching a CPU if "!SMP" and
"!RCU_TRACE". Thus you still need to have:

#else
static inline bool rcu_watching_this_cpu(void)
{
return true;
}

Otherwise we still fail to compile.

-- Steve

> +#endif /* #if defined(CONFIG_SMP) || defined(CONFIG_RCU_TRACE) */
>
> #ifdef CONFIG_RCU_USER_QS
> extern void rcu_user_enter(void);
> diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
> index 7e3b0d6..fce820f 100644
> --- a/kernel/rcutiny.c
> +++ b/kernel/rcutiny.c
> @@ -189,6 +189,17 @@ EXPORT_SYMBOL(rcu_is_cpu_idle);
>
> #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>
> +#ifdef CONFIG_RCU_TRACE
> +/*
> + * Test whether the current CPU can safely enter RCU read-side critical
> + * sections. The caller must at least have disabled interrupts.
> + */
> +bool rcu_watching_this_cpu(void)
> +{
> + return !!rcu_dynticks_nesting;
> +}
> +#endif /* #ifdef CONFIG_RCU_TRACE */
> +
> /*
> * Test whether the current CPU was interrupted from idle. Nested
> * interrupts don't count, we must be running at the first interrupt
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index a06d172..7b8fcee 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -710,6 +710,18 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
> #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */
>
> /**
> + * rcu_watching_this_cpu - are RCU read-side critical sections safe?
> + *
> + * Return true if RCU is watching the running CPU, which means that this
> + * CPU can safely enter RCU read-side critical sections. The caller must
> + * have at least disabled preemption.
> + */
> +bool rcu_watching_this_cpu(void)
> +{
> + return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
> +}
> +
> +/**
> * rcu_is_cpu_rrupt_from_idle - see if idle or immediately interrupted from idle
> *
> * If the current CPU is idle or running at a first-level (not nested)

2013-09-06 10:59:48

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Thu, Sep 05, 2013 at 12:52:34PM -0700, Paul E. McKenney wrote:
> There is currently no way for kernel code to determine whether it
> is safe to enter an RCU read-side critical section, in other words,
> whether or not RCU is paying attention to the currently running CPU.
> Given the large and increasing quantity of code shared by the idle loop
> and non-idle code, the this shortcoming is becoming increasingly painful.
>
> This commit therefore adds rcu_watching_this_cpu(), which returns true
> if it is safe to enter an RCU read-side critical section on the currently
> running CPU. This function is quite fast, using only a __this_cpu_read().
> However, the caller must disable preemption.
>
> Reported-by: Steven Rostedt <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
>
> include/linux/rcupdate.h | 1 +
> kernel/rcutree.c | 12 ++++++++++++
> 2 files changed, 13 insertions(+)
>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 15d33d9..1c7112c 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -225,6 +225,7 @@ extern void rcu_idle_enter(void);
> extern void rcu_idle_exit(void);
> extern void rcu_irq_enter(void);
> extern void rcu_irq_exit(void);
> +extern bool rcu_watching_this_cpu(void);
>
> #ifdef CONFIG_RCU_USER_QS
> extern void rcu_user_enter(void);
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index a06d172..7b8fcee 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -710,6 +710,18 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
> #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */
>
> /**
> + * rcu_watching_this_cpu - are RCU read-side critical sections safe?
> + *
> + * Return true if RCU is watching the running CPU, which means that this
> + * CPU can safely enter RCU read-side critical sections. The caller must
> + * have at least disabled preemption.
> + */
> +bool rcu_watching_this_cpu(void)
> +{
> + return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
> +}

There is also rcu_is_cpu_idle().

Thanks.

> +
> +/**
> * rcu_is_cpu_rrupt_from_idle - see if idle or immediately interrupted from idle
> *
> * If the current CPU is idle or running at a first-level (not nested)
>

2013-09-06 15:19:13

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 12:59:41PM +0200, Frederic Weisbecker wrote:
> On Thu, Sep 05, 2013 at 12:52:34PM -0700, Paul E. McKenney wrote:
> > There is currently no way for kernel code to determine whether it
> > is safe to enter an RCU read-side critical section, in other words,
> > whether or not RCU is paying attention to the currently running CPU.
> > Given the large and increasing quantity of code shared by the idle loop
> > and non-idle code, the this shortcoming is becoming increasingly painful.
> >
> > This commit therefore adds rcu_watching_this_cpu(), which returns true
> > if it is safe to enter an RCU read-side critical section on the currently
> > running CPU. This function is quite fast, using only a __this_cpu_read().
> > However, the caller must disable preemption.
> >
> > Reported-by: Steven Rostedt <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
> >
> > include/linux/rcupdate.h | 1 +
> > kernel/rcutree.c | 12 ++++++++++++
> > 2 files changed, 13 insertions(+)
> >
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 15d33d9..1c7112c 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -225,6 +225,7 @@ extern void rcu_idle_enter(void);
> > extern void rcu_idle_exit(void);
> > extern void rcu_irq_enter(void);
> > extern void rcu_irq_exit(void);
> > +extern bool rcu_watching_this_cpu(void);
> >
> > #ifdef CONFIG_RCU_USER_QS
> > extern void rcu_user_enter(void);
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index a06d172..7b8fcee 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -710,6 +710,18 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
> > #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */
> >
> > /**
> > + * rcu_watching_this_cpu - are RCU read-side critical sections safe?
> > + *
> > + * Return true if RCU is watching the running CPU, which means that this
> > + * CPU can safely enter RCU read-side critical sections. The caller must
> > + * have at least disabled preemption.
> > + */
> > +bool rcu_watching_this_cpu(void)
> > +{
> > + return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
> > +}
>
> There is also rcu_is_cpu_idle().

Good point, thank you! I was clearly in autonomic-reflex mode yesterday. :-/

Here is the rcutree version:

int rcu_is_cpu_idle(void)
{
int ret;

preempt_disable();
ret = (atomic_read(&__get_cpu_var(rcu_dynticks).dynticks) & 0x1) == 0;
preempt_enable();
return ret;
}

And here is the rcutiny version:

int rcu_is_cpu_idle(void)
{
return !rcu_dynticks_nesting;
}

Steve, could you please use rcu_is_cpu_idle()? I will revert yesterday's
redundancy.

Thanx, Paul

> Thanks.
>
> > +
> > +/**
> > * rcu_is_cpu_rrupt_from_idle - see if idle or immediately interrupted from idle
> > *
> > * If the current CPU is idle or running at a first-level (not nested)
> >
>

2013-09-06 15:33:25

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, 6 Sep 2013 08:18:52 -0700
"Paul E. McKenney" <[email protected]> wrote:

> On Fri, Sep 06, 2013 at 12:59:41PM +0200, Frederic Weisbecker wrote:
> > On Thu, Sep 05, 2013 at 12:52:34PM -0700, Paul E. McKenney wrote:
> > > There is currently no way for kernel code to determine whether it
> > > is safe to enter an RCU read-side critical section, in other words,
> > > whether or not RCU is paying attention to the currently running CPU.
> > > Given the large and increasing quantity of code shared by the idle loop
> > > and non-idle code, the this shortcoming is becoming increasingly painful.
> > >
> > > This commit therefore adds rcu_watching_this_cpu(), which returns true
> > > if it is safe to enter an RCU read-side critical section on the currently
> > > running CPU. This function is quite fast, using only a __this_cpu_read().
> > > However, the caller must disable preemption.
> > >
> > > Reported-by: Steven Rostedt <[email protected]>
> > > Signed-off-by: Paul E. McKenney <[email protected]>
> > >
> > > include/linux/rcupdate.h | 1 +
> > > kernel/rcutree.c | 12 ++++++++++++
> > > 2 files changed, 13 insertions(+)
> > >
> > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > index 15d33d9..1c7112c 100644
> > > --- a/include/linux/rcupdate.h
> > > +++ b/include/linux/rcupdate.h
> > > @@ -225,6 +225,7 @@ extern void rcu_idle_enter(void);
> > > extern void rcu_idle_exit(void);
> > > extern void rcu_irq_enter(void);
> > > extern void rcu_irq_exit(void);
> > > +extern bool rcu_watching_this_cpu(void);
> > >
> > > #ifdef CONFIG_RCU_USER_QS
> > > extern void rcu_user_enter(void);
> > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > index a06d172..7b8fcee 100644
> > > --- a/kernel/rcutree.c
> > > +++ b/kernel/rcutree.c
> > > @@ -710,6 +710,18 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
> > > #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */
> > >
> > > /**
> > > + * rcu_watching_this_cpu - are RCU read-side critical sections safe?
> > > + *
> > > + * Return true if RCU is watching the running CPU, which means that this
> > > + * CPU can safely enter RCU read-side critical sections. The caller must
> > > + * have at least disabled preemption.
> > > + */
> > > +bool rcu_watching_this_cpu(void)
> > > +{
> > > + return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
> > > +}
> >
> > There is also rcu_is_cpu_idle().
>
> Good point, thank you! I was clearly in autonomic-reflex mode yesterday. :-/
>
> Here is the rcutree version:
>
> int rcu_is_cpu_idle(void)
> {
> int ret;
>
> preempt_disable();
> ret = (atomic_read(&__get_cpu_var(rcu_dynticks).dynticks) & 0x1) == 0;
> preempt_enable();
> return ret;
> }
>
> And here is the rcutiny version:
>
> int rcu_is_cpu_idle(void)
> {
> return !rcu_dynticks_nesting;
> }
>
> Steve, could you please use rcu_is_cpu_idle()? I will revert yesterday's
> redundancy.
>

I can't use plain preempt_disable() in function tracing.

Also, since it's a misnomer to say the cpu is idle in NO_HZ_FULL when
we are coming from userspace, can we rename that?

Perhaps we can also have a __rcu_is_cpu_tracking() (or whatever), with
the "__" appended that does not do the preempt disable.

-- Steve

2013-09-06 16:40:26

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 11:33:20AM -0400, Steven Rostedt wrote:
> On Fri, 6 Sep 2013 08:18:52 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > On Fri, Sep 06, 2013 at 12:59:41PM +0200, Frederic Weisbecker wrote:
> > > On Thu, Sep 05, 2013 at 12:52:34PM -0700, Paul E. McKenney wrote:
> > > > There is currently no way for kernel code to determine whether it
> > > > is safe to enter an RCU read-side critical section, in other words,
> > > > whether or not RCU is paying attention to the currently running CPU.
> > > > Given the large and increasing quantity of code shared by the idle loop
> > > > and non-idle code, the this shortcoming is becoming increasingly painful.
> > > >
> > > > This commit therefore adds rcu_watching_this_cpu(), which returns true
> > > > if it is safe to enter an RCU read-side critical section on the currently
> > > > running CPU. This function is quite fast, using only a __this_cpu_read().
> > > > However, the caller must disable preemption.
> > > >
> > > > Reported-by: Steven Rostedt <[email protected]>
> > > > Signed-off-by: Paul E. McKenney <[email protected]>
> > > >
> > > > include/linux/rcupdate.h | 1 +
> > > > kernel/rcutree.c | 12 ++++++++++++
> > > > 2 files changed, 13 insertions(+)
> > > >
> > > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > > index 15d33d9..1c7112c 100644
> > > > --- a/include/linux/rcupdate.h
> > > > +++ b/include/linux/rcupdate.h
> > > > @@ -225,6 +225,7 @@ extern void rcu_idle_enter(void);
> > > > extern void rcu_idle_exit(void);
> > > > extern void rcu_irq_enter(void);
> > > > extern void rcu_irq_exit(void);
> > > > +extern bool rcu_watching_this_cpu(void);
> > > >
> > > > #ifdef CONFIG_RCU_USER_QS
> > > > extern void rcu_user_enter(void);
> > > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > > > index a06d172..7b8fcee 100644
> > > > --- a/kernel/rcutree.c
> > > > +++ b/kernel/rcutree.c
> > > > @@ -710,6 +710,18 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
> > > > #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */
> > > >
> > > > /**
> > > > + * rcu_watching_this_cpu - are RCU read-side critical sections safe?
> > > > + *
> > > > + * Return true if RCU is watching the running CPU, which means that this
> > > > + * CPU can safely enter RCU read-side critical sections. The caller must
> > > > + * have at least disabled preemption.
> > > > + */
> > > > +bool rcu_watching_this_cpu(void)
> > > > +{
> > > > + return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
> > > > +}
> > >
> > > There is also rcu_is_cpu_idle().
> >
> > Good point, thank you! I was clearly in autonomic-reflex mode yesterday. :-/
> >
> > Here is the rcutree version:
> >
> > int rcu_is_cpu_idle(void)
> > {
> > int ret;
> >
> > preempt_disable();
> > ret = (atomic_read(&__get_cpu_var(rcu_dynticks).dynticks) & 0x1) == 0;
> > preempt_enable();
> > return ret;
> > }
> >
> > And here is the rcutiny version:
> >
> > int rcu_is_cpu_idle(void)
> > {
> > return !rcu_dynticks_nesting;
> > }
> >
> > Steve, could you please use rcu_is_cpu_idle()? I will revert yesterday's
> > redundancy.
> >
>
> I can't use plain preempt_disable() in function tracing.
>
> Also, since it's a misnomer to say the cpu is idle in NO_HZ_FULL when
> we are coming from userspace, can we rename that?
>
> Perhaps we can also have a __rcu_is_cpu_tracking() (or whatever), with
> the "__" appended that does not do the preempt disable.

rcu_is_cpu_eqs() is probably better. It refers to other related "eqs" naming
in RCU APIs.

> -- Steve

2013-09-06 16:52:41

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, 6 Sep 2013 18:40:18 +0200
Frederic Weisbecker <[email protected]> wrote:

> > I can't use plain preempt_disable() in function tracing.
> >
> > Also, since it's a misnomer to say the cpu is idle in NO_HZ_FULL when
> > we are coming from userspace, can we rename that?
> >
> > Perhaps we can also have a __rcu_is_cpu_tracking() (or whatever), with
> > the "__" appended that does not do the preempt disable.
>
> rcu_is_cpu_eqs() is probably better. It refers to other related "eqs" naming
> in RCU APIs.

But that will just confuse the heck out of people. When I see "eqs" I
equate that with "equals". What does the rcu cpu equal?

-- Steve

2013-09-06 16:58:39

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 12:52:38PM -0400, Steven Rostedt wrote:
> On Fri, 6 Sep 2013 18:40:18 +0200
> Frederic Weisbecker <[email protected]> wrote:
>
> > > I can't use plain preempt_disable() in function tracing.
> > >
> > > Also, since it's a misnomer to say the cpu is idle in NO_HZ_FULL when
> > > we are coming from userspace, can we rename that?
> > >
> > > Perhaps we can also have a __rcu_is_cpu_tracking() (or whatever), with
> > > the "__" appended that does not do the preempt disable.
> >
> > rcu_is_cpu_eqs() is probably better. It refers to other related "eqs" naming
> > in RCU APIs.
>
> But that will just confuse the heck out of people. When I see "eqs" I
> equate that with "equals". What does the rcu cpu equal?

The acronym eqs "equals" "extended quiescent state". ;-)

Thanx, Paul

2013-09-06 17:00:13

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 12:52:38PM -0400, Steven Rostedt wrote:
> On Fri, 6 Sep 2013 18:40:18 +0200
> Frederic Weisbecker <[email protected]> wrote:
>
> > > I can't use plain preempt_disable() in function tracing.
> > >
> > > Also, since it's a misnomer to say the cpu is idle in NO_HZ_FULL when
> > > we are coming from userspace, can we rename that?
> > >
> > > Perhaps we can also have a __rcu_is_cpu_tracking() (or whatever), with
> > > the "__" appended that does not do the preempt disable.
> >
> > rcu_is_cpu_eqs() is probably better. It refers to other related "eqs" naming
> > in RCU APIs.
>
> But that will just confuse the heck out of people. When I see "eqs" I
> equate that with "equals". What does the rcu cpu equal?

It's "extended quiescent state". There is already rcu_eqs_enter() and rcu_eqs_exit().
You're right, may be we can rename that to avoid confusion with "equals". I don't mind much.
I'm happy as long as the reader rcu_is_cpu_foo() and the writers rcu_foo_enter() and
rcu_foo_exit() have consistant naming.

>
> -- Steve

2013-09-06 17:16:34

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, 6 Sep 2013 19:00:08 +0200
Frederic Weisbecker <[email protected]> wrote:

> On Fri, Sep 06, 2013 at 12:52:38PM -0400, Steven Rostedt wrote:
> > On Fri, 6 Sep 2013 18:40:18 +0200
> > Frederic Weisbecker <[email protected]> wrote:
> >
> > > > I can't use plain preempt_disable() in function tracing.
> > > >
> > > > Also, since it's a misnomer to say the cpu is idle in NO_HZ_FULL when
> > > > we are coming from userspace, can we rename that?
> > > >
> > > > Perhaps we can also have a __rcu_is_cpu_tracking() (or whatever), with
> > > > the "__" appended that does not do the preempt disable.
> > >
> > > rcu_is_cpu_eqs() is probably better. It refers to other related "eqs" naming
> > > in RCU APIs.
> >
> > But that will just confuse the heck out of people. When I see "eqs" I
> > equate that with "equals". What does the rcu cpu equal?
>
> It's "extended quiescent state". There is already rcu_eqs_enter() and rcu_eqs_exit().
> You're right, may be we can rename that to avoid confusion with "equals". I don't mind much.
> I'm happy as long as the reader rcu_is_cpu_foo() and the writers rcu_foo_enter() and
> rcu_foo_exit() have consistant naming.
>

What exactly does "extended quiescent state" mean? (Note, that's a
rhetorical question)


I wonder if we should change "rcu_cpu_ignore()" for "rcu_eqs_enter()"
and "rcu_cpu_heed()" for "rcu_eqs_exit()", as IMHO that's much more
straight forward to understand than trying to wrap you head around what
a quiescent state is, and why we are entering it or exiting it.

It also flat out explains to people that rcu is not processing that
current CPU, and things like rcu_read_lock() should not be used.

Then we can say "rcu_cpu_is_ignored()" for things like
"rcu_is_cpu_eqs()".

-- Steve

2013-09-06 17:21:32

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, 2013-09-06 at 08:18 -0700, Paul E. McKenney wrote:

> int rcu_is_cpu_idle(void)
> {
> int ret;
>
> preempt_disable();
> ret = (atomic_read(&__get_cpu_var(rcu_dynticks).dynticks) & 0x1) == 0;
> preempt_enable();
> return ret;
> }

Paul I find this very confusing.

If caller doesn't have preemption disabled, what could be the meaning of
this rcu_is_cpu_idle() call ?

Its result is meaningless if suddenly thread is preempted, so what is
the point ?

Sorry if this is obvious to you.

2013-09-06 17:41:26

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 10:21:28AM -0700, Eric Dumazet wrote:
> On Fri, 2013-09-06 at 08:18 -0700, Paul E. McKenney wrote:
>
> > int rcu_is_cpu_idle(void)
> > {
> > int ret;
> >
> > preempt_disable();
> > ret = (atomic_read(&__get_cpu_var(rcu_dynticks).dynticks) & 0x1) == 0;
> > preempt_enable();
> > return ret;
> > }
>
> Paul I find this very confusing.
>
> If caller doesn't have preemption disabled, what could be the meaning of
> this rcu_is_cpu_idle() call ?
>
> Its result is meaningless if suddenly thread is preempted, so what is
> the point ?
>
> Sorry if this is obvious to you.

It is a completely fair question. In fact, this might well now be
pointing to a bug given NO_HZ_FULL.

The assumption is that if you don't have preemption disabled, you had
better be running on a CPU that RCU is paying attention to. The rationale
involves preemptible RCU.

Suppose that you just did rcu_read_lock() on a CPU that RCU is paying
attention to. All is well, and rcu_is_cpu_idle() will return false, as
expected. Suppose now that it is possible to be preempted and suddenly
find yourself running on a CPU that RCU is not paying attention to.
This would have the effect of making your RCU read-side critical section
be ignored. Therefore, it had better not be possible to be preempted
from a CPU to which RCU is paying attention to a CPU that RCU is ignoring.

So if rcu_is_cpu_idle() returns false, you had better be guaranteed
that whatever CPU you are running on (which might well be a different
one than the rcu_is_cpu_idle() was running on) is being watched by RCU.

So, Frederic, does this still work with NO_HZ_FULL? If not, I believe
we have a bigger problem than the preempt_disable() in rcu_is_cpu_idle()!

Thanx, Paul

2013-09-06 17:52:45

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 01:16:31PM -0400, Steven Rostedt wrote:
> On Fri, 6 Sep 2013 19:00:08 +0200
> Frederic Weisbecker <[email protected]> wrote:
>
> > On Fri, Sep 06, 2013 at 12:52:38PM -0400, Steven Rostedt wrote:
> > > On Fri, 6 Sep 2013 18:40:18 +0200
> > > Frederic Weisbecker <[email protected]> wrote:
> > >
> > > > > I can't use plain preempt_disable() in function tracing.
> > > > >
> > > > > Also, since it's a misnomer to say the cpu is idle in NO_HZ_FULL when
> > > > > we are coming from userspace, can we rename that?
> > > > >
> > > > > Perhaps we can also have a __rcu_is_cpu_tracking() (or whatever), with
> > > > > the "__" appended that does not do the preempt disable.
> > > >
> > > > rcu_is_cpu_eqs() is probably better. It refers to other related "eqs" naming
> > > > in RCU APIs.
> > >
> > > But that will just confuse the heck out of people. When I see "eqs" I
> > > equate that with "equals". What does the rcu cpu equal?
> >
> > It's "extended quiescent state". There is already rcu_eqs_enter() and rcu_eqs_exit().
> > You're right, may be we can rename that to avoid confusion with "equals". I don't mind much.
> > I'm happy as long as the reader rcu_is_cpu_foo() and the writers rcu_foo_enter() and
> > rcu_foo_exit() have consistant naming.
> >
>
> What exactly does "extended quiescent state" mean? (Note, that's a
> rhetorical question)

In which case my rhetorical (and therefore useless) answer has to be
"it is a quiescent state that is extended". ;-)

Sorry, couldn't resist...

> I wonder if we should change "rcu_cpu_ignore()" for "rcu_eqs_enter()"
> and "rcu_cpu_heed()" for "rcu_eqs_exit()", as IMHO that's much more
> straight forward to understand than trying to wrap you head around what
> a quiescent state is, and why we are entering it or exiting it.
>
> It also flat out explains to people that rcu is not processing that
> current CPU, and things like rcu_read_lock() should not be used.
>
> Then we can say "rcu_cpu_is_ignored()" for things like
> "rcu_is_cpu_eqs()".

Currently, none of RCU's _eqs functions are exported, so they have
the potential to confuse only people working on the RCU implementation
itself, who had better understand what "eqs" means.

But I do count your vote against "eqs" appearing in the name of any
function exported by RCU.

How about if I made rcu_is_cpu_idle() be as follows?

int rcu_is_cpu_idle(void)
{
int ret;

ret = (atomic_read(&per_cpu(rcu_dynticks.dynticks,
raw_smp_processor_id())) & 0x1) == 0;
return ret;
}

This should allow existing uses to function properly and should allow
you to use it as well.

Thanx, Paul

2013-09-06 17:56:16

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 10:52:38AM -0700, Paul E. McKenney wrote:
> On Fri, Sep 06, 2013 at 01:16:31PM -0400, Steven Rostedt wrote:
> > On Fri, 6 Sep 2013 19:00:08 +0200
> > Frederic Weisbecker <[email protected]> wrote:
> >
> > > On Fri, Sep 06, 2013 at 12:52:38PM -0400, Steven Rostedt wrote:
> > > > On Fri, 6 Sep 2013 18:40:18 +0200
> > > > Frederic Weisbecker <[email protected]> wrote:
> > > >
> > > > > > I can't use plain preempt_disable() in function tracing.
> > > > > >
> > > > > > Also, since it's a misnomer to say the cpu is idle in NO_HZ_FULL when
> > > > > > we are coming from userspace, can we rename that?
> > > > > >
> > > > > > Perhaps we can also have a __rcu_is_cpu_tracking() (or whatever), with
> > > > > > the "__" appended that does not do the preempt disable.
> > > > >
> > > > > rcu_is_cpu_eqs() is probably better. It refers to other related "eqs" naming
> > > > > in RCU APIs.
> > > >
> > > > But that will just confuse the heck out of people. When I see "eqs" I
> > > > equate that with "equals". What does the rcu cpu equal?
> > >
> > > It's "extended quiescent state". There is already rcu_eqs_enter() and rcu_eqs_exit().
> > > You're right, may be we can rename that to avoid confusion with "equals". I don't mind much.
> > > I'm happy as long as the reader rcu_is_cpu_foo() and the writers rcu_foo_enter() and
> > > rcu_foo_exit() have consistant naming.
> > >
> >
> > What exactly does "extended quiescent state" mean? (Note, that's a
> > rhetorical question)
>
> In which case my rhetorical (and therefore useless) answer has to be
> "it is a quiescent state that is extended". ;-)
>
> Sorry, couldn't resist...
>
> > I wonder if we should change "rcu_cpu_ignore()" for "rcu_eqs_enter()"
> > and "rcu_cpu_heed()" for "rcu_eqs_exit()", as IMHO that's much more
> > straight forward to understand than trying to wrap you head around what
> > a quiescent state is, and why we are entering it or exiting it.
> >
> > It also flat out explains to people that rcu is not processing that
> > current CPU, and things like rcu_read_lock() should not be used.
> >
> > Then we can say "rcu_cpu_is_ignored()" for things like
> > "rcu_is_cpu_eqs()".
>
> Currently, none of RCU's _eqs functions are exported, so they have
> the potential to confuse only people working on the RCU implementation
> itself, who had better understand what "eqs" means.
>
> But I do count your vote against "eqs" appearing in the name of any
> function exported by RCU.
>
> How about if I made rcu_is_cpu_idle() be as follows?
>
> int rcu_is_cpu_idle(void)
> {
> int ret;
>
> ret = (atomic_read(&per_cpu(rcu_dynticks.dynticks,
> raw_smp_processor_id())) & 0x1) == 0;
> return ret;
> }
>
> This should allow existing uses to function properly and should allow
> you to use it as well.

Never mind, I remember why this doesn't work... The problem is that the
offset is calculated on one CPU, the task is preempted, the original CPU
goes idle, the task starts running on some other CPU, and then lockdep
complains about the CPU entering an RCU read-side critical section on
an idle CPU.

Blech.

But I -can- make them the same function on rcutiny, courtesy of the
ever-present C preprocessor. ;-)

Thanx, Paul

2013-09-06 18:21:39

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, 6 Sep 2013 10:52:38 -0700
"Paul E. McKenney" <[email protected]> wrote:

> > What exactly does "extended quiescent state" mean? (Note, that's a
> > rhetorical question)
>
> In which case my rhetorical (and therefore useless) answer has to be
> "it is a quiescent state that is extended". ;-)
>
> Sorry, couldn't resist...

Of course you couldn't ;)

>
> > I wonder if we should change "rcu_cpu_ignore()" for "rcu_eqs_enter()"
> > and "rcu_cpu_heed()" for "rcu_eqs_exit()", as IMHO that's much more
> > straight forward to understand than trying to wrap you head around what
> > a quiescent state is, and why we are entering it or exiting it.
> >
> > It also flat out explains to people that rcu is not processing that
> > current CPU, and things like rcu_read_lock() should not be used.
> >
> > Then we can say "rcu_cpu_is_ignored()" for things like
> > "rcu_is_cpu_eqs()".
>
> Currently, none of RCU's _eqs functions are exported, so they have
> the potential to confuse only people working on the RCU implementation
> itself, who had better understand what "eqs" means.

Yeah, that's what I thought, and never cared about the "eqs" meaning.

>
> But I do count your vote against "eqs" appearing in the name of any
> function exported by RCU.

Right, their shouldn't be any "eqs" functions that are global to users
outside of the RCU infrastructure.

>
> How about if I made rcu_is_cpu_idle() be as follows?
>
> int rcu_is_cpu_idle(void)
> {
> int ret;
>
> ret = (atomic_read(&per_cpu(rcu_dynticks.dynticks,
> raw_smp_processor_id())) & 0x1) == 0;
> return ret;
> }
>
> This should allow existing uses to function properly and should allow
> you to use it as well.
>

You already said it wont work, but I still would have been against
using it, because I wouldn't be checking if rcu thinks the CPU is idle,
as NO_HZ_FULL has nothing to do with idle.

-- Steve

2013-09-06 18:59:34

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 10:41:17AM -0700, Paul E. McKenney wrote:
> On Fri, Sep 06, 2013 at 10:21:28AM -0700, Eric Dumazet wrote:
> > On Fri, 2013-09-06 at 08:18 -0700, Paul E. McKenney wrote:
> >
> > > int rcu_is_cpu_idle(void)
> > > {
> > > int ret;
> > >
> > > preempt_disable();
> > > ret = (atomic_read(&__get_cpu_var(rcu_dynticks).dynticks) & 0x1) == 0;
> > > preempt_enable();
> > > return ret;
> > > }
> >
> > Paul I find this very confusing.
> >
> > If caller doesn't have preemption disabled, what could be the meaning of
> > this rcu_is_cpu_idle() call ?
> >
> > Its result is meaningless if suddenly thread is preempted, so what is
> > the point ?
> >
> > Sorry if this is obvious to you.
>
> It is a completely fair question. In fact, this might well now be
> pointing to a bug given NO_HZ_FULL.
>
> The assumption is that if you don't have preemption disabled, you had
> better be running on a CPU that RCU is paying attention to. The rationale
> involves preemptible RCU.
>
> Suppose that you just did rcu_read_lock() on a CPU that RCU is paying
> attention to. All is well, and rcu_is_cpu_idle() will return false, as
> expected. Suppose now that it is possible to be preempted and suddenly
> find yourself running on a CPU that RCU is not paying attention to.
> This would have the effect of making your RCU read-side critical section
> be ignored. Therefore, it had better not be possible to be preempted
> from a CPU to which RCU is paying attention to a CPU that RCU is ignoring.
>
> So if rcu_is_cpu_idle() returns false, you had better be guaranteed
> that whatever CPU you are running on (which might well be a different
> one than the rcu_is_cpu_idle() was running on) is being watched by RCU.
>
> So, Frederic, does this still work with NO_HZ_FULL? If not, I believe
> we have a bigger problem than the preempt_disable() in rcu_is_cpu_idle()!

Sure it works well, because the scheduler task entrypoints exit those RCU
extended quiescent states.

Imagine that you're running on an rcu read side critical section on CPU 0, which
is not in extended quiescent state. Now you get preempted in the middle of your
RCU read side critical section (you called rcu_read_lock() but not yet rcu_read_unlock()).

Later on, the task is woken up to be scheduled in CPU 1. If CPU 1 is in extended
quiescent state because it runs is userspace, it receives a scheduler IPI,
then schedule_user() is called by the end of the interrupt and in turns calls rcu_user_exit()
before the task is resumed to the code it was running on CPU 0, in the middle of
the rcu read side extended quiescent state.

See, the key here is the rcu_user_exit() that restore the CPU on RCU's state machine.
There are other possible scheduler entrypoints when a CPU runs in user extended quiescent
state: exception and syscall entries or even preempt_schedule_irq() in case we receive an irq
in the kernel while we haven't yet reached the call to rcu_user_exit()... All of these should
be covered, otherwise you bet RCU would be prompt to warn.

That's why when we call rcu_is_cpu_idle() from an RCU read side critical section, it's legit even
if we can be preempted anytime around it.
And preempt_disable() is probably not even necessary, except perhaps if __get_cpu_var() itself
relies on non-preemptibility for its own correctness on the address calculation.

Thanks.

2013-09-06 20:38:31

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 08:59:29PM +0200, Frederic Weisbecker wrote:
> On Fri, Sep 06, 2013 at 10:41:17AM -0700, Paul E. McKenney wrote:
> > On Fri, Sep 06, 2013 at 10:21:28AM -0700, Eric Dumazet wrote:
> > > On Fri, 2013-09-06 at 08:18 -0700, Paul E. McKenney wrote:
> > >
> > > > int rcu_is_cpu_idle(void)
> > > > {
> > > > int ret;
> > > >
> > > > preempt_disable();
> > > > ret = (atomic_read(&__get_cpu_var(rcu_dynticks).dynticks) & 0x1) == 0;
> > > > preempt_enable();
> > > > return ret;
> > > > }
> > >
> > > Paul I find this very confusing.
> > >
> > > If caller doesn't have preemption disabled, what could be the meaning of
> > > this rcu_is_cpu_idle() call ?
> > >
> > > Its result is meaningless if suddenly thread is preempted, so what is
> > > the point ?
> > >
> > > Sorry if this is obvious to you.
> >
> > It is a completely fair question. In fact, this might well now be
> > pointing to a bug given NO_HZ_FULL.
> >
> > The assumption is that if you don't have preemption disabled, you had
> > better be running on a CPU that RCU is paying attention to. The rationale
> > involves preemptible RCU.
> >
> > Suppose that you just did rcu_read_lock() on a CPU that RCU is paying
> > attention to. All is well, and rcu_is_cpu_idle() will return false, as
> > expected. Suppose now that it is possible to be preempted and suddenly
> > find yourself running on a CPU that RCU is not paying attention to.
> > This would have the effect of making your RCU read-side critical section
> > be ignored. Therefore, it had better not be possible to be preempted
> > from a CPU to which RCU is paying attention to a CPU that RCU is ignoring.
> >
> > So if rcu_is_cpu_idle() returns false, you had better be guaranteed
> > that whatever CPU you are running on (which might well be a different
> > one than the rcu_is_cpu_idle() was running on) is being watched by RCU.
> >
> > So, Frederic, does this still work with NO_HZ_FULL? If not, I believe
> > we have a bigger problem than the preempt_disable() in rcu_is_cpu_idle()!
>
> Sure it works well, because the scheduler task entrypoints exit those RCU
> extended quiescent states.
>
> Imagine that you're running on an rcu read side critical section on CPU 0, which
> is not in extended quiescent state. Now you get preempted in the middle of your
> RCU read side critical section (you called rcu_read_lock() but not yet rcu_read_unlock()).
>
> Later on, the task is woken up to be scheduled in CPU 1. If CPU 1 is in extended
> quiescent state because it runs is userspace, it receives a scheduler IPI,
> then schedule_user() is called by the end of the interrupt and in turns calls rcu_user_exit()
> before the task is resumed to the code it was running on CPU 0, in the middle of
> the rcu read side extended quiescent state.
>
> See, the key here is the rcu_user_exit() that restore the CPU on RCU's state machine.
> There are other possible scheduler entrypoints when a CPU runs in user extended quiescent
> state: exception and syscall entries or even preempt_schedule_irq() in case we receive an irq
> in the kernel while we haven't yet reached the call to rcu_user_exit()... All of these should
> be covered, otherwise you bet RCU would be prompt to warn.
>
> That's why when we call rcu_is_cpu_idle() from an RCU read side critical section, it's legit even
> if we can be preempted anytime around it.
> And preempt_disable() is probably not even necessary, except perhaps if __get_cpu_var() itself
> relies on non-preemptibility for its own correctness on the address calculation.

Whew!!! ;-)

But the problem for rcu_is_cpu_idle() was not the calls from the scheduler,
but rather those from lockdep. If the overhead is a concern, you could
switch to the primitives I will be supplying for Steven.

Thanx, Paul

2013-09-07 00:49:33

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 02:21:35PM -0400, Steven Rostedt wrote:
> On Fri, 6 Sep 2013 10:52:38 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > > What exactly does "extended quiescent state" mean? (Note, that's a
> > > rhetorical question)
> >
> > In which case my rhetorical (and therefore useless) answer has to be
> > "it is a quiescent state that is extended". ;-)
> >
> > Sorry, couldn't resist...
>
> Of course you couldn't ;)
>
> >
> > > I wonder if we should change "rcu_cpu_ignore()" for "rcu_eqs_enter()"
> > > and "rcu_cpu_heed()" for "rcu_eqs_exit()", as IMHO that's much more
> > > straight forward to understand than trying to wrap you head around what
> > > a quiescent state is, and why we are entering it or exiting it.
> > >
> > > It also flat out explains to people that rcu is not processing that
> > > current CPU, and things like rcu_read_lock() should not be used.
> > >
> > > Then we can say "rcu_cpu_is_ignored()" for things like
> > > "rcu_is_cpu_eqs()".
> >
> > Currently, none of RCU's _eqs functions are exported, so they have
> > the potential to confuse only people working on the RCU implementation
> > itself, who had better understand what "eqs" means.
>
> Yeah, that's what I thought, and never cared about the "eqs" meaning.
>
> >
> > But I do count your vote against "eqs" appearing in the name of any
> > function exported by RCU.
>
> Right, their shouldn't be any "eqs" functions that are global to users
> outside of the RCU infrastructure.
>
> >
> > How about if I made rcu_is_cpu_idle() be as follows?
> >
> > int rcu_is_cpu_idle(void)
> > {
> > int ret;
> >
> > ret = (atomic_read(&per_cpu(rcu_dynticks.dynticks,
> > raw_smp_processor_id())) & 0x1) == 0;
> > return ret;
> > }
> >
> > This should allow existing uses to function properly and should allow
> > you to use it as well.
> >
>
> You already said it wont work, but I still would have been against
> using it, because I wouldn't be checking if rcu thinks the CPU is idle,
> as NO_HZ_FULL has nothing to do with idle.

OK then, how about the following?

Thanx, Paul

------------------------------------------------------------------------

rcu: Is it safe to enter an RCU read-side critical section?

There is currently no way for kernel code to determine whether it
is safe to enter an RCU read-side critical section, in other words,
whether or not RCU is paying attention to the currently running CPU.
Given the large and increasing quantity of code shared by the idle loop
and non-idle code, the this shortcoming is becoming increasingly painful.

This commit therefore adds rcu_watching_this_cpu(), which returns true
if it is safe to enter an RCU read-side critical section on the currently
running CPU. This function is quite fast, using only a __this_cpu_read().
However, the caller must disable preemption.

Reported-by: Steven Rostedt <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 5b444e0..a41eb35 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -261,6 +261,10 @@ static inline void rcu_user_hooks_switch(struct task_struct *prev,
rcu_irq_exit(); \
} while (0)

+#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP)
+extern int rcu_is_cpu_idle(void);
+#endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP) */
+
/*
* Infrastructure to implement the synchronize_() primitives in
* TREE_RCU and rcu_barrier_() primitives in TINY_RCU.
@@ -297,10 +301,6 @@ static inline void destroy_rcu_head_on_stack(struct rcu_head *head)
}
#endif /* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */

-#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SMP)
-extern int rcu_is_cpu_idle(void);
-#endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SMP) */
-
#if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU)
bool rcu_lockdep_current_cpu_online(void);
#else /* #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU) */
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index e31005e..67fe672 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -132,4 +132,13 @@ static inline void rcu_scheduler_starting(void)
}
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */

+#ifdef CONFIG_RCU_TRACE
+
+static inline bool rcu_watching_this_cpu(void)
+{
+ return !rcu_is_cpu_idle();
+}
+
+#endif /* #ifdef CONFIG_RCU_TRACE */
+
#endif /* __LINUX_RCUTINY_H */
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 226169d..c605b41 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -90,4 +90,6 @@ extern void exit_rcu(void);
extern void rcu_scheduler_starting(void);
extern int rcu_scheduler_active __read_mostly;

+extern bool rcu_watching_this_cpu(void);
+
#endif /* __LINUX_RCUTREE_H */
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 7e3b0d6..b14701f 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -176,7 +176,7 @@ void rcu_irq_enter(void)
}
EXPORT_SYMBOL_GPL(rcu_irq_enter);

-#ifdef CONFIG_DEBUG_LOCK_ALLOC
+#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE)

/*
* Test whether RCU thinks that the current CPU is idle.
@@ -187,7 +187,7 @@ int rcu_is_cpu_idle(void)
}
EXPORT_SYMBOL(rcu_is_cpu_idle);

-#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+#endif /* defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) */

/*
* Test whether the current CPU was interrupted from idle. Nested
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index a06d172..38c6883 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -666,6 +666,19 @@ int rcu_is_cpu_idle(void)
}
EXPORT_SYMBOL(rcu_is_cpu_idle);

+/**
+ * rcu_watching_this_cpu - are RCU read-side critical sections safe?
+ *
+ * Return true if RCU is watching the running CPU, which means that
+ * this CPU can safely enter RCU read-side critical sections. Unlike
+ * rcu_is_cpu_idle(), the caller of rcu_watching_this_cpu() must have at
+ * least disabled preemption.
+ */
+bool rcu_watching_this_cpu(void)
+{
+ return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
+}
+
#if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU)

/*

2013-09-07 01:19:37

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

* Paul E. McKenney ([email protected]) wrote:
> On Fri, Sep 06, 2013 at 02:21:35PM -0400, Steven Rostedt wrote:
> > On Fri, 6 Sep 2013 10:52:38 -0700
> > "Paul E. McKenney" <[email protected]> wrote:
> >
> > > > What exactly does "extended quiescent state" mean? (Note, that's a
> > > > rhetorical question)
> > >
> > > In which case my rhetorical (and therefore useless) answer has to be
> > > "it is a quiescent state that is extended". ;-)
> > >
> > > Sorry, couldn't resist...
> >
> > Of course you couldn't ;)
> >
> > >
> > > > I wonder if we should change "rcu_cpu_ignore()" for "rcu_eqs_enter()"
> > > > and "rcu_cpu_heed()" for "rcu_eqs_exit()", as IMHO that's much more
> > > > straight forward to understand than trying to wrap you head around what
> > > > a quiescent state is, and why we are entering it or exiting it.
> > > >
> > > > It also flat out explains to people that rcu is not processing that
> > > > current CPU, and things like rcu_read_lock() should not be used.
> > > >
> > > > Then we can say "rcu_cpu_is_ignored()" for things like
> > > > "rcu_is_cpu_eqs()".
> > >
> > > Currently, none of RCU's _eqs functions are exported, so they have
> > > the potential to confuse only people working on the RCU implementation
> > > itself, who had better understand what "eqs" means.
> >
> > Yeah, that's what I thought, and never cared about the "eqs" meaning.
> >
> > >
> > > But I do count your vote against "eqs" appearing in the name of any
> > > function exported by RCU.
> >
> > Right, their shouldn't be any "eqs" functions that are global to users
> > outside of the RCU infrastructure.
> >
> > >
> > > How about if I made rcu_is_cpu_idle() be as follows?
> > >
> > > int rcu_is_cpu_idle(void)
> > > {
> > > int ret;
> > >
> > > ret = (atomic_read(&per_cpu(rcu_dynticks.dynticks,
> > > raw_smp_processor_id())) & 0x1) == 0;
> > > return ret;
> > > }
> > >
> > > This should allow existing uses to function properly and should allow
> > > you to use it as well.
> > >
> >
> > You already said it wont work, but I still would have been against
> > using it, because I wouldn't be checking if rcu thinks the CPU is idle,
> > as NO_HZ_FULL has nothing to do with idle.
>
> OK then, how about the following?
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> rcu: Is it safe to enter an RCU read-side critical section?
>
> There is currently no way for kernel code to determine whether it
> is safe to enter an RCU read-side critical section, in other words,
> whether or not RCU is paying attention to the currently running CPU.
> Given the large and increasing quantity of code shared by the idle loop
> and non-idle code, the this shortcoming is becoming increasingly painful.
>
> This commit therefore adds rcu_watching_this_cpu(), which returns true
> if it is safe to enter an RCU read-side critical section on the currently
> running CPU. This function is quite fast, using only a __this_cpu_read().
> However, the caller must disable preemption.

Hi Paul,

Hopefully I won't be redundant with other prior comments, but how about
the following:

static inline __rcu_read_check(void):
- checks if it is safe to enter a RCU read-side critical section in
the current context.
- requires that the caller disable preemption.

static inline rcu_read_check(void):
- disables preemption and inlines __rcu_read_check().

I don't think it is semantically a good thing to bury the
implementation-specific detail (whether is RCU watched on this
particular CPU) into the API naming. Also, I think the generic version
of this check should require no "special knowledge" from the user, hence
my double-underscores proposal for the optimized version.

Thoughts ?

Thanks,

Mathieu

>
> Reported-by: Steven Rostedt <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 5b444e0..a41eb35 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -261,6 +261,10 @@ static inline void rcu_user_hooks_switch(struct task_struct *prev,
> rcu_irq_exit(); \
> } while (0)
>
> +#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP)
> +extern int rcu_is_cpu_idle(void);
> +#endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP) */
> +
> /*
> * Infrastructure to implement the synchronize_() primitives in
> * TREE_RCU and rcu_barrier_() primitives in TINY_RCU.
> @@ -297,10 +301,6 @@ static inline void destroy_rcu_head_on_stack(struct rcu_head *head)
> }
> #endif /* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
>
> -#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SMP)
> -extern int rcu_is_cpu_idle(void);
> -#endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SMP) */
> -
> #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU)
> bool rcu_lockdep_current_cpu_online(void);
> #else /* #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU) */
> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> index e31005e..67fe672 100644
> --- a/include/linux/rcutiny.h
> +++ b/include/linux/rcutiny.h
> @@ -132,4 +132,13 @@ static inline void rcu_scheduler_starting(void)
> }
> #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>
> +#ifdef CONFIG_RCU_TRACE
> +
> +static inline bool rcu_watching_this_cpu(void)
> +{
> + return !rcu_is_cpu_idle();
> +}
> +
> +#endif /* #ifdef CONFIG_RCU_TRACE */
> +
> #endif /* __LINUX_RCUTINY_H */
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 226169d..c605b41 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -90,4 +90,6 @@ extern void exit_rcu(void);
> extern void rcu_scheduler_starting(void);
> extern int rcu_scheduler_active __read_mostly;
>
> +extern bool rcu_watching_this_cpu(void);
> +
> #endif /* __LINUX_RCUTREE_H */
> diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
> index 7e3b0d6..b14701f 100644
> --- a/kernel/rcutiny.c
> +++ b/kernel/rcutiny.c
> @@ -176,7 +176,7 @@ void rcu_irq_enter(void)
> }
> EXPORT_SYMBOL_GPL(rcu_irq_enter);
>
> -#ifdef CONFIG_DEBUG_LOCK_ALLOC
> +#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE)
>
> /*
> * Test whether RCU thinks that the current CPU is idle.
> @@ -187,7 +187,7 @@ int rcu_is_cpu_idle(void)
> }
> EXPORT_SYMBOL(rcu_is_cpu_idle);
>
> -#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> +#endif /* defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) */
>
> /*
> * Test whether the current CPU was interrupted from idle. Nested
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index a06d172..38c6883 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -666,6 +666,19 @@ int rcu_is_cpu_idle(void)
> }
> EXPORT_SYMBOL(rcu_is_cpu_idle);
>
> +/**
> + * rcu_watching_this_cpu - are RCU read-side critical sections safe?
> + *
> + * Return true if RCU is watching the running CPU, which means that
> + * this CPU can safely enter RCU read-side critical sections. Unlike
> + * rcu_is_cpu_idle(), the caller of rcu_watching_this_cpu() must have at
> + * least disabled preemption.
> + */
> +bool rcu_watching_this_cpu(void)
> +{
> + return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
> +}
> +
> #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU)
>
> /*
>

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

2013-09-08 01:55:21

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 09:19:30PM -0400, Mathieu Desnoyers wrote:
> * Paul E. McKenney ([email protected]) wrote:
> > On Fri, Sep 06, 2013 at 02:21:35PM -0400, Steven Rostedt wrote:
> > > On Fri, 6 Sep 2013 10:52:38 -0700
> > > "Paul E. McKenney" <[email protected]> wrote:
> > >
> > > > > What exactly does "extended quiescent state" mean? (Note, that's a
> > > > > rhetorical question)
> > > >
> > > > In which case my rhetorical (and therefore useless) answer has to be
> > > > "it is a quiescent state that is extended". ;-)
> > > >
> > > > Sorry, couldn't resist...
> > >
> > > Of course you couldn't ;)
> > >
> > > >
> > > > > I wonder if we should change "rcu_cpu_ignore()" for "rcu_eqs_enter()"
> > > > > and "rcu_cpu_heed()" for "rcu_eqs_exit()", as IMHO that's much more
> > > > > straight forward to understand than trying to wrap you head around what
> > > > > a quiescent state is, and why we are entering it or exiting it.
> > > > >
> > > > > It also flat out explains to people that rcu is not processing that
> > > > > current CPU, and things like rcu_read_lock() should not be used.
> > > > >
> > > > > Then we can say "rcu_cpu_is_ignored()" for things like
> > > > > "rcu_is_cpu_eqs()".
> > > >
> > > > Currently, none of RCU's _eqs functions are exported, so they have
> > > > the potential to confuse only people working on the RCU implementation
> > > > itself, who had better understand what "eqs" means.
> > >
> > > Yeah, that's what I thought, and never cared about the "eqs" meaning.
> > >
> > > >
> > > > But I do count your vote against "eqs" appearing in the name of any
> > > > function exported by RCU.
> > >
> > > Right, their shouldn't be any "eqs" functions that are global to users
> > > outside of the RCU infrastructure.
> > >
> > > >
> > > > How about if I made rcu_is_cpu_idle() be as follows?
> > > >
> > > > int rcu_is_cpu_idle(void)
> > > > {
> > > > int ret;
> > > >
> > > > ret = (atomic_read(&per_cpu(rcu_dynticks.dynticks,
> > > > raw_smp_processor_id())) & 0x1) == 0;
> > > > return ret;
> > > > }
> > > >
> > > > This should allow existing uses to function properly and should allow
> > > > you to use it as well.
> > > >
> > >
> > > You already said it wont work, but I still would have been against
> > > using it, because I wouldn't be checking if rcu thinks the CPU is idle,
> > > as NO_HZ_FULL has nothing to do with idle.
> >
> > OK then, how about the following?
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > rcu: Is it safe to enter an RCU read-side critical section?
> >
> > There is currently no way for kernel code to determine whether it
> > is safe to enter an RCU read-side critical section, in other words,
> > whether or not RCU is paying attention to the currently running CPU.
> > Given the large and increasing quantity of code shared by the idle loop
> > and non-idle code, the this shortcoming is becoming increasingly painful.
> >
> > This commit therefore adds rcu_watching_this_cpu(), which returns true
> > if it is safe to enter an RCU read-side critical section on the currently
> > running CPU. This function is quite fast, using only a __this_cpu_read().
> > However, the caller must disable preemption.
>
> Hi Paul,
>
> Hopefully I won't be redundant with other prior comments, but how about
> the following:

The naming discussion has been a bit fraught, so I was focusing initially
on the functionality. Which still isn't quite right.

> static inline __rcu_read_check(void):
> - checks if it is safe to enter a RCU read-side critical section in
> the current context.
> - requires that the caller disable preemption.
>
> static inline rcu_read_check(void):
> - disables preemption and inlines __rcu_read_check().

My fear, rightly or wrongly, is that this would get conflated with
rcu_dereference_check().

> I don't think it is semantically a good thing to bury the
> implementation-specific detail (whether is RCU watched on this
> particular CPU) into the API naming. Also, I think the generic version
> of this check should require no "special knowledge" from the user, hence
> my double-underscores proposal for the optimized version.

I do agree that the "__" vs. no-"__" distinction does make a lot of sense.
Especially now that I have them checking the same condition. Which they
need to be, or Steve might get false positives when tracing some of
RCU's functions on the path to/from idle.

That said, I would say that "RCU watching this CPU" is not so much
implementation-specific as it is the semantic meaning. I could imagine
other variations on this theme, such as "CPU under RCU control",
"CPU safe for RCU readers", and so on, but "RCU watching this CPU"
seems a bit more direct.

But please tell me more about what you have in mind here.

Thanx, Paul

> Thoughts ?
>
> Thanks,
>
> Mathieu
>
> >
> > Reported-by: Steven Rostedt <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
> >
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 5b444e0..a41eb35 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -261,6 +261,10 @@ static inline void rcu_user_hooks_switch(struct task_struct *prev,
> > rcu_irq_exit(); \
> > } while (0)
> >
> > +#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP)
> > +extern int rcu_is_cpu_idle(void);
> > +#endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP) */
> > +
> > /*
> > * Infrastructure to implement the synchronize_() primitives in
> > * TREE_RCU and rcu_barrier_() primitives in TINY_RCU.
> > @@ -297,10 +301,6 @@ static inline void destroy_rcu_head_on_stack(struct rcu_head *head)
> > }
> > #endif /* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
> >
> > -#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SMP)
> > -extern int rcu_is_cpu_idle(void);
> > -#endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SMP) */
> > -
> > #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU)
> > bool rcu_lockdep_current_cpu_online(void);
> > #else /* #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU) */
> > diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> > index e31005e..67fe672 100644
> > --- a/include/linux/rcutiny.h
> > +++ b/include/linux/rcutiny.h
> > @@ -132,4 +132,13 @@ static inline void rcu_scheduler_starting(void)
> > }
> > #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> >
> > +#ifdef CONFIG_RCU_TRACE
> > +
> > +static inline bool rcu_watching_this_cpu(void)
> > +{
> > + return !rcu_is_cpu_idle();
> > +}
> > +
> > +#endif /* #ifdef CONFIG_RCU_TRACE */
> > +
> > #endif /* __LINUX_RCUTINY_H */
> > diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> > index 226169d..c605b41 100644
> > --- a/include/linux/rcutree.h
> > +++ b/include/linux/rcutree.h
> > @@ -90,4 +90,6 @@ extern void exit_rcu(void);
> > extern void rcu_scheduler_starting(void);
> > extern int rcu_scheduler_active __read_mostly;
> >
> > +extern bool rcu_watching_this_cpu(void);
> > +
> > #endif /* __LINUX_RCUTREE_H */
> > diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
> > index 7e3b0d6..b14701f 100644
> > --- a/kernel/rcutiny.c
> > +++ b/kernel/rcutiny.c
> > @@ -176,7 +176,7 @@ void rcu_irq_enter(void)
> > }
> > EXPORT_SYMBOL_GPL(rcu_irq_enter);
> >
> > -#ifdef CONFIG_DEBUG_LOCK_ALLOC
> > +#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE)
> >
> > /*
> > * Test whether RCU thinks that the current CPU is idle.
> > @@ -187,7 +187,7 @@ int rcu_is_cpu_idle(void)
> > }
> > EXPORT_SYMBOL(rcu_is_cpu_idle);
> >
> > -#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> > +#endif /* defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) */
> >
> > /*
> > * Test whether the current CPU was interrupted from idle. Nested
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index a06d172..38c6883 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -666,6 +666,19 @@ int rcu_is_cpu_idle(void)
> > }
> > EXPORT_SYMBOL(rcu_is_cpu_idle);
> >
> > +/**
> > + * rcu_watching_this_cpu - are RCU read-side critical sections safe?
> > + *
> > + * Return true if RCU is watching the running CPU, which means that
> > + * this CPU can safely enter RCU read-side critical sections. Unlike
> > + * rcu_is_cpu_idle(), the caller of rcu_watching_this_cpu() must have at
> > + * least disabled preemption.
> > + */
> > +bool rcu_watching_this_cpu(void)
> > +{
> > + return !!__this_cpu_read(rcu_dynticks.dynticks_nesting);
> > +}
> > +
> > #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU)
> >
> > /*
> >
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>

2013-09-09 10:54:24

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 08:59:29PM +0200, Frederic Weisbecker wrote:
> Imagine that you're running on an rcu read side critical section on CPU 0, which
> is not in extended quiescent state. Now you get preempted in the middle of your
> RCU read side critical section (you called rcu_read_lock() but not yet rcu_read_unlock()).
>
> Later on, the task is woken up to be scheduled in CPU 1. If CPU 1 is in extended
> quiescent state because it runs is userspace, it receives a scheduler IPI,
> then schedule_user() is called by the end of the interrupt and in turns calls rcu_user_exit()
> before the task is resumed to the code it was running on CPU 0, in the middle of
> the rcu read side extended quiescent state.
>
> See, the key here is the rcu_user_exit() that restore the CPU on RCU's state machine.
> There are other possible scheduler entrypoints when a CPU runs in user extended quiescent
> state: exception and syscall entries or even preempt_schedule_irq() in case we receive an irq
> in the kernel while we haven't yet reached the call to rcu_user_exit()... All of these should
> be covered, otherwise you bet RCU would be prompt to warn.
>
> That's why when we call rcu_is_cpu_idle() from an RCU read side critical section, it's legit even
> if we can be preempted anytime around it.
> And preempt_disable() is probably not even necessary, except perhaps if __get_cpu_var() itself
> relies on non-preemptibility for its own correctness on the address calculation.

I've tried reading that trice now, still not making much sense.

In any case rcu_is_cpu_idle() is complete bollocks, either use
__raw_get_cpu_var() and add a _coherent_ explanation for why its right,
or its broken.

In any case the preempt_disable/enable pair there is just plain wrong as
Eric pointed out.

2013-09-09 10:56:32

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Fri, Sep 06, 2013 at 10:52:38AM -0700, Paul E. McKenney wrote:
> How about if I made rcu_is_cpu_idle() be as follows?
>
> int rcu_is_cpu_idle(void)
> {
> int ret;
>
> ret = (atomic_read(&per_cpu(rcu_dynticks.dynticks,
> raw_smp_processor_id())) & 0x1) == 0;
> return ret;
> }


No, please use __raw_get_cpu_var(), that generates far better code. Also
any such trickery needs a coherent comment.

2013-09-09 12:13:37

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 12:53:47PM +0200, Peter Zijlstra wrote:
> On Fri, Sep 06, 2013 at 08:59:29PM +0200, Frederic Weisbecker wrote:
> > Imagine that you're running on an rcu read side critical section on CPU 0, which
> > is not in extended quiescent state. Now you get preempted in the middle of your
> > RCU read side critical section (you called rcu_read_lock() but not yet rcu_read_unlock()).
> >
> > Later on, the task is woken up to be scheduled in CPU 1. If CPU 1 is in extended
> > quiescent state because it runs is userspace, it receives a scheduler IPI,
> > then schedule_user() is called by the end of the interrupt and in turns calls rcu_user_exit()
> > before the task is resumed to the code it was running on CPU 0, in the middle of
> > the rcu read side extended quiescent state.
> >
> > See, the key here is the rcu_user_exit() that restore the CPU on RCU's state machine.
> > There are other possible scheduler entrypoints when a CPU runs in user extended quiescent
> > state: exception and syscall entries or even preempt_schedule_irq() in case we receive an irq
> > in the kernel while we haven't yet reached the call to rcu_user_exit()... All of these should
> > be covered, otherwise you bet RCU would be prompt to warn.
> >
> > That's why when we call rcu_is_cpu_idle() from an RCU read side critical section, it's legit even
> > if we can be preempted anytime around it.
> > And preempt_disable() is probably not even necessary, except perhaps if __get_cpu_var() itself
> > relies on non-preemptibility for its own correctness on the address calculation.
>
> I've tried reading that trice now, still not making much sense.
>
> In any case rcu_is_cpu_idle() is complete bollocks, either use
> __raw_get_cpu_var() and add a _coherent_ explanation for why its right,
> or its broken.
>
> In any case the preempt_disable/enable pair there is just plain wrong as
> Eric pointed out.

Check this:

34240697d619c439c55f21989680024dcb604aab "rcu: Disable preemption in rcu_is_cpu_idle()"

2013-09-09 12:39:31

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 14:13:31 +0200
Frederic Weisbecker <[email protected]> wrote:


> > In any case the preempt_disable/enable pair there is just plain wrong as
> > Eric pointed out.
>
> Check this:
>
> 34240697d619c439c55f21989680024dcb604aab "rcu: Disable preemption in rcu_is_cpu_idle()"


Ug, and that patch does nothing to fix the bug that it reported!

1. Task A on CPU 1 enters rcu_is_cpu_idle() and picks up the
pointer to CPU 1's per-CPU variables.

2. Task B preempts Task A and starts running on CPU 1.

Let's say that B preempts Task A here:

preempt_disable();
ret = (atomic_read(&__get_cpu_var(rcu_dynticks).dynticks) & 0x1) == 0;
preempt_enable();
<preempt>
return ret;


3. Task A migrates to CPU 2.

4. Task B blocks, leaving CPU 1 idle.

5. Task A continues execution on CPU 2, accessing CPU 1's
dyntick-idle information using the pointer fetched in step 1 above,
and finds that CPU 1 is idle.

Yeah, and Task A is using the "ret" from CPU 1!

6. Task A therefore incorrectly concludes that it is executing in
an extended quiescent state, possibly issuing a spurious splat.

Therefore, this commit disables preemption within the
rcu_is_cpu_idle() function.

Where this commit is totally bogus. Sorry, but it is.

This just proves that the caller of rcu_is_cpu_idle() must disable
preemption itself for the entire time that it needs to use the result
of rcu_is_cpu_idle().

-- Steve

2013-09-09 12:45:53

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 08:39:26AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 14:13:31 +0200
> Frederic Weisbecker <[email protected]> wrote:
>
>
> > > In any case the preempt_disable/enable pair there is just plain wrong as
> > > Eric pointed out.
> >
> > Check this:
> >
> > 34240697d619c439c55f21989680024dcb604aab "rcu: Disable preemption in rcu_is_cpu_idle()"
>
>
> Ug, and that patch does nothing to fix the bug that it reported!
>
> 1. Task A on CPU 1 enters rcu_is_cpu_idle() and picks up the
> pointer to CPU 1's per-CPU variables.
>
> 2. Task B preempts Task A and starts running on CPU 1.
>
> Let's say that B preempts Task A here:
>
> preempt_disable();
> ret = (atomic_read(&__get_cpu_var(rcu_dynticks).dynticks) & 0x1) == 0;
> preempt_enable();
> <preempt>
> return ret;
>
>
> 3. Task A migrates to CPU 2.
>
> 4. Task B blocks, leaving CPU 1 idle.
>
> 5. Task A continues execution on CPU 2, accessing CPU 1's
> dyntick-idle information using the pointer fetched in step 1 above,
> and finds that CPU 1 is idle.
>
> Yeah, and Task A is using the "ret" from CPU 1!
>
> 6. Task A therefore incorrectly concludes that it is executing in
> an extended quiescent state, possibly issuing a spurious splat.
>
> Therefore, this commit disables preemption within the
> rcu_is_cpu_idle() function.
>
> Where this commit is totally bogus. Sorry, but it is.
>
> This just proves that the caller of rcu_is_cpu_idle() must disable
> preemption itself for the entire time that it needs to use the result
> of rcu_is_cpu_idle().

Sorry, I don't understand your point here. What's wrong with checking the
ret from another CPU?

2013-09-09 12:55:09

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 14:45:49 +0200
Frederic Weisbecker <[email protected]> wrote:


> > This just proves that the caller of rcu_is_cpu_idle() must disable
> > preemption itself for the entire time that it needs to use the result
> > of rcu_is_cpu_idle().
>
> Sorry, I don't understand your point here. What's wrong with checking the
> ret from another CPU?

Hmm, OK, this is why that code is in desperate need of a comment.

>From reading the context a bit more, it seems that the per cpu value is
more a "per task" value that happens to be using per cpu variables, and
changes on context switches. Is that correct?

Anyway, it requires a comment to explain that we are not checking the
CPU state, but really the current task state, otherwise that 'ret'
value wouldn't travel with the task, but would stick with the CPU.

-- Steve

2013-09-09 13:08:58

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 08:55:04AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 14:45:49 +0200
> Frederic Weisbecker <[email protected]> wrote:
>
>
> > > This just proves that the caller of rcu_is_cpu_idle() must disable
> > > preemption itself for the entire time that it needs to use the result
> > > of rcu_is_cpu_idle().
> >
> > Sorry, I don't understand your point here. What's wrong with checking the
> > ret from another CPU?
>
> Hmm, OK, this is why that code is in desperate need of a comment.
>
> From reading the context a bit more, it seems that the per cpu value is
> more a "per task" value that happens to be using per cpu variables, and
> changes on context switches. Is that correct?

Yeah that's probably what confuse so many people. It's indeed at the same
time a task state and a per cpu state.

Pretty much like tsk->ti->preempt_count that people now try to implement
through a per cpu value.

>
> Anyway, it requires a comment to explain that we are not checking the
> CPU state, but really the current task state, otherwise that 'ret'
> value wouldn't travel with the task, but would stick with the CPU.

Yeah that desperately need some comment. I'll try to output something
once I'm done with the perf comm stuff.

> -- Steve
>

2013-09-09 13:17:41

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 08:55:04AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 14:45:49 +0200
> Frederic Weisbecker <[email protected]> wrote:
>
>
> > > This just proves that the caller of rcu_is_cpu_idle() must disable
> > > preemption itself for the entire time that it needs to use the result
> > > of rcu_is_cpu_idle().
> >
> > Sorry, I don't understand your point here. What's wrong with checking the
> > ret from another CPU?
>
> Hmm, OK, this is why that code is in desperate need of a comment.
>
> From reading the context a bit more, it seems that the per cpu value is
> more a "per task" value that happens to be using per cpu variables, and
> changes on context switches. Is that correct?
>
> Anyway, it requires a comment to explain that we are not checking the
> CPU state, but really the current task state, otherwise that 'ret'
> value wouldn't travel with the task, but would stick with the CPU.

Egads.. and the only reason we couldn't do the immediate load is because
of that atomic mess.

Also, if its per-task, why don't we have this in the task struct? The
current scheme makes the context switch more expensive -- is this the
right trade-off?

So maybe something like:

int rcu_is_cpu_idle(void)
{
/*
* Comment explaining that rcu_dynticks.dynticks really is a
* per-task something and we need preemption-safe loading.
*/
atomic_t dynticks = this_cpu_read(rcu_dynticks.dynticks);
return !(__atomic_read(&dynticks) & 0x01);
}

Where __atomic_read() would be like atomic_read() but without the
volatile crap since that's entirely redundant here I think.

The this_cpu_read() should ensure we get a preemption-safe copy of the
value.

Once that this_cpu stuff grows preemption checks we'd need something
like __raw_this_cpu_read() or whatever the variant without preemption
checks will be called.

2013-09-09 13:21:46

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 15:08:53 +0200
Frederic Weisbecker <[email protected]> wrote:

> On Mon, Sep 09, 2013 at 08:55:04AM -0400, Steven Rostedt wrote:

> > From reading the context a bit more, it seems that the per cpu value is
> > more a "per task" value that happens to be using per cpu variables, and
> > changes on context switches. Is that correct?
>
> Yeah that's probably what confuse so many people. It's indeed at the same
> time a task state and a per cpu state.

Especially since the function name itself is "rcu_is_cpu_idle()" which
tells you it's a cpu state, and not a task state. Why would you care in
RCU if CPU 1 is idle if you are on CPU 2? The name should be changed.

>
> Pretty much like tsk->ti->preempt_count that people now try to implement
> through a per cpu value.

Actually, preempt_count is more a CPU state than a task state. When
preempt_count is set, that CPU can not schedule out the current task.
It really has nothing to do with the task itself. It's the CPU that
can't do something. Preempt count should never traverse with a task
from one CPU to another.

-- Steve

2013-09-09 13:24:11

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 12:53:47PM +0200, Peter Zijlstra wrote:
> On Fri, Sep 06, 2013 at 08:59:29PM +0200, Frederic Weisbecker wrote:
> > Imagine that you're running on an rcu read side critical section on CPU 0, which
> > is not in extended quiescent state. Now you get preempted in the middle of your
> > RCU read side critical section (you called rcu_read_lock() but not yet rcu_read_unlock()).
> >
> > Later on, the task is woken up to be scheduled in CPU 1. If CPU 1 is in extended
> > quiescent state because it runs is userspace, it receives a scheduler IPI,
> > then schedule_user() is called by the end of the interrupt and in turns calls rcu_user_exit()
> > before the task is resumed to the code it was running on CPU 0, in the middle of
> > the rcu read side extended quiescent state.
> >
> > See, the key here is the rcu_user_exit() that restore the CPU on RCU's state machine.
> > There are other possible scheduler entrypoints when a CPU runs in user extended quiescent
> > state: exception and syscall entries or even preempt_schedule_irq() in case we receive an irq
> > in the kernel while we haven't yet reached the call to rcu_user_exit()... All of these should
> > be covered, otherwise you bet RCU would be prompt to warn.
> >
> > That's why when we call rcu_is_cpu_idle() from an RCU read side critical section, it's legit even
> > if we can be preempted anytime around it.
> > And preempt_disable() is probably not even necessary, except perhaps if __get_cpu_var() itself
> > relies on non-preemptibility for its own correctness on the address calculation.
>
> I've tried reading that trice now, still not making much sense.

Sorry, Frederic really is describing what is going on here.

And it really does work.

> In any case rcu_is_cpu_idle() is complete bollocks, either use
> __raw_get_cpu_var() and add a _coherent_ explanation for why its right,
> or its broken.

Hmmmm... Adding Christoph Lameter on CC, since he was the one pushing
for the current formulation of that line of rcu_is_cpu_idle().

And guys, I have to say that the advice on which per-CPU primitive to use
varies wildly and randomly. For all I know, each of you individually
might well be sticking to the same story, but taken together, your
collective advice is strongly resembling white noise.

It is not that the primitives themselves are changing that quickly:
__raw_get_cpu_var() has been around for three years.

> In any case the preempt_disable/enable pair there is just plain wrong as
> Eric pointed out.

Peter, in the general case, you are quite correct. But this is a special
case where it really does work.

The key point here is that preemption and migration cannot move a task
from a CPU to which RCU is paying attention to a CPU that RCU is ignoring.
So yes, by the time the task sees the return value from rcu_is_cpu_idle(),
that task might be running on some other CPU. But that is OK, because
if RCU was paying attention to the old CPU, then RCU must also be paying
attention to the new CPU.

Frederic's description gives the details of how this is enforced.

Here is an example of how this works:

1. Some task running on a CPU 0 (which RCU is paying attention to)
calls rcu_is_cpu_idle(), which disables preemption, checks the
per-CPU variable, sets ret to zero, then enables preemption.

At this point, the task is preempted by some high-priority task.

2. CPU 1 is currently idle, so RCU is -not- paying attention to it.
However, it is decided that our low-priority task should migrate
to CPU 1.

3. CPU 1 is sent an IPI, which forces this CPU out of idle. This
causes rcu_idle_exit() to be called, which causes RCU to start
paying attention to CPU 1.

4. CPU 1 switches to the low-priority task, which now sees the
return value of rcu_is_cpu_idle(). Now, this return value did
in fact reflect the old state of CPU 0, and the state of CPU 0
might have changed. (For example, the high-priority task might
have blocked, so that CPU 0 is now idle, which in turn would
mean that RCU is no longer paying attention to it, so that
if rcu_is_cpu_idle() was called right now, it would return
true rather than the false return computed in step 1 above.)

5. But that is OK. Because of the way RCU and idle interact,
if a call from a given task to rcu_is_cpu_idle() returned false
some time in the past, a call from that same task will also
return false right now.

So yes, in general it is wrong to disable preemption, grab the value
of a per-CPU variable, re-enable preemption, and then return the result.
But there are a number of special cases where it is OK, and this is
one of them.

Thanx, Paul

2013-09-09 13:29:08

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 03:14:52PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 09, 2013 at 08:55:04AM -0400, Steven Rostedt wrote:
> > On Mon, 9 Sep 2013 14:45:49 +0200
> > Frederic Weisbecker <[email protected]> wrote:
> >
> >
> > > > This just proves that the caller of rcu_is_cpu_idle() must disable
> > > > preemption itself for the entire time that it needs to use the result
> > > > of rcu_is_cpu_idle().
> > >
> > > Sorry, I don't understand your point here. What's wrong with checking the
> > > ret from another CPU?
> >
> > Hmm, OK, this is why that code is in desperate need of a comment.
> >
> > From reading the context a bit more, it seems that the per cpu value is
> > more a "per task" value that happens to be using per cpu variables, and
> > changes on context switches. Is that correct?
> >
> > Anyway, it requires a comment to explain that we are not checking the
> > CPU state, but really the current task state, otherwise that 'ret'
> > value wouldn't travel with the task, but would stick with the CPU.
>
> Egads.. and the only reason we couldn't do the immediate load is because
> of that atomic mess.
>
> Also, if its per-task, why don't we have this in the task struct? The
> current scheme makes the context switch more expensive -- is this the
> right trade-off?

No, putting that on the task_struct won't help much in this regard I think.
Regular schedule() calls don't change that per cpu state.

Only preempt_schedule_irq() and schedule_user() are concerned with rcu eqs state
exit/restore. But still storing that on task struct won't help.

>
> So maybe something like:
>
> int rcu_is_cpu_idle(void)
> {
> /*
> * Comment explaining that rcu_dynticks.dynticks really is a
> * per-task something and we need preemption-safe loading.
> */
> atomic_t dynticks = this_cpu_read(rcu_dynticks.dynticks);
> return !(__atomic_read(&dynticks) & 0x01);
> }
>
> Where __atomic_read() would be like atomic_read() but without the
> volatile crap since that's entirely redundant here I think.
>
> The this_cpu_read() should ensure we get a preemption-safe copy of the
> value.
>
> Once that this_cpu stuff grows preemption checks we'd need something
> like __raw_this_cpu_read() or whatever the variant without preemption
> checks will be called.

Yeah I thought about using this_cpu_read() too, lets wait for the preemption
checks to get in.

2013-09-09 13:29:20

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 09:21:42AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 15:08:53 +0200
> Frederic Weisbecker <[email protected]> wrote:
>
> > On Mon, Sep 09, 2013 at 08:55:04AM -0400, Steven Rostedt wrote:
>
> > > From reading the context a bit more, it seems that the per cpu value is
> > > more a "per task" value that happens to be using per cpu variables, and
> > > changes on context switches. Is that correct?
> >
> > Yeah that's probably what confuse so many people. It's indeed at the same
> > time a task state and a per cpu state.
>
> Especially since the function name itself is "rcu_is_cpu_idle()" which
> tells you it's a cpu state, and not a task state. Why would you care in
> RCU if CPU 1 is idle if you are on CPU 2? The name should be changed.

I could call it rcu_watching_this_cpu(), and rename the current
rcu_watching_this_cpu() to __rcu_watching_this_cpu().

It should be possible to make a straightforward comment that helps.
I will let Frederic take first crack at it.

> > Pretty much like tsk->ti->preempt_count that people now try to implement
> > through a per cpu value.
>
> Actually, preempt_count is more a CPU state than a task state. When
> preempt_count is set, that CPU can not schedule out the current task.
> It really has nothing to do with the task itself. It's the CPU that
> can't do something. Preempt count should never traverse with a task
> from one CPU to another.

And this is similar to what is happening with rcu_is_cpu_idle(). The
rcu_dynticks.dynticks per-CPU variable cannot transition between zero
and non-zero while a given non-idle task is running. So what about
the idle tasks? Well, they run with preemption disabled, so they can
safely access per-CPU variables.

Thanx, Paul

2013-09-09 13:29:39

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 09:21:42 -0400
Steven Rostedt <[email protected]> wrote:


> Especially since the function name itself is "rcu_is_cpu_idle()" which
> tells you it's a cpu state, and not a task state. Why would you care in
> RCU if CPU 1 is idle if you are on CPU 2? The name should be changed.

> Actually, preempt_count is more a CPU state than a task state. When
> preempt_count is set, that CPU can not schedule out the current task.
> It really has nothing to do with the task itself. It's the CPU that
> can't do something. Preempt count should never traverse with a task
> from one CPU to another.

I'll take this a step further. Here's a simple rule to determine if
something is a task state or a CPU state.

If the state migrates with a task from one CPU to another, it's a task
state.

If the state never leaves a CPU with a task, then it's a CPU state.

According to the above rules, rcu_is_cpu_idle() is a task state, and
really should be in task_struct, and preempt_count is a CPU state, and
should be a per_cpu variable.

I think the reason preempt_count never was a per cpu variable, is that
having it in the stack (thread info) made it easier to test in assembly
than having to grab the per cpu info. But I believe it's easier to grab
per cpu info in assembly today than it once was, which is why there is
a push to move preempt_count to per_cpu where it truly belongs.

-- Steve

2013-09-09 13:37:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 06:23:43AM -0700, Paul E. McKenney wrote:
> And guys, I have to say that the advice on which per-CPU primitive to use
> varies wildly and randomly. For all I know, each of you individually
> might well be sticking to the same story, but taken together, your
> collective advice is strongly resembling white noise.

Its partly because cl and I disagree on things. He doesn't seem to care
much about validation and believes that people will not make mistakes
with this stuff.

And partly because I didn't get what was going on. While Frederic's
explanation might be correct it was incomprehensible for me.

2013-09-09 13:37:04

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 06:23:43 -0700
"Paul E. McKenney" <[email protected]> wrote:

> On Mon, Sep 09, 2013 at 12:53:47PM +0200, Peter Zijlstra wrote:
> > On Fri, Sep 06, 2013 at 08:59:29PM +0200, Frederic Weisbecker wrote:
> > > Imagine that you're running on an rcu read side critical section on CPU 0, which
> > > is not in extended quiescent state. Now you get preempted in the middle of your
> > > RCU read side critical section (you called rcu_read_lock() but not yet rcu_read_unlock()).
> > >
> > > Later on, the task is woken up to be scheduled in CPU 1. If CPU 1 is in extended
> > > quiescent state because it runs is userspace, it receives a scheduler IPI,
> > > then schedule_user() is called by the end of the interrupt and in turns calls rcu_user_exit()
> > > before the task is resumed to the code it was running on CPU 0, in the middle of
> > > the rcu read side extended quiescent state.
> > >
> > > See, the key here is the rcu_user_exit() that restore the CPU on RCU's state machine.
> > > There are other possible scheduler entrypoints when a CPU runs in user extended quiescent
> > > state: exception and syscall entries or even preempt_schedule_irq() in case we receive an irq
> > > in the kernel while we haven't yet reached the call to rcu_user_exit()... All of these should
> > > be covered, otherwise you bet RCU would be prompt to warn.
> > >
> > > That's why when we call rcu_is_cpu_idle() from an RCU read side critical section, it's legit even
> > > if we can be preempted anytime around it.
> > > And preempt_disable() is probably not even necessary, except perhaps if __get_cpu_var() itself
> > > relies on non-preemptibility for its own correctness on the address calculation.
> >
> > I've tried reading that trice now, still not making much sense.
>
> Sorry, Frederic really is describing what is going on here.
>
> And it really does work.
>
> > In any case rcu_is_cpu_idle() is complete bollocks, either use
> > __raw_get_cpu_var() and add a _coherent_ explanation for why its right,
> > or its broken.
>
> Hmmmm... Adding Christoph Lameter on CC, since he was the one pushing
> for the current formulation of that line of rcu_is_cpu_idle().
>
> And guys, I have to say that the advice on which per-CPU primitive to use
> varies wildly and randomly. For all I know, each of you individually
> might well be sticking to the same story, but taken together, your
> collective advice is strongly resembling white noise.
>
> It is not that the primitives themselves are changing that quickly:
> __raw_get_cpu_var() has been around for three years.
>
> > In any case the preempt_disable/enable pair there is just plain wrong as
> > Eric pointed out.
>
> Peter, in the general case, you are quite correct. But this is a special
> case where it really does work.
>
> The key point here is that preemption and migration cannot move a task
> from a CPU to which RCU is paying attention to a CPU that RCU is ignoring.
> So yes, by the time the task sees the return value from rcu_is_cpu_idle(),
> that task might be running on some other CPU. But that is OK, because
> if RCU was paying attention to the old CPU, then RCU must also be paying
> attention to the new CPU.
>
> Frederic's description gives the details of how this is enforced.
>
> Here is an example of how this works:
>
> 1. Some task running on a CPU 0 (which RCU is paying attention to)
> calls rcu_is_cpu_idle(), which disables preemption, checks the
> per-CPU variable, sets ret to zero, then enables preemption.
>
> At this point, the task is preempted by some high-priority task.
>
> 2. CPU 1 is currently idle, so RCU is -not- paying attention to it.
> However, it is decided that our low-priority task should migrate
> to CPU 1.
>
> 3. CPU 1 is sent an IPI, which forces this CPU out of idle. This
> causes rcu_idle_exit() to be called, which causes RCU to start
> paying attention to CPU 1.
>
> 4. CPU 1 switches to the low-priority task, which now sees the
> return value of rcu_is_cpu_idle(). Now, this return value did
> in fact reflect the old state of CPU 0, and the state of CPU 0
> might have changed. (For example, the high-priority task might
> have blocked, so that CPU 0 is now idle, which in turn would
> mean that RCU is no longer paying attention to it, so that
> if rcu_is_cpu_idle() was called right now, it would return
> true rather than the false return computed in step 1 above.)
>
> 5. But that is OK. Because of the way RCU and idle interact,
> if a call from a given task to rcu_is_cpu_idle() returned false
> some time in the past, a call from that same task will also
> return false right now.
>
> So yes, in general it is wrong to disable preemption, grab the value
> of a per-CPU variable, re-enable preemption, and then return the result.
> But there are a number of special cases where it is OK, and this is
> one of them.

It should have been called rcu_is_task_idle() not is_cpu_idle(),
because as you just pointed out, we don't care about the state of the
CPU we are running on, we care about the state of the task.

A comment that said something to the fact of:

/*
* We store the task idle state in a per cpu variable
* of the CPU the task is on.
*/

preempt_disable();
ret = yada_yada();
preempt_enable();

return ret;

-- Steve

2013-09-09 13:37:50

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 09:29:17AM -0400, Steven Rostedt wrote:
> But I believe it's easier to grab
> per cpu info in assembly today than it once was, which is why there is
> a push to move preempt_count to per_cpu where it truly belongs.

On some architectures at least. But yes, I should get back to that
stuff.

2013-09-09 13:41:38

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 15:29:02 +0200
Frederic Weisbecker <[email protected]> wrote:


> No, putting that on the task_struct won't help much in this regard I think.
> Regular schedule() calls don't change that per cpu state.

But is there a place that it would need to?

I mean, if RCU is not tracking a CPU, is it safe to call schedule().
And then how would the new task know that RCU is ignoring that CPU?

-- Steve

2013-09-09 13:45:12

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 09:21:42AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 15:08:53 +0200
> Frederic Weisbecker <[email protected]> wrote:
>
> > On Mon, Sep 09, 2013 at 08:55:04AM -0400, Steven Rostedt wrote:
>
> > > From reading the context a bit more, it seems that the per cpu value is
> > > more a "per task" value that happens to be using per cpu variables, and
> > > changes on context switches. Is that correct?
> >
> > Yeah that's probably what confuse so many people. It's indeed at the same
> > time a task state and a per cpu state.
>
> Especially since the function name itself is "rcu_is_cpu_idle()" which
> tells you it's a cpu state, and not a task state. Why would you care in
> RCU if CPU 1 is idle if you are on CPU 2? The name should be changed.

OTOH it's referring to RCU extended quiescent states that are really per
CPU from the RCU point of view, implementation and design wise.

Perhaps a name that has a meaning close to task_is_in_cpu_extqs().

I'm not sure it makes things clearer though.

>
> >
> > Pretty much like tsk->ti->preempt_count that people now try to implement
> > through a per cpu value.
>
> Actually, preempt_count is more a CPU state than a task state. When
> preempt_count is set, that CPU can not schedule out the current task.
> It really has nothing to do with the task itself. It's the CPU that
> can't do something. Preempt count should never traverse with a task
> from one CPU to another.

Yep.

2013-09-09 13:46:17

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 03:14:52PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 09, 2013 at 08:55:04AM -0400, Steven Rostedt wrote:
> > On Mon, 9 Sep 2013 14:45:49 +0200
> > Frederic Weisbecker <[email protected]> wrote:
> >
> >
> > > > This just proves that the caller of rcu_is_cpu_idle() must disable
> > > > preemption itself for the entire time that it needs to use the result
> > > > of rcu_is_cpu_idle().
> > >
> > > Sorry, I don't understand your point here. What's wrong with checking the
> > > ret from another CPU?
> >
> > Hmm, OK, this is why that code is in desperate need of a comment.
> >
> > From reading the context a bit more, it seems that the per cpu value is
> > more a "per task" value that happens to be using per cpu variables, and
> > changes on context switches. Is that correct?
> >
> > Anyway, it requires a comment to explain that we are not checking the
> > CPU state, but really the current task state, otherwise that 'ret'
> > value wouldn't travel with the task, but would stick with the CPU.
>
> Egads.. and the only reason we couldn't do the immediate load is because
> of that atomic mess.
>
> Also, if its per-task, why don't we have this in the task struct? The
> current scheme makes the context switch more expensive -- is this the
> right trade-off?

There are constraints based on the task, but RCU really is
paying attention to CPUs, not than tasks. (With the exception of
TREE_PREEMPT_RCU, which does keep lists of tasks that it has to pay
attention to, namely those that have been preempted within their current
RCU read-side critical section.)

I suppose that we could move it to the task structure, but that would
make for some "interesting" interactions between context switch and RCU.
Given previous experience with this sort of thing, I do not believe that
this would be a win.

Now, it might well be possible to make rcu_dynticks.dynticks be
manipulated by non-atomic operations, and that is on my list, but it
will require quite a bit of care and testing.

> So maybe something like:
>
> int rcu_is_cpu_idle(void)
> {
> /*
> * Comment explaining that rcu_dynticks.dynticks really is a
> * per-task something and we need preemption-safe loading.
> */
> atomic_t dynticks = this_cpu_read(rcu_dynticks.dynticks);
> return !(__atomic_read(&dynticks) & 0x01);
> }
>
> Where __atomic_read() would be like atomic_read() but without the
> volatile crap since that's entirely redundant here I think.
>
> The this_cpu_read() should ensure we get a preemption-safe copy of the
> value.
>
> Once that this_cpu stuff grows preemption checks we'd need something
> like __raw_this_cpu_read() or whatever the variant without preemption
> checks will be called.

Yay! Yet another formulation of the per-CPU code for this function! ;-)

Thanx, Paul

2013-09-09 13:49:25

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 09:41:32AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 15:29:02 +0200
> Frederic Weisbecker <[email protected]> wrote:
>
>
> > No, putting that on the task_struct won't help much in this regard I think.
> > Regular schedule() calls don't change that per cpu state.
>
> But is there a place that it would need to?

I don't have any in mind.

>
> I mean, if RCU is not tracking a CPU, is it safe to call schedule().

Nope, a CPU is not allowed to call schedule() if RCU is not tracking it.

> And then how would the new task know that RCU is ignoring that CPU?

On return from schedule(), it's (supposed to be) guaranteed that the CPU
is always tracked by RCU.

>
> -- Steve
>

2013-09-09 13:49:41

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 09:29:17AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 09:21:42 -0400
> Steven Rostedt <[email protected]> wrote:
>
>
> > Especially since the function name itself is "rcu_is_cpu_idle()" which
> > tells you it's a cpu state, and not a task state. Why would you care in
> > RCU if CPU 1 is idle if you are on CPU 2? The name should be changed.
>
> > Actually, preempt_count is more a CPU state than a task state. When
> > preempt_count is set, that CPU can not schedule out the current task.
> > It really has nothing to do with the task itself. It's the CPU that
> > can't do something. Preempt count should never traverse with a task
> > from one CPU to another.
>
> I'll take this a step further. Here's a simple rule to determine if
> something is a task state or a CPU state.
>
> If the state migrates with a task from one CPU to another, it's a task
> state.
>
> If the state never leaves a CPU with a task, then it's a CPU state.
>
> According to the above rules, rcu_is_cpu_idle() is a task state, and
> really should be in task_struct, and preempt_count is a CPU state, and
> should be a per_cpu variable.

Ahem. The rcu_dynticks.dynticks field really is per-CPU state: it is
tracking whether or not RCU is paying attention to the corresponding
-CPU-, not to any particular task. When RCU wants to track tasks, it
does so with the blkd_tasks field of the rcu_node structure.

Thanx, Paul

> I think the reason preempt_count never was a per cpu variable, is that
> having it in the stack (thread info) made it easier to test in assembly
> than having to grab the per cpu info. But I believe it's easier to grab
> per cpu info in assembly today than it once was, which is why there is
> a push to move preempt_count to per_cpu where it truly belongs.
>
> -- Steve
>

2013-09-09 13:50:49

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 09:41:32AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 15:29:02 +0200
> Frederic Weisbecker <[email protected]> wrote:
>
>
> > No, putting that on the task_struct won't help much in this regard I think.
> > Regular schedule() calls don't change that per cpu state.
>
> But is there a place that it would need to?
>
> I mean, if RCU is not tracking a CPU, is it safe to call schedule().
> And then how would the new task know that RCU is ignoring that CPU?

Guys, RCU really is tracking per-CPU state with the rcu_dynticks.dynticks
field.

Thanx, Paul

2013-09-09 13:53:43

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 03:36:04PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 09, 2013 at 06:23:43AM -0700, Paul E. McKenney wrote:
> > And guys, I have to say that the advice on which per-CPU primitive to use
> > varies wildly and randomly. For all I know, each of you individually
> > might well be sticking to the same story, but taken together, your
> > collective advice is strongly resembling white noise.
>
> Its partly because cl and I disagree on things. He doesn't seem to care
> much about validation and believes that people will not make mistakes
> with this stuff.

At some point, my only recourse would be to require an Acked-by from one
of you for any per-CPU changes proposed by the other. I really hope that
it doesn't come to that, but this situation is getting a bit annoying.

> And partly because I didn't get what was going on. While Frederic's
> explanation might be correct it was incomprehensible for me.

And I freely admit that the comments on rcu_is_cpu_idle() are completely
inadequate, and I do apologize for that. Seemed clear at the time.
But then it always does, doesn't it? ;-)

Thanx, Paul

2013-09-09 13:55:18

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 06:46:05 -0700
"Paul E. McKenney" <[email protected]> wrote:


> > Also, if its per-task, why don't we have this in the task struct? The
> > current scheme makes the context switch more expensive -- is this the
> > right trade-off?
>
> There are constraints based on the task, but RCU really is
> paying attention to CPUs, not than tasks. (With the exception of
> TREE_PREEMPT_RCU, which does keep lists of tasks that it has to pay
> attention to, namely those that have been preempted within their current
> RCU read-side critical section.)

Conceptually wise, RCU keeps track of task state, not CPU state. In all
your diagrams in your presentations, where you talk about grace periods
and quiescent states, you show tasks, not CPUs.

RCU's implementation is based on CPUs, and only when rcu_read_lock()
prevents preemption. As you stated above, TREE_PREEMPT_RCU needs to
keep track of tasks.

I think you are too deep into the implementation, that you are
forgetting the concept that you created :-)

-- Steve

2013-09-09 13:57:09

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 03:45:06PM +0200, Frederic Weisbecker wrote:
> On Mon, Sep 09, 2013 at 09:21:42AM -0400, Steven Rostedt wrote:
> > On Mon, 9 Sep 2013 15:08:53 +0200
> > Frederic Weisbecker <[email protected]> wrote:
> >
> > > On Mon, Sep 09, 2013 at 08:55:04AM -0400, Steven Rostedt wrote:
> >
> > > > From reading the context a bit more, it seems that the per cpu value is
> > > > more a "per task" value that happens to be using per cpu variables, and
> > > > changes on context switches. Is that correct?
> > >
> > > Yeah that's probably what confuse so many people. It's indeed at the same
> > > time a task state and a per cpu state.
> >
> > Especially since the function name itself is "rcu_is_cpu_idle()" which
> > tells you it's a cpu state, and not a task state. Why would you care in
> > RCU if CPU 1 is idle if you are on CPU 2? The name should be changed.
>
> OTOH it's referring to RCU extended quiescent states that are really per
> CPU from the RCU point of view, implementation and design wise.
>
> Perhaps a name that has a meaning close to task_is_in_cpu_extqs().
>
> I'm not sure it makes things clearer though.

Indeed, there is on ongoing naming debate as well. About the only point
of agreement thus far is that the current names are inadequate. ;-)

My current feeling is that rcu_is_cpu_idle() should be called
rcu_watching_this_cpu() and what is called rcu_watching_this_cpu()
in my local tree should be called __rcu_watching_this_cpu().


Thanx, Paul

> > > Pretty much like tsk->ti->preempt_count that people now try to implement
> > > through a per cpu value.
> >
> > Actually, preempt_count is more a CPU state than a task state. When
> > preempt_count is set, that CPU can not schedule out the current task.
> > It really has nothing to do with the task itself. It's the CPU that
> > can't do something. Preempt count should never traverse with a task
> > from one CPU to another.
>
> Yep.
>

2013-09-09 14:16:33

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 06:56:56 -0700
"Paul E. McKenney" <[email protected]> wrote:


> Indeed, there is on ongoing naming debate as well. About the only point
> of agreement thus far is that the current names are inadequate. ;-)
>
> My current feeling is that rcu_is_cpu_idle() should be called
> rcu_watching_this_cpu() and what is called rcu_watching_this_cpu()
> in my local tree should be called __rcu_watching_this_cpu().

I disagree. Then it would not make sense if we take a return value of
"__rcu_watching_this_cpu()" and use it on another CPU to make other
decisions for that other CPU.

I still think we are confusing concepts with implementation. Yes, the
RCU implementation tracks CPU state, but the concept is still based on
the task.

But you are right, with dynamic ticks, things get a little more
complex, as dynamic ticks is a CPU state, not a task state, as it can
be something other than the running task that changes the state
(another task gets scheduled on that CPU).

But I think we are coupling RCU a bit too much with dynamic ticks here.
Maybe we need to take a step back to visualize concepts again.

The state of being in dynamic tick mode is determined by what a task or
tasks are doing on the CPU. One of those things is if the task needs to
be tracked by RCU. And here, is where I think we are getting our
confusion from. The dynamic tick state needs to check if the running
task is requiring RCU or not, and thus we ask for "is rcu needed on
this CPU?" when the real question is "is the task running on this CPU
requiring RCU?"

Again, if we keep things in a conceptual mode, and not look too much at
the implementation details, I think more people will understand what's
going on. Especially those that don't know why something was
implemented the way it was.

-- Steve

2013-09-09 14:22:32

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 06:23:43AM -0700, Paul E. McKenney wrote:
> Peter, in the general case, you are quite correct. But this is a special
> case where it really does work.
>
> The key point here is that preemption and migration cannot move a task
> from a CPU to which RCU is paying attention to a CPU that RCU is ignoring.

But there's no constraint placed on the migration mask (aka
task_struct::cpus_allowed) and therefore it can move it thusly.

What you're trying to say is that by the time the task is running on
another cpu, that cpu's state will match the state of the previous cpu,
no?

> So yes, by the time the task sees the return value from rcu_is_cpu_idle(),
> that task might be running on some other CPU. But that is OK, because
> if RCU was paying attention to the old CPU, then RCU must also be paying
> attention to the new CPU.

OK, fair enough.

> Here is an example of how this works:
>
> 1. Some task running on a CPU 0 (which RCU is paying attention to)
> calls rcu_is_cpu_idle(), which disables preemption, checks the
> per-CPU variable, sets ret to zero, then enables preemption.
>
> At this point, the task is preempted by some high-priority task.
>
> 2. CPU 1 is currently idle, so RCU is -not- paying attention to it.
> However, it is decided that our low-priority task should migrate
> to CPU 1.
>
> 3. CPU 1 is sent an IPI, which forces this CPU out of idle. This
> causes rcu_idle_exit() to be called, which causes RCU to start
> paying attention to CPU 1.
>

Just a nit, we typically try to avoid using IPIs to wake idle CPUs,
doesn't change the story much though.

> 4. CPU 1 switches to the low-priority task, which now sees the
> return value of rcu_is_cpu_idle(). Now, this return value did
> in fact reflect the old state of CPU 0, and the state of CPU 0
> might have changed. (For example, the high-priority task might
> have blocked, so that CPU 0 is now idle, which in turn would
> mean that RCU is no longer paying attention to it, so that
> if rcu_is_cpu_idle() was called right now, it would return
> true rather than the false return computed in step 1 above.)
>
> 5. But that is OK. Because of the way RCU and idle interact,
> if a call from a given task to rcu_is_cpu_idle() returned false
> some time in the past, a call from that same task will also
> return false right now.
>
> So yes, in general it is wrong to disable preemption, grab the value
> of a per-CPU variable, re-enable preemption, and then return the result.
> But there are a number of special cases where it is OK, and this is
> one of them.

Right, worthy of comments though :-)

2013-09-09 14:40:44

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 09:29:17AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 09:21:42 -0400
> Steven Rostedt <[email protected]> wrote:
>
>
> > Especially since the function name itself is "rcu_is_cpu_idle()" which
> > tells you it's a cpu state, and not a task state. Why would you care in
> > RCU if CPU 1 is idle if you are on CPU 2? The name should be changed.
>
> > Actually, preempt_count is more a CPU state than a task state. When
> > preempt_count is set, that CPU can not schedule out the current task.
> > It really has nothing to do with the task itself. It's the CPU that
> > can't do something. Preempt count should never traverse with a task
> > from one CPU to another.
>
> I'll take this a step further. Here's a simple rule to determine if
> something is a task state or a CPU state.
>
> If the state migrates with a task from one CPU to another, it's a task
> state.

This applies to preempt_count as well.

>
> If the state never leaves a CPU with a task, then it's a CPU state.

Per CPU and per task property aren't mutually exclusive.
Per task obviously implies per cpu. And per cpu doesn't imply per task.

Taking that into account, when you're dealing with a per task property,
the rest depends on the POV: you can zoom to the lower level and look at
things from the CPU POV. Or you can zoom out and look at thing from the
task POV.

RCU extended quiescent states can be viewed from both POV. RCU just
only cares about the CPU POV.

We don't even need a task to set/unset a CPU to/from extended quiescent state.
It's only a matter of CPU running RCU code. The kernel just happens to use
tasks to run code.

It's a bit the same with spinlocks. spinlocks aren't a task synchronization
but a CPU synchronization. It's low level. Of course a task can't sleep
with a spinlock held (not talking about -rt) so it could be defined as a per
task property. But it's just not relevant.

Mutexes, OTOH, really are task synchronisation. They could be considered from
the CPU view. That's just not relevant either.


>
> According to the above rules, rcu_is_cpu_idle() is a task state, and
> really should be in task_struct, and preempt_count is a CPU state, and
> should be a per_cpu variable.

No, according to you description, preempt count is a task state.
Eventually it just happens to be both. But more relevant from the CPU POV.

And really the fact that a state can be viewed per task doesn't mean it's
always right to store it in the task struct. The semantics also
play a big role here.



>
> I think the reason preempt_count never was a per cpu variable, is that
> having it in the stack (thread info) made it easier to test in assembly
> than having to grab the per cpu info. But I believe it's easier to grab
> per cpu info in assembly today than it once was, which is why there is
> a push to move preempt_count to per_cpu where it truly belongs.
>
> -- Steve

Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013, Peter Zijlstra wrote:

> On Mon, Sep 09, 2013 at 06:23:43AM -0700, Paul E. McKenney wrote:
> > And guys, I have to say that the advice on which per-CPU primitive to use
> > varies wildly and randomly. For all I know, each of you individually
> > might well be sticking to the same story, but taken together, your
> > collective advice is strongly resembling white noise.
>
> Its partly because cl and I disagree on things. He doesn't seem to care
> much about validation and believes that people will not make mistakes
> with this stuff.

Slander. Certainly validation is good. Its just that PREEMPT kernels are
not in use and AFAICT the full preempt stuff requires significant developer
support and complicates the code without much benefit.

Having said that I am trying to support preeempt checks. The newest
__get_cpu_var patchset introduces the preempt checks that you want.

2013-09-09 15:12:29

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 02:49:42PM +0000, Christoph Lameter wrote:
> On Mon, 9 Sep 2013, Peter Zijlstra wrote:
>
> > On Mon, Sep 09, 2013 at 06:23:43AM -0700, Paul E. McKenney wrote:
> > > And guys, I have to say that the advice on which per-CPU primitive to use
> > > varies wildly and randomly. For all I know, each of you individually
> > > might well be sticking to the same story, but taken together, your
> > > collective advice is strongly resembling white noise.
> >
> > Its partly because cl and I disagree on things. He doesn't seem to care
> > much about validation and believes that people will not make mistakes
> > with this stuff.
>
> Slander. Certainly validation is good. Its just that PREEMPT kernels are
> not in use

Complete bullshit, its part of the mainline kernel, lots of people run
them -- including me, and any patch is supposed to keep it working.

> and AFAICT the full preempt stuff requires significant developer
> support and complicates the code without much benefit.

More bullshit, each and every patch submitted must fully support all
preemption modes supported by the kernel. CONFIG_PREEMPT is in, no two
ways about it. Breaking it is a regression and reason to revert stuff.

Therefore every Linux developer supports it per definition. And clearly
the complication is worth it for enough people, otherwise it wouldn't be
there.

The absolute worst part about you is your inability/unwillingness to
look beyond your immediate concern. Please pull head from arse.

> Having said that I am trying to support preeempt checks. The newest
> __get_cpu_var patchset introduces the preempt checks that you want.

Good, I did see some patches from you but haven't come around to looking
at them.

2013-09-09 15:22:03

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 16:40:38 +0200
Frederic Weisbecker <[email protected]> wrote:

> On Mon, Sep 09, 2013 at 09:29:17AM -0400, Steven Rostedt wrote:
> > On Mon, 9 Sep 2013 09:21:42 -0400
> > Steven Rostedt <[email protected]> wrote:
> >
> >
> > > Especially since the function name itself is "rcu_is_cpu_idle()" which
> > > tells you it's a cpu state, and not a task state. Why would you care in
> > > RCU if CPU 1 is idle if you are on CPU 2? The name should be changed.
> >
> > > Actually, preempt_count is more a CPU state than a task state. When
> > > preempt_count is set, that CPU can not schedule out the current task.
> > > It really has nothing to do with the task itself. It's the CPU that
> > > can't do something. Preempt count should never traverse with a task
> > > from one CPU to another.
> >
> > I'll take this a step further. Here's a simple rule to determine if
> > something is a task state or a CPU state.
> >
> > If the state migrates with a task from one CPU to another, it's a task
> > state.
>
> This applies to preempt_count as well.

No it does not, because the very definition of preempt_count means the
task can not migrate. And why would a task want preempt count activated?
Because it is dealing with per cpu data! Again, if we look at the
concept, it's not a task state, its a CPU state, because the reason to
disable preemption is that the task needs to look at stuff specific to
the CPU. There's a reason that we don't want the task to migrate, and
that reason has to do with data specific to the CPU not the task.

>
> >
> > If the state never leaves a CPU with a task, then it's a CPU state.
>
> Per CPU and per task property aren't mutually exclusive.

No, but using my simple rules (and that's why I made them simple). They
can be. :-)

> Per task obviously implies per cpu. And per cpu doesn't imply per task.

I disagree. A state of a task can move with the task when it moves to
another CPU. That means the task state does not imply per cpu state.

For example, a task may be sleeping and not on any CPU, but it still
has that "state", where no CPU has it.

And I can say per cpu implies per task, as a state of the CPU implies
that it is also the current state of the task running on that CPU.

A CPU always has one task on it (idle if nothing is running), thus any
CPU state is also some task state. But a task may not be on any CPU,
thus a task state does not imply that any CPU has that state as well.

>
> Taking that into account, when you're dealing with a per task property,
> the rest depends on the POV: you can zoom to the lower level and look at
> things from the CPU POV. Or you can zoom out and look at thing from the
> task POV.

But I just negated "that account", so this no longer applies.

>
> RCU extended quiescent states can be viewed from both POV. RCU just
> only cares about the CPU POV.

No it does not!

What is a grace period? Some task has access to some data that is about
to change, but a synchronize_rcu() has been executed, and needs to wait
for the grace period to end before it can continue. If that task
sleeps, then there's no CPU that RCU cares about. It cares *only* about
the tasks that are preventing the grace period to finish. Why else do
we implement rcu boosting?

Please, lets focus on the concept of RCU not the implementation.

>
> We don't even need a task to set/unset a CPU to/from extended quiescent state.
> It's only a matter of CPU running RCU code. The kernel just happens to use
> tasks to run code.

We are really confusing dynamic ticks and NO_HZ with RCU here. I hate
the term "extended quiescent state", when it really just means the CPU
is not using RCU anymore. Yes, that is a CPU state. But I think we are
getting things backwards. We are saying its a RCU state, when it really
is a CPU state, and this is due to our implementation and not about RCU
itself.

The CPU can not be in this state if the task does something that
requires accessing RCU critical data. Basically, there's a state that
says the task wont be doing any RCU work (going idle, or going into
userland), and thus, we have an implementation to optimize RCU not to
care about this CPU. The confusion here is that this is an optimization
implementation, and not a "concept" of RCU.

It's really a CPU and task state that say "we don't need RCU, so RCU go
do your business someplace else".

But this is implemented in the RCU system, and thus we are saying that
RCU is dictating what is going on here, when it really is the task and
CPU state that are doing the dictating.

This is why I liked "rcu_ignored()" as it really is the real state of
affairs. We are basically saying that the task, or tasks, running on
the CPU is not going to do any RCU work, so the RCU system can safely
ignore this CPU as an optimization implementation.

And this is where all POVs are getting confused :-)



>
> It's a bit the same with spinlocks. spinlocks aren't a task synchronization
> but a CPU synchronization. It's low level. Of course a task can't sleep
> with a spinlock held (not talking about -rt) so it could be defined as a per
> task property. But it's just not relevant.

Again, this is where we get into trouble. No it is not a CPU
synchronization. We only disable preemption because of implementation
details. Not the concept. A spin lock is only used to protect critical
data, and not to disable preemption. Those that use it to disable
preemption has caused us issues in -rt.

This is again the problem with confusing implementation with concepts.
-rt proved that a spin lock has nothing to do with cpu state, nor
preemption.

>
> Mutexes, OTOH, really are task synchronisation. They could be considered from
> the CPU view. That's just not relevant either.

Mutexes are just a different implementation of a lock. And yes, the
sleeping nature of a mutex makes it have a different concept, but
that's because we then can not use it for places that can not sleep.

>
>
> >
> > According to the above rules, rcu_is_cpu_idle() is a task state, and
> > really should be in task_struct, and preempt_count is a CPU state, and
> > should be a per_cpu variable.
>
> No, according to you description, preempt count is a task state.

No, I already stated that it is not a task state.

> Eventually it just happens to be both. But more relevant from the CPU POV.

No it is not.

>
> And really the fact that a state can be viewed per task doesn't mean it's
> always right to store it in the task struct. The semantics also
> play a big role here.

I'll say that implementation can play a big role. If there's a fast way
to implement something, we need to do it that way, even if one looks at
it and says "hmm, that doesn't match the concept". But those need to be
commented.

I'm not saying that we must follow the concept in our implementation,
I'm saying that names of functions and comments need to be focused on
concept, where as the implementation can focus on optimizations, and
comment where they differ.

For example, preempt_count really is a per cpu state. But because of
issues accessing it in assembly, it was added to a task state. And when
things change like this, we have to add hacks when concept and
implementation differ. For example, when interrupts came in and used a
different stack, we now can not use the stack info to access preempt
count. This hack has changed over the years, but just points out how
things go apart when concept and implementation differ.


-- Steve

Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013, Peter Zijlstra wrote:

> > Slander. Certainly validation is good. Its just that PREEMPT kernels are
> > not in use
>
> Complete bullshit, its part of the mainline kernel, lots of people run
> them -- including me, and any patch is supposed to keep it working.

Nonsense. There is no main line distro that supports full preempt. Its an
academic exercise.

> > and AFAICT the full preempt stuff requires significant developer
> > support and complicates the code without much benefit.
>
> More bullshit, each and every patch submitted must fully support all
> preemption modes supported by the kernel. CONFIG_PREEMPT is in, no two
> ways about it. Breaking it is a regression and reason to revert stuff.

Right so you have enforcing that developers spend time to maintain a
useless kernel option. We have lots of other things to worry about.

> Therefore every Linux developer supports it per definition. And clearly
> the complication is worth it for enough people, otherwise it wouldn't be
> there.

Where is the worth? The only thing that I heard last time that I
brought it up is that there is some audio specialty distro. Guess that
needs it to support broken old audio boards? Cannot believe that a kernel
with PREEMPT_VOLUNTARY cannot handle audio. All my Linux workstations do
just fine without full preemption.

It seems that even RT is moving away from full preemption due to the
performance issue. Wake up!!

2013-09-09 15:39:09

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 11:20:57 -0400
Steven Rostedt <[email protected]> wrote:

> > It's a bit the same with spinlocks. spinlocks aren't a task synchronization
> > but a CPU synchronization. It's low level. Of course a task can't sleep
> > with a spinlock held (not talking about -rt) so it could be defined as a per
> > task property. But it's just not relevant.
>
> Again, this is where we get into trouble. No it is not a CPU
> synchronization. We only disable preemption because of implementation
> details. Not the concept. A spin lock is only used to protect critical
> data, and not to disable preemption. Those that use it to disable
> preemption has caused us issues in -rt.
>
> This is again the problem with confusing implementation with concepts.
> -rt proved that a spin lock has nothing to do with cpu state, nor
> preemption.
>

Let me expand on this. Note, using a implementation detail from a item
is known as a side effect, and is frowned on when doing so.

In fact, when spin_locks() were created, it was just to point out where
critical sections are that prevent more than one task from accessing
some data at the same time. This was needed for multiple CPUs. This was
done before CONFIG_PREEMPT was even created.

Then Robert Love built on that concept where these same locations
had a characteristic that showed where two tasks can not access the
same data, and used that as preemption points. Points where we can not
be preempted, and let the kernel become preemptible.

Then -rt built further on the concept, and made these locations able to
sleep by removing the areas that could not sleep before (by threading
IRQs).

Again, the concept of a spin lock is not about the CPU or even the
task. It is about accessing some data in a safe way. When we stick to
concepts, we can expand on them as we did with CONFIG_PREEMPT and
CONFIG_PREEMPT_RT. It's when people use side effects (disabled
preemption) that breaks this expansion (like those that use spin_locks
and access per_cpu data).

-- Steve

2013-09-09 15:41:21

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 15:24:55 +0000
Christoph Lameter <[email protected]> wrote:

> On Mon, 9 Sep 2013, Peter Zijlstra wrote:
>
> > > Slander. Certainly validation is good. Its just that PREEMPT kernels are
> > > not in use
> >
> > Complete bullshit, its part of the mainline kernel, lots of people run
> > them -- including me, and any patch is supposed to keep it working.
>
> Nonsense. There is no main line distro that supports full preempt. Its an
> academic exercise.

Red Hat MRG supports PREEMPT_RT. And Debian has an option to use
PREEMPT_RT too. Two big main line distros.

So please, cut out the "academic exercise" crap.

-- Steve

2013-09-09 15:47:05

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 11:41:16 -0400
Steven Rostedt <[email protected]> wrote:

> On Mon, 9 Sep 2013 15:24:55 +0000
> Christoph Lameter <[email protected]> wrote:
>
> > On Mon, 9 Sep 2013, Peter Zijlstra wrote:
> >
> > > > Slander. Certainly validation is good. Its just that PREEMPT kernels are
> > > > not in use
> > >
> > > Complete bullshit, its part of the mainline kernel, lots of people run
> > > them -- including me, and any patch is supposed to keep it working.
> >
> > Nonsense. There is no main line distro that supports full preempt. Its an
> > academic exercise.
>
> Red Hat MRG supports PREEMPT_RT. And Debian has an option to use
> PREEMPT_RT too. Two big main line distros.
>
> So please, cut out the "academic exercise" crap.

Ubuntu Lucid supplies a CONFIG_PREEMPT kernel too:

http://kernel.ubuntu.com/~kernel-ppa/configs/lucid/amd64--preempt

-- Steve

2013-09-09 16:00:29

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?


* Christoph Lameter <[email protected]> wrote:

> On Mon, 9 Sep 2013, Peter Zijlstra wrote:
>
> > > Slander. Certainly validation is good. Its just that PREEMPT kernels
> > > are not in use
> >
> > Complete bullshit, its part of the mainline kernel, lots of people run
> > them -- including me, and any patch is supposed to keep it working.
>
> Nonsense. There is no main line distro that supports full preempt. Its
> an academic exercise.

You are dead wrong on multiple levels.

Firstly, here's a "popularity list" of preempt kernel config options,
distilled from kernel configs sent to lkml, from a time span of a couple
of months:

73 CONFIG_PREEMPT_NONE=y
81 CONFIG_PREEMPT_VOLUNTARY=y
135 CONFIG_PREEMPT=y

CONFIG_PREEMPT=y is in fact more popular than the other two modes of
preemption, amongst kernel testers who post configs to lkml.

Secondly, even if, hypotethically, in an alternate universe,
CONFIG_PREEMPT=y was used only rarely, the preempt debug checks are still
very important for automated testing: they often catch bugs that are
relevant on !PREEMPT as well. (such as accidental unlocked access) The
last thing we want to do is to reduce the 'reach' of debug checks.

Distros typically go for the lowest overhead preemption option with server
loads in mind - still even they enjoy the bug fixes generated by the
preempt checks.

So my NAK stands: you are still in denial, you should stop the silly
arguing and you should stop wasting maintainer time. You need to address
PeterZ's review feedback and fix the bugs in your patches, ASAP.

Thanks,

Ingo

2013-09-09 16:03:33

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 18:00:24 +0200
Ingo Molnar <[email protected]> wrote:

> So my NAK stands: you are still in denial, you should stop the silly
> arguing and you should stop wasting maintainer time. You need to address
> PeterZ's review feedback and fix the bugs in your patches, ASAP.

To Christoph's credit. He did post patches with debug checks. We just
need to get around to review them.

-- Steve

2013-09-09 16:03:49

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 11:39:05AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 11:20:57 -0400
> Steven Rostedt <[email protected]> wrote:
>
> > > It's a bit the same with spinlocks. spinlocks aren't a task synchronization
> > > but a CPU synchronization. It's low level. Of course a task can't sleep
> > > with a spinlock held (not talking about -rt) so it could be defined as a per
> > > task property. But it's just not relevant.
> >
> > Again, this is where we get into trouble. No it is not a CPU
> > synchronization. We only disable preemption because of implementation
> > details. Not the concept. A spin lock is only used to protect critical
> > data, and not to disable preemption. Those that use it to disable
> > preemption has caused us issues in -rt.
> >
> > This is again the problem with confusing implementation with concepts.
> > -rt proved that a spin lock has nothing to do with cpu state, nor
> > preemption.
> >
>
> Let me expand on this. Note, using a implementation detail from a item
> is known as a side effect, and is frowned on when doing so.
>
> In fact, when spin_locks() were created, it was just to point out where
> critical sections are that prevent more than one task from accessing
> some data at the same time. This was needed for multiple CPUs. This was
> done before CONFIG_PREEMPT was even created.
>
> Then Robert Love built on that concept where these same locations
> had a characteristic that showed where two tasks can not access the
> same data, and used that as preemption points. Points where we can not
> be preempted, and let the kernel become preemptible.
>
> Then -rt built further on the concept, and made these locations able to
> sleep by removing the areas that could not sleep before (by threading
> IRQs).
>
> Again, the concept of a spin lock is not about the CPU or even the
> task. It is about accessing some data in a safe way. When we stick to
> concepts, we can expand on them as we did with CONFIG_PREEMPT and
> CONFIG_PREEMPT_RT. It's when people use side effects (disabled
> preemption) that breaks this expansion (like those that use spin_locks
> and access per_cpu data).

Well, I was considering strict basic spinlocks, sticking to the name.
Of course sleeping spinlocks involve the scheduler and the concept of "tasks", and as such
complicate the debate :)

2013-09-09 16:10:35

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 11:20:57AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 16:40:38 +0200
> Frederic Weisbecker <[email protected]> wrote:
>
> > On Mon, Sep 09, 2013 at 09:29:17AM -0400, Steven Rostedt wrote:
> > > On Mon, 9 Sep 2013 09:21:42 -0400
> > > Steven Rostedt <[email protected]> wrote:
> > >
> > >
> > > > Especially since the function name itself is "rcu_is_cpu_idle()" which
> > > > tells you it's a cpu state, and not a task state. Why would you care in
> > > > RCU if CPU 1 is idle if you are on CPU 2? The name should be changed.
> > >
> > > > Actually, preempt_count is more a CPU state than a task state. When
> > > > preempt_count is set, that CPU can not schedule out the current task.
> > > > It really has nothing to do with the task itself. It's the CPU that
> > > > can't do something. Preempt count should never traverse with a task
> > > > from one CPU to another.
> > >
> > > I'll take this a step further. Here's a simple rule to determine if
> > > something is a task state or a CPU state.
> > >
> > > If the state migrates with a task from one CPU to another, it's a task
> > > state.
> >
> > This applies to preempt_count as well.
>
> No it does not, because the very definition of preempt_count means the
> task can not migrate. And why would a task want preempt count activated?
> Because it is dealing with per cpu data! Again, if we look at the
> concept, it's not a task state, its a CPU state, because the reason to
> disable preemption is that the task needs to look at stuff specific to
> the CPU. There's a reason that we don't want the task to migrate, and
> that reason has to do with data specific to the CPU not the task.
>
> > > If the state never leaves a CPU with a task, then it's a CPU state.
> >
> > Per CPU and per task property aren't mutually exclusive.
>
> No, but using my simple rules (and that's why I made them simple). They
> can be. :-)

Unless you are RCU and would like to avoid too tight of dependencies on
exactly when context switch happens.

> > Per task obviously implies per cpu. And per cpu doesn't imply per task.
>
> I disagree. A state of a task can move with the task when it moves to
> another CPU. That means the task state does not imply per cpu state.
>
> For example, a task may be sleeping and not on any CPU, but it still
> has that "state", where no CPU has it.
>
> And I can say per cpu implies per task, as a state of the CPU implies
> that it is also the current state of the task running on that CPU.
>
> A CPU always has one task on it (idle if nothing is running), thus any
> CPU state is also some task state. But a task may not be on any CPU,
> thus a task state does not imply that any CPU has that state as well.

In many cases, yes. But context switches do not happen atomically,
so in other cases you do need to look behind the curtain in order to
make things work reliably.

> > Taking that into account, when you're dealing with a per task property,
> > the rest depends on the POV: you can zoom to the lower level and look at
> > things from the CPU POV. Or you can zoom out and look at thing from the
> > task POV.
>
> But I just negated "that account", so this no longer applies.
>
> > RCU extended quiescent states can be viewed from both POV. RCU just
> > only cares about the CPU POV.
>
> No it does not!
>
> What is a grace period? Some task has access to some data that is about
> to change, but a synchronize_rcu() has been executed, and needs to wait
> for the grace period to end before it can continue. If that task
> sleeps, then there's no CPU that RCU cares about. It cares *only* about
> the tasks that are preventing the grace period to finish. Why else do
> we implement rcu boosting?
>
> Please, lets focus on the concept of RCU not the implementation.

Fine, but only until you suggest changes to RCU. Then we need to focus
on the implementation.

> > We don't even need a task to set/unset a CPU to/from extended quiescent state.
> > It's only a matter of CPU running RCU code. The kernel just happens to use
> > tasks to run code.
>
> We are really confusing dynamic ticks and NO_HZ with RCU here. I hate
> the term "extended quiescent state", when it really just means the CPU
> is not using RCU anymore. Yes, that is a CPU state. But I think we are
> getting things backwards. We are saying its a RCU state, when it really
> is a CPU state, and this is due to our implementation and not about RCU
> itself.
>
> The CPU can not be in this state if the task does something that
> requires accessing RCU critical data. Basically, there's a state that
> says the task wont be doing any RCU work (going idle, or going into
> userland), and thus, we have an implementation to optimize RCU not to
> care about this CPU. The confusion here is that this is an optimization
> implementation, and not a "concept" of RCU.
>
> It's really a CPU and task state that say "we don't need RCU, so RCU go
> do your business someplace else".
>
> But this is implemented in the RCU system, and thus we are saying that
> RCU is dictating what is going on here, when it really is the task and
> CPU state that are doing the dictating.
>
> This is why I liked "rcu_ignored()" as it really is the real state of
> affairs. We are basically saying that the task, or tasks, running on
> the CPU is not going to do any RCU work, so the RCU system can safely
> ignore this CPU as an optimization implementation.
>
> And this is where all POVs are getting confused :-)

Focusing on only one POV is not a strategy to win with RCU. You instead
need to make sure that things are working in all the applicable POVs.

> > It's a bit the same with spinlocks. spinlocks aren't a task synchronization
> > but a CPU synchronization. It's low level. Of course a task can't sleep
> > with a spinlock held (not talking about -rt) so it could be defined as a per
> > task property. But it's just not relevant.
>
> Again, this is where we get into trouble. No it is not a CPU
> synchronization. We only disable preemption because of implementation
> details. Not the concept. A spin lock is only used to protect critical
> data, and not to disable preemption. Those that use it to disable
> preemption has caused us issues in -rt.
>
> This is again the problem with confusing implementation with concepts.
> -rt proved that a spin lock has nothing to do with cpu state, nor
> preemption.

The classical definition of spinlock does in fact require disabling
preemption. What really happened is that -rt implemented the "spinlock"
API with something that is not a spinlock, and fixed up a bunch of things,
including RCU, so that it all worked out OK. The spinlock API in -rt is
thus a fiction, but a fiction that has extremely useful outcomes, like
extremely low scheduling and interrupt latencies.

When is a spinlock not a spinlock? When you are running -rt. ;-)

> > Mutexes, OTOH, really are task synchronisation. They could be considered from
> > the CPU view. That's just not relevant either.
>
> Mutexes are just a different implementation of a lock. And yes, the
> sleeping nature of a mutex makes it have a different concept, but
> that's because we then can not use it for places that can not sleep.

This part I agree with.

> > > According to the above rules, rcu_is_cpu_idle() is a task state, and
> > > really should be in task_struct, and preempt_count is a CPU state, and
> > > should be a per_cpu variable.
> >
> > No, according to you description, preempt count is a task state.
>
> No, I already stated that it is not a task state.

> > Eventually it just happens to be both. But more relevant from the CPU POV.
>
> No it is not.

Yes it is! No it is not! Yes it is! ... ;-)

> > And really the fact that a state can be viewed per task doesn't mean it's
> > always right to store it in the task struct. The semantics also
> > play a big role here.
>
> I'll say that implementation can play a big role. If there's a fast way
> to implement something, we need to do it that way, even if one looks at
> it and says "hmm, that doesn't match the concept". But those need to be
> commented.
>
> I'm not saying that we must follow the concept in our implementation,
> I'm saying that names of functions and comments need to be focused on
> concept, where as the implementation can focus on optimizations, and
> comment where they differ.
>
> For example, preempt_count really is a per cpu state. But because of
> issues accessing it in assembly, it was added to a task state. And when
> things change like this, we have to add hacks when concept and
> implementation differ. For example, when interrupts came in and used a
> different stack, we now can not use the stack info to access preempt
> count. This hack has changed over the years, but just points out how
> things go apart when concept and implementation differ.

Oddly enough, the earliest implementations of RCU that I worked with
back in the early 1990s had implementation but no concept. The concepts
came later. Something about how dragging people through 1500 lines
of highly optimized parallel C code not being a very effective way of
teaching RCU. Surprisingly enough, there actually were some people who
managed to learn it that way...

Thanx, Paul

2013-09-09 16:11:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?


* Steven Rostedt <[email protected]> wrote:

> On Mon, 9 Sep 2013 18:00:24 +0200
> Ingo Molnar <[email protected]> wrote:
>
> > So my NAK stands: you are still in denial, you should stop the silly
> > arguing and you should stop wasting maintainer time. You need to
> > address PeterZ's review feedback and fix the bugs in your patches,
> > ASAP.
>
> To Christoph's credit. He did post patches with debug checks. We just
> need to get around to review them.

I saw those, he posted 'needs testing' patches. He still behaved
passive-aggressively, pretending that it was some difficult task to
perform, as if we were pulling his teeth.

And in this thread he still arguing nonsense in the middle in the merge
window, claiming that CONFIG_PREEMPT=y is 'academic' - when just a cursory
look at lkml or just about anywhere else would tell him that amongst bug
reporters on lkml it's as popular as the other preempt options.

Adding and keeping preempt checks is not rocket science.

The thing is, we should not be forced to shout at him at all: Christoph's
should be _proactive_ in addressing the shortcomings that were readily
pointed out literally years ago during review in a friendly fashion,
instead of wasting a lot of people's time trying to argue around it...

Thanks,

Ingo

2013-09-09 16:16:38

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 03:24:55PM +0000, Christoph Lameter wrote:
> On Mon, 9 Sep 2013, Peter Zijlstra wrote:
>
> > > Slander. Certainly validation is good. Its just that PREEMPT kernels are
> > > not in use
> >
> > Complete bullshit, its part of the mainline kernel, lots of people run
> > them -- including me, and any patch is supposed to keep it working.
>
> Nonsense. There is no main line distro that supports full preempt. Its an
> academic exercise.

You have a weird definition of academia. And you might not think the
various distro's that do support PREEMPT aren't big enough for you but
that doesn't mean they're not there.

Also, you don't need a major distro to be 'in use'.

Furthermore, there's RedHat MRG that ships PREEMPT_RT which is a full
superset of PREEMPT. And I'm fairly sure there's other commercial
distros out there doing the same.

> > > and AFAICT the full preempt stuff requires significant developer
> > > support and complicates the code without much benefit.
> >
> > More bullshit, each and every patch submitted must fully support all
> > preemption modes supported by the kernel. CONFIG_PREEMPT is in, no two
> > ways about it. Breaking it is a regression and reason to revert stuff.
>
> Right so you have enforcing that developers spend time to maintain a
> useless kernel option. We have lots of other things to worry about.

This useless option has found/exposed genuine SMP locking errors, driven
developers to create validation frameworks such as lockdep and forced
people to think harder on serialization. Yes all useless things.

> > Therefore every Linux developer supports it per definition. And clearly
> > the complication is worth it for enough people, otherwise it wouldn't be
> > there.
>
> Where is the worth? The only thing that I heard last time that I
> brought it up is that there is some audio specialty distro. Guess that
> needs it to support broken old audio boards? Cannot believe that a kernel
> with PREEMPT_VOLUNTARY cannot handle audio. All my Linux workstations do
> just fine without full preemption.

Right.. you don't use it, therefore its useless. How classic you.

> It seems that even RT is moving away from full preemption due to the
> performance issue. Wake up!!

It is not. The main goal of RT is still to have minimal and worst case
bounded context switch time for a higher priority task.

2013-09-09 16:17:31

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 10:16:29AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 06:56:56 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
>
> > Indeed, there is on ongoing naming debate as well. About the only point
> > of agreement thus far is that the current names are inadequate. ;-)
> >
> > My current feeling is that rcu_is_cpu_idle() should be called
> > rcu_watching_this_cpu() and what is called rcu_watching_this_cpu()
> > in my local tree should be called __rcu_watching_this_cpu().
>
> I disagree. Then it would not make sense if we take a return value of
> "__rcu_watching_this_cpu()" and use it on another CPU to make other
> decisions for that other CPU.

Frederic and I both went through why this works.

> I still think we are confusing concepts with implementation. Yes, the
> RCU implementation tracks CPU state, but the concept is still based on
> the task.

You keep asserting this, but I am not seeing it. Sure, you can argue
that grace periods are based on tasks as well as or instead of CPUs.
But I am not convinced that it helps at the dynticks interface.

> But you are right, with dynamic ticks, things get a little more
> complex, as dynamic ticks is a CPU state, not a task state, as it can
> be something other than the running task that changes the state
> (another task gets scheduled on that CPU).
>
> But I think we are coupling RCU a bit too much with dynamic ticks here.
> Maybe we need to take a step back to visualize concepts again.

If we don't couple it pretty tightly, it won't work. And whatever we
want to call this thing that determines what RCU is paying attention to
has to be at the implementation level. For things like rcu_read_lock()
and synchronize_rcu(), yes, the task view is important -- and in recent
documentation is the POV I use.

> The state of being in dynamic tick mode is determined by what a task or
> tasks are doing on the CPU. One of those things is if the task needs to
> be tracked by RCU. And here, is where I think we are getting our
> confusion from. The dynamic tick state needs to check if the running
> task is requiring RCU or not, and thus we ask for "is rcu needed on
> this CPU?" when the real question is "is the task running on this CPU
> requiring RCU?"
>
> Again, if we keep things in a conceptual mode, and not look too much at
> the implementation details, I think more people will understand what's
> going on. Especially those that don't know why something was
> implemented the way it was.

All this aside, do you have a name you are nominating?

Thanx, Paul

2013-09-09 16:19:12

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 06:53:20AM -0700, Paul E. McKenney wrote:
> On Mon, Sep 09, 2013 at 03:36:04PM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 09, 2013 at 06:23:43AM -0700, Paul E. McKenney wrote:
> > > And guys, I have to say that the advice on which per-CPU primitive to use
> > > varies wildly and randomly. For all I know, each of you individually
> > > might well be sticking to the same story, but taken together, your
> > > collective advice is strongly resembling white noise.
> >
> > Its partly because cl and I disagree on things. He doesn't seem to care
> > much about validation and believes that people will not make mistakes
> > with this stuff.
>
> At some point, my only recourse would be to require an Acked-by from one
> of you for any per-CPU changes proposed by the other. I really hope that
> it doesn't come to that, but this situation is getting a bit annoying.

Nah, I'll not hold that over you. I think the current storm-in-teacup
was mostly due us seeing something suspicous and not understanding the
explanation well or so.

> > And partly because I didn't get what was going on. While Frederic's
> > explanation might be correct it was incomprehensible for me.
>
> And I freely admit that the comments on rcu_is_cpu_idle() are completely
> inadequate, and I do apologize for that. Seemed clear at the time.
> But then it always does, doesn't it? ;-)

Yeah, I'm only all too familiar with this problem :/

2013-09-09 16:22:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 04:40:38PM +0200, Frederic Weisbecker wrote:
> > If the state migrates with a task from one CPU to another, it's a task
> > state.
>
> This applies to preempt_count as well.

This is in fact true. Its something I'd like to cure, but currently the
PREEMPT_ACTIVE state makes this so.

Hence the need to save/restore the per-cpu preempt state in the per-cpu
preempt_count patches.

2013-09-09 16:26:53

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 04:21:55PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 09, 2013 at 06:23:43AM -0700, Paul E. McKenney wrote:
> > Peter, in the general case, you are quite correct. But this is a special
> > case where it really does work.
> >
> > The key point here is that preemption and migration cannot move a task
> > from a CPU to which RCU is paying attention to a CPU that RCU is ignoring.
>
> But there's no constraint placed on the migration mask (aka
> task_struct::cpus_allowed) and therefore it can move it thusly.
>
> What you're trying to say is that by the time the task is running on
> another cpu, that cpu's state will match the state of the previous cpu,
> no?

Yep! Might be a better way to put it as well.

> > So yes, by the time the task sees the return value from rcu_is_cpu_idle(),
> > that task might be running on some other CPU. But that is OK, because
> > if RCU was paying attention to the old CPU, then RCU must also be paying
> > attention to the new CPU.
>
> OK, fair enough.
>
> > Here is an example of how this works:
> >
> > 1. Some task running on a CPU 0 (which RCU is paying attention to)
> > calls rcu_is_cpu_idle(), which disables preemption, checks the
> > per-CPU variable, sets ret to zero, then enables preemption.
> >
> > At this point, the task is preempted by some high-priority task.
> >
> > 2. CPU 1 is currently idle, so RCU is -not- paying attention to it.
> > However, it is decided that our low-priority task should migrate
> > to CPU 1.
> >
> > 3. CPU 1 is sent an IPI, which forces this CPU out of idle. This
> > causes rcu_idle_exit() to be called, which causes RCU to start
> > paying attention to CPU 1.
>
> Just a nit, we typically try to avoid using IPIs to wake idle CPUs,
> doesn't change the story much though.

K, if I get to this level of detail in the comments, I will leave IPIs
out, and just say that the CPU is forced out of idle.

> > 4. CPU 1 switches to the low-priority task, which now sees the
> > return value of rcu_is_cpu_idle(). Now, this return value did
> > in fact reflect the old state of CPU 0, and the state of CPU 0
> > might have changed. (For example, the high-priority task might
> > have blocked, so that CPU 0 is now idle, which in turn would
> > mean that RCU is no longer paying attention to it, so that
> > if rcu_is_cpu_idle() was called right now, it would return
> > true rather than the false return computed in step 1 above.)
> >
> > 5. But that is OK. Because of the way RCU and idle interact,
> > if a call from a given task to rcu_is_cpu_idle() returned false
> > some time in the past, a call from that same task will also
> > return false right now.
> >
> > So yes, in general it is wrong to disable preemption, grab the value
> > of a per-CPU variable, re-enable preemption, and then return the result.
> > But there are a number of special cases where it is OK, and this is
> > one of them.
>
> Right, worthy of comments though :-)

No argument there!

Now if we can agree on the naming and the exact per-CPU incantation... ;-)

Thanx, Paul

2013-09-09 16:30:31

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 09:09:20 -0700
"Paul E. McKenney" <[email protected]> wrote:

> Oddly enough, the earliest implementations of RCU that I worked with
> back in the early 1990s had implementation but no concept. The concepts
> came later. Something about how dragging people through 1500 lines
> of highly optimized parallel C code not being a very effective way of
> teaching RCU. Surprisingly enough, there actually were some people who
> managed to learn it that way...

This is actually my point of my arguments :-)

Things that are shown outside of the rcu files should be an interface
that reflects the concept, not the implementation (if possible). Just
like there's nothing in "preempt_count()" that says the preempt count
is stored on a per task field.

We can't expect the rest of the community to understand the RCU
implementation, when there's still many that don't understand the
concept ;-)

You can implement RCU anyway you seem fit. My concern is where very
smart people like Peter Zijlstra stumble over code used by others that
show:

preempt_disable();
ret = yada_yada();
preempt_enable();

return ret;

Where the function name states that we want CPU state, but if that's
really true, then who ever wants CPU state must not preempt here.

The point being, here the concept actually makes more sense than the
implementation, because I think one thing we all can agree on, is that
the interface throughout the kernel should be something people can
understand quickly without too much effort. The better we all can
understand, the better the kernel will be.

I think we are starting to bike shed a bit too much. The real issue is,
what can "rcu_is_cpu_idle()" be called that makes sense with those that
need to look at this code?


"rcu_is_not_active()" or "rcu_is_ignored()" where a user can see that,
"Oh, RCU is ignored here, I shouldn't be using rcu_read_lock()" or if
their code is called within something that does "rcu_is_ignored()" and
they see that they used "rcu_read_lock()" they would understand what
the issue is.

If they see "rcu_is_cpu_idle()" they would get confused: "Hmm, the cpu
isn't idle here?"

Perhaps "rcu_watching_this_cpu()" may make sense, but again, if I see a
the above preempt_enable() in that code, I would think it's a bug,
because then the return value is not guaranteed to be on this cpu.

-- Steve

2013-09-09 16:34:27

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 09:17:08 -0700
"Paul E. McKenney" <[email protected]> wrote:

> On Mon, Sep 09, 2013 at 10:16:29AM -0400, Steven Rostedt wrote:
> > On Mon, 9 Sep 2013 06:56:56 -0700
> > "Paul E. McKenney" <[email protected]> wrote:
> >
> >
> > > Indeed, there is on ongoing naming debate as well. About the only point
> > > of agreement thus far is that the current names are inadequate. ;-)
> > >
> > > My current feeling is that rcu_is_cpu_idle() should be called
> > > rcu_watching_this_cpu() and what is called rcu_watching_this_cpu()
> > > in my local tree should be called __rcu_watching_this_cpu().
> >
> > I disagree. Then it would not make sense if we take a return value of
> > "__rcu_watching_this_cpu()" and use it on another CPU to make other
> > decisions for that other CPU.
>
> Frederic and I both went through why this works.


My concern is people stumbling over why preemption can be enabled here?

If it must *always* be called with preemption disabled (no
rcu_watching_this_cpu() version that disables preemption for you) then
I would be OK with it.

The problem I'm having is, anything that uses "this_cpu()" can cause
problems with understanding the code, because the first thing I think
is "if we get the result for 'this_cpu', it may not be 'this_cpu' when
I use it".

>
> > I still think we are confusing concepts with implementation. Yes, the
> > RCU implementation tracks CPU state, but the concept is still based on
> > the task.
>
> You keep asserting this, but I am not seeing it. Sure, you can argue
> that grace periods are based on tasks as well as or instead of CPUs.
> But I am not convinced that it helps at the dynticks interface.
>
> > But you are right, with dynamic ticks, things get a little more
> > complex, as dynamic ticks is a CPU state, not a task state, as it can
> > be something other than the running task that changes the state
> > (another task gets scheduled on that CPU).
> >
> > But I think we are coupling RCU a bit too much with dynamic ticks here.
> > Maybe we need to take a step back to visualize concepts again.
>
> If we don't couple it pretty tightly, it won't work. And whatever we
> want to call this thing that determines what RCU is paying attention to
> has to be at the implementation level. For things like rcu_read_lock()
> and synchronize_rcu(), yes, the task view is important -- and in recent
> documentation is the POV I use.
>
> > The state of being in dynamic tick mode is determined by what a task or
> > tasks are doing on the CPU. One of those things is if the task needs to
> > be tracked by RCU. And here, is where I think we are getting our
> > confusion from. The dynamic tick state needs to check if the running
> > task is requiring RCU or not, and thus we ask for "is rcu needed on
> > this CPU?" when the real question is "is the task running on this CPU
> > requiring RCU?"
> >
> > Again, if we keep things in a conceptual mode, and not look too much at
> > the implementation details, I think more people will understand what's
> > going on. Especially those that don't know why something was
> > implemented the way it was.
>
> All this aside, do you have a name you are nominating?

Something that doesn't specify "this_cpu" or "cpu" if the result can be
used on another cpu correctly.

"rcu_is_ignored()" or "rcu_is_not_active()", "rcu_is_watching_you()"

-- Steve

2013-09-09 16:37:45

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 09:55:11AM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 06:46:05 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
>
> > > Also, if its per-task, why don't we have this in the task struct? The
> > > current scheme makes the context switch more expensive -- is this the
> > > right trade-off?
> >
> > There are constraints based on the task, but RCU really is
> > paying attention to CPUs, not than tasks. (With the exception of
> > TREE_PREEMPT_RCU, which does keep lists of tasks that it has to pay
> > attention to, namely those that have been preempted within their current
> > RCU read-side critical section.)
>
> Conceptually wise, RCU keeps track of task state, not CPU state. In all
> your diagrams in your presentations, where you talk about grace periods
> and quiescent states, you show tasks, not CPUs.
>
> RCU's implementation is based on CPUs, and only when rcu_read_lock()
> prevents preemption. As you stated above, TREE_PREEMPT_RCU needs to
> keep track of tasks.

Actually, in TINY_RCU and TREE_RCU, preemption is disabled to begin
with, so that rcu_read_lock() doesn't need to do anything. I left
the preempt_disable() in rcu_read_lock() and the preempt_enable() in
rcu_read_unlock() in case we ever have need to run either TINY_RCU or
TREE_RCU in a CONFIG_PREEPT=y kernel.

That said, TREE_PREEMPT_RCU's implementation does track tasks sometimes,
but only in the (hopefully) uncommon case where an RCU read-side critical
section is preempted.

However, the API we are arguing about is deep within the implementation.
It is not at the level of rcu_read_lock(). It is something that should
not have that many invocations -- after all, the things using it are
binding themselves unusually close to RCU.

> I think you are too deep into the implementation, that you are
> forgetting the concept that you created :-)

Like I said before (though admittedly after you wrote the above), the
implementation came first and the concepts much later. ;-)

Thanx, Paul

2013-09-09 16:40:47

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 09:22:15 -0700
"Paul E. McKenney" <[email protected]> wrote:


> However, the API we are arguing about is deep within the implementation.
> It is not at the level of rcu_read_lock(). It is something that should
> not have that many invocations -- after all, the things using it are
> binding themselves unusually close to RCU.
>

Is it? I guess the question is, is dynamic ticks an extension of RCU,
or is it just using the RCU implementation as a convenience?

Also the OP patch is for function tracing, something not coupled by RCU
at all. Just a way to know if it is safe to call functions that use RCU
or not.

That can have "this_cpu()" by the way as a way to tell us that we must
disable preemption before hand. Which is what caused this thread to
start with, as it was suggested to combine rcu_is_cpu_idle() which
brought up why that function disables preemption.

-- Steve

2013-09-09 16:42:10

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 09:26:32 -0700
"Paul E. McKenney" <[email protected]> wrote:


> Now if we can agree on the naming and the exact per-CPU incantation... ;-)

s/per-CPU/per-task/

/me runs away!!!

-- Steve

2013-09-09 16:56:37

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 12:30:26PM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 09:09:20 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > Oddly enough, the earliest implementations of RCU that I worked with
> > back in the early 1990s had implementation but no concept. The concepts
> > came later. Something about how dragging people through 1500 lines
> > of highly optimized parallel C code not being a very effective way of
> > teaching RCU. Surprisingly enough, there actually were some people who
> > managed to learn it that way...
>
> This is actually my point of my arguments :-)
>
> Things that are shown outside of the rcu files should be an interface
> that reflects the concept, not the implementation (if possible). Just
> like there's nothing in "preempt_count()" that says the preempt count
> is stored on a per task field.
>
> We can't expect the rest of the community to understand the RCU
> implementation, when there's still many that don't understand the
> concept ;-)
>
> You can implement RCU anyway you seem fit. My concern is where very
> smart people like Peter Zijlstra stumble over code used by others that
> show:
>
> preempt_disable();
> ret = yada_yada();
> preempt_enable();
>
> return ret;
>
> Where the function name states that we want CPU state, but if that's
> really true, then who ever wants CPU state must not preempt here.
>
> The point being, here the concept actually makes more sense than the
> implementation, because I think one thing we all can agree on, is that
> the interface throughout the kernel should be something people can
> understand quickly without too much effort. The better we all can
> understand, the better the kernel will be.
>
> I think we are starting to bike shed a bit too much. The real issue is,
> what can "rcu_is_cpu_idle()" be called that makes sense with those that
> need to look at this code?
>
>
> "rcu_is_not_active()" or "rcu_is_ignored()" where a user can see that,
> "Oh, RCU is ignored here, I shouldn't be using rcu_read_lock()" or if
> their code is called within something that does "rcu_is_ignored()" and
> they see that they used "rcu_read_lock()" they would understand what
> the issue is.
>
> If they see "rcu_is_cpu_idle()" they would get confused: "Hmm, the cpu
> isn't idle here?"
>
> Perhaps "rcu_watching_this_cpu()" may make sense, but again, if I see a
> the above preempt_enable() in that code, I would think it's a bug,
> because then the return value is not guaranteed to be on this cpu.

Indeed, the preempt_disable()/preempt_enable() pair will need a big
fat comment whatever the name turns out to be. Without such a comment,
regardless of the name smart people would likely have objections to the
"yada_yada()" containing references to a per-CPU variable whose result
is passed out of the preempt_disable() region.

Thanx, Paul

2013-09-09 16:58:48

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 12:34:22PM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 09:17:08 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > On Mon, Sep 09, 2013 at 10:16:29AM -0400, Steven Rostedt wrote:
> > > On Mon, 9 Sep 2013 06:56:56 -0700
> > > "Paul E. McKenney" <[email protected]> wrote:
> > >
> > >
> > > > Indeed, there is on ongoing naming debate as well. About the only point
> > > > of agreement thus far is that the current names are inadequate. ;-)
> > > >
> > > > My current feeling is that rcu_is_cpu_idle() should be called
> > > > rcu_watching_this_cpu() and what is called rcu_watching_this_cpu()
> > > > in my local tree should be called __rcu_watching_this_cpu().
> > >
> > > I disagree. Then it would not make sense if we take a return value of
> > > "__rcu_watching_this_cpu()" and use it on another CPU to make other
> > > decisions for that other CPU.
> >
> > Frederic and I both went through why this works.
>
> My concern is people stumbling over why preemption can be enabled here?
>
> If it must *always* be called with preemption disabled (no
> rcu_watching_this_cpu() version that disables preemption for you) then
> I would be OK with it.
>
> The problem I'm having is, anything that uses "this_cpu()" can cause
> problems with understanding the code, because the first thing I think
> is "if we get the result for 'this_cpu', it may not be 'this_cpu' when
> I use it".

Yep, this is an exception to the usual rule about not passing per-CPU
variables out of preempt_disable() regions, and will need to be commented
appropriately.

> > > I still think we are confusing concepts with implementation. Yes, the
> > > RCU implementation tracks CPU state, but the concept is still based on
> > > the task.
> >
> > You keep asserting this, but I am not seeing it. Sure, you can argue
> > that grace periods are based on tasks as well as or instead of CPUs.
> > But I am not convinced that it helps at the dynticks interface.
> >
> > > But you are right, with dynamic ticks, things get a little more
> > > complex, as dynamic ticks is a CPU state, not a task state, as it can
> > > be something other than the running task that changes the state
> > > (another task gets scheduled on that CPU).
> > >
> > > But I think we are coupling RCU a bit too much with dynamic ticks here.
> > > Maybe we need to take a step back to visualize concepts again.
> >
> > If we don't couple it pretty tightly, it won't work. And whatever we
> > want to call this thing that determines what RCU is paying attention to
> > has to be at the implementation level. For things like rcu_read_lock()
> > and synchronize_rcu(), yes, the task view is important -- and in recent
> > documentation is the POV I use.
> >
> > > The state of being in dynamic tick mode is determined by what a task or
> > > tasks are doing on the CPU. One of those things is if the task needs to
> > > be tracked by RCU. And here, is where I think we are getting our
> > > confusion from. The dynamic tick state needs to check if the running
> > > task is requiring RCU or not, and thus we ask for "is rcu needed on
> > > this CPU?" when the real question is "is the task running on this CPU
> > > requiring RCU?"
> > >
> > > Again, if we keep things in a conceptual mode, and not look too much at
> > > the implementation details, I think more people will understand what's
> > > going on. Especially those that don't know why something was
> > > implemented the way it was.
> >
> > All this aside, do you have a name you are nominating?
>
> Something that doesn't specify "this_cpu" or "cpu" if the result can be
> used on another cpu correctly.
>
> "rcu_is_ignored()" or "rcu_is_not_active()", "rcu_is_watching_you()"

You know, I am strongly tempted by "rcu_is_watching_you()", but I have
this feeling that it is too cute for its own good. ;-)

Thanx, Paul

2013-09-09 16:59:57

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 12:42:05PM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 09:26:32 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
>
> > Now if we can agree on the naming and the exact per-CPU incantation... ;-)
>
> s/per-CPU/per-task/
>
> /me runs away!!!

Me too, given my previous experiences attempting to bind RCU that
closely to the scheduler! ;-)

Thanx, Paul

2013-09-09 17:06:09

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 09:58:36 -0700
"Paul E. McKenney" <[email protected]> wrote:


> > "rcu_is_ignored()" or "rcu_is_not_active()", "rcu_is_watching_you()"
>
> You know, I am strongly tempted by "rcu_is_watching_you()", but I have
> this feeling that it is too cute for its own good. ;-)

Sign of the times...

s/c/S/
s/r/N/
s/u/A/

-- Steve

2013-09-09 17:29:15

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

* Paul E. McKenney ([email protected]) wrote:
> On Mon, Sep 09, 2013 at 12:34:22PM -0400, Steven Rostedt wrote:
[...]
> > "rcu_is_ignored()" or "rcu_is_not_active()", "rcu_is_watching_you()"
>
> You know, I am strongly tempted by "rcu_is_watching_you()", but I have
> this feeling that it is too cute for its own good. ;-)

Wow, I just got off the plane, and look at what happened to this thread
;-)

Referring to your earlier question Paul, what I meant by my earlier
email on naming has been addressed by Steven: when exposing a new RCU
API, even if it is just for in-kernel use, we should be really cautious
not to tie it to implementation, but rather to concepts. Basically, my
original thought is that we should be able to express the exact same
concept in the kernel RCU implementation and in Userspace RCU. Here,
binding the name on whether the CPU is watching RCU really makes no
sense for urcu, since all the RCU flavors we currently have are watching
threads, not CPUs.

Hence my proposal for "rcu_read_check()". It could be "rcu_is_active()"
too, I don't really mind. It really minds: Is RCU actively watching the
current execution context ? This can be translated to a runtime check
too: is it safe to call rcu_read_lock() form this context ?

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

2013-09-09 17:45:46

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 12:40:44PM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 09:22:15 -0700
> "Paul E. McKenney" <[email protected]> wrote:
> > However, the API we are arguing about is deep within the implementation.
> > It is not at the level of rcu_read_lock(). It is something that should
> > not have that many invocations -- after all, the things using it are
> > binding themselves unusually close to RCU.
>
> Is it? I guess the question is, is dynamic ticks an extension of RCU,
> or is it just using the RCU implementation as a convenience?

It is something that RCU didn't need to deal with at all in the "good
old days" before the current level of concern with energy efficiency.
To say nothing of back in the day where the idle loop was a simple tight
loop without all the tracing, C-state manipulation, and who knows what
all else.

Dynamic ticks is separate from RCU, but is something that RCU must pay
close attention to, because it RCU must avoid causing dynamic-ticks CPUs
to do anything -- to do otherwise would defeat the energy-efficiency
purpose of dynamic ticks. Therefore, RCU needs to exactly track the
dynamic-ticks state of each CPU and behave accordingly.

The RCU-user-visible piece of this is that RCU read-side critical sections
are not permitted on dynamic-ticks CPUs, which normally only an issue for
code called from within the idle loop. In most cases, code that is called
from the idle loop that need to use RCU read-side primitives can use the
RCU_NONIDLE() macro.

> Also the OP patch is for function tracing, something not coupled by RCU
> at all. Just a way to know if it is safe to call functions that use RCU
> or not.

Your tracing code is an exception because RCU_NONIDLE() is too
heavyweight for your code: surrounding each and every function trace
with RCU_NONIDLE() would make for nice stately and slow function tracing.
I don't expect too many similar exceptions because most code would
have a problem if the answer came back "you can't use RCU read-side
critical sections". I suspect that this includes event tracing invoked
from the idle loop, as I would guess that just refusing to do the
event tracing would not make people happy.

> That can have "this_cpu()" by the way as a way to tell us that we must
> disable preemption before hand. Which is what caused this thread to
> start with, as it was suggested to combine rcu_is_cpu_idle() which
> brought up why that function disables preemption.

Yep, that is how we got here. ;-)

Thanx, Paul

2013-09-09 17:46:15

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 01:06:05PM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 09:58:36 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
>
> > > "rcu_is_ignored()" or "rcu_is_not_active()", "rcu_is_watching_you()"
> >
> > You know, I am strongly tempted by "rcu_is_watching_you()", but I have
> > this feeling that it is too cute for its own good. ;-)
>
> Sign of the times...
>
> s/c/S/
> s/r/N/
> s/u/A/

;-) ;-) ;-)

Thanx, Paul

2013-09-09 17:57:26

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 01:29:08PM -0400, Mathieu Desnoyers wrote:
> * Paul E. McKenney ([email protected]) wrote:
> > On Mon, Sep 09, 2013 at 12:34:22PM -0400, Steven Rostedt wrote:
> [...]
> > > "rcu_is_ignored()" or "rcu_is_not_active()", "rcu_is_watching_you()"
> >
> > You know, I am strongly tempted by "rcu_is_watching_you()", but I have
> > this feeling that it is too cute for its own good. ;-)
>
> Wow, I just got off the plane, and look at what happened to this thread
> ;-)

I had the same reaction when getting up this morning. ;-)

> Referring to your earlier question Paul, what I meant by my earlier
> email on naming has been addressed by Steven: when exposing a new RCU
> API, even if it is just for in-kernel use, we should be really cautious
> not to tie it to implementation, but rather to concepts. Basically, my
> original thought is that we should be able to express the exact same
> concept in the kernel RCU implementation and in Userspace RCU. Here,
> binding the name on whether the CPU is watching RCU really makes no
> sense for urcu, since all the RCU flavors we currently have are watching
> threads, not CPUs.

More that that, userspace RCU doesn't have any energy management tie-ins.
It instead expects the application threads to invoke rcu_thread_offline()
when that thread goes idle and rcu_thread_offline() when the thread wakes
up again. There is therefore less need for the application to query the
state because it was the application that set the state.

In contrast, within the Linux kernel, the RCU-watching state gets set
asynchronously with respect to in-kernel users of RCU.

Given the rest of the userspace RCU primitives, something like
rcu_thread_is_online() might make sense for the userspace RCU if some
application needs to know the state. Or some other name that fits in
with rcu_thread_offline() and rcu_thread_online(). But such a name would
be problematic in the kernel due to CPU hotplug's use of those terms.

> Hence my proposal for "rcu_read_check()". It could be "rcu_is_active()"
> too, I don't really mind. It really minds: Is RCU actively watching the
> current execution context ? This can be translated to a runtime check
> too: is it safe to call rcu_read_lock() form this context ?

Although I do like rcu_is_active() better than rcu_read_check(), my
concern with rcu_is_active() is that it can easily be mistaken for a
global state rather than a per-CPU/thread/task/whatever state.

Thanx, Paul

2013-09-09 18:36:42

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 10:56:56 -0700
"Paul E. McKenney" <[email protected]> wrote:


> Although I do like rcu_is_active() better than rcu_read_check(), my
> concern with rcu_is_active() is that it can easily be mistaken for a
> global state rather than a per-CPU/thread/task/whatever state.

rcu_is_active_local();

Although, even though it's cute. I think "rcu_is_watching_you()" has no
ambiguity.

-- Steve

2013-09-09 18:51:00

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 02:36:39PM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 10:56:56 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
>
> > Although I do like rcu_is_active() better than rcu_read_check(), my
> > concern with rcu_is_active() is that it can easily be mistaken for a
> > global state rather than a per-CPU/thread/task/whatever state.
>
> rcu_is_active_local();

rcu_is_active_here()?

> Although, even though it's cute. I think "rcu_is_watching_you()" has no
> ambiguity.

Like I said, I am tempted...

Thanx, Paul

2013-09-09 21:40:32

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

* Paul E. McKenney ([email protected]) wrote:
> On Mon, Sep 09, 2013 at 01:29:08PM -0400, Mathieu Desnoyers wrote:
> > * Paul E. McKenney ([email protected]) wrote:
> > > On Mon, Sep 09, 2013 at 12:34:22PM -0400, Steven Rostedt wrote:
> > [...]
> > > > "rcu_is_ignored()" or "rcu_is_not_active()", "rcu_is_watching_you()"
> > >
> > > You know, I am strongly tempted by "rcu_is_watching_you()", but I have
> > > this feeling that it is too cute for its own good. ;-)
> >
> > Wow, I just got off the plane, and look at what happened to this thread
> > ;-)
>
> I had the same reaction when getting up this morning. ;-)
>
> > Referring to your earlier question Paul, what I meant by my earlier
> > email on naming has been addressed by Steven: when exposing a new RCU
> > API, even if it is just for in-kernel use, we should be really cautious
> > not to tie it to implementation, but rather to concepts. Basically, my
> > original thought is that we should be able to express the exact same
> > concept in the kernel RCU implementation and in Userspace RCU. Here,
> > binding the name on whether the CPU is watching RCU really makes no
> > sense for urcu, since all the RCU flavors we currently have are watching
> > threads, not CPUs.
>
> More that that, userspace RCU doesn't have any energy management tie-ins.

So far! ;)

> It instead expects the application threads to invoke rcu_thread_offline()
> when that thread goes idle and rcu_thread_offline() when the thread wakes
> up again. There is therefore less need for the application to query the
> state because it was the application that set the state.
>
> In contrast, within the Linux kernel, the RCU-watching state gets set
> asynchronously with respect to in-kernel users of RCU.
>
> Given the rest of the userspace RCU primitives, something like
> rcu_thread_is_online() might make sense for the userspace RCU if some
> application needs to know the state. Or some other name that fits in
> with rcu_thread_offline() and rcu_thread_online().

I think calling it "rcu_thread_is_online()" would actually give away
some of the implementation details. I would be tempted to go for
something along the lines of "rcu_is_online()".

> But such a name would
>be problematic in the kernel due to CPU hotplug's use of those terms.

The kernel has a clearly defined notion of "online cpu". I don't see the
semantic clash with "online rcu". Whenever the RCU machinery is not
observing an execution context, it is considered "offline". Whether this
is caused by an explicit call (e.g. rcu_thread_offline() for urcu) or
done internally (kernel dynticks) should not matter.

>
> > Hence my proposal for "rcu_read_check()". It could be "rcu_is_active()"
> > too, I don't really mind. It really minds: Is RCU actively watching the
> > current execution context ? This can be translated to a runtime check
> > too: is it safe to call rcu_read_lock() form this context ?
>
> Although I do like rcu_is_active() better than rcu_read_check(), my
> concern with rcu_is_active() is that it can easily be mistaken for a
> global state rather than a per-CPU/thread/task/whatever state.

Agreed. So how about rcu_is_online() ?

Thanks,

Mathieu

>
> Thanx, Paul
>

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

2013-09-09 21:59:21

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013 17:40:26 -0400
Mathieu Desnoyers <[email protected]> wrote:

> Agreed. So how about rcu_is_online() ?

Nope, what about all_your_base_are_belong_to_rcu()?

-- Steve

2013-09-09 22:34:37

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 05:59:17PM -0400, Steven Rostedt wrote:
> On Mon, 9 Sep 2013 17:40:26 -0400
> Mathieu Desnoyers <[email protected]> wrote:
>
> > Agreed. So how about rcu_is_online() ?
>
> Nope, what about all_your_base_are_belong_to_rcu()?

Let's see if I can remember the candidates...

rcu_is_cpu_idle() # reversed sense from the others
rcu_is_ignored() # reversed sense from the others
rcu_is_not_active() # reversed sense from the others
rcu_is_watching_cpu()
rcu_read_check()
rcu_is_active()
rcu_is_active_local()
rcu_is_online()
rcu_is_watching_task()
rcu_is_watching_thread()
rcu_is_watching_you()
all_your_base_are_belong_to_rcu()
rcu_is_active_loco()
rcu_kilroy_was_here()

Maybe I should just lock them all in a room overnight and see which
are still alive in the morning.

Thanx, Paul

2013-09-10 04:07:16

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 2013-09-09 at 14:49 +0000, Christoph Lameter wrote:

> Its just that PREEMPT kernels are
> not in use and AFAICT the full preempt stuff requires significant developer
> support and complicates the code without much benefit.

The openSUSE desktop kernel is PREEMPT, and presumably has users.

I use VOLUNTARY for my desktop, having zero tight latency constraints,
and not being into inflicting needless pain on poor defenseless boxen,
but lots of folks do use PREEMPT, and some of them may even need it.

-Mike

Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013, Ingo Molnar wrote:

> So my NAK stands: you are still in denial, you should stop the silly
> arguing and you should stop wasting maintainer time. You need to address
> PeterZ's review feedback and fix the bugs in your patches, ASAP.

You are NAKing my patches that add the preempt checks? Now I am confused.
I thought you wanted those?

Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, 9 Sep 2013, Ingo Molnar wrote:

> I saw those, he posted 'needs testing' patches. He still behaved
> passive-aggressively, pretending that it was some difficult task to
> perform, as if we were pulling his teeth.

I need your review of those. I will rediff as soon as rc1 is out to
send something that can be put into -next. Please tell me until then if
the approach is ok. I dont think we can do anything in the merge window.

> The thing is, we should not be forced to shout at him at all: Christoph's
> should be _proactive_ in addressing the shortcomings that were readily
> pointed out literally years ago during review in a friendly fashion,
> instead of wasting a lot of people's time trying to argue around it...

I have fixed all the issues that Steven pointed out in the past about
suspicious __this_cpu operations a long time ago. He seemed to want to
implement the checks at that point. Not that difficult if one adds new
variants of this_cpu operations. Which was an issue initially. The
irqsafe_ this_cpu variants were nixed as the time because it was seen
to be too complicated.

2013-09-11 14:13:21

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Mon, Sep 09, 2013 at 03:34:26PM -0700, Paul E. McKenney wrote:
> On Mon, Sep 09, 2013 at 05:59:17PM -0400, Steven Rostedt wrote:
> > On Mon, 9 Sep 2013 17:40:26 -0400
> > Mathieu Desnoyers <[email protected]> wrote:
> >
> > > Agreed. So how about rcu_is_online() ?
> >
> > Nope, what about all_your_base_are_belong_to_rcu()?
>
> Let's see if I can remember the candidates...
>
> rcu_is_cpu_idle() # reversed sense from the others
> rcu_is_ignored() # reversed sense from the others
> rcu_is_not_active() # reversed sense from the others
> rcu_is_watching_cpu()
> rcu_read_check()
> rcu_is_active()
> rcu_is_active_local()
> rcu_is_online()
> rcu_is_watching_task()
> rcu_is_watching_thread()
> rcu_is_watching_you()
> all_your_base_are_belong_to_rcu()
> rcu_is_active_loco()
> rcu_kilroy_was_here()
>
> Maybe I should just lock them all in a room overnight and see which
> are still alive in the morning.

And after treating injuries, the survivor is rcu_is_watching().

Thanx, Paul

2013-09-11 14:26:11

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Wed, 11 Sep 2013 07:13:02 -0700
"Paul E. McKenney" <[email protected]> wrote:

> On Mon, Sep 09, 2013 at 03:34:26PM -0700, Paul E. McKenney wrote:
> > On Mon, Sep 09, 2013 at 05:59:17PM -0400, Steven Rostedt wrote:
> > > On Mon, 9 Sep 2013 17:40:26 -0400
> > > Mathieu Desnoyers <[email protected]> wrote:
> > >
> > > > Agreed. So how about rcu_is_online() ?
> > >
> > > Nope, what about all_your_base_are_belong_to_rcu()?
> >
> > Let's see if I can remember the candidates...
> >
> > rcu_is_cpu_idle() # reversed sense from the others
> > rcu_is_ignored() # reversed sense from the others
> > rcu_is_not_active() # reversed sense from the others
> > rcu_is_watching_cpu()
> > rcu_read_check()
> > rcu_is_active()
> > rcu_is_active_local()
> > rcu_is_online()
> > rcu_is_watching_task()
> > rcu_is_watching_thread()
> > rcu_is_watching_you()
> > all_your_base_are_belong_to_rcu()
> > rcu_is_active_loco()
> > rcu_kilroy_was_here()
> >
> > Maybe I should just lock them all in a room overnight and see which
> > are still alive in the morning.
>
> And after treating injuries, the survivor is rcu_is_watching().
>

But, but, but...

That wasn't one of the contenders!

What happened? Did rcu_is_watching_cpu(), rcu_is_watching_task(),
rcu_is_watching_thread() and rcu_is_watching_you() all get together to
gang up on the others, and then combined to be one?

It's another Iraq! Several segments joined together by an outside
force and they don't play well together. And like Iraq (and the US),
hidden inside of this "community" is "rcu_is_watching_you"!

-- Steve

2013-09-11 15:24:00

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Wed, Sep 11, 2013 at 10:26:07AM -0400, Steven Rostedt wrote:
> On Wed, 11 Sep 2013 07:13:02 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > On Mon, Sep 09, 2013 at 03:34:26PM -0700, Paul E. McKenney wrote:
> > > On Mon, Sep 09, 2013 at 05:59:17PM -0400, Steven Rostedt wrote:
> > > > On Mon, 9 Sep 2013 17:40:26 -0400
> > > > Mathieu Desnoyers <[email protected]> wrote:
> > > >
> > > > > Agreed. So how about rcu_is_online() ?
> > > >
> > > > Nope, what about all_your_base_are_belong_to_rcu()?
> > >
> > > Let's see if I can remember the candidates...
> > >
> > > rcu_is_cpu_idle() # reversed sense from the others
> > > rcu_is_ignored() # reversed sense from the others
> > > rcu_is_not_active() # reversed sense from the others
> > > rcu_is_watching_cpu()
> > > rcu_read_check()
> > > rcu_is_active()
> > > rcu_is_active_local()
> > > rcu_is_online()
> > > rcu_is_watching_task()
> > > rcu_is_watching_thread()
> > > rcu_is_watching_you()
> > > all_your_base_are_belong_to_rcu()
> > > rcu_is_active_loco()
> > > rcu_kilroy_was_here()
> > >
> > > Maybe I should just lock them all in a room overnight and see which
> > > are still alive in the morning.
> >
> > And after treating injuries, the survivor is rcu_is_watching().
> >
>
> But, but, but...
>
> That wasn't one of the contenders!
>
> What happened? Did rcu_is_watching_cpu(), rcu_is_watching_task(),
> rcu_is_watching_thread() and rcu_is_watching_you() all get together to
> gang up on the others, and then combined to be one?
>
> It's another Iraq! Several segments joined together by an outside
> force and they don't play well together. And like Iraq (and the US),
> hidden inside of this "community" is "rcu_is_watching_you"!

C'mon, Steven! I did say "after treating injuries"! In the opinion
of the surgeon, the only option was to ampute what was left of either
the _cpu(), _task(), _thread(), or _you(). Heck, the damage was so
severe that we couldn't even tell which one it was!

Thanx, Paul

2013-09-11 15:49:19

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Wed, 11 Sep 2013 08:23:31 -0700
"Paul E. McKenney" <[email protected]> wrote:

> C'mon, Steven! I did say "after treating injuries"! In the opinion
> of the surgeon, the only option was to ampute what was left of either
> the _cpu(), _task(), _thread(), or _you(). Heck, the damage was so
> severe that we couldn't even tell which one it was!

What a morbid thought. The possibility that "_you" was amputated.

-- Steve

2013-09-11 16:03:31

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Wed, Sep 11, 2013 at 11:49:16AM -0400, Steven Rostedt wrote:
> On Wed, 11 Sep 2013 08:23:31 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > C'mon, Steven! I did say "after treating injuries"! In the opinion
> > of the surgeon, the only option was to ampute what was left of either
> > the _cpu(), _task(), _thread(), or _you(). Heck, the damage was so
> > severe that we couldn't even tell which one it was!
>
> What a morbid thought. The possibility that "_you" was amputated.

;-) ;-) ;-)

Thanx, Paul

2013-09-12 06:38:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?


* Christoph Lameter <[email protected]> wrote:

> On Mon, 9 Sep 2013, Ingo Molnar wrote:
>
> > So my NAK stands: you are still in denial, you should stop the silly
> > arguing and you should stop wasting maintainer time. You need to
> > address PeterZ's review feedback and fix the bugs in your patches,
> > ASAP.
>
> You are NAKing my patches that add the preempt checks? Now I am
> confused. I thought you wanted those?

My NAK of the original patches stands until the debug checks are working
and are put up for a merge. You were talking nonsense all around in this
thread and I simply don't trust your promise, but I'll obviously trust a
queued up fix.

Thanks,

Ingo

2013-09-12 06:39:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?


* Christoph Lameter <[email protected]> wrote:

> On Mon, 9 Sep 2013, Ingo Molnar wrote:
>
> > I saw those, he posted 'needs testing' patches. He still behaved
> > passive-aggressively, pretending that it was some difficult task to
> > perform, as if we were pulling his teeth.
>
> I need your review of those. I will rediff as soon as rc1 is out to send
> something that can be put into -next. Please tell me until then if the
> approach is ok. I dont think we can do anything in the merge window.

The patch looked OK. Have you tested it, such as using a this_cpu op on a
PREEMPT=y kernel in a preemptible section? That should trigger the preempt
warning.

Thanks,

Ingo

Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Thu, 12 Sep 2013, Ingo Molnar wrote:

>
> * Christoph Lameter <[email protected]> wrote:
>
> > On Mon, 9 Sep 2013, Ingo Molnar wrote:
> >
> > > I saw those, he posted 'needs testing' patches. He still behaved
> > > passive-aggressively, pretending that it was some difficult task to
> > > perform, as if we were pulling his teeth.
> >
> > I need your review of those. I will rediff as soon as rc1 is out to send
> > something that can be put into -next. Please tell me until then if the
> > approach is ok. I dont think we can do anything in the merge window.
>
> The patch looked OK. Have you tested it, such as using a this_cpu op on a
> PREEMPT=y kernel in a preemptible section? That should trigger the preempt
> warning.

The reason that certain __this_cpu ops were converted to raw_cpu ops is
because they triggered the preemption check. The test was done in a kvm
environment (as evident from the description).


Subject: Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?

On Thu, 12 Sep 2013, Ingo Molnar wrote:

> My NAK of the original patches stands until the debug checks are working
> and are put up for a merge. You were talking nonsense all around in this
> thread and I simply don't trust your promise, but I'll obviously trust a
> queued up fix.

You could just pick up the existing preempt check patches and fix them up
and merge now saving me the work. We will have the checks immediately for
3.12. The conversion of all the source code from __get_cpu_var can be done
later.

It would not be that complicated to do even without my patches.