2013-04-30 02:53:13

by Simon Horman

[permalink] [raw]
Subject: [PATCH v2 0/2] sched: Add cond_resched_rcu_lock() helper

Add a helper that for use in loops which read data protected by RCU and
may have a large number of iterations. Such an example is dumping the list
of connections known to IPVS: ip_vs_conn_array() and ip_vs_conn_seq_next().

This series also updates the two ip_vs functions mentioned above
to use the helper.

Changes since v1 noted in the changelog of each patch.

Simon Horman (2):
sched: Add cond_resched_rcu_lock() helper
ipvs: Use cond_resched_rcu_lock() helper when dumping connections

include/linux/sched.h | 11 +++++++++++
net/netfilter/ipvs/ip_vs_conn.c | 6 ++----
2 files changed, 13 insertions(+), 4 deletions(-)

--
1.8.2.1


2013-04-30 02:53:19

by Simon Horman

[permalink] [raw]
Subject: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

This is intended for use in loops which read data protected by RCU and may
have a large number of iterations. Such an example is dumping the list of
connections known to IPVS: ip_vs_conn_array() and ip_vs_conn_seq_next().

The call to cond_resched() is guarded by #ifndef CONFIG_PREEMPT_RCU as in
the case of CONFIG_PREEMPT_RCU _rcu_read_unlock() will check to see if the
RCU core needs to be informed, so there is no need to invoke cond_resched()
in that case. Thanks to Paul E. McKenney for explaining this.

cond_resched_rcu_lock() suggested by Eric Dumazet.
ifndef guard suggested by Paul E. McKenney and Julian Anastasov.

Cc: Eric Dumazet <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Julian Anastasov <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: Simon Horman <[email protected]>

---

v2
* Add guard the call to cond_resched()
---
include/linux/sched.h | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e692a02..66da71c 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2787,3 +2787,14 @@ static inline unsigned long rlimit_max(unsigned int limit)
}

#endif
+
+static void inline cond_resched_rcu_lock(void)
+{
+ if (need_resched()) {
+ rcu_read_unlock();
+#ifndef CONFIG_PREEMPT_RCU
+ cond_resched();
+#endif
+ rcu_read_lock();
+ }
+}
--
1.8.2.1

2013-04-30 02:53:16

by Simon Horman

[permalink] [raw]
Subject: [PATCH v2 2/2] ipvs: Use cond_resched_rcu_lock() helper when dumping connections

This avoids the situation where a dump of a large number of connections
may prevent scheduling for a long time while also avoiding excessive
calls to rcu_read_unlock() and rcu_read_lock().

Note that in the case of !CONFIG_PREEMPT_RCU this will
add a call to cond_resched().

Compile tested only.

Cc: Eric Dumazet <[email protected]>
Cc: Julian Anastasov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
---
net/netfilter/ipvs/ip_vs_conn.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index a083bda..42a7b33 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -975,8 +975,7 @@ static void *ip_vs_conn_array(struct seq_file *seq, loff_t pos)
return cp;
}
}
- rcu_read_unlock();
- rcu_read_lock();
+ cond_resched_rcu_lock();
}

return NULL;
@@ -1015,8 +1014,7 @@ static void *ip_vs_conn_seq_next(struct seq_file *seq, void *v, loff_t *pos)
iter->l = &ip_vs_conn_tab[idx];
return cp;
}
- rcu_read_unlock();
- rcu_read_lock();
+ cond_resched_rcu_lock();
}
iter->l = NULL;
return NULL;
--
1.8.2.1

2013-04-30 07:11:47

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Tue, 30 Apr 2013, Simon Horman wrote:

> This is intended for use in loops which read data protected by RCU and may
> have a large number of iterations. Such an example is dumping the list of
> connections known to IPVS: ip_vs_conn_array() and ip_vs_conn_seq_next().
>
> The call to cond_resched() is guarded by #ifndef CONFIG_PREEMPT_RCU as in
> the case of CONFIG_PREEMPT_RCU _rcu_read_unlock() will check to see if the
> RCU core needs to be informed, so there is no need to invoke cond_resched()
> in that case. Thanks to Paul E. McKenney for explaining this.
>
> cond_resched_rcu_lock() suggested by Eric Dumazet.
> ifndef guard suggested by Paul E. McKenney and Julian Anastasov.
>
> Cc: Eric Dumazet <[email protected]>
> Cc: Paul E. McKenney <[email protected]>
> Cc: Julian Anastasov <[email protected]>
> Acked-by: Ingo Molnar <[email protected]>
> Signed-off-by: Simon Horman <[email protected]>
>
> ---
>
> v2
> * Add guard the call to cond_resched()
> ---
> include/linux/sched.h | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index e692a02..66da71c 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2787,3 +2787,14 @@ static inline unsigned long rlimit_max(unsigned int limit)
> }
>
> #endif
> +
> +static void inline cond_resched_rcu_lock(void)
> +{
> + if (need_resched()) {

Ops, it should be without above need_resched.

> + rcu_read_unlock();
> +#ifndef CONFIG_PREEMPT_RCU
> + cond_resched();
> +#endif
> + rcu_read_lock();
> + }
> +}
> --
> 1.8.2.1

Regards

--
Julian Anastasov <[email protected]>

2013-04-30 07:29:49

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Tue, Apr 30, 2013 at 10:12:38AM +0300, Julian Anastasov wrote:
>
> Hello,
>
> On Tue, 30 Apr 2013, Simon Horman wrote:
>
> > This is intended for use in loops which read data protected by RCU and may
> > have a large number of iterations. Such an example is dumping the list of
> > connections known to IPVS: ip_vs_conn_array() and ip_vs_conn_seq_next().
> >
> > The call to cond_resched() is guarded by #ifndef CONFIG_PREEMPT_RCU as in
> > the case of CONFIG_PREEMPT_RCU _rcu_read_unlock() will check to see if the
> > RCU core needs to be informed, so there is no need to invoke cond_resched()
> > in that case. Thanks to Paul E. McKenney for explaining this.
> >
> > cond_resched_rcu_lock() suggested by Eric Dumazet.
> > ifndef guard suggested by Paul E. McKenney and Julian Anastasov.
> >
> > Cc: Eric Dumazet <[email protected]>
> > Cc: Paul E. McKenney <[email protected]>
> > Cc: Julian Anastasov <[email protected]>
> > Acked-by: Ingo Molnar <[email protected]>
> > Signed-off-by: Simon Horman <[email protected]>
> >
> > ---
> >
> > v2
> > * Add guard the call to cond_resched()
> > ---
> > include/linux/sched.h | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index e692a02..66da71c 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -2787,3 +2787,14 @@ static inline unsigned long rlimit_max(unsigned int limit)
> > }
> >
> > #endif
> > +
> > +static void inline cond_resched_rcu_lock(void)
> > +{
> > + if (need_resched()) {
>
> Ops, it should be without above need_resched.

Thanks, to clarify, just this:

static void inline cond_resched_rcu_lock(void)
{
rcu_read_unlock();
#ifndef CONFIG_PREEMPT_RCU
cond_resched();
#endif
rcu_read_lock();
}

2013-04-30 07:51:32

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Tue, 30 Apr 2013, Simon Horman wrote:

> > > +static void inline cond_resched_rcu_lock(void)
> > > +{
> > > + if (need_resched()) {
> >
> > Ops, it should be without above need_resched.
>
> Thanks, to clarify, just this:
>
> static void inline cond_resched_rcu_lock(void)
> {
> rcu_read_unlock();
> #ifndef CONFIG_PREEMPT_RCU
> cond_resched();
> #endif
> rcu_read_lock();
> }

Yes, thanks!

--
Julian Anastasov <[email protected]>

2013-05-01 09:12:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Tue, Apr 30, 2013 at 10:52:38AM +0300, Julian Anastasov wrote:
>
> Hello,
>
> On Tue, 30 Apr 2013, Simon Horman wrote:
>
> > > > +static void inline cond_resched_rcu_lock(void)
> > > > +{
> > > > + if (need_resched()) {
> > >
> > > Ops, it should be without above need_resched.
> >
> > Thanks, to clarify, just this:
> >
> > static void inline cond_resched_rcu_lock(void)
> > {
> > rcu_read_unlock();
> > #ifndef CONFIG_PREEMPT_RCU
> > cond_resched();
> > #endif
> > rcu_read_lock();
> > }
>
> Yes, thanks!

OK, now I'm confused.. PREEMPT_RCU would preempt in any case, so why bother
dropping rcu_read_lock() at all?

That is; the thing that makes sense to me is:

static void inline cond_resched_rcu_lock(void)
{
#ifdef CONFIG_PREEMPT_RCU
if (need_resched()) {
rcu_read_unlock();
cond_resched();
rcu_read_lock();
}
#endif /* CONFIG_PREEMPT_RCU */
}

That would have an rcu_read_lock() break and voluntary preemption point for
non-preemptible RCU and not bother with the stuff for preemptible RCU.

2013-05-01 12:46:51

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, May 01, 2013 at 11:10:12AM +0200, Peter Zijlstra wrote:
> On Tue, Apr 30, 2013 at 10:52:38AM +0300, Julian Anastasov wrote:
> >
> > Hello,
> >
> > On Tue, 30 Apr 2013, Simon Horman wrote:
> >
> > > > > +static void inline cond_resched_rcu_lock(void)
> > > > > +{
> > > > > + if (need_resched()) {
> > > >
> > > > Ops, it should be without above need_resched.
> > >
> > > Thanks, to clarify, just this:
> > >
> > > static void inline cond_resched_rcu_lock(void)
> > > {
> > > rcu_read_unlock();
> > > #ifndef CONFIG_PREEMPT_RCU
> > > cond_resched();
> > > #endif
> > > rcu_read_lock();
> > > }
> >
> > Yes, thanks!
>
> OK, now I'm confused.. PREEMPT_RCU would preempt in any case, so why bother
> dropping rcu_read_lock() at all?

Good point, I was assuming that the goal was to let grace periods end
as well as to allow preemption. The momentary dropping out of the
RCU read-side critical section allows the grace periods to end.

> That is; the thing that makes sense to me is:
>
> static void inline cond_resched_rcu_lock(void)
> {
> #ifdef CONFIG_PREEMPT_RCU
> if (need_resched()) {
> rcu_read_unlock();
> cond_resched();
> rcu_read_lock();
> }
> #endif /* CONFIG_PREEMPT_RCU */
> }
>
> That would have an rcu_read_lock() break and voluntary preemption point for
> non-preemptible RCU and not bother with the stuff for preemptible RCU.

If the only goal is to allow preemption, and if long grace periods are
not a concern, then this alternate approach would work fine as well.

Of course, both approaches assume that the caller is in a place
where having all RCU-protected data disappear is OK!

Thanx, Paul

2013-05-01 14:26:20

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Wed, 1 May 2013, Peter Zijlstra wrote:

> On Tue, Apr 30, 2013 at 10:52:38AM +0300, Julian Anastasov wrote:
> >
> > Hello,
> >
> > On Tue, 30 Apr 2013, Simon Horman wrote:
> >
> > > Thanks, to clarify, just this:
> > >
> > > static void inline cond_resched_rcu_lock(void)
> > > {
> > > rcu_read_unlock();
> > > #ifndef CONFIG_PREEMPT_RCU
> > > cond_resched();
> > > #endif
> > > rcu_read_lock();
> > > }
> >
> > Yes, thanks!
>
> OK, now I'm confused.. PREEMPT_RCU would preempt in any case, so why bother
> dropping rcu_read_lock() at all?
>
> That is; the thing that makes sense to me is:
>
> static void inline cond_resched_rcu_lock(void)
> {
> #ifdef CONFIG_PREEMPT_RCU

You mean '#ifndef' here, right? But in the non-preempt
case is using the need_resched() needed? rcu_read_unlock
and rcu_read_lock do not generate code.

> if (need_resched()) {
> rcu_read_unlock();
> cond_resched();
> rcu_read_lock();
> }
> #endif /* CONFIG_PREEMPT_RCU */
> }
>
> That would have an rcu_read_lock() break and voluntary preemption point for
> non-preemptible RCU and not bother with the stuff for preemptible RCU.

I see. So, can we choose one of both variants:

1. Your variant but with ifndef:

static void inline cond_resched_rcu_lock(void)
{
#ifndef CONFIG_PREEMPT_RCU
if (need_resched()) {
rcu_read_unlock();
cond_resched();
rcu_read_lock();
}
#endif
}

2. Same without need_resched because cond_resched already
performs the same checks:

static void inline cond_resched_rcu_lock(void)
{
#ifndef CONFIG_PREEMPT_RCU
rcu_read_unlock();
cond_resched();
rcu_read_lock();
#endif
}

Regards

--
Julian Anastasov <[email protected]>

2013-05-01 14:33:33

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, May 01, 2013 at 05:46:37AM -0700, Paul E. McKenney wrote:
> On Wed, May 01, 2013 at 11:10:12AM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 30, 2013 at 10:52:38AM +0300, Julian Anastasov wrote:
> > >
> > > Hello,
> > >
> > > On Tue, 30 Apr 2013, Simon Horman wrote:
> > >
> > > > > > +static void inline cond_resched_rcu_lock(void)
> > > > > > +{
> > > > > > + if (need_resched()) {
> > > > >
> > > > > Ops, it should be without above need_resched.
> > > >
> > > > Thanks, to clarify, just this:
> > > >
> > > > static void inline cond_resched_rcu_lock(void)
> > > > {
> > > > rcu_read_unlock();
> > > > #ifndef CONFIG_PREEMPT_RCU
> > > > cond_resched();
> > > > #endif
> > > > rcu_read_lock();
> > > > }
> > >
> > > Yes, thanks!
> >
> > OK, now I'm confused.. PREEMPT_RCU would preempt in any case, so why bother
> > dropping rcu_read_lock() at all?
>
> Good point, I was assuming that the goal was to let grace periods end
> as well as to allow preemption. The momentary dropping out of the
> RCU read-side critical section allows the grace periods to end.
>
> > That is; the thing that makes sense to me is:
> >
> > static void inline cond_resched_rcu_lock(void)
> > {
> > #ifdef CONFIG_PREEMPT_RCU
> > if (need_resched()) {
> > rcu_read_unlock();
> > cond_resched();
> > rcu_read_lock();
> > }
> > #endif /* CONFIG_PREEMPT_RCU */
> > }
> >
> > That would have an rcu_read_lock() break and voluntary preemption point for
> > non-preemptible RCU and not bother with the stuff for preemptible RCU.
>
> If the only goal is to allow preemption, and if long grace periods are
> not a concern, then this alternate approach would work fine as well.

But now that I think about it, there is one big advantage to the
unconditional exiting and reentering the RCU read-side critical section:
It allows easy placement of unconditional lockdep debug code to catch
the following type of bug:

rcu_read_lock();
...
rcu_read_lock();
...
cond_resched_rcu_lock();
...
rcu_read_unlock();
...
rcu_read_unlock();

Here is how to detect this:

static void inline cond_resched_rcu_lock(void)
{
rcu_read_unlock();
WARN_ON_ONCE(rcu_read_lock_held());
#ifndef CONFIG_PREEMPT_RCU
cond_resched();
#endif
rcu_read_lock();
}

Of course, we could do this in your implementation as well:

static void inline cond_resched_rcu_lock(void)
{
#ifdef CONFIG_PREEMPT_RCU
if (need_resched()) {
rcu_read_unlock();
WARN_ON_ONCE(rcu_read_lock_held());
cond_resched();
rcu_read_lock();
}
#endif /* CONFIG_PREEMPT_RCU */
}

But this would fail to detect the bug -- and would silently fail -- on
!CONFIG_PREEMPT_RCU systems.

Thanx, Paul

> Of course, both approaches assume that the caller is in a place
> where having all RCU-protected data disappear is OK!
>
> Thanx, Paul

2013-05-01 15:19:47

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, May 01, 2013 at 05:46:37AM -0700, Paul E. McKenney wrote:

> If the only goal is to allow preemption, and if long grace periods are
> not a concern, then this alternate approach would work fine as well.

Hmm.. if that were the goal I'd like it to have a different name;
cond_resched*() has always been about preemption.

> Of course, both approaches assume that the caller is in a place
> where having all RCU-protected data disappear is OK!

Quite :-)

2013-05-01 15:30:06

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, 2013-05-01 at 17:17 +0200, Peter Zijlstra wrote:
> On Wed, May 01, 2013 at 05:46:37AM -0700, Paul E. McKenney wrote:
>
> > If the only goal is to allow preemption, and if long grace periods are
> > not a concern, then this alternate approach would work fine as well.
>
> Hmm.. if that were the goal I'd like it to have a different name;
> cond_resched*() has always been about preemption.

BTW, I do not remember why cond_resched() is not an empty macro
when CONFIG_PREEMPT=y ?

2013-05-01 15:56:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, May 01, 2013 at 05:22:05PM +0300, Julian Anastasov wrote:

> > static void inline cond_resched_rcu_lock(void)
> > {
> > #ifdef CONFIG_PREEMPT_RCU
>
> You mean '#ifndef' here, right? But in the non-preempt
> case is using the need_resched() needed? rcu_read_unlock
> and rcu_read_lock do not generate code.

Uhm... yes!

> > if (need_resched()) {
> > rcu_read_unlock();
> > cond_resched();
> > rcu_read_lock();
> > }
> > #endif /* CONFIG_PREEMPT_RCU */
> > }
> >
> > That would have an rcu_read_lock() break and voluntary preemption point for
> > non-preemptible RCU and not bother with the stuff for preemptible RCU.
>
> I see. So, can we choose one of both variants:
>
> 1. Your variant but with ifndef:
>
> static void inline cond_resched_rcu_lock(void)
> {
> #ifndef CONFIG_PREEMPT_RCU
> if (need_resched()) {
> rcu_read_unlock();
> cond_resched();
> rcu_read_lock();
> }
> #endif
> }
>
> 2. Same without need_resched because cond_resched already
> performs the same checks:
>
> static void inline cond_resched_rcu_lock(void)
> {
> #ifndef CONFIG_PREEMPT_RCU
> rcu_read_unlock();
> cond_resched();
> rcu_read_lock();
> #endif
> }

Ah so the 'problem' with this last version is that it does an unconditional /
unnessecary rcu_read_unlock().

The below would be in line with all the other cond_resched*() implementations.

---
include/linux/sched.h | 7 +++++++
kernel/sched/core.c | 14 ++++++++++++++
2 files changed, 21 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 802a751..fd2c77f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2449,6 +2449,13 @@ extern int __cond_resched_softirq(void);
__cond_resched_softirq(); \
})

+extern int __cond_resched_rcu(void);
+
+#define cond_resched_rcu() ({ \
+ __might_sleep(__FILE__, __LINE__, 0); \
+ __cond_resched_rcu(); \
+})
+
/*
* Does a critical section need to be broken due to another
* task waiting?: (technically does not depend on CONFIG_PREEMPT,
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7d7901a..2b3b4e6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4358,6 +4358,20 @@ int __sched __cond_resched_softirq(void)
}
EXPORT_SYMBOL(__cond_resched_softirq);

+int __sched __cond_resched_rcu(void)
+{
+#ifndef CONFIG_PREEMPT_RCU
+ if (should_resched()) {
+ rcu_read_unlock();
+ __cond_resched();
+ rcu_read_lock();
+ return 1;
+ }
+#endif
+ return 0;
+}
+EXPORT_SYMBOL(__cond_resched_rcu);
+
/**
* yield - yield the current processor to other threads.
*

2013-05-01 16:01:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, May 01, 2013 at 08:29:55AM -0700, Eric Dumazet wrote:
> On Wed, 2013-05-01 at 17:17 +0200, Peter Zijlstra wrote:
> > On Wed, May 01, 2013 at 05:46:37AM -0700, Paul E. McKenney wrote:
> >
> > > If the only goal is to allow preemption, and if long grace periods are
> > > not a concern, then this alternate approach would work fine as well.
> >
> > Hmm.. if that were the goal I'd like it to have a different name;
> > cond_resched*() has always been about preemption.
>
> BTW, I do not remember why cond_resched() is not an empty macro
> when CONFIG_PREEMPT=y ?

Good question.. at at least, only the __might_sleep() construct. Ingo, happen
to remember why this is? Most of this infrastructure is from before my time.

2013-05-01 16:02:47

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, May 01, 2013 at 08:29:55AM -0700, Eric Dumazet wrote:
> On Wed, 2013-05-01 at 17:17 +0200, Peter Zijlstra wrote:
> > On Wed, May 01, 2013 at 05:46:37AM -0700, Paul E. McKenney wrote:
> >
> > > If the only goal is to allow preemption, and if long grace periods are
> > > not a concern, then this alternate approach would work fine as well.
> >
> > Hmm.. if that were the goal I'd like it to have a different name;
> > cond_resched*() has always been about preemption.
>
> BTW, I do not remember why cond_resched() is not an empty macro
> when CONFIG_PREEMPT=y ?

My guess would be for the case where sched_preempt_enable_no_resched()
is followed some time later by cond_resched().

Thanx, Paul

2013-05-01 16:59:37

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, May 01, 2013 at 09:02:18AM -0700, Paul E. McKenney wrote:
> My guess would be for the case where sched_preempt_enable_no_resched()
> is followed some time later by cond_resched().

I might hope not.. preempt_enable_no_resched() is nasty and you're likely to be
hit with a frozen fish of sorts by tglx if you try to use it ;-)

2013-05-01 17:31:37

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, May 01, 2013 at 06:57:49PM +0200, Peter Zijlstra wrote:
> On Wed, May 01, 2013 at 09:02:18AM -0700, Paul E. McKenney wrote:
> > My guess would be for the case where sched_preempt_enable_no_resched()
> > is followed some time later by cond_resched().
>
> I might hope not.. preempt_enable_no_resched() is nasty and you're likely to be
> hit with a frozen fish of sorts by tglx if you try to use it ;-)

I will stick with my guess, though I agree that if I am correct,
this situation almost certainly predates tglx's Linux-related use of
frozen fish as projectile weapons. ;-)

Thanx, Paul

2013-05-01 18:22:12

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Wed, 1 May 2013, Peter Zijlstra wrote:

> On Wed, May 01, 2013 at 05:22:05PM +0300, Julian Anastasov wrote:
>
> > 2. Same without need_resched because cond_resched already
> > performs the same checks:
> >
> > static void inline cond_resched_rcu_lock(void)
> > {
> > #ifndef CONFIG_PREEMPT_RCU
> > rcu_read_unlock();
> > cond_resched();
> > rcu_read_lock();
> > #endif
> > }
>
> Ah so the 'problem' with this last version is that it does an unconditional /
> unnessecary rcu_read_unlock().

It is just a barrier() for the non-preempt case.

> The below would be in line with all the other cond_resched*() implementations.

...

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 802a751..fd2c77f 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2449,6 +2449,13 @@ extern int __cond_resched_softirq(void);
> __cond_resched_softirq(); \
> })
>
> +extern int __cond_resched_rcu(void);
> +
> +#define cond_resched_rcu() ({ \
> + __might_sleep(__FILE__, __LINE__, 0); \

I see your goal. But digging into __might_sleep()
I see that rcu_sleep_check() will scream for the non-preempt
case because we are under rcu_read_lock.

What about such inline version:

static void inline cond_resched_rcu(void)
{
#ifndef CONFIG_PREEMPT_RCU
rcu_read_unlock();
__might_sleep(__FILE__, __LINE__, 0);
cond_resched();
rcu_read_lock();
#else
__might_sleep(__FILE__, __LINE__, 0);
rcu_lockdep_assert(rcu_preempt_depth() == 1,
"Illegal cond_resched_rcu() context");
#endif
}

It will restrict to single RCU lock level for all
RCU implementations. But we don't have _cond_resched_rcu
helper for two reasons:

- __might_sleep uses __FILE__, __LINE__
- only cond_resched generates code, so need_resched() is
avoided

> + __cond_resched_rcu(); \
> +})
> +
> /*
> * Does a critical section need to be broken due to another
> * task waiting?: (technically does not depend on CONFIG_PREEMPT,
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 7d7901a..2b3b4e6 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4358,6 +4358,20 @@ int __sched __cond_resched_softirq(void)
> }
> EXPORT_SYMBOL(__cond_resched_softirq);
>
> +int __sched __cond_resched_rcu(void)
> +{
> +#ifndef CONFIG_PREEMPT_RCU
> + if (should_resched()) {
> + rcu_read_unlock();
> + __cond_resched();
> + rcu_read_lock();
> + return 1;
> + }
> +#endif
> + return 0;
> +}
> +EXPORT_SYMBOL(__cond_resched_rcu);
> +

Regards

--
Julian Anastasov <[email protected]>

2013-05-01 19:04:44

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, May 01, 2013 at 09:22:08PM +0300, Julian Anastasov wrote:
>
> Hello,
>
> On Wed, 1 May 2013, Peter Zijlstra wrote:
>
> > On Wed, May 01, 2013 at 05:22:05PM +0300, Julian Anastasov wrote:
> >
> > > 2. Same without need_resched because cond_resched already
> > > performs the same checks:
> > >
> > > static void inline cond_resched_rcu_lock(void)
> > > {
> > > #ifndef CONFIG_PREEMPT_RCU
> > > rcu_read_unlock();
> > > cond_resched();
> > > rcu_read_lock();
> > > #endif
> > > }
> >
> > Ah so the 'problem' with this last version is that it does an unconditional /
> > unnessecary rcu_read_unlock().
>
> It is just a barrier() for the non-preempt case.
>
> > The below would be in line with all the other cond_resched*() implementations.
>
> ...
>
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 802a751..fd2c77f 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -2449,6 +2449,13 @@ extern int __cond_resched_softirq(void);
> > __cond_resched_softirq(); \
> > })
> >
> > +extern int __cond_resched_rcu(void);
> > +
> > +#define cond_resched_rcu() ({ \
> > + __might_sleep(__FILE__, __LINE__, 0); \
>
> I see your goal. But digging into __might_sleep()
> I see that rcu_sleep_check() will scream for the non-preempt
> case because we are under rcu_read_lock.
>
> What about such inline version:
>
> static void inline cond_resched_rcu(void)
> {
> #ifndef CONFIG_PREEMPT_RCU
> rcu_read_unlock();
> __might_sleep(__FILE__, __LINE__, 0);
> cond_resched();
> rcu_read_lock();
> #else
> __might_sleep(__FILE__, __LINE__, 0);
> rcu_lockdep_assert(rcu_preempt_depth() == 1,
> "Illegal cond_resched_rcu() context");

The above requires that include/linux/sched.h be included. This might
be OK, but please check the current intended uses.

Thanx, Paul

> #endif
> }
>
> It will restrict to single RCU lock level for all
> RCU implementations. But we don't have _cond_resched_rcu
> helper for two reasons:
>
> - __might_sleep uses __FILE__, __LINE__
> - only cond_resched generates code, so need_resched() is
> avoided
>
> > + __cond_resched_rcu(); \
> > +})
> > +
> > /*
> > * Does a critical section need to be broken due to another
> > * task waiting?: (technically does not depend on CONFIG_PREEMPT,
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 7d7901a..2b3b4e6 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4358,6 +4358,20 @@ int __sched __cond_resched_softirq(void)
> > }
> > EXPORT_SYMBOL(__cond_resched_softirq);
> >
> > +int __sched __cond_resched_rcu(void)
> > +{
> > +#ifndef CONFIG_PREEMPT_RCU
> > + if (should_resched()) {
> > + rcu_read_unlock();
> > + __cond_resched();
> > + rcu_read_lock();
> > + return 1;
> > + }
> > +#endif
> > + return 0;
> > +}
> > +EXPORT_SYMBOL(__cond_resched_rcu);
> > +
>
> Regards
>
> --
> Julian Anastasov <[email protected]>
>

2013-05-02 07:28:12

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, May 01, 2013 at 09:22:08PM +0300, Julian Anastasov wrote:
> > +extern int __cond_resched_rcu(void);
> > +
> > +#define cond_resched_rcu() ({ \
> > + __might_sleep(__FILE__, __LINE__, 0); \
>
> I see your goal. But digging into __might_sleep()
> I see that rcu_sleep_check() will scream for the non-preempt
> case because we are under rcu_read_lock.


#ifdef CONFIG_PREEMPT_RCU
#define PREEMPT_RCU_OFFSET 0
#else
#define PREEMPT_RCU_OFFSET 1
#endif

#define cond_resched_rcu() ({ \
__might_sleep(__FILE__, __LINE__, PREEMPT_RCU_OFFSET); \
__cond_resched_rcu(); \
})

Should work I think..

2013-05-02 07:29:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Wed, May 01, 2013 at 07:32:58AM -0700, Paul E. McKenney wrote:
> Here is how to detect this:
>
> static void inline cond_resched_rcu_lock(void)
> {
> rcu_read_unlock();
> WARN_ON_ONCE(rcu_read_lock_held());
> #ifndef CONFIG_PREEMPT_RCU
> cond_resched();
> #endif
> rcu_read_lock();
> }

The __might_sleep(__FILE__, __LINE__, PREEMPT_RCU_OFFSET) I posted in the other
email just now should deal with this.

2013-05-02 10:05:25

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Thu, 2 May 2013, Peter Zijlstra wrote:

> On Wed, May 01, 2013 at 09:22:08PM +0300, Julian Anastasov wrote:
> > > +extern int __cond_resched_rcu(void);
> > > +
> > > +#define cond_resched_rcu() ({ \
> > > + __might_sleep(__FILE__, __LINE__, 0); \
> >
> > I see your goal. But digging into __might_sleep()
> > I see that rcu_sleep_check() will scream for the non-preempt
> > case because we are under rcu_read_lock.
>
>
> #ifdef CONFIG_PREEMPT_RCU
> #define PREEMPT_RCU_OFFSET 0
> #else
> #define PREEMPT_RCU_OFFSET 1
> #endif
>
> #define cond_resched_rcu() ({ \
> __might_sleep(__FILE__, __LINE__, PREEMPT_RCU_OFFSET); \
> __cond_resched_rcu(); \
> })
>
> Should work I think..

Looks like CONFIG_DEBUG_ATOMIC_SLEEP selects
CONFIG_PREEMPT_COUNT, so PREEMPT_RCU_OFFSET should be
1 in all cases because preempt_disable() adds 1, while
for CONFIG_PREEMPT_RCU case rcu_preempt_depth() should
return 1:

#ifdef CONFIG_PREEMPT_RCU
#define PREEMPT_RCU_OFFSET 1
#else
#define PREEMPT_RCU_OFFSET PREEMPT_CHECK_OFFSET
#endif

Now the remaining part is to fix rcu_sleep_check() for
the non-preempt case. As there are no nesting depths in this
case, I don't see a solution so far. We can provide
some argument to rcu_preempt_sleep_check to compare
depth with preempt_count() but currently I don't know
how to differentiate cond_resched_lock() from cond_resched_rcu()
when calling __might_sleep, in both cases we provide
PREEMPT_OFFSET. May be some trick is needed here without
adding new arg to __might_sleep, so that we can properly
check for rcu_lock_map.

Regards

--
Julian Anastasov <[email protected]>

2013-05-02 15:53:43

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Thu, 2 May 2013, Peter Zijlstra wrote:

> On Wed, May 01, 2013 at 09:22:08PM +0300, Julian Anastasov wrote:
> > > +extern int __cond_resched_rcu(void);
> > > +
> > > +#define cond_resched_rcu() ({ \
> > > + __might_sleep(__FILE__, __LINE__, 0); \
> >
> > I see your goal. But digging into __might_sleep()
> > I see that rcu_sleep_check() will scream for the non-preempt
> > case because we are under rcu_read_lock.
>
>
> #ifdef CONFIG_PREEMPT_RCU
> #define PREEMPT_RCU_OFFSET 0
> #else
> #define PREEMPT_RCU_OFFSET 1
> #endif
>
> #define cond_resched_rcu() ({ \
> __might_sleep(__FILE__, __LINE__, PREEMPT_RCU_OFFSET); \
> __cond_resched_rcu(); \
> })
>
> Should work I think..

I implemented your idea.

I tested the following patch in 2 variants,
TINY_RCU and CONFIG_TREE_PREEMPT_RCU. I see the
error if extra rcu_read_lock is added for testing.

I'm using the PREEMPT_ACTIVE flag to indicate
that we are already under lock. It should work because
__might_sleep is not called with such bit. I also tried to
add new flag in include/linux/hardirq.h but PREEMPT_ACTIVE
depends on the arch, so this alternative looked difficult to
implement.

include/linux/rcupdate.h | 7 ++++---
include/linux/sched.h | 14 ++++++++++++++
kernel/sched/core.c | 20 ++++++++++++++++++--
3 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index b758ce1..b594759 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -480,9 +480,10 @@ static inline void rcu_preempt_sleep_check(void)
}
#endif /* #else #ifdef CONFIG_PROVE_RCU */

-#define rcu_sleep_check() \
+#define rcu_sleep_check(locked) \
do { \
- rcu_preempt_sleep_check(); \
+ if (!(locked)) \
+ rcu_preempt_sleep_check(); \
rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map), \
"Illegal context switch in RCU-bh" \
" read-side critical section"); \
@@ -494,7 +495,7 @@ static inline void rcu_preempt_sleep_check(void)
#else /* #ifdef CONFIG_PROVE_RCU */

#define rcu_lockdep_assert(c, s) do { } while (0)
-#define rcu_sleep_check() do { } while (0)
+#define rcu_sleep_check(locked) do { } while (0)

#endif /* #else #ifdef CONFIG_PROVE_RCU */

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e692a02..027deea 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2608,6 +2608,20 @@ extern int __cond_resched_softirq(void);
__cond_resched_softirq(); \
})

+#ifdef CONFIG_PREEMPT_RCU
+#define PREEMPT_RCU_OFFSET 1
+#else
+#define PREEMPT_RCU_OFFSET PREEMPT_CHECK_OFFSET
+#endif
+
+extern int __cond_resched_rcu(void);
+
+#define cond_resched_rcu() ({ \
+ __might_sleep(__FILE__, __LINE__, PREEMPT_ACTIVE | \
+ PREEMPT_RCU_OFFSET); \
+ __cond_resched_rcu(); \
+})
+
/*
* Does a critical section need to be broken due to another
* task waiting?: (technically does not depend on CONFIG_PREEMPT,
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 67d0465..2724be7 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2793,7 +2793,7 @@ static inline void schedule_debug(struct task_struct *prev)
*/
if (unlikely(in_atomic_preempt_off() && !prev->exit_state))
__schedule_bug(prev);
- rcu_sleep_check();
+ rcu_sleep_check(0);

profile_hit(SCHED_PROFILING, __builtin_return_address(0));

@@ -4364,6 +4364,20 @@ int __sched __cond_resched_softirq(void)
}
EXPORT_SYMBOL(__cond_resched_softirq);

+int __sched __cond_resched_rcu(void)
+{
+#ifndef CONFIG_PREEMPT_RCU
+ if (should_resched()) {
+ rcu_read_unlock();
+ __cond_resched();
+ rcu_read_lock();
+ return 1;
+ }
+#endif
+ return 0;
+}
+EXPORT_SYMBOL(__cond_resched_rcu);
+
/**
* yield - yield the current processor to other threads.
*
@@ -7062,7 +7076,9 @@ void __might_sleep(const char *file, int line, int preempt_offset)
{
static unsigned long prev_jiffy; /* ratelimiting */

- rcu_sleep_check(); /* WARN_ON_ONCE() by default, no rate limit reqd. */
+ /* WARN_ON_ONCE() by default, no rate limit reqd. */
+ rcu_sleep_check(preempt_offset & PREEMPT_ACTIVE);
+ preempt_offset &= ~PREEMPT_ACTIVE;
if ((preempt_count_equals(preempt_offset) && !irqs_disabled()) ||
system_state != SYSTEM_RUNNING || oops_in_progress)
return;
--
1.7.3.4


Regards

--
Julian Anastasov <[email protected]>

2013-05-02 17:35:27

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Thu, May 02, 2013 at 06:54:05PM +0300, Julian Anastasov wrote:
>
> Hello,
>
> On Thu, 2 May 2013, Peter Zijlstra wrote:
>
> > On Wed, May 01, 2013 at 09:22:08PM +0300, Julian Anastasov wrote:
> > > > +extern int __cond_resched_rcu(void);
> > > > +
> > > > +#define cond_resched_rcu() ({ \
> > > > + __might_sleep(__FILE__, __LINE__, 0); \
> > >
> > > I see your goal. But digging into __might_sleep()
> > > I see that rcu_sleep_check() will scream for the non-preempt
> > > case because we are under rcu_read_lock.
> >
> >
> > #ifdef CONFIG_PREEMPT_RCU
> > #define PREEMPT_RCU_OFFSET 0
> > #else
> > #define PREEMPT_RCU_OFFSET 1
> > #endif
> >
> > #define cond_resched_rcu() ({ \
> > __might_sleep(__FILE__, __LINE__, PREEMPT_RCU_OFFSET); \
> > __cond_resched_rcu(); \
> > })
> >
> > Should work I think..
>
> I implemented your idea.
>
> I tested the following patch in 2 variants,
> TINY_RCU and CONFIG_TREE_PREEMPT_RCU. I see the

Could you please also try CONFIG_TREE_RCU?

> error if extra rcu_read_lock is added for testing.
>
> I'm using the PREEMPT_ACTIVE flag to indicate
> that we are already under lock. It should work because
> __might_sleep is not called with such bit. I also tried to
> add new flag in include/linux/hardirq.h but PREEMPT_ACTIVE
> depends on the arch, so this alternative looked difficult to
> implement.
>
> include/linux/rcupdate.h | 7 ++++---
> include/linux/sched.h | 14 ++++++++++++++
> kernel/sched/core.c | 20 ++++++++++++++++++--
> 3 files changed, 36 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index b758ce1..b594759 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -480,9 +480,10 @@ static inline void rcu_preempt_sleep_check(void)
> }
> #endif /* #else #ifdef CONFIG_PROVE_RCU */
>
> -#define rcu_sleep_check() \
> +#define rcu_sleep_check(locked) \
> do { \
> - rcu_preempt_sleep_check(); \
> + if (!(locked)) \
> + rcu_preempt_sleep_check(); \
> rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map), \
> "Illegal context switch in RCU-bh" \
> " read-side critical section"); \
> @@ -494,7 +495,7 @@ static inline void rcu_preempt_sleep_check(void)
> #else /* #ifdef CONFIG_PROVE_RCU */
>
> #define rcu_lockdep_assert(c, s) do { } while (0)
> -#define rcu_sleep_check() do { } while (0)
> +#define rcu_sleep_check(locked) do { } while (0)
>
> #endif /* #else #ifdef CONFIG_PROVE_RCU */
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index e692a02..027deea 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2608,6 +2608,20 @@ extern int __cond_resched_softirq(void);
> __cond_resched_softirq(); \
> })
>
> +#ifdef CONFIG_PREEMPT_RCU
> +#define PREEMPT_RCU_OFFSET 1
> +#else
> +#define PREEMPT_RCU_OFFSET PREEMPT_CHECK_OFFSET
> +#endif
> +
> +extern int __cond_resched_rcu(void);
> +
> +#define cond_resched_rcu() ({ \
> + __might_sleep(__FILE__, __LINE__, PREEMPT_ACTIVE | \
> + PREEMPT_RCU_OFFSET); \
> + __cond_resched_rcu(); \
> +})
> +
> /*
> * Does a critical section need to be broken due to another
> * task waiting?: (technically does not depend on CONFIG_PREEMPT,
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 67d0465..2724be7 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2793,7 +2793,7 @@ static inline void schedule_debug(struct task_struct *prev)
> */
> if (unlikely(in_atomic_preempt_off() && !prev->exit_state))
> __schedule_bug(prev);
> - rcu_sleep_check();
> + rcu_sleep_check(0);
>
> profile_hit(SCHED_PROFILING, __builtin_return_address(0));
>
> @@ -4364,6 +4364,20 @@ int __sched __cond_resched_softirq(void)
> }
> EXPORT_SYMBOL(__cond_resched_softirq);
>
> +int __sched __cond_resched_rcu(void)
> +{
> +#ifndef CONFIG_PREEMPT_RCU
> + if (should_resched()) {
> + rcu_read_unlock();
> + __cond_resched();
> + rcu_read_lock();
> + return 1;
> + }
> +#endif
> + return 0;
> +}
> +EXPORT_SYMBOL(__cond_resched_rcu);
> +
> /**
> * yield - yield the current processor to other threads.
> *
> @@ -7062,7 +7076,9 @@ void __might_sleep(const char *file, int line, int preempt_offset)
> {
> static unsigned long prev_jiffy; /* ratelimiting */
>
> - rcu_sleep_check(); /* WARN_ON_ONCE() by default, no rate limit reqd. */
> + /* WARN_ON_ONCE() by default, no rate limit reqd. */
> + rcu_sleep_check(preempt_offset & PREEMPT_ACTIVE);

Color me confused.

>From what I can see, the two values passed in through preempt_offset
are PREEMPT_LOCK_OFFSET and SOFTIRQ_DISABLE_OFFSET. PREEMPT_ACTIVE
is normally a high-order bit, above PREEMPT_MASK, SOFTIRQ_MASK, and
HARDIRQ_MASK.

PREEMPT_LOCK_OFFSET and SOFTIRQ_DISABLE_OFFSET have only low-order bits,
so I don't see how rcu_sleep_check() is passed anything other than zero.
Am I going blind, or what?

Thanx, Paul

> + preempt_offset &= ~PREEMPT_ACTIVE;
> if ((preempt_count_equals(preempt_offset) && !irqs_disabled()) ||
> system_state != SYSTEM_RUNNING || oops_in_progress)
> return;
> --
> 1.7.3.4
>
>
> Regards
>
> --
> Julian Anastasov <[email protected]>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2013-05-02 18:54:53

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Thu, 2 May 2013, Paul E. McKenney wrote:

> On Thu, May 02, 2013 at 06:54:05PM +0300, Julian Anastasov wrote:
> >
> > I tested the following patch in 2 variants,
> > TINY_RCU and CONFIG_TREE_PREEMPT_RCU. I see the
>
> Could you please also try CONFIG_TREE_RCU?

Note that I'm testing on some 9-year old
UP system, i.e. 1 CPU. Now I enabled SMP to test CONFIG_TREE_RCU
and the results are same. I think, it should be just like
the TINY_RCU in terms of these debuggings (non-preempt). Extra
rcu_read_lock gives me "Illegal context switch in RCU read-side
critical section" in addition to the "BUG: sleeping function
called from invalid context" message.

> > error if extra rcu_read_lock is added for testing.
> >
> > I'm using the PREEMPT_ACTIVE flag to indicate
> > that we are already under lock. It should work because
> > __might_sleep is not called with such bit. I also tried to
> > add new flag in include/linux/hardirq.h but PREEMPT_ACTIVE
> > depends on the arch, so this alternative looked difficult to
> > implement.

> > +extern int __cond_resched_rcu(void);
> > +
> > +#define cond_resched_rcu() ({ \
> > + __might_sleep(__FILE__, __LINE__, PREEMPT_ACTIVE | \
> > + PREEMPT_RCU_OFFSET); \
> > + __cond_resched_rcu(); \
> > +})
> > +

> > @@ -7062,7 +7076,9 @@ void __might_sleep(const char *file, int line, int preempt_offset)
> > {
> > static unsigned long prev_jiffy; /* ratelimiting */
> >
> > - rcu_sleep_check(); /* WARN_ON_ONCE() by default, no rate limit reqd. */
> > + /* WARN_ON_ONCE() by default, no rate limit reqd. */
> > + rcu_sleep_check(preempt_offset & PREEMPT_ACTIVE);
>
> Color me confused.
>
> >From what I can see, the two values passed in through preempt_offset
> are PREEMPT_LOCK_OFFSET and SOFTIRQ_DISABLE_OFFSET. PREEMPT_ACTIVE
> is normally a high-order bit, above PREEMPT_MASK, SOFTIRQ_MASK, and
> HARDIRQ_MASK.
>
> PREEMPT_LOCK_OFFSET and SOFTIRQ_DISABLE_OFFSET have only low-order bits,
> so I don't see how rcu_sleep_check() is passed anything other than zero.
> Am I going blind, or what?

Only the new cond_resched_rcu() macro provides
PREEMPT_ACTIVE flag to skip the rcu_preempt_sleep_check()
call. The old macros provide locked=0 as you noticed. Does it
answer your question or I'm missing something?

Regards

--
Julian Anastasov <[email protected]>

2013-05-02 19:24:05

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Thu, 2 May 2013, Julian Anastasov wrote:

> Note that I'm testing on some 9-year old
> UP system, i.e. 1 CPU. Now I enabled SMP to test CONFIG_TREE_RCU
> and the results are same. I think, it should be just like
> the TINY_RCU in terms of these debuggings (non-preempt). Extra
> rcu_read_lock gives me "Illegal context switch in RCU read-side
> critical section" in addition to the "BUG: sleeping function
> called from invalid context" message.

Just to clarify about the test with extra
rcu_read_lock because above paragraph is very confusing:

- The __might_sleep call with PREEMPT_ACTIVE | PREEMPT_RCU_OFFSET
just warns with "BUG: sleeping function called from invalid context"
because its rcu_sleep_check is silenced. We match the
nesting depth only.

- but __cond_resched -> __schedule -> schedule_debug warns about
the extra rcu_read_lock() with "BUG: scheduling while atomic" and
then with "Illegal context switch in RCU read-side critical section"
from rcu_sleep_check(0).

Regards

--
Julian Anastasov <[email protected]>

2013-05-02 19:36:27

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Thu, May 02, 2013 at 09:55:54PM +0300, Julian Anastasov wrote:
>
> Hello,
>
> On Thu, 2 May 2013, Paul E. McKenney wrote:
>
> > On Thu, May 02, 2013 at 06:54:05PM +0300, Julian Anastasov wrote:
> > >
> > > I tested the following patch in 2 variants,
> > > TINY_RCU and CONFIG_TREE_PREEMPT_RCU. I see the
> >
> > Could you please also try CONFIG_TREE_RCU?
>
> Note that I'm testing on some 9-year old
> UP system, i.e. 1 CPU. Now I enabled SMP to test CONFIG_TREE_RCU
> and the results are same. I think, it should be just like
> the TINY_RCU in terms of these debuggings (non-preempt). Extra
> rcu_read_lock gives me "Illegal context switch in RCU read-side
> critical section" in addition to the "BUG: sleeping function
> called from invalid context" message.

OK...

> > > error if extra rcu_read_lock is added for testing.
> > >
> > > I'm using the PREEMPT_ACTIVE flag to indicate
> > > that we are already under lock. It should work because
> > > __might_sleep is not called with such bit. I also tried to
> > > add new flag in include/linux/hardirq.h but PREEMPT_ACTIVE
> > > depends on the arch, so this alternative looked difficult to
> > > implement.
>
> > > +extern int __cond_resched_rcu(void);
> > > +
> > > +#define cond_resched_rcu() ({ \
> > > + __might_sleep(__FILE__, __LINE__, PREEMPT_ACTIVE | \
> > > + PREEMPT_RCU_OFFSET); \
> > > + __cond_resched_rcu(); \
> > > +})
> > > +
>
> > > @@ -7062,7 +7076,9 @@ void __might_sleep(const char *file, int line, int preempt_offset)
> > > {
> > > static unsigned long prev_jiffy; /* ratelimiting */
> > >
> > > - rcu_sleep_check(); /* WARN_ON_ONCE() by default, no rate limit reqd. */
> > > + /* WARN_ON_ONCE() by default, no rate limit reqd. */
> > > + rcu_sleep_check(preempt_offset & PREEMPT_ACTIVE);
> >
> > Color me confused.
> >
> > >From what I can see, the two values passed in through preempt_offset
> > are PREEMPT_LOCK_OFFSET and SOFTIRQ_DISABLE_OFFSET. PREEMPT_ACTIVE
> > is normally a high-order bit, above PREEMPT_MASK, SOFTIRQ_MASK, and
> > HARDIRQ_MASK.
> >
> > PREEMPT_LOCK_OFFSET and SOFTIRQ_DISABLE_OFFSET have only low-order bits,
> > so I don't see how rcu_sleep_check() is passed anything other than zero.
> > Am I going blind, or what?
>
> Only the new cond_resched_rcu() macro provides
> PREEMPT_ACTIVE flag to skip the rcu_preempt_sleep_check()
> call. The old macros provide locked=0 as you noticed. Does it
> answer your question or I'm missing something?

PREEMPT_ACTIVE's value is usually 0x10000000. Did it change
since 3.9? If not, rcu_sleep_check(preempt_offset & PREEMPT_ACTIVE)
is the same as rcu_sleep_check(0).

Thanx, Paul

2013-05-02 20:17:59

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Thu, 2 May 2013, Paul E. McKenney wrote:

> > Only the new cond_resched_rcu() macro provides
> > PREEMPT_ACTIVE flag to skip the rcu_preempt_sleep_check()
> > call. The old macros provide locked=0 as you noticed. Does it
> > answer your question or I'm missing something?
>
> PREEMPT_ACTIVE's value is usually 0x10000000. Did it change
> since 3.9? If not, rcu_sleep_check(preempt_offset & PREEMPT_ACTIVE)
> is the same as rcu_sleep_check(0).

Yes, the different platforms use different bit,
that is why I mentioned about my failed attempt at
changing hardirq.h. PREEMPT_ACTIVE is always != 0.

But I don't understand what do you mean by
'preempt_offset & PREEMPT_ACTIVE' being always 0.
It is always 0 for cond_resched(), cond_resched_lock()
and cond_resched_softirq(), not for cond_resched_rcu():

(PREEMPT_ACTIVE | PREEMPT_RCU_OFFSET) & PREEMPT_ACTIVE
should give PREEMPT_ACTIVE, not 0. We have 2 cases in
rcu_sleep_check() for the if:

1. !(PREEMPT_ACTIVE) => FALSE for cond_resched_rcu
2. !(0) => TRUE for other cond_resched_* macros

On x86 the code is:

__might_sleep:
pushl %ebp #
testl $268435456, %ecx #, preempt_offset
...
jne .L240 #,
// rcu_lock_map checked when PREEMPT_ACTIVE is missing
.L240:
// rcu_bh_lock_map checked

Regards

--
Julian Anastasov <[email protected]>

2013-05-02 22:31:19

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Thu, May 02, 2013 at 11:19:12PM +0300, Julian Anastasov wrote:
>
> Hello,
>
> On Thu, 2 May 2013, Paul E. McKenney wrote:
>
> > > Only the new cond_resched_rcu() macro provides
> > > PREEMPT_ACTIVE flag to skip the rcu_preempt_sleep_check()
> > > call. The old macros provide locked=0 as you noticed. Does it
> > > answer your question or I'm missing something?
> >
> > PREEMPT_ACTIVE's value is usually 0x10000000. Did it change
> > since 3.9? If not, rcu_sleep_check(preempt_offset & PREEMPT_ACTIVE)
> > is the same as rcu_sleep_check(0).
>
> Yes, the different platforms use different bit,
> that is why I mentioned about my failed attempt at
> changing hardirq.h. PREEMPT_ACTIVE is always != 0.
>
> But I don't understand what do you mean by
> 'preempt_offset & PREEMPT_ACTIVE' being always 0.
> It is always 0 for cond_resched(), cond_resched_lock()
> and cond_resched_softirq(), not for cond_resched_rcu():
>
> (PREEMPT_ACTIVE | PREEMPT_RCU_OFFSET) & PREEMPT_ACTIVE
> should give PREEMPT_ACTIVE, not 0. We have 2 cases in
> rcu_sleep_check() for the if:
>
> 1. !(PREEMPT_ACTIVE) => FALSE for cond_resched_rcu
> 2. !(0) => TRUE for other cond_resched_* macros
>
> On x86 the code is:
>
> __might_sleep:
> pushl %ebp #
> testl $268435456, %ecx #, preempt_offset
> ...
> jne .L240 #,
> // rcu_lock_map checked when PREEMPT_ACTIVE is missing
> .L240:
> // rcu_bh_lock_map checked

OK, apologies -- I was looking at the calls to __might_sleep() in
mainline, and missed the one that you added. Revisiting that, a
question:

> +#ifdef CONFIG_PREEMPT_RCU
> +#define PREEMPT_RCU_OFFSET 1

Does this really want to be "1" instead of PREEMPT_OFFSET?

> +#else
> +#define PREEMPT_RCU_OFFSET PREEMPT_CHECK_OFFSET
> +#endif
> +
> +extern int __cond_resched_rcu(void);
> +
> +#define cond_resched_rcu() ({ \
> + __might_sleep(__FILE__, __LINE__, PREEMPT_ACTIVE | \
> + PREEMPT_RCU_OFFSET); \
> + __cond_resched_rcu(); \
> +})
> +

For the rest, I clearly need to revisit when more alert, because right
now I am not seeing the connection to preemptible RCU's rcu_read_lock()
implementation.

Thanx, Paul

2013-05-03 07:53:10

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Thu, 2 May 2013, Paul E. McKenney wrote:

> mainline, and missed the one that you added. Revisiting that, a
> question:
>
> > +#ifdef CONFIG_PREEMPT_RCU
> > +#define PREEMPT_RCU_OFFSET 1
>
> Does this really want to be "1" instead of PREEMPT_OFFSET?

In this case when CONFIG_PREEMPT_RCU is enabled
we (RCU) do not touch the preempt counters. Instead, the units
are accounted in current->rcu_read_lock_nesting:

#define rcu_preempt_depth() (current->rcu_read_lock_nesting)

__rcu_read_lock:
current->rcu_read_lock_nesting++;

and the path is __might_sleep -> preempt_count_equals ->
rcu_preempt_depth

For now both places do not use PREEMPT_OFFSET:

- #define inc_preempt_count() add_preempt_count(1)
- __rcu_read_lock: current->rcu_read_lock_nesting++;

so, ... it does not matter much for me. In short,
the trick is in preempt_count_equals() where preempt_offset
is a combination of preempt count and RCU preempt depth:

#ifdef CONFIG_PREEMPT_RCU
#define PREEMPT_RCU_OFFSET (0 /* preempt */ + 1 /* RCU */)
#else
#define PREEMPT_RCU_OFFSET (PREEMPT_CHECK_OFFSET + 0 /* RCU */)
#endif

Let me know for your preference about this definition...

Regards

--
Julian Anastasov <[email protected]>

2013-05-03 16:31:11

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Fri, May 03, 2013 at 10:52:36AM +0300, Julian Anastasov wrote:
>
> Hello,
>
> On Thu, 2 May 2013, Paul E. McKenney wrote:
>
> > mainline, and missed the one that you added. Revisiting that, a
> > question:
> >
> > > +#ifdef CONFIG_PREEMPT_RCU
> > > +#define PREEMPT_RCU_OFFSET 1
> >
> > Does this really want to be "1" instead of PREEMPT_OFFSET?
>
> In this case when CONFIG_PREEMPT_RCU is enabled
> we (RCU) do not touch the preempt counters. Instead, the units
> are accounted in current->rcu_read_lock_nesting:
>
> #define rcu_preempt_depth() (current->rcu_read_lock_nesting)
>
> __rcu_read_lock:
> current->rcu_read_lock_nesting++;
>
> and the path is __might_sleep -> preempt_count_equals ->
> rcu_preempt_depth
>
> For now both places do not use PREEMPT_OFFSET:
>
> - #define inc_preempt_count() add_preempt_count(1)
> - __rcu_read_lock: current->rcu_read_lock_nesting++;
>
> so, ... it does not matter much for me. In short,
> the trick is in preempt_count_equals() where preempt_offset
> is a combination of preempt count and RCU preempt depth:
>
> #ifdef CONFIG_PREEMPT_RCU
> #define PREEMPT_RCU_OFFSET (0 /* preempt */ + 1 /* RCU */)
> #else
> #define PREEMPT_RCU_OFFSET (PREEMPT_CHECK_OFFSET + 0 /* RCU */)
> #endif
>
> Let me know for your preference about this definition...

OK, after getting some sleep, I might have located the root cause of
my confusion yesterday.

The key point is that I don't understand why we cannot get the effect
we are looking for with the following in sched.h (or wherever):

static inline int cond_resched_rcu(void)
{
#if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU)
rcu_read_unlock();
cond_resched();
rcu_read_lock();
#endif
}

This adds absolutely no overhead in non-debug builds of CONFIG_PREEMPT_RCU,
does the checking in debug builds, and allows voluntary preemption in
!CONFIG_PREEMPT_RCU builds. CONFIG_PROVE_RCU builds will check for an
(illegal) outer rcu_read_lock() in CONFIG_PREEMPT_RCU builds, and you
will get "scheduling while atomic" in response to an outer rcu_read_lock()
in !CONFIG_PREEMPT_RCU builds.

It also seems to me a lot simpler.

Does this work, or am I still missing something?

Thanx, Paul

2013-05-03 17:06:25

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

> The key point is that I don't understand why we cannot get the effect
> we are looking for with the following in sched.h (or wherever):
>
> static inline int cond_resched_rcu(void)
> {
> #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU)
> rcu_read_unlock();
> cond_resched();
> rcu_read_lock();
> #endif
> }
>
> This adds absolutely no overhead in non-debug builds of CONFIG_PREEMPT_RCU,
> does the checking in debug builds, and allows voluntary preemption in
> !CONFIG_PREEMPT_RCU builds. CONFIG_PROVE_RCU builds will check for an
> (illegal) outer rcu_read_lock() in CONFIG_PREEMPT_RCU builds, and you
> will get "scheduling while atomic" in response to an outer rcu_read_lock()
> in !CONFIG_PREEMPT_RCU builds.
>
> It also seems to me a lot simpler.
>
> Does this work, or am I still missing something?

It can do quite a number of superfluous rcu_read_unlock()/lock() pairs for
voluntary preemption kernels?

2013-05-03 17:39:04

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Fri, May 03, 2013 at 07:04:47PM +0200, Peter Zijlstra wrote:
> > The key point is that I don't understand why we cannot get the effect
> > we are looking for with the following in sched.h (or wherever):
> >
> > static inline int cond_resched_rcu(void)
> > {
> > #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU)
> > rcu_read_unlock();
> > cond_resched();
> > rcu_read_lock();
> > #endif
> > }
> >
> > This adds absolutely no overhead in non-debug builds of CONFIG_PREEMPT_RCU,
> > does the checking in debug builds, and allows voluntary preemption in
> > !CONFIG_PREEMPT_RCU builds. CONFIG_PROVE_RCU builds will check for an
> > (illegal) outer rcu_read_lock() in CONFIG_PREEMPT_RCU builds, and you
> > will get "scheduling while atomic" in response to an outer rcu_read_lock()
> > in !CONFIG_PREEMPT_RCU builds.
> >
> > It also seems to me a lot simpler.
> >
> > Does this work, or am I still missing something?
>
> It can do quite a number of superfluous rcu_read_unlock()/lock() pairs for
> voluntary preemption kernels?

This happens in only two cases:

1. CONFIG_PREEMPT_RCU=n kernels. But in this case, rcu_read_unlock()
and rcu_read_lock() are free, at least for CONFIG_PROVE_LOCKING=n
kernels. And if you have CONFIG_PROVE_LOCKING=y, any contribution
from rcu_read_unlock() and rcu_read_lock() will be in the noise.

2. CONFIG_DEBUG_ATOMIC_SLEEP=y kernels -- but in this case, you
-want- the debugging.

So either the overhead is non-existent, or you explicitly asked for the
resulting debugging.

In particular, CONFIG_PREEMPT_RCU=y kernels have an empty static inline
function, which is free -- unless CONFIG_DEBUG_ATOMIC_SLEEP=y, in which
case you again explicitly asked for the debugging.

So I do not believe that the extra rcu_read_unlock()/lock() pairs are a
problem in any of the possible combinations of configurations.

Thanx, Paul

2013-05-03 17:48:06

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Fri, 3 May 2013, Paul E. McKenney wrote:

> OK, after getting some sleep, I might have located the root cause of
> my confusion yesterday.
>
> The key point is that I don't understand why we cannot get the effect
> we are looking for with the following in sched.h (or wherever):
>
> static inline int cond_resched_rcu(void)
> {
> #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU)
> rcu_read_unlock();
> cond_resched();
> rcu_read_lock();
> #endif
> }
>
> This adds absolutely no overhead in non-debug builds of CONFIG_PREEMPT_RCU,
> does the checking in debug builds, and allows voluntary preemption in
> !CONFIG_PREEMPT_RCU builds. CONFIG_PROVE_RCU builds will check for an
> (illegal) outer rcu_read_lock() in CONFIG_PREEMPT_RCU builds, and you
> will get "scheduling while atomic" in response to an outer rcu_read_lock()
> in !CONFIG_PREEMPT_RCU builds.
>
> It also seems to me a lot simpler.
>
> Does this work, or am I still missing something?

It should work. It is a better version of
the 2nd variant I mentioned here:

http://marc.info/?l=linux-netdev&m=136741839021257&w=2

I'll stick to this version, hope Peter Zijlstra
agrees. Playing with PREEMPT_ACTIVE or another bit makes
the things more complex.

To summarize:

- CONFIG_PREEMPT_RCU:
- no empty functions called
- CONFIG_DEBUG_ATOMIC_SLEEP can catch errors even
for this case

- non-CONFIG_PREEMPT_RCU:
- rcu_read_lock and rcu_read_unlock are barrier(),
so it expands just to cond_resched()

I'll repeat the tests tomorrow and if there are
no problems will post official version after the merge window.

Regards

--
Julian Anastasov <[email protected]>

2013-05-03 18:11:04

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

> This happens in only two cases:
>
> 1. CONFIG_PREEMPT_RCU=n kernels. But in this case, rcu_read_unlock()
> and rcu_read_lock() are free, at least for CONFIG_PROVE_LOCKING=n
> kernels. And if you have CONFIG_PROVE_LOCKING=y, any contribution
> from rcu_read_unlock() and rcu_read_lock() will be in the noise.

Oh argh.. I completely overlooked that rcu_read_{,un}lock() are NOPs for
PREEMPT=n kernels.

/me crawls back under his stone..

2013-05-04 07:22:16

by Julian Anastasov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper


Hello,

On Fri, 3 May 2013, Paul E. McKenney wrote:

> static inline int cond_resched_rcu(void)
> {
> #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU)
> rcu_read_unlock();
> cond_resched();
> rcu_read_lock();
> #endif
> }
>
> This adds absolutely no overhead in non-debug builds of CONFIG_PREEMPT_RCU,
> does the checking in debug builds, and allows voluntary preemption in
> !CONFIG_PREEMPT_RCU builds. CONFIG_PROVE_RCU builds will check for an
> (illegal) outer rcu_read_lock() in CONFIG_PREEMPT_RCU builds, and you
> will get "scheduling while atomic" in response to an outer rcu_read_lock()
> in !CONFIG_PREEMPT_RCU builds.
>
> It also seems to me a lot simpler.
>
> Does this work, or am I still missing something?

Works perfectly. Thanks! Tested CONFIG_PREEMPT_RCU=y/n,
the errors messages with extra RCU lock.

Regards

--
Julian Anastasov <[email protected]>

2013-05-04 18:03:29

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] sched: Add cond_resched_rcu_lock() helper

On Sat, May 04, 2013 at 10:23:31AM +0300, Julian Anastasov wrote:
>
> Hello,
>
> On Fri, 3 May 2013, Paul E. McKenney wrote:
>
> > static inline int cond_resched_rcu(void)
> > {
> > #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU)
> > rcu_read_unlock();
> > cond_resched();
> > rcu_read_lock();
> > #endif
> > }
> >
> > This adds absolutely no overhead in non-debug builds of CONFIG_PREEMPT_RCU,
> > does the checking in debug builds, and allows voluntary preemption in
> > !CONFIG_PREEMPT_RCU builds. CONFIG_PROVE_RCU builds will check for an
> > (illegal) outer rcu_read_lock() in CONFIG_PREEMPT_RCU builds, and you
> > will get "scheduling while atomic" in response to an outer rcu_read_lock()
> > in !CONFIG_PREEMPT_RCU builds.
> >
> > It also seems to me a lot simpler.
> >
> > Does this work, or am I still missing something?
>
> Works perfectly. Thanks! Tested CONFIG_PREEMPT_RCU=y/n,
> the errors messages with extra RCU lock.

Very good! Please send the patch along, and I will ack it.

Thanx, Paul