2006-09-30 00:11:52

by Thomas Gleixner

[permalink] [raw]
Subject: [patch 08/23] dynticks: prepare the RCU code

From: Ingo Molnar <[email protected]>

prepare the RCU code for dynticks/nohz. Since on nohz kernels there
is no guaranteed timer IRQ that processes RCU callbacks, the idle
code has to make sure that all RCU callbacks that can be finished
off are indeed finished off. This patch adds the necessary APIs:
rcu_advance_callbacks() [to register quiescent state] and
rcu_process_callbacks() [to finish finishable RCU callbacks].

Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
--
include/linux/rcupdate.h | 2 ++
kernel/rcupdate.c | 13 ++++++++++++-
2 files changed, 14 insertions(+), 1 deletion(-)

Index: linux-2.6.18-mm2/include/linux/rcupdate.h
===================================================================
--- linux-2.6.18-mm2.orig/include/linux/rcupdate.h 2006-09-30 01:41:13.000000000 +0200
+++ linux-2.6.18-mm2/include/linux/rcupdate.h 2006-09-30 01:41:16.000000000 +0200
@@ -271,6 +271,7 @@ extern int rcu_needs_cpu(int cpu);

extern void rcu_init(void);
extern void rcu_check_callbacks(int cpu, int user);
+extern void rcu_advance_callbacks(int cpu, int user);
extern void rcu_restart_cpu(int cpu);
extern long rcu_batches_completed(void);
extern long rcu_batches_completed_bh(void);
@@ -283,6 +284,7 @@ extern void FASTCALL(call_rcu_bh(struct
extern void synchronize_rcu(void);
void synchronize_idle(void);
extern void rcu_barrier(void);
+extern void rcu_process_callbacks(unsigned long unused);

#endif /* __KERNEL__ */
#endif /* __LINUX_RCUPDATE_H */
Index: linux-2.6.18-mm2/kernel/rcupdate.c
===================================================================
--- linux-2.6.18-mm2.orig/kernel/rcupdate.c 2006-09-30 01:41:13.000000000 +0200
+++ linux-2.6.18-mm2/kernel/rcupdate.c 2006-09-30 01:41:16.000000000 +0200
@@ -460,7 +460,7 @@ static void __rcu_process_callbacks(stru
rcu_do_batch(rdp);
}

-static void rcu_process_callbacks(unsigned long unused)
+void rcu_process_callbacks(unsigned long unused)
{
__rcu_process_callbacks(&rcu_ctrlblk, &__get_cpu_var(rcu_data));
__rcu_process_callbacks(&rcu_bh_ctrlblk, &__get_cpu_var(rcu_bh_data));
@@ -515,6 +515,17 @@ int rcu_needs_cpu(int cpu)
return (!!rdp->curlist || !!rdp_bh->curlist || rcu_pending(cpu));
}

+void rcu_advance_callbacks(int cpu, int user)
+{
+ if (user ||
+ (idle_cpu(cpu) && !in_softirq() &&
+ hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
+ rcu_qsctr_inc(cpu);
+ rcu_bh_qsctr_inc(cpu);
+ } else if (!in_softirq())
+ rcu_bh_qsctr_inc(cpu);
+}
+
void rcu_check_callbacks(int cpu, int user)
{
if (user ||

--


2006-09-30 08:40:21

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch 08/23] dynticks: prepare the RCU code

On Fri, 29 Sep 2006 23:58:27 -0000
Thomas Gleixner <[email protected]> wrote:

> From: Ingo Molnar <[email protected]>
>
> prepare the RCU code for dynticks/nohz. Since on nohz kernels there
> is no guaranteed timer IRQ that processes RCU callbacks, the idle
> code has to make sure that all RCU callbacks that can be finished
> off are indeed finished off. This patch adds the necessary APIs:
> rcu_advance_callbacks() [to register quiescent state] and
> rcu_process_callbacks() [to finish finishable RCU callbacks].
>
> ...
>
> +void rcu_advance_callbacks(int cpu, int user)
> +{
> + if (user ||
> + (idle_cpu(cpu) && !in_softirq() &&
> + hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
> + rcu_qsctr_inc(cpu);
> + rcu_bh_qsctr_inc(cpu);
> + } else if (!in_softirq())
> + rcu_bh_qsctr_inc(cpu);
> +}
> +

I hope this function is immediately clear to the RCU maintainers, because it's
complete mud to me.

An introductory comment which describes what this function does and how it
does it seems appropriate. And some words which decrypt the tests in there
might be needed too.

2006-09-30 12:25:27

by Dipankar Sarma

[permalink] [raw]
Subject: Re: [patch 08/23] dynticks: prepare the RCU code

On Sat, Sep 30, 2006 at 01:36:41AM -0700, Andrew Morton wrote:
> On Fri, 29 Sep 2006 23:58:27 -0000
> Thomas Gleixner <[email protected]> wrote:
>
> > From: Ingo Molnar <[email protected]>
> >
> > prepare the RCU code for dynticks/nohz. Since on nohz kernels there
> > is no guaranteed timer IRQ that processes RCU callbacks, the idle
> > code has to make sure that all RCU callbacks that can be finished
> > off are indeed finished off. This patch adds the necessary APIs:
> > rcu_advance_callbacks() [to register quiescent state] and
> > rcu_process_callbacks() [to finish finishable RCU callbacks].
> >
> > ...
> >
> > +void rcu_advance_callbacks(int cpu, int user)
> > +{
> > + if (user ||
> > + (idle_cpu(cpu) && !in_softirq() &&
> > + hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
> > + rcu_qsctr_inc(cpu);
> > + rcu_bh_qsctr_inc(cpu);
> > + } else if (!in_softirq())
> > + rcu_bh_qsctr_inc(cpu);
> > +}
> > +
>
> I hope this function is immediately clear to the RCU maintainers, because it's
> complete mud to me.
>

Ingo,

It is duplicating code. That can be easily fixed, but we need to figure
out what we really want from RCU when we are about to switch off
the ticks. It is hard if you want to finish off all the pending
RCUs and go to nohz state. Can you live with backing out if
there are pending RCUs ?

Thanks
Dipankar

2006-09-30 13:18:20

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 08/23] dynticks: prepare the RCU code


* Dipankar Sarma <[email protected]> wrote:

> It is duplicating code. That can be easily fixed, but we need to
> figure out what we really want from RCU when we are about to switch
> off the ticks. It is hard if you want to finish off all the pending
> RCUs and go to nohz state. Can you live with backing out if there are
> pending RCUs ?

the thing is that when we go idle we /want/ to process whatever delayed
work there might be - rate limited or not. Do you agree with that
approach? I consider this a performance feature as well: this way we can
utilize otherwise lost idle time. It is not a problem that we dont
'batch' this processing: we are really idle and we've got free cycles to
burn. We could even do an RCU processing loop that immediately breaks
out if need_resched() gets set [by an IRQ or by another CPU].

secondly, i think i saw functionality problems when RCU was not
completed before going idle - for example synchronize_rcu() on another
CPU would hang.

what approach would you suggest to achieve these goals?

Ingo

2006-09-30 13:52:40

by Dipankar Sarma

[permalink] [raw]
Subject: Re: [patch 08/23] dynticks: prepare the RCU code

On Sat, Sep 30, 2006 at 03:09:58PM +0200, Ingo Molnar wrote:
> * Dipankar Sarma <[email protected]> wrote:
>
> > It is duplicating code. That can be easily fixed, but we need to
> > figure out what we really want from RCU when we are about to switch
> > off the ticks. It is hard if you want to finish off all the pending
> > RCUs and go to nohz state. Can you live with backing out if there are
> > pending RCUs ?
>
> the thing is that when we go idle we /want/ to process whatever delayed
> work there might be - rate limited or not. Do you agree with that
> approach? I consider this a performance feature as well: this way we can
> utilize otherwise lost idle time. It is not a problem that we dont
> 'batch' this processing: we are really idle and we've got free cycles to
> burn. We could even do an RCU processing loop that immediately breaks
> out if need_resched() gets set [by an IRQ or by another CPU].

If you don't care about rate limiting RCU processing (you wouldn't
in CONFIG_PREEMPT_RT), you still have to deal with the situation
that one CPU going idle doesn't guarantee that you can process
all pending RCUs. You can process the finished ones, but what
about the ones that are still waiting for the grace period
beyond the current cpu ?

>
> secondly, i think i saw functionality problems when RCU was not
> completed before going idle - for example synchronize_rcu() on another
> CPU would hang.

That is probably because of what I mention above. In the original
CONFIG_NO_IDLE_HZ, we don't go into a nohz state if there are
RCUs pending in that cpu.

>
> what approach would you suggest to achieve these goals?

There is no way to finish all the RCUs in a given cpu
unless you are prepared to wait for a grace period or so.
So, you go to idle, keep checking in the timer tick and as soon
as all RCUs are done, go to nohz state. You can keep
processing RCUs in every idle tick so that if you have only
finished RCU callbacks on that cpu, you can go to nohz rightaway.
I can add that API.

Of course, you can do what cpu hotplug does - move the RCUs
to another CPU. But that is an expensive operaation.

Thanks
Dipankar

2006-09-30 21:44:09

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 08/23] dynticks: prepare the RCU code


* Dipankar Sarma <[email protected]> wrote:

> > secondly, i think i saw functionality problems when RCU was not
> > completed before going idle - for example synchronize_rcu() on
> > another CPU would hang.
>
> That is probably because of what I mention above. In the original
> CONFIG_NO_IDLE_HZ, we don't go into a nohz state if there are RCUs
> pending in that cpu.

hm. I just tried it and it seems completing RCU processing isnt even
necessary. I'll drop the RCU hackery. If we need anything then in
synchronize_rcu [which is a rare and slowpath op]: there (on NO_HZ) we
should tickle all cpus via an smp_call_function().

Ingo