2019-08-27 01:35:24

by Joel Fernandes

[permalink] [raw]
Subject: [RFC v1 0/2] RCU dyntick nesting counter cleanups

These patches clean up the usage of dynticks nesting counters simplifying the
code, while preserving the usecases.

It is a much needed simplification, makes the code less confusing, and prevents
future bugs such as those that arise from forgetting that the
dynticks_nmi_nesting counter is not a simple counter and can be "crowbarred" in
common situations.

Several nights of rcutorture testing with CONFIG_RCU_EQS_DEBUG on all RCU
kernel configurations have survived without any splats.

Further testing is in progress, hence marked as RFC!

thanks,

- Joel

Joel Fernandes (Google) (2):
rcu/tree: Clean up dynticks counter usage
rcu/tree: Remove dynticks_nmi_nesting counter

.../Data-Structures/Data-Structures.rst | 31 ++----
Documentation/RCU/stallwarn.txt | 6 +-
kernel/rcu/rcu.h | 4 -
kernel/rcu/tree.c | 98 +++++++++----------
kernel/rcu/tree.h | 4 +-
kernel/rcu/tree_stall.h | 4 +-
6 files changed, 64 insertions(+), 83 deletions(-)

--
2.23.0.187.g17f5b7556c-goog


2019-08-28 00:13:41

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [RFC v1 0/2] RCU dyntick nesting counter cleanups

On Mon, Aug 26, 2019 at 09:33:52PM -0400, Joel Fernandes (Google) wrote:
> These patches clean up the usage of dynticks nesting counters simplifying the
> code, while preserving the usecases.
>
> It is a much needed simplification, makes the code less confusing, and prevents
> future bugs such as those that arise from forgetting that the
> dynticks_nmi_nesting counter is not a simple counter and can be "crowbarred" in
> common situations.
>
> Several nights of rcutorture testing with CONFIG_RCU_EQS_DEBUG on all RCU
> kernel configurations have survived without any splats.
>
> Further testing is in progress, hence marked as RFC!

My intent was to review this today, but this ran afoul of recent 3.10
work that made for some "interesting" review comments. Fortunately,
I realized my mistake before sending the email. I then reviewed the
current 2019 RCU dyntick-idle code.

I am doing some (possibly redundant) rcutorture runs on them here, and
will take a fresh look tomorrow.

Thanx, Paul

> thanks,
>
> - Joel
>
> Joel Fernandes (Google) (2):
> rcu/tree: Clean up dynticks counter usage
> rcu/tree: Remove dynticks_nmi_nesting counter
>
> .../Data-Structures/Data-Structures.rst | 31 ++----
> Documentation/RCU/stallwarn.txt | 6 +-
> kernel/rcu/rcu.h | 4 -
> kernel/rcu/tree.c | 98 +++++++++----------
> kernel/rcu/tree.h | 4 +-
> kernel/rcu/tree_stall.h | 4 +-
> 6 files changed, 64 insertions(+), 83 deletions(-)
>
> --
> 2.23.0.187.g17f5b7556c-goog
>

2019-08-28 18:28:11

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH] rcu/dyntick-idle: Add better tracing

The dyntick-idle traces are a bit confusing. This patch makes it simpler
and adds some missing cases such as EQS-enter because user vs idle mode.

Following are the changes:
(1) Add a new context field to trace_rcu_dyntick tracepoint. This
context field can be "USER", "IDLE" or "IRQ".

(2) Remove the "++=" and "--=" strings and replace them with
"StillNonIdle". This is much easier on the eyes, and the -- and ++
are easily apparent in the dynticks_nesting counters we are printing
anyway.

This patch is based on the previous patches to simplify rcu_dyntick
counters [1] and with these traces, I have verified the counters are
working properly.

[1]
Link: https://lore.kernel.org/patchwork/patch/1120021/
Link: https://lore.kernel.org/patchwork/patch/1120022/

Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
include/trace/events/rcu.h | 13 ++++++++-----
kernel/rcu/tree.c | 17 +++++++++++------
2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 66122602bd08..474c1f7e7104 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -449,12 +449,14 @@ TRACE_EVENT_RCU(rcu_fqs,
*/
TRACE_EVENT_RCU(rcu_dyntick,

- TP_PROTO(const char *polarity, long oldnesting, long newnesting, atomic_t dynticks),
+ TP_PROTO(const char *polarity, const char *context, long oldnesting,
+ long newnesting, atomic_t dynticks),

- TP_ARGS(polarity, oldnesting, newnesting, dynticks),
+ TP_ARGS(polarity, context, oldnesting, newnesting, dynticks),

TP_STRUCT__entry(
__field(const char *, polarity)
+ __field(const char *, context)
__field(long, oldnesting)
__field(long, newnesting)
__field(int, dynticks)
@@ -462,14 +464,15 @@ TRACE_EVENT_RCU(rcu_dyntick,

TP_fast_assign(
__entry->polarity = polarity;
+ __entry->context = context;
__entry->oldnesting = oldnesting;
__entry->newnesting = newnesting;
__entry->dynticks = atomic_read(&dynticks);
),

- TP_printk("%s %lx %lx %#3x", __entry->polarity,
- __entry->oldnesting, __entry->newnesting,
- __entry->dynticks & 0xfff)
+ TP_printk("%s %s %lx %lx %#3x", __entry->polarity,
+ __entry->context, __entry->oldnesting, __entry->newnesting,
+ __entry->dynticks & 0xfff)
);

/*
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1465a3e406f8..1a65919ec800 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -570,7 +570,8 @@ static void rcu_eqs_enter(bool user)
}

lockdep_assert_irqs_disabled();
- trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, rdp->dynticks);
+ trace_rcu_dyntick(TPS("Start"), (user ? TPS("USER") : TPS("IDLE")),
+ rdp->dynticks_nesting, 0, rdp->dynticks);
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
rdp = this_cpu_ptr(&rcu_data);
do_nocb_deferred_wakeup(rdp);
@@ -642,15 +643,17 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
* leave it in non-RCU-idle state.
*/
if (rdp->dynticks_nesting != 1) {
- trace_rcu_dyntick(TPS("--="), rdp->dynticks_nesting,
- rdp->dynticks_nesting - 2, rdp->dynticks);
+ trace_rcu_dyntick(TPS("StillNonIdle"), TPS("IRQ"),
+ rdp->dynticks_nesting, rdp->dynticks_nesting - 2,
+ rdp->dynticks);
WRITE_ONCE(rdp->dynticks_nesting, /* No store tearing. */
rdp->dynticks_nesting - 2);
return;
}

/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
- trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nesting, 0, rdp->dynticks);
+ trace_rcu_dyntick(TPS("Start"), TPS("IRQ"), rdp->dynticks_nesting, 0,
+ rdp->dynticks);
WRITE_ONCE(rdp->dynticks_nesting, 0); /* Avoid store tearing. */

if (irq)
@@ -733,7 +736,8 @@ static void rcu_eqs_exit(bool user)
rcu_dynticks_task_exit();
rcu_dynticks_eqs_exit();
rcu_cleanup_after_idle();
- trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, rdp->dynticks);
+ trace_rcu_dyntick(TPS("End"), (user ? TPS("USER") : TPS("IDLE")),
+ rdp->dynticks_nesting, 1, rdp->dynticks);
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
WRITE_ONCE(rdp->dynticks_nesting, 1);

@@ -825,7 +829,8 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
tick_dep_set_cpu(rdp->cpu, TICK_DEP_BIT_RCU);
}

- trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
+ trace_rcu_dyntick(incby == 1 ? TPS("End") : TPS("StillNonIdle"),
+ TPS("IRQ"),
rdp->dynticks_nesting,
rdp->dynticks_nesting + incby, rdp->dynticks);

--
2.23.0.187.g17f5b7556c-goog

2019-08-28 23:07:34

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu/dyntick-idle: Add better tracing

On Wed, Aug 28, 2019 at 02:26:13PM -0400, Joel Fernandes (Google) wrote:
> The dyntick-idle traces are a bit confusing. This patch makes it simpler
> and adds some missing cases such as EQS-enter because user vs idle mode.
>
> Following are the changes:
> (1) Add a new context field to trace_rcu_dyntick tracepoint. This
> context field can be "USER", "IDLE" or "IRQ".
>
> (2) Remove the "++=" and "--=" strings and replace them with
> "StillNonIdle". This is much easier on the eyes, and the -- and ++
> are easily apparent in the dynticks_nesting counters we are printing
> anyway.
>
> This patch is based on the previous patches to simplify rcu_dyntick
> counters [1] and with these traces, I have verified the counters are
> working properly.
>
> [1]
> Link: https://lore.kernel.org/patchwork/patch/1120021/
> Link: https://lore.kernel.org/patchwork/patch/1120022/
>
> Signed-off-by: Joel Fernandes (Google) <[email protected]>

Looks plausible to me.

Thanx, Paul

> ---
> include/trace/events/rcu.h | 13 ++++++++-----
> kernel/rcu/tree.c | 17 +++++++++++------
> 2 files changed, 19 insertions(+), 11 deletions(-)
>
> diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> index 66122602bd08..474c1f7e7104 100644
> --- a/include/trace/events/rcu.h
> +++ b/include/trace/events/rcu.h
> @@ -449,12 +449,14 @@ TRACE_EVENT_RCU(rcu_fqs,
> */
> TRACE_EVENT_RCU(rcu_dyntick,
>
> - TP_PROTO(const char *polarity, long oldnesting, long newnesting, atomic_t dynticks),
> + TP_PROTO(const char *polarity, const char *context, long oldnesting,
> + long newnesting, atomic_t dynticks),
>
> - TP_ARGS(polarity, oldnesting, newnesting, dynticks),
> + TP_ARGS(polarity, context, oldnesting, newnesting, dynticks),
>
> TP_STRUCT__entry(
> __field(const char *, polarity)
> + __field(const char *, context)
> __field(long, oldnesting)
> __field(long, newnesting)
> __field(int, dynticks)
> @@ -462,14 +464,15 @@ TRACE_EVENT_RCU(rcu_dyntick,
>
> TP_fast_assign(
> __entry->polarity = polarity;
> + __entry->context = context;
> __entry->oldnesting = oldnesting;
> __entry->newnesting = newnesting;
> __entry->dynticks = atomic_read(&dynticks);
> ),
>
> - TP_printk("%s %lx %lx %#3x", __entry->polarity,
> - __entry->oldnesting, __entry->newnesting,
> - __entry->dynticks & 0xfff)
> + TP_printk("%s %s %lx %lx %#3x", __entry->polarity,
> + __entry->context, __entry->oldnesting, __entry->newnesting,
> + __entry->dynticks & 0xfff)
> );
>
> /*
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 1465a3e406f8..1a65919ec800 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -570,7 +570,8 @@ static void rcu_eqs_enter(bool user)
> }
>
> lockdep_assert_irqs_disabled();
> - trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, rdp->dynticks);
> + trace_rcu_dyntick(TPS("Start"), (user ? TPS("USER") : TPS("IDLE")),
> + rdp->dynticks_nesting, 0, rdp->dynticks);
> WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
> rdp = this_cpu_ptr(&rcu_data);
> do_nocb_deferred_wakeup(rdp);
> @@ -642,15 +643,17 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
> * leave it in non-RCU-idle state.
> */
> if (rdp->dynticks_nesting != 1) {
> - trace_rcu_dyntick(TPS("--="), rdp->dynticks_nesting,
> - rdp->dynticks_nesting - 2, rdp->dynticks);
> + trace_rcu_dyntick(TPS("StillNonIdle"), TPS("IRQ"),
> + rdp->dynticks_nesting, rdp->dynticks_nesting - 2,
> + rdp->dynticks);
> WRITE_ONCE(rdp->dynticks_nesting, /* No store tearing. */
> rdp->dynticks_nesting - 2);
> return;
> }
>
> /* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
> - trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nesting, 0, rdp->dynticks);
> + trace_rcu_dyntick(TPS("Start"), TPS("IRQ"), rdp->dynticks_nesting, 0,
> + rdp->dynticks);
> WRITE_ONCE(rdp->dynticks_nesting, 0); /* Avoid store tearing. */
>
> if (irq)
> @@ -733,7 +736,8 @@ static void rcu_eqs_exit(bool user)
> rcu_dynticks_task_exit();
> rcu_dynticks_eqs_exit();
> rcu_cleanup_after_idle();
> - trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, rdp->dynticks);
> + trace_rcu_dyntick(TPS("End"), (user ? TPS("USER") : TPS("IDLE")),
> + rdp->dynticks_nesting, 1, rdp->dynticks);
> WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
> WRITE_ONCE(rdp->dynticks_nesting, 1);
>
> @@ -825,7 +829,8 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
> tick_dep_set_cpu(rdp->cpu, TICK_DEP_BIT_RCU);
> }
>
> - trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
> + trace_rcu_dyntick(incby == 1 ? TPS("End") : TPS("StillNonIdle"),
> + TPS("IRQ"),
> rdp->dynticks_nesting,
> rdp->dynticks_nesting + incby, rdp->dynticks);
>
> --
> 2.23.0.187.g17f5b7556c-goog
>