2023-07-10 19:26:30

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 2/2] rcu: Don't dump the stalled CPU on where RCU GP kthread last ran twice

On Wed, Jul 05, 2023 at 03:30:20PM +0800, Zhen Lei wrote:
> The stacks of all stalled CPUs will be dumped. If the CPU on where RCU GP
> kthread last ran is stalled, its stack does not need to be dumped again.
>
> Signed-off-by: Zhen Lei <[email protected]>

This one looks good. Please feel free to rebase it before 1/2 and repost.

Thanx, Paul

> ---
> kernel/rcu/tree_stall.h | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index dcfaa3d5db2cbc7..cc884cd49e026a3 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -534,12 +534,14 @@ static void rcu_check_gp_kthread_starvation(void)
> data_race(READ_ONCE(rcu_state.gp_state)),
> gpk ? data_race(READ_ONCE(gpk->__state)) : ~0, cpu);
> if (gpk) {
> + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> +
> pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
> pr_err("RCU grace-period kthread stack dump:\n");
> sched_show_task(gpk);
> if (cpu_is_offline(cpu)) {
> pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
> - } else {
> + } else if (!(data_race(READ_ONCE(rdp->mynode->qsmask)) & rdp->grpmask)) {
> pr_err("Stack dump where RCU GP kthread last ran:\n");
> dump_cpu_task(cpu);
> }
> --
> 2.25.1
>


2023-07-10 20:03:10

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 2/2] rcu: Don't dump the stalled CPU on where RCU GP kthread last ran twice

On Mon, Jul 10, 2023 at 3:06 PM Paul E. McKenney <[email protected]> wrote:
>
> On Wed, Jul 05, 2023 at 03:30:20PM +0800, Zhen Lei wrote:
> > The stacks of all stalled CPUs will be dumped. If the CPU on where RCU GP
> > kthread last ran is stalled, its stack does not need to be dumped again.
> >
> > Signed-off-by: Zhen Lei <[email protected]>
>
> This one looks good. Please feel free to rebase it before 1/2 and repost.

Just a small comment:
I wondered if this would make it harder to identify which stack among
the various CPU stacks corresponds to the one the GP kthread is
running on. However, this line does print the CPU number of the
thread, so it is perhaps not an issue:

pr_err("%s kthread starved for %ld jiffies! g%ld f%#x
%s(%d) ->state=%#x ->cpu=%d\n",

Reviewed-by: Joel Fernandes (Google) <[email protected]>

Thanks.

2023-07-10 20:44:23

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 2/2] rcu: Don't dump the stalled CPU on where RCU GP kthread last ran twice

On Mon, Jul 10, 2023 at 03:55:16PM -0400, Joel Fernandes wrote:
> On Mon, Jul 10, 2023 at 3:06 PM Paul E. McKenney <[email protected]> wrote:
> >
> > On Wed, Jul 05, 2023 at 03:30:20PM +0800, Zhen Lei wrote:
> > > The stacks of all stalled CPUs will be dumped. If the CPU on where RCU GP
> > > kthread last ran is stalled, its stack does not need to be dumped again.
> > >
> > > Signed-off-by: Zhen Lei <[email protected]>
> >
> > This one looks good. Please feel free to rebase it before 1/2 and repost.
>
> Just a small comment:
> I wondered if this would make it harder to identify which stack among
> the various CPU stacks corresponds to the one the GP kthread is
> running on. However, this line does print the CPU number of the
> thread, so it is perhaps not an issue:
>
> pr_err("%s kthread starved for %ld jiffies! g%ld f%#x
> %s(%d) ->state=%#x ->cpu=%d\n",
>
> Reviewed-by: Joel Fernandes (Google) <[email protected]>

Thank you! Zhen Lei, please feel free to add Joel's Reviewed-by on your
next posting.

Thanx, Paul