2022-05-09 01:36:49

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: Add rnp->cbovldmask check in rcutree_migrate_callbacks()

On Thu, May 05, 2022 at 11:52:36PM +0800, Zqiang wrote:
> Currently, the rnp's cbovlmask is set in call_rcu(). when CPU going
> offline, the outgoing CPU's callbacks is migrated to target CPU, the
> number of callbacks on the my_rdp may be overloaded, if overload and
> there is no call_rcu() call on target CPU for a long time, the rnp's
> cbovldmask is not set in time. in order to fix this situation, add
> check_cb_ovld_locked() in rcutree_migrate_callbacks() to help CPU more
> quickly reach quiescent states.
>
> Signed-off-by: Zqiang <[email protected]>

Doesn't this get set right at the end of the current grace period?
Given that there is a callback overload, there should be a grace
period in progress.

See this code in rcu_gp_cleanup():

if (rcu_is_leaf_node(rnp))
for_each_leaf_node_cpu_mask(rnp, cpu, rnp->cbovldmask) {
rdp = per_cpu_ptr(&rcu_data, cpu);
check_cb_ovld_locked(rdp, rnp);
}

So what am I missing here? Or are you planning to remove the above code?

If so, wouldn't you also need to clear the indication for the CPU that
is going offline, being careful to handle the case where the two CPUs
have different leaf rcu_node structures?

Thanx, Paul

> ---
> kernel/rcu/tree.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 9dc4c4e82db6..bcc5876c9753 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4577,6 +4577,7 @@ void rcutree_migrate_callbacks(int cpu)
> needwake = needwake || rcu_advance_cbs(my_rnp, my_rdp);
> rcu_segcblist_disable(&rdp->cblist);
> WARN_ON_ONCE(rcu_segcblist_empty(&my_rdp->cblist) != !rcu_segcblist_n_cbs(&my_rdp->cblist));
> + check_cb_ovld_locked(my_rdp, my_rnp);
> if (rcu_rdp_is_offloaded(my_rdp)) {
> raw_spin_unlock_rcu_node(my_rnp); /* irqs remain disabled. */
> __call_rcu_nocb_wake(my_rdp, true, flags);
> --
> 2.25.1
>


2022-05-09 09:23:12

by Zqiang

[permalink] [raw]
Subject: RE: [PATCH] rcu: Add rnp->cbovldmask check in rcutree_migrate_callbacks()

On Thu, May 05, 2022 at 11:52:36PM +0800, Zqiang wrote:
> Currently, the rnp's cbovlmask is set in call_rcu(). when CPU going
> offline, the outgoing CPU's callbacks is migrated to target CPU, the
> number of callbacks on the my_rdp may be overloaded, if overload and
> there is no call_rcu() call on target CPU for a long time, the rnp's
> cbovldmask is not set in time. in order to fix this situation, add
> check_cb_ovld_locked() in rcutree_migrate_callbacks() to help CPU more
> quickly reach quiescent states.
>
> Signed-off-by: Zqiang <[email protected]>

>Doesn't this get set right at the end of the current grace period?
>Given that there is a callback overload, there should be a grace period in progress.
>
>See this code in rcu_gp_cleanup():
>
> if (rcu_is_leaf_node(rnp))
> for_each_leaf_node_cpu_mask(rnp, cpu, rnp->cbovldmask) {
> rdp = per_cpu_ptr(&rcu_data, cpu);
> check_cb_ovld_locked(rdp, rnp);
> }
>
>So what am I missing here? Or are you planning to remove the above code?

We only checked the overloaded rdp at the end of current grace period, for
my_rdp overloaded cause by migration callbacks to it, if the my_rdp overloaded,
and the my_rdp->mynode 's cbovldmask is empty, the my_rdp overloaded may be
not checked at end of the current grace period.

I have another question, why don't we call check_cb_ovld_locked() when rdp's n_cbs decreases.
for example call check_cb_ovld_locked() in rcu_do_bacth(), not at the end of grace period.

>
>If so, wouldn't you also need to clear the indication for the CPU that is going offline, being careful to handle the case where the two CPUs have different leaf rcu_node structures?

Yes the offline CPU need to clear.

Thanks,
Zqiang

>
> Thanx, Paul
>
> ---
> kernel/rcu/tree.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index
> 9dc4c4e82db6..bcc5876c9753 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4577,6 +4577,7 @@ void rcutree_migrate_callbacks(int cpu)
> needwake = needwake || rcu_advance_cbs(my_rnp, my_rdp);
> rcu_segcblist_disable(&rdp->cblist);
> WARN_ON_ONCE(rcu_segcblist_empty(&my_rdp->cblist) !=
> !rcu_segcblist_n_cbs(&my_rdp->cblist));
> + check_cb_ovld_locked(my_rdp, my_rnp);
> if (rcu_rdp_is_offloaded(my_rdp)) {
> raw_spin_unlock_rcu_node(my_rnp); /* irqs remain disabled. */
> __call_rcu_nocb_wake(my_rdp, true, flags);
> --
> 2.25.1
>