2022-01-29 12:04:38

by Zqiang

[permalink] [raw]
Subject: [PATCH] rcu: When rcuog kthreads is in polling mode, wakeup waitqueue is not requried

When grace period cleanup, the rcuog kthreads that waiting in sq
waitqueue will be awakened, however if the 'rcu_nocb_poll' is set,
the sq waitqueue always empty, so if 'rcu_nocb_poll' is set, return
directly.

Signed-off-by: Zqiang <[email protected]>
---
kernel/rcu/tree_nocb.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 636d0546a4e9..9e106c590e56 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -201,6 +201,8 @@ static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp)
*/
static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq)
{
+ if (rcu_nocb_poll)
+ return;
swake_up_all(sq);
}

--
2.25.1


2022-01-31 23:03:12

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: When rcuog kthreads is in polling mode, wakeup waitqueue is not requried

On Fri, Jan 28, 2022 at 11:13:46AM +0800, Zqiang wrote:
> When grace period cleanup, the rcuog kthreads that waiting in sq
> waitqueue will be awakened, however if the 'rcu_nocb_poll' is set,
> the sq waitqueue always empty, so if 'rcu_nocb_poll' is set, return
> directly.

This does decrease grace-period-cleanup overhead in kernels built with
CONFIG_RCU_NOCB_CPU=y and booted with the rcu_nocb_poll kernel boot
parameter set. On the other hand, it increases grace-period-cleanup
overhead in kernels built with CONFIG_RCU_NOCB_CPU=y but booted without
the rcu_nocb_poll kernel boot parameter set.

Last I checked, more kernels were booted without the rcu_nocb_poll kernel
boot parameter set. If this is still the case, this patch would slow
things down for most systems.

Or are there now lots of systems booted with rcu_nocb_poll?

Thanx, Paul

> Signed-off-by: Zqiang <[email protected]>
> ---
> kernel/rcu/tree_nocb.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index 636d0546a4e9..9e106c590e56 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -201,6 +201,8 @@ static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp)
> */
> static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq)
> {
> + if (rcu_nocb_poll)
> + return;
> swake_up_all(sq);
> }
>
> --
> 2.25.1
>

2022-01-31 23:04:39

by Zqiang

[permalink] [raw]
Subject: RE: [PATCH] rcu: When rcuog kthreads is in polling mode, wakeup waitqueue is not requried


On Fri, Jan 28, 2022 at 11:13:46AM +0800, Zqiang wrote:
> When grace period cleanup, the rcuog kthreads that waiting in sq
> waitqueue will be awakened, however if the 'rcu_nocb_poll' is set, the
> sq waitqueue always empty, so if 'rcu_nocb_poll' is set, return
> directly.

>This does decrease grace-period-cleanup overhead in kernels built with CONFIG_RCU_NOCB_CPU=y and booted with the rcu_nocb_poll kernel boot parameter set. On the other hand, it increases grace-period-cleanup overhead in kernels built with CONFIG_RCU_NOCB_CPU=y but booted without the rcu_nocb_poll kernel boot parameter set.
>
>Last I checked, more kernels were booted without the rcu_nocb_poll kernel boot parameter set. If this is still the case, this patch would slow things down for most systems.
>
>Or are there now lots of systems booted with rcu_nocb_poll?

Hi Paul
I found that some of our customers configured rcu_nocb_poll startup parameters under Preempt-RT kernel.
at each grace period cleanup, we'll all wakeup sq waitqueue, however when rcuog kthreads is in polling mode,
the wakeup operation doesn't required, because the sq waitqueue always empty.

Thanks,
Zqiang

>
> Thanx, Paul

> Signed-off-by: Zqiang <[email protected]>
> ---
> kernel/rcu/tree_nocb.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index
> 636d0546a4e9..9e106c590e56 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -201,6 +201,8 @@ static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp)
> */
> static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq) {
> + if (rcu_nocb_poll)
> + return;
> swake_up_all(sq);
> }
>
> --
> 2.25.1
>

2022-02-01 03:11:45

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: When rcuog kthreads is in polling mode, wakeup waitqueue is not requried

On Sat, Jan 29, 2022 at 05:55:34AM +0000, Zhang, Qiang1 wrote:
>
> On Fri, Jan 28, 2022 at 11:13:46AM +0800, Zqiang wrote:
> > When grace period cleanup, the rcuog kthreads that waiting in sq
> > waitqueue will be awakened, however if the 'rcu_nocb_poll' is set, the
> > sq waitqueue always empty, so if 'rcu_nocb_poll' is set, return
> > directly.
>
> >This does decrease grace-period-cleanup overhead in kernels built with CONFIG_RCU_NOCB_CPU=y and booted with the rcu_nocb_poll kernel boot parameter set. On the other hand, it increases grace-period-cleanup overhead in kernels built with CONFIG_RCU_NOCB_CPU=y but booted without the rcu_nocb_poll kernel boot parameter set.
> >
> >Last I checked, more kernels were booted without the rcu_nocb_poll kernel boot parameter set. If this is still the case, this patch would slow things down for most systems.
> >
> >Or are there now lots of systems booted with rcu_nocb_poll?
>
> Hi Paul
> I found that some of our customers configured rcu_nocb_poll startup parameters under Preempt-RT kernel.
> at each grace period cleanup, we'll all wakeup sq waitqueue, however when rcuog kthreads is in polling mode,
> the wakeup operation doesn't required, because the sq waitqueue always empty.

OK, fair enough. But was there any difference in performance measurable
at the system level? Let's take a look at swake_up_all():

void swake_up_all(struct swait_queue_head *q)
{
struct swait_queue *curr;
LIST_HEAD(tmp);

raw_spin_lock_irq(&q->lock);
list_splice_init(&q->task_list, &tmp);
while (!list_empty(&tmp)) {
curr = list_first_entry(&tmp, typeof(*curr), task_list);

wake_up_state(curr->task, TASK_NORMAL);
list_del_init(&curr->task_list);

if (list_empty(&tmp))
break;

raw_spin_unlock_irq(&q->lock);
raw_spin_lock_irq(&q->lock);
}
raw_spin_unlock_irq(&q->lock);
}

If the list is empty, we acquire an uncontended lock, splice an empty
list, check a pair of pointers for equality, and release that lock.
We do this once per 16 CPUs per grace period, which normally will be
every few milliseconds or less frequently.

What is the system-level performance difference and how did you measure
it?

Please don't get me wrong. If this really is causing your users trouble,
we clearly do need to fix it. But if so, let's do so in a way that
doesn't risk penalizing the many users who do not set rcu_nocb_poll.

Thanx, Paul

> Thanks,
> Zqiang
>
> >
> > Thanx, Paul
>
> > Signed-off-by: Zqiang <[email protected]>
> > ---
> > kernel/rcu/tree_nocb.h | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index
> > 636d0546a4e9..9e106c590e56 100644
> > --- a/kernel/rcu/tree_nocb.h
> > +++ b/kernel/rcu/tree_nocb.h
> > @@ -201,6 +201,8 @@ static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp)
> > */
> > static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq) {
> > + if (rcu_nocb_poll)
> > + return;
> > swake_up_all(sq);
> > }
> >
> > --
> > 2.25.1
> >

2022-02-01 09:33:16

by Zqiang

[permalink] [raw]
Subject: RE: [PATCH] rcu: When rcuog kthreads is in polling mode, wakeup waitqueue is not requried


On Sat, Jan 29, 2022 at 05:55:34AM +0000, Zhang, Qiang1 wrote:
>
> On Fri, Jan 28, 2022 at 11:13:46AM +0800, Zqiang wrote:
> > When grace period cleanup, the rcuog kthreads that waiting in sq
> > waitqueue will be awakened, however if the 'rcu_nocb_poll' is set,
> > the sq waitqueue always empty, so if 'rcu_nocb_poll' is set, return
> > directly.
>
> >This does decrease grace-period-cleanup overhead in kernels built with CONFIG_RCU_NOCB_CPU=y and booted with the rcu_nocb_poll kernel boot parameter set. On the other hand, it increases grace-period-cleanup overhead in kernels built with CONFIG_RCU_NOCB_CPU=y but booted without the rcu_nocb_poll kernel boot parameter set.
> >
> >Last I checked, more kernels were booted without the rcu_nocb_poll kernel boot parameter set. If this is still the case, this patch would slow things down for most systems.
> >
> >Or are there now lots of systems booted with rcu_nocb_poll?
>
> Hi Paul
> I found that some of our customers configured rcu_nocb_poll startup parameters under Preempt-RT kernel.
> at each grace period cleanup, we'll all wakeup sq waitqueue, however
> when rcuog kthreads is in polling mode, the wakeup operation doesn't required, because the sq waitqueue always empty.

>>>OK, fair enough. But was there any difference in performance measurable at the system level? Let's take a look at swake_up_all():>>>
>>>
>>> void swake_up_all(struct swait_queue_head *q)
>>> {
>>> struct swait_queue *curr;
>>> LIST_HEAD(tmp);
>>>
>>> raw_spin_lock_irq(&q->lock);
>>> list_splice_init(&q->task_list, &tmp);
>>> while (!list_empty(&tmp)) {
>>> curr = list_first_entry(&tmp, typeof(*curr), task_list);
>>>
>>> wake_up_state(curr->task, TASK_NORMAL);
>>> list_del_init(&curr->task_list);
>>>
>>> if (list_empty(&tmp))
>>> break;
>>>
>>> raw_spin_unlock_irq(&q->lock);
>>> raw_spin_lock_irq(&q->lock);
>>> }
>>> raw_spin_unlock_irq(&q->lock);
>>> }
>>>
>>>If the list is empty, we acquire an uncontended lock, splice an empty list, check a pair of pointers for equality, and release that lock.
>>>We do this once per 16 CPUs per grace period, which normally will be every few milliseconds or less frequently.
>>>
>>>What is the system-level performance difference and how did you measure it?

Sorry I ignored that. I don't measure performance differences at the system level,

>>>
>>>Please don't get me wrong. If this really is causing your users trouble, we clearly do need to fix it. But if so, let's do so in a way that doesn't risk penalizing the many users who do not set rcu_nocb_poll.

Thank you for detailed analysis . the original intention of my modification is avoid unnecessary wake-up operations,
when rcu_nocb_poll is set. In polling mode, the rcuog kthreads don't use sq waitqueue, however, every time the grace period cleanup,
all rnp nodes must be traversed for wake-up operation. after that, I will do the test.

Thanks
Zqiang

>>>
>>> Thanx, Paul
>>>
> Thanks,
> Zqiang
>
> >
> > Thanx, Paul
>
> > Signed-off-by: Zqiang <[email protected]>
> > ---
> > kernel/rcu/tree_nocb.h | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index
> > 636d0546a4e9..9e106c590e56 100644
> > --- a/kernel/rcu/tree_nocb.h
> > +++ b/kernel/rcu/tree_nocb.h
> > @@ -201,6 +201,8 @@ static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp)
> > */
> > static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq)
> > + if (rcu_nocb_poll)
> > + return;
> > swake_up_all(sq);
> > }
> >
> > --
> > 2.25.1
> >