2018-05-25 02:47:17

by Steven Rostedt

[permalink] [raw]
Subject: [PATCH v4] rcu: Speed up calling of RCU tasks callbacks


From: Steven Rostedt (VMware) <[email protected]>

Joel Fernandes found that the synchronize_rcu_tasks() was taking a
significant amount of time. He demonstrated it with the following test:

# cd /sys/kernel/tracing
# while [ 1 ]; do x=1; done &
# echo '__schedule_bug:traceon' > set_ftrace_filter
# time echo '!__schedule_bug:traceon' > set_ftrace_filter;

real 0m1.064s
user 0m0.000s
sys 0m0.004s

Where it takes a little over a second to perform the synchronize,
because there's a loop that waits 1 second at a time for tasks to get
through their quiescent points when there's a task that must be waited
for.

After discussion we came up with a simple way to wait for holdouts but
increase the time for each iteration of the loop but no more than a
full second.

With the new patch we have:

# time echo '!__schedule_bug:traceon' > set_ftrace_filter;

real 0m0.131s
user 0m0.000s
sys 0m0.004s

Which drops it down to 13% of what the original wait time was.

Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Joel Fernandes (Google) <[email protected]>
Suggested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
---
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 68fa19a5e7bd..452e47841a86 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -715,6 +715,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
struct rcu_head *list;
struct rcu_head *next;
LIST_HEAD(rcu_tasks_holdouts);
+ int fract;

/* Run on housekeeping CPUs by default. Sysadm can move if desired. */
housekeeping_affine(current, HK_FLAG_RCU);
@@ -796,13 +797,25 @@ static int __noreturn rcu_tasks_kthread(void *arg)
* holdouts. When the list is empty, we are done.
*/
lastreport = jiffies;
- while (!list_empty(&rcu_tasks_holdouts)) {
+
+ /* Start off with HZ/10 wait and slowly back off to 1 HZ wait*/
+ fract = 10;
+
+ for (;;) {
bool firstreport;
bool needreport;
int rtst;
struct task_struct *t1;

- schedule_timeout_interruptible(HZ);
+ if (list_empty(&rcu_tasks_holdouts))
+ break;
+
+ /* Slowly back off waiting for holdouts */
+ schedule_timeout_interruptible(HZ/fract);
+
+ if (fract > 1)
+ fract--;
+
rtst = READ_ONCE(rcu_task_stall_timeout);
needreport = rtst > 0 &&
time_after(jiffies, lastreport + rtst);


2018-05-25 02:46:45

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v4] rcu: Speed up calling of RCU tasks callbacks

On Thu, 24 May 2018 16:19:18 -0700
Joel Fernandes <[email protected]> wrote:

> On Thu, May 24, 2018 at 06:49:46PM -0400, Steven Rostedt wrote:
> >
> > From: Steven Rostedt (VMware) <[email protected]>
> >
> > Joel Fernandes found that the synchronize_rcu_tasks() was taking a
> > significant amount of time. He demonstrated it with the following test:
> >
> > # cd /sys/kernel/tracing
> > # while [ 1 ]; do x=1; done &
> > # echo '__schedule_bug:traceon' > set_ftrace_filter
> > # time echo '!__schedule_bug:traceon' > set_ftrace_filter;
> >
> > real 0m1.064s
> > user 0m0.000s
> > sys 0m0.004s
> >
> > Where it takes a little over a second to perform the synchronize,
> > because there's a loop that waits 1 second at a time for tasks to get
> > through their quiescent points when there's a task that must be waited
> > for.
> >
> > After discussion we came up with a simple way to wait for holdouts but
> > increase the time for each iteration of the loop but no more than a
> > full second.
> >
> > With the new patch we have:
> >
> > # time echo '!__schedule_bug:traceon' > set_ftrace_filter;
> >
> > real 0m0.131s
> > user 0m0.000s
> > sys 0m0.004s
> >
> > Which drops it down to 13% of what the original wait time was.
>
> Should be 90% of original?

That would be if I said "drops it down X" but I said "drops it down to
X of what the original wait time was". And 0.131 is 13% of 1.064. :-)

> Other than minor change log change, looks good to me:
>
> Reviewed-by: Joel Fernandes (Google) <[email protected]>

Awesome, thanks!

-- Steve

2018-05-25 02:47:50

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v4] rcu: Speed up calling of RCU tasks callbacks

On Thu, May 24, 2018 at 06:49:46PM -0400, Steven Rostedt wrote:
>
> From: Steven Rostedt (VMware) <[email protected]>
>
> Joel Fernandes found that the synchronize_rcu_tasks() was taking a
> significant amount of time. He demonstrated it with the following test:
>
> # cd /sys/kernel/tracing
> # while [ 1 ]; do x=1; done &
> # echo '__schedule_bug:traceon' > set_ftrace_filter
> # time echo '!__schedule_bug:traceon' > set_ftrace_filter;
>
> real 0m1.064s
> user 0m0.000s
> sys 0m0.004s
>
> Where it takes a little over a second to perform the synchronize,
> because there's a loop that waits 1 second at a time for tasks to get
> through their quiescent points when there's a task that must be waited
> for.
>
> After discussion we came up with a simple way to wait for holdouts but
> increase the time for each iteration of the loop but no more than a
> full second.
>
> With the new patch we have:
>
> # time echo '!__schedule_bug:traceon' > set_ftrace_filter;
>
> real 0m0.131s
> user 0m0.000s
> sys 0m0.004s
>
> Which drops it down to 13% of what the original wait time was.

Should be 90% of original?

>
> Link: http://lkml.kernel.org/r/[email protected]
> Reported-by: Joel Fernandes (Google) <[email protected]>
> Suggested-by: Joel Fernandes (Google) <[email protected]>
> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
> ---
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index 68fa19a5e7bd..452e47841a86 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -715,6 +715,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
> struct rcu_head *list;
> struct rcu_head *next;
> LIST_HEAD(rcu_tasks_holdouts);
> + int fract;
>
> /* Run on housekeeping CPUs by default. Sysadm can move if desired. */
> housekeeping_affine(current, HK_FLAG_RCU);
> @@ -796,13 +797,25 @@ static int __noreturn rcu_tasks_kthread(void *arg)
> * holdouts. When the list is empty, we are done.
> */
> lastreport = jiffies;
> - while (!list_empty(&rcu_tasks_holdouts)) {
> +
> + /* Start off with HZ/10 wait and slowly back off to 1 HZ wait*/
> + fract = 10;
> +
> + for (;;) {
> bool firstreport;
> bool needreport;
> int rtst;
> struct task_struct *t1;
>
> - schedule_timeout_interruptible(HZ);
> + if (list_empty(&rcu_tasks_holdouts))
> + break;
> +
> + /* Slowly back off waiting for holdouts */
> + schedule_timeout_interruptible(HZ/fract);
> +
> + if (fract > 1)
> + fract--;
> +

Other than minor change log change, looks good to me:

Reviewed-by: Joel Fernandes (Google) <[email protected]>

thanks,

- Joel


2018-05-25 02:47:57

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v4] rcu: Speed up calling of RCU tasks callbacks

On 05/24/2018 04:22 PM, Steven Rostedt wrote:
> On Thu, 24 May 2018 16:19:18 -0700
> Joel Fernandes <[email protected]> wrote:
>
>> On Thu, May 24, 2018 at 06:49:46PM -0400, Steven Rostedt wrote:
>>>
>>> From: Steven Rostedt (VMware) <[email protected]>
>>>
>>> Joel Fernandes found that the synchronize_rcu_tasks() was taking a
>>> significant amount of time. He demonstrated it with the following test:
>>>
>>> # cd /sys/kernel/tracing
>>> # while [ 1 ]; do x=1; done &
>>> # echo '__schedule_bug:traceon' > set_ftrace_filter
>>> # time echo '!__schedule_bug:traceon' > set_ftrace_filter;
>>>
>>> real 0m1.064s
>>> user 0m0.000s
>>> sys 0m0.004s
>>>
>>> Where it takes a little over a second to perform the synchronize,
>>> because there's a loop that waits 1 second at a time for tasks to get
>>> through their quiescent points when there's a task that must be waited
>>> for.
>>>
>>> After discussion we came up with a simple way to wait for holdouts but
>>> increase the time for each iteration of the loop but no more than a
>>> full second.
>>>
>>> With the new patch we have:
>>>
>>> # time echo '!__schedule_bug:traceon' > set_ftrace_filter;
>>>
>>> real 0m0.131s
>>> user 0m0.000s
>>> sys 0m0.004s
>>>
>>> Which drops it down to 13% of what the original wait time was.
>>
>> Should be 90% of original?
>
> That would be if I said "drops it down X" but I said "drops it down to
> X of what the original wait time was". And 0.131 is 13% of 1.064. :-)

I think that you are confusing "drops it down to" with "drops it down by".
You said the former. You should say the latter.
IOW, I agree with Joel.

>> Other than minor change log change, looks good to me:
>>
>> Reviewed-by: Joel Fernandes (Google) <[email protected]>
>
> Awesome, thanks!
>
> -- Steve
>


--
~Randy

2018-05-25 02:50:20

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v4] rcu: Speed up calling of RCU tasks callbacks

On 05/24/2018 04:26 PM, Randy Dunlap wrote:
> On 05/24/2018 04:22 PM, Steven Rostedt wrote:
>> On Thu, 24 May 2018 16:19:18 -0700
>> Joel Fernandes <[email protected]> wrote:
>>
>>> On Thu, May 24, 2018 at 06:49:46PM -0400, Steven Rostedt wrote:
>>>>
>>>> From: Steven Rostedt (VMware) <[email protected]>
>>>>
>>>> Joel Fernandes found that the synchronize_rcu_tasks() was taking a
>>>> significant amount of time. He demonstrated it with the following test:
>>>>
>>>> # cd /sys/kernel/tracing
>>>> # while [ 1 ]; do x=1; done &
>>>> # echo '__schedule_bug:traceon' > set_ftrace_filter
>>>> # time echo '!__schedule_bug:traceon' > set_ftrace_filter;
>>>>
>>>> real 0m1.064s
>>>> user 0m0.000s
>>>> sys 0m0.004s
>>>>
>>>> Where it takes a little over a second to perform the synchronize,
>>>> because there's a loop that waits 1 second at a time for tasks to get
>>>> through their quiescent points when there's a task that must be waited
>>>> for.
>>>>
>>>> After discussion we came up with a simple way to wait for holdouts but
>>>> increase the time for each iteration of the loop but no more than a
>>>> full second.
>>>>
>>>> With the new patch we have:
>>>>
>>>> # time echo '!__schedule_bug:traceon' > set_ftrace_filter;
>>>>
>>>> real 0m0.131s
>>>> user 0m0.000s
>>>> sys 0m0.004s
>>>>
>>>> Which drops it down to 13% of what the original wait time was.
>>>
>>> Should be 90% of original?
>>
>> That would be if I said "drops it down X" but I said "drops it down to
>> X of what the original wait time was". And 0.131 is 13% of 1.064. :-)
>
> I think that you are confusing "drops it down to" with "drops it down by".
> You said the former. You should say the latter.
> IOW, I agree with Joel.


Please forget this. After reading the numbers, your comments look correct.


--
~Randy

2018-05-25 20:10:47

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v4] rcu: Speed up calling of RCU tasks callbacks

On Thu, May 24, 2018 at 06:49:46PM -0400, Steven Rostedt wrote:
>
> From: Steven Rostedt (VMware) <[email protected]>
>
> Joel Fernandes found that the synchronize_rcu_tasks() was taking a
> significant amount of time. He demonstrated it with the following test:
>
> # cd /sys/kernel/tracing
> # while [ 1 ]; do x=1; done &
> # echo '__schedule_bug:traceon' > set_ftrace_filter
> # time echo '!__schedule_bug:traceon' > set_ftrace_filter;
>
> real 0m1.064s
> user 0m0.000s
> sys 0m0.004s
>
> Where it takes a little over a second to perform the synchronize,
> because there's a loop that waits 1 second at a time for tasks to get
> through their quiescent points when there's a task that must be waited
> for.
>
> After discussion we came up with a simple way to wait for holdouts but
> increase the time for each iteration of the loop but no more than a
> full second.
>
> With the new patch we have:
>
> # time echo '!__schedule_bug:traceon' > set_ftrace_filter;
>
> real 0m0.131s
> user 0m0.000s
> sys 0m0.004s
>
> Which drops it down to 13% of what the original wait time was.
>
> Link: http://lkml.kernel.org/r/[email protected]
> Reported-by: Joel Fernandes (Google) <[email protected]>
> Suggested-by: Joel Fernandes (Google) <[email protected]>
> Signed-off-by: Steven Rostedt (VMware) <[email protected]>

I queued both commits, thank you all!

Thanx, Paul

> ---
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index 68fa19a5e7bd..452e47841a86 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -715,6 +715,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
> struct rcu_head *list;
> struct rcu_head *next;
> LIST_HEAD(rcu_tasks_holdouts);
> + int fract;
>
> /* Run on housekeeping CPUs by default. Sysadm can move if desired. */
> housekeeping_affine(current, HK_FLAG_RCU);
> @@ -796,13 +797,25 @@ static int __noreturn rcu_tasks_kthread(void *arg)
> * holdouts. When the list is empty, we are done.
> */
> lastreport = jiffies;
> - while (!list_empty(&rcu_tasks_holdouts)) {
> +
> + /* Start off with HZ/10 wait and slowly back off to 1 HZ wait*/
> + fract = 10;
> +
> + for (;;) {
> bool firstreport;
> bool needreport;
> int rtst;
> struct task_struct *t1;
>
> - schedule_timeout_interruptible(HZ);
> + if (list_empty(&rcu_tasks_holdouts))
> + break;
> +
> + /* Slowly back off waiting for holdouts */
> + schedule_timeout_interruptible(HZ/fract);
> +
> + if (fract > 1)
> + fract--;
> +
> rtst = READ_ONCE(rcu_task_stall_timeout);
> needreport = rtst > 0 &&
> time_after(jiffies, lastreport + rtst);
>