LinuxLists.cc - [RFC] rcu: Prevent expedite reporting within RCU read-side section

2018-03-06 05:34:42

Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side section

Hello Paul and RCU folks,

I am afraid I correctly understand and fix it. But I really wonder why
sync_rcu_exp_handler() reports the quiescent state even in the case that
current task is within a RCU read-side section. Do I miss something?

If I correctly understand it and you agree with it, I can add more logic
which make it more expedited by boosting current or making it urgent
when we fail to report the quiescent state on the IPI.

----->8-----
From 0b0191f506c19ce331a1fdb7c2c5a00fb23fbcf2 Mon Sep 17 00:00:00 2001
From: Byungchul Park <[email protected]>
Date: Tue, 6 Mar 2018 13:54:41 +0900
Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side section

We report the quiescent state for this cpu if it's out of RCU read-side
section at the moment IPI was just fired during the expedite process.

However, current code reports the quiescent state even in the case:

1) the current task is still within a RCU read-side section
2) the current task has been blocked within the RCU read-side section

Since we don't get to the quiescent state yet in the case, we shouldn't
report it but check it another time.

Signed-off-by: Byungchul Park <[email protected]>
---
kernel/rcu/tree_exp.h | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 73e1d3d..cc69d14 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -731,13 +731,13 @@ static void sync_rcu_exp_handler(void *info)
/*
* We are either exiting an RCU read-side critical section (negative
* values of t->rcu_read_lock_nesting) or are not in one at all
- * (zero value of t->rcu_read_lock_nesting). Or we are in an RCU
- * read-side critical section that blocked before this expedited
- * grace period started. Either way, we can immediately report
- * the quiescent state.
+ * (zero value of t->rcu_read_lock_nesting). We can immediately
+ * report the quiescent state.
*/
- rdp = this_cpu_ptr(rsp->rda);
- rcu_report_exp_rdp(rsp, rdp, true);
+ if (t->rcu_read_lock_nesting <= 0) {
+ rdp = this_cpu_ptr(rsp->rda);
+ rcu_report_exp_rdp(rsp, rdp, true);
+ }
}

/**
--
1.9.1

2018-03-06 12:44:49

by Byungchul Park

[permalink] [raw]

Subject: Re: [RFC] rcu: Prevent expedite reporting within RCU read-side section

On Mar 6, 2018 2:34 PM, "Byungchul Park" <[email protected]> wrote:
>
> Hello Paul and RCU folks,
>
> I am afraid I correctly understand and fix it. But I really wonder why
> sync_rcu_exp_handler() reports the quiescent state even in the case that
> current task is within a RCU read-side section. Do I miss something?

Hello,

I missed the fact that the original code is anyway safe because
the case is gonna be handled properly in rcu_read_unlock().

This patch just makes unnecessary spin lock/unlock within *report*()
avoided. Please ignore this if you don't think it's that worthy. I am
also not sure if it is.

Sorry bothering you. And thanks.

> If I correctly understand it and you agree with it, I can add more logic
> which make it more expedited by boosting current or making it urgent
> when we fail to report the quiescent state on the IPI.
>
> ----->8-----
> From 0b0191f506c19ce331a1fdb7c2c5a00fb23fbcf2 Mon Sep 17 00:00:00 2001
> From: Byungchul Park <[email protected]>
> Date: Tue, 6 Mar 2018 13:54:41 +0900
> Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side section
>
> We report the quiescent state for this cpu if it's out of RCU read-side
> section at the moment IPI was just fired during the expedite process.
>
> However, current code reports the quiescent state even in the case:
>
> 1) the current task is still within a RCU read-side section
> 2) the current task has been blocked within the RCU read-side section
>
> Since we don't get to the quiescent state yet in the case, we shouldn't
> report it but check it another time.
>
> Signed-off-by: Byungchul Park <[email protected]>
> ---
> kernel/rcu/tree_exp.h | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> index 73e1d3d..cc69d14 100644
> --- a/kernel/rcu/tree_exp.h
> +++ b/kernel/rcu/tree_exp.h
> @@ -731,13 +731,13 @@ static void sync_rcu_exp_handler(void *info)
> /*
> * We are either exiting an RCU read-side critical section (negative
> * values of t->rcu_read_lock_nesting) or are not in one at all
> - * (zero value of t->rcu_read_lock_nesting). Or we are in an RCU
> - * read-side critical section that blocked before this expedited
> - * grace period started. Either way, we can immediately report
> - * the quiescent state.
> + * (zero value of t->rcu_read_lock_nesting). We can immediately
> + * report the quiescent state.
> */
> - rdp = this_cpu_ptr(rsp->rda);
> - rcu_report_exp_rdp(rsp, rdp, true);
> + if (t->rcu_read_lock_nesting <= 0) {
> + rdp = this_cpu_ptr(rsp->rda);
> + rcu_report_exp_rdp(rsp, rdp, true);
> + }
> }
>
> /**
> --
> 1.9.1
>

2018-03-06 13:39:46

On 3/9/2018 5:41 PM, Byungchul Park wrote:
> On Thu, Mar 08, 2018 at 10:01:56AM -0800, Paul E. McKenney wrote:
>> On Thu, Mar 08, 2018 at 07:08:25PM +0900, Byungchul Park wrote:
>
> [...]
>
>>> 2. Clear its bit of ->expmask *only* when it's out of RCU read
>>> sections and keep others unchanged. So it will be cleared at the
>>> end of the RCU read section in that case.
>>>
>>> This option would also work because we anyway check both
>>> ->exp_tasks and ->expmask to finish the expedite-gp.
>>
>> This could be made to work, but one shortcoming is that the grace
>> period would end up waiting on later read-side critical sections
>> that it does not really need to wait on. Also, eventually all the
>
> I don't think it waits on any later ones since ->expmask would be
> cleared at the end of the previous RCU read section.

Now having simplified my patch, with the simplified version, I can
see what you were saying. Right, the expedite-gp might be extended
as you said. That optimization cannot be achieved this way without
reggression.

You're awesome. Thank you for explaning the reason why not.

--
Thanks,
Byungchul