2021-06-10 15:51:56

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH] rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()

Add some missing critical pieces of explanation to understand the need
for full memory barriers throughout the whole grace period state machine,
thanks to Paul's explanations.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Neeraj Upadhyay <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Uladzislau Rezki <[email protected]>
Cc: Boqun Feng <[email protected]>
---
.../Tree-RCU-Memory-Ordering.rst | 33 +++++++++++++++++++
1 file changed, 33 insertions(+)

diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
index 11cdab037bff..f21432115627 100644
--- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
+++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
@@ -112,6 +112,39 @@ on PowerPC.
The ``smp_mb__after_unlock_lock()`` invocations prevent this
``WARN_ON()`` from triggering.

++-----------------------------------------------------------------------+
+| **Quick Quiz**: |
++-----------------------------------------------------------------------+
+| But the whole chain of rnp locking is enough for the readers to see |
+| all the pre-grace-period accesses from the updater and for the updater|
+| to see all the accesses from the readers performed before the end of |
+| the grace period. So why do we need to enforce full ordering at all |
+| through smp_mb__after_unlock_lock()? |
++-----------------------------------------------------------------------+
+| **Answer**: |
++-----------------------------------------------------------------------+
+| Because we still need to take care of the lockless counterparts of |
+| RCU. The first key example here is grace period polling. Using |
+| poll_state_synchronize_rcu() or cond_synchronize_rcu(), an updater |
+| can rely solely on lockess full ordering to benefit from the usual |
+| TREE RCU ordering guarantees. |
+| |
+| The second example lays behind the fact that a grace period still |
+| claims to imply full memory ordering. Therefore in the following |
+| scenario: |
+| |
+| CPU 0 CPU 1 |
+| ---- ---- |
+| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
+| synchronize_rcu() smp_mb() |
+| r0 = READ_ONCE(Y) r1 = READ_ONCE(X) |
+| |
+| It must be impossible to have r0 == 0 && r1 == 0 after both CPUs |
+| have completed their sequences, even if CPU 1 is in an RCU extended |
+| quiescent state (idle mode) and thus won't report a quiescent state |
+| throughout the common rnp locking chain. |
++-----------------------------------------------------------------------+
+
This approach must be extended to include idle CPUs, which need
RCU's grace-period memory ordering guarantee to extend to any
RCU read-side critical sections preceding and following the current
--
2.25.1


2021-06-10 16:59:21

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()

On Thu, Jun 10, 2021 at 05:50:29PM +0200, Frederic Weisbecker wrote:
> Add some missing critical pieces of explanation to understand the need
> for full memory barriers throughout the whole grace period state machine,
> thanks to Paul's explanations.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Neeraj Upadhyay <[email protected]>
> Cc: Joel Fernandes <[email protected]>
> Cc: Uladzislau Rezki <[email protected]>
> Cc: Boqun Feng <[email protected]>

Nice!!! And not bad wording either, though I still could not resist the
urge to wordsmith further. Plus I combined your two examples, in order to
provide a trivial example use of the polling interfaces, if nothing else.

Please let me know if I messed anything up.

Thanx, Paul

------------------------------------------------------------------------

commit f21b8fbdf9a59553da825265e92cedb639b4ba3c
Author: Frederic Weisbecker <[email protected]>
Date: Thu Jun 10 17:50:29 2021 +0200

rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()

Add some missing critical pieces of explanation to understand the need
for full memory barriers throughout the whole grace period state machine,
thanks to Paul's explanations.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Neeraj Upadhyay <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Uladzislau Rezki <[email protected]>
Cc: Boqun Feng <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>

diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
index 11cdab037bff..3cd5cb4d86e5 100644
--- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
+++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
@@ -112,6 +112,35 @@ on PowerPC.
The ``smp_mb__after_unlock_lock()`` invocations prevent this
``WARN_ON()`` from triggering.

++-----------------------------------------------------------------------+
+| **Quick Quiz**: |
++-----------------------------------------------------------------------+
+| But the whole chain of rcu_node-structure locking guarantees that |
+| readers see all pre-grace-period accesses from the updater and |
+| also guarantees that the updater to see all post-grace-period |
+| accesses from the readers. So why do we need all of those calls |
+| to smp_mb__after_unlock_lock()? |
++-----------------------------------------------------------------------+
+| **Answer**: |
++-----------------------------------------------------------------------+
+| Because we must provide ordering for RCU's polling grace-period |
+| primitives, for example, get_state_synchronize_rcu() and |
+| poll_state_synchronize_rcu(). For example: |
+| |
+| CPU 0 CPU 1 |
+| ---- ---- |
+| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
+| g = get_state_synchronize_rcu() smp_mb() |
+| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
+| continue; |
+| r0 = READ_ONCE(Y) |
+| |
+| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
+| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
+| or offline) and thus won't interact directly with the RCU core |
+| processing at all. |
++-----------------------------------------------------------------------+
+
This approach must be extended to include idle CPUs, which need
RCU's grace-period memory ordering guarantee to extend to any
RCU read-side critical sections preceding and following the current

2021-06-11 00:32:01

by Akira Yokosawa

[permalink] [raw]
Subject: Re: [PATCH] rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()

On Thu, 10 Jun 2021 09:57:10 -0700, Paul E. McKenney wrote:
> On Thu, Jun 10, 2021 at 05:50:29PM +0200, Frederic Weisbecker wrote:
>> Add some missing critical pieces of explanation to understand the need
>> for full memory barriers throughout the whole grace period state machine,
>> thanks to Paul's explanations.
>>
>> Signed-off-by: Frederic Weisbecker <[email protected]>
>> Cc: Neeraj Upadhyay <[email protected]>
>> Cc: Joel Fernandes <[email protected]>
>> Cc: Uladzislau Rezki <[email protected]>
>> Cc: Boqun Feng <[email protected]>
>
> Nice!!! And not bad wording either, though I still could not resist the
> urge to wordsmith further. Plus I combined your two examples, in order to
> provide a trivial example use of the polling interfaces, if nothing else.
>
> Please let me know if I messed anything up.

Hi Paul,

See minor tweaks below to satisfy sphinx.

>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> commit f21b8fbdf9a59553da825265e92cedb639b4ba3c
> Author: Frederic Weisbecker <[email protected]>
> Date: Thu Jun 10 17:50:29 2021 +0200
>
> rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()
>
> Add some missing critical pieces of explanation to understand the need
> for full memory barriers throughout the whole grace period state machine,
> thanks to Paul's explanations.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Neeraj Upadhyay <[email protected]>
> Cc: Joel Fernandes <[email protected]>
> Cc: Uladzislau Rezki <[email protected]>
> Cc: Boqun Feng <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
>
> diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> index 11cdab037bff..3cd5cb4d86e5 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> @@ -112,6 +112,35 @@ on PowerPC.
> The ``smp_mb__after_unlock_lock()`` invocations prevent this
> ``WARN_ON()`` from triggering.
>
> ++-----------------------------------------------------------------------+
> +| **Quick Quiz**: |
> ++-----------------------------------------------------------------------+
> +| But the whole chain of rcu_node-structure locking guarantees that |
> +| readers see all pre-grace-period accesses from the updater and |
> +| also guarantees that the updater to see all post-grace-period |
> +| accesses from the readers. So why do we need all of those calls |
> +| to smp_mb__after_unlock_lock()? |
> ++-----------------------------------------------------------------------+
> +| **Answer**: |
> ++-----------------------------------------------------------------------+
> +| Because we must provide ordering for RCU's polling grace-period |
> +| primitives, for example, get_state_synchronize_rcu() and |
> +| poll_state_synchronize_rcu(). For example: |
> +| |
> +| CPU 0 CPU 1 |
> +| ---- ---- |
> +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> +| g = get_state_synchronize_rcu() smp_mb() |
> +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> +| continue; |

This indent causes warnings from sphinx:

Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst:135: WARNING: Unexpected indentation.
Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst:137: WARNING: Block quote ends without a blank line; unexpected unindent

> +| r0 = READ_ONCE(Y) |
> +| |
> +| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
> +| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
> +| or offline) and thus won't interact directly with the RCU core |
> +| processing at all. |
> ++-----------------------------------------------------------------------+
> +
> This approach must be extended to include idle CPUs, which need
> RCU's grace-period memory ordering guarantee to extend to any
> RCU read-side critical sections preceding and following the current

The code block in the answer can be fixed as follows:

++-----------------------------------------------------------------------+
+| **Answer**: |
++-----------------------------------------------------------------------+
+| Because we must provide ordering for RCU's polling grace-period |
+| primitives, for example, get_state_synchronize_rcu() and |
+| poll_state_synchronize_rcu(). For example:: |
+| |
+| CPU 0 CPU 1 |
+| ---- ---- |
+| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
+| g = get_state_synchronize_rcu() smp_mb() |
+| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
+| continue; |
+| r0 = READ_ONCE(Y) |
+| |
+| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
+| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
+| or offline) and thus won't interact directly with the RCU core |
+| processing at all. |
++-----------------------------------------------------------------------+

Hint: Use of "::" and indented code block.

Thanks, Akira

2021-06-11 00:49:37

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()

On Fri, Jun 11, 2021 at 09:28:10AM +0900, Akira Yokosawa wrote:
> On Thu, 10 Jun 2021 09:57:10 -0700, Paul E. McKenney wrote:
> > On Thu, Jun 10, 2021 at 05:50:29PM +0200, Frederic Weisbecker wrote:
> >> Add some missing critical pieces of explanation to understand the need
> >> for full memory barriers throughout the whole grace period state machine,
> >> thanks to Paul's explanations.
> >>
> >> Signed-off-by: Frederic Weisbecker <[email protected]>
> >> Cc: Neeraj Upadhyay <[email protected]>
> >> Cc: Joel Fernandes <[email protected]>
> >> Cc: Uladzislau Rezki <[email protected]>
> >> Cc: Boqun Feng <[email protected]>
> >
> > Nice!!! And not bad wording either, though I still could not resist the
> > urge to wordsmith further. Plus I combined your two examples, in order to
> > provide a trivial example use of the polling interfaces, if nothing else.
> >
> > Please let me know if I messed anything up.
>
> Hi Paul,
>
> See minor tweaks below to satisfy sphinx.
>
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > commit f21b8fbdf9a59553da825265e92cedb639b4ba3c
> > Author: Frederic Weisbecker <[email protected]>
> > Date: Thu Jun 10 17:50:29 2021 +0200
> >
> > rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()
> >
> > Add some missing critical pieces of explanation to understand the need
> > for full memory barriers throughout the whole grace period state machine,
> > thanks to Paul's explanations.
> >
> > Signed-off-by: Frederic Weisbecker <[email protected]>
> > Cc: Neeraj Upadhyay <[email protected]>
> > Cc: Joel Fernandes <[email protected]>
> > Cc: Uladzislau Rezki <[email protected]>
> > Cc: Boqun Feng <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
> >
> > diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > index 11cdab037bff..3cd5cb4d86e5 100644
> > --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > @@ -112,6 +112,35 @@ on PowerPC.
> > The ``smp_mb__after_unlock_lock()`` invocations prevent this
> > ``WARN_ON()`` from triggering.
> >
> > ++-----------------------------------------------------------------------+
> > +| **Quick Quiz**: |
> > ++-----------------------------------------------------------------------+
> > +| But the whole chain of rcu_node-structure locking guarantees that |
> > +| readers see all pre-grace-period accesses from the updater and |
> > +| also guarantees that the updater to see all post-grace-period |
> > +| accesses from the readers. So why do we need all of those calls |
> > +| to smp_mb__after_unlock_lock()? |
> > ++-----------------------------------------------------------------------+
> > +| **Answer**: |
> > ++-----------------------------------------------------------------------+
> > +| Because we must provide ordering for RCU's polling grace-period |
> > +| primitives, for example, get_state_synchronize_rcu() and |
> > +| poll_state_synchronize_rcu(). For example: |
> > +| |
> > +| CPU 0 CPU 1 |
> > +| ---- ---- |
> > +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> > +| g = get_state_synchronize_rcu() smp_mb() |
> > +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> > +| continue; |
>
> This indent causes warnings from sphinx:
>
> Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst:135: WARNING: Unexpected indentation.
> Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst:137: WARNING: Block quote ends without a blank line; unexpected unindent
>
> > +| r0 = READ_ONCE(Y) |
> > +| |
> > +| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
> > +| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
> > +| or offline) and thus won't interact directly with the RCU core |
> > +| processing at all. |
> > ++-----------------------------------------------------------------------+
> > +
> > This approach must be extended to include idle CPUs, which need
> > RCU's grace-period memory ordering guarantee to extend to any
> > RCU read-side critical sections preceding and following the current
>
> The code block in the answer can be fixed as follows:
>
> ++-----------------------------------------------------------------------+
> +| **Answer**: |
> ++-----------------------------------------------------------------------+
> +| Because we must provide ordering for RCU's polling grace-period |
> +| primitives, for example, get_state_synchronize_rcu() and |
> +| poll_state_synchronize_rcu(). For example:: |
> +| |
> +| CPU 0 CPU 1 |
> +| ---- ---- |
> +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> +| g = get_state_synchronize_rcu() smp_mb() |
> +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> +| continue; |
> +| r0 = READ_ONCE(Y) |
> +| |
> +| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
> +| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
> +| or offline) and thus won't interact directly with the RCU core |
> +| processing at all. |
> ++-----------------------------------------------------------------------+
>
> Hint: Use of "::" and indented code block.

Thank you!

As in with the following patch to be merged into Frederic's original,
with attribution?

Thanx, Paul

------------------------------------------------------------------------

diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
index 3cd5cb4d86e5..bc884ebf88bb 100644
--- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
+++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
@@ -125,15 +125,15 @@ The ``smp_mb__after_unlock_lock()`` invocations prevent this
+-----------------------------------------------------------------------+
| Because we must provide ordering for RCU's polling grace-period |
| primitives, for example, get_state_synchronize_rcu() and |
-| poll_state_synchronize_rcu(). For example: |
+| poll_state_synchronize_rcu(). For example:: |
| |
-| CPU 0 CPU 1 |
-| ---- ---- |
-| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
-| g = get_state_synchronize_rcu() smp_mb() |
-| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
-| continue; |
-| r0 = READ_ONCE(Y) |
+| CPU 0 CPU 1 |
+| ---- ---- |
+| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
+| g = get_state_synchronize_rcu() smp_mb() |
+| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
+| continue; |
+| r0 = READ_ONCE(Y) |
| |
| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
| happen, even if CPU 1 is in an RCU extended quiescent state (idle |

2021-06-11 01:03:04

by Akira Yokosawa

[permalink] [raw]
Subject: Re: [PATCH] rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()

On Thu, 10 Jun 2021 17:48:13 -0700, Paul E. McKenney wrote:
> On Fri, Jun 11, 2021 at 09:28:10AM +0900, Akira Yokosawa wrote:
>> On Thu, 10 Jun 2021 09:57:10 -0700, Paul E. McKenney wrote:
>>> On Thu, Jun 10, 2021 at 05:50:29PM +0200, Frederic Weisbecker wrote:
>>>> Add some missing critical pieces of explanation to understand the need
>>>> for full memory barriers throughout the whole grace period state machine,
>>>> thanks to Paul's explanations.
>>>>
>>>> Signed-off-by: Frederic Weisbecker <[email protected]>
>>>> Cc: Neeraj Upadhyay <[email protected]>
>>>> Cc: Joel Fernandes <[email protected]>
>>>> Cc: Uladzislau Rezki <[email protected]>
>>>> Cc: Boqun Feng <[email protected]>
>>>
>>> Nice!!! And not bad wording either, though I still could not resist the
>>> urge to wordsmith further. Plus I combined your two examples, in order to
>>> provide a trivial example use of the polling interfaces, if nothing else.
>>>
>>> Please let me know if I messed anything up.
>>
>> Hi Paul,
>>
>> See minor tweaks below to satisfy sphinx.
>>
>>>
>>> Thanx, Paul
>>>
>>> ------------------------------------------------------------------------
>>>
>>> commit f21b8fbdf9a59553da825265e92cedb639b4ba3c
>>> Author: Frederic Weisbecker <[email protected]>
>>> Date: Thu Jun 10 17:50:29 2021 +0200
>>>
>>> rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()
>>>
>>> Add some missing critical pieces of explanation to understand the need
>>> for full memory barriers throughout the whole grace period state machine,
>>> thanks to Paul's explanations.
>>>
>>> Signed-off-by: Frederic Weisbecker <[email protected]>
>>> Cc: Neeraj Upadhyay <[email protected]>
>>> Cc: Joel Fernandes <[email protected]>
>>> Cc: Uladzislau Rezki <[email protected]>
>>> Cc: Boqun Feng <[email protected]>
>>> Signed-off-by: Paul E. McKenney <[email protected]>
>>>
>>> diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
>>> index 11cdab037bff..3cd5cb4d86e5 100644
>>> --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
>>> +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
>>> @@ -112,6 +112,35 @@ on PowerPC.
>>> The ``smp_mb__after_unlock_lock()`` invocations prevent this
>>> ``WARN_ON()`` from triggering.
>>>
>>> ++-----------------------------------------------------------------------+
>>> +| **Quick Quiz**: |
>>> ++-----------------------------------------------------------------------+
>>> +| But the whole chain of rcu_node-structure locking guarantees that |
>>> +| readers see all pre-grace-period accesses from the updater and |
>>> +| also guarantees that the updater to see all post-grace-period |
>>> +| accesses from the readers. So why do we need all of those calls |
>>> +| to smp_mb__after_unlock_lock()? |
>>> ++-----------------------------------------------------------------------+
>>> +| **Answer**: |
>>> ++-----------------------------------------------------------------------+
>>> +| Because we must provide ordering for RCU's polling grace-period |
>>> +| primitives, for example, get_state_synchronize_rcu() and |
>>> +| poll_state_synchronize_rcu(). For example: |
>>> +| |
>>> +| CPU 0 CPU 1 |
>>> +| ---- ---- |
>>> +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
>>> +| g = get_state_synchronize_rcu() smp_mb() |
>>> +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
>>> +| continue; |
>>
>> This indent causes warnings from sphinx:
>>
>> Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst:135: WARNING: Unexpected indentation.
>> Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst:137: WARNING: Block quote ends without a blank line; unexpected unindent
>>
>>> +| r0 = READ_ONCE(Y) |
>>> +| |
>>> +| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
>>> +| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
>>> +| or offline) and thus won't interact directly with the RCU core |
>>> +| processing at all. |
>>> ++-----------------------------------------------------------------------+
>>> +
>>> This approach must be extended to include idle CPUs, which need
>>> RCU's grace-period memory ordering guarantee to extend to any
>>> RCU read-side critical sections preceding and following the current
>>
>> The code block in the answer can be fixed as follows:
>>
>> ++-----------------------------------------------------------------------+
>> +| **Answer**: |
>> ++-----------------------------------------------------------------------+
>> +| Because we must provide ordering for RCU's polling grace-period |
>> +| primitives, for example, get_state_synchronize_rcu() and |
>> +| poll_state_synchronize_rcu(). For example:: |
>> +| |
>> +| CPU 0 CPU 1 |
>> +| ---- ---- |
>> +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
>> +| g = get_state_synchronize_rcu() smp_mb() |
>> +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
>> +| continue; |
>> +| r0 = READ_ONCE(Y) |
>> +| |
>> +| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
>> +| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
>> +| or offline) and thus won't interact directly with the RCU core |
>> +| processing at all. |
>> ++-----------------------------------------------------------------------+
>>
>> Hint: Use of "::" and indented code block.
>
> Thank you!
>
> As in with the following patch to be merged into Frederic's original,
> with attribution?

Sounds good to me!

Thanks, Akira

>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> index 3cd5cb4d86e5..bc884ebf88bb 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> @@ -125,15 +125,15 @@ The ``smp_mb__after_unlock_lock()`` invocations prevent this
> +-----------------------------------------------------------------------+
> | Because we must provide ordering for RCU's polling grace-period |
> | primitives, for example, get_state_synchronize_rcu() and |
> -| poll_state_synchronize_rcu(). For example: |
> +| poll_state_synchronize_rcu(). For example:: |
> | |
> -| CPU 0 CPU 1 |
> -| ---- ---- |
> -| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> -| g = get_state_synchronize_rcu() smp_mb() |
> -| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> -| continue; |
> -| r0 = READ_ONCE(Y) |
> +| CPU 0 CPU 1 |
> +| ---- ---- |
> +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> +| g = get_state_synchronize_rcu() smp_mb() |
> +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> +| continue; |
> +| r0 = READ_ONCE(Y) |
> | |
> | RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
> | happen, even if CPU 1 is in an RCU extended quiescent state (idle |
>

2021-06-11 10:37:08

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()

On Thu, Jun 10, 2021 at 09:57:10AM -0700, Paul E. McKenney wrote:
> diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> index 11cdab037bff..3cd5cb4d86e5 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> @@ -112,6 +112,35 @@ on PowerPC.
> The ``smp_mb__after_unlock_lock()`` invocations prevent this
> ``WARN_ON()`` from triggering.
>
> ++-----------------------------------------------------------------------+
> +| **Quick Quiz**: |
> ++-----------------------------------------------------------------------+
> +| But the whole chain of rcu_node-structure locking guarantees that |
> +| readers see all pre-grace-period accesses from the updater and |
> +| also guarantees that the updater to see all post-grace-period |

Should it be either "that the updater see" or "the updater to see"?

> +| accesses from the readers.

Is it really post-grace-period that you meant here? The updater can't see
the future. It's rather all reader accesses before the end of the grace period?

> So why do we need all of those calls |
> +| to smp_mb__after_unlock_lock()? |
> ++-----------------------------------------------------------------------+
> +| **Answer**: |
> ++-----------------------------------------------------------------------+
> +| Because we must provide ordering for RCU's polling grace-period |
> +| primitives, for example, get_state_synchronize_rcu() and |
> +| poll_state_synchronize_rcu(). For example: |

Two times "for example" (sorry I'm nitpicking...)

> +| |
> +| CPU 0 CPU 1 |
> +| ---- ---- |
> +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> +| g = get_state_synchronize_rcu() smp_mb() |
> +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> +| continue; |
> +| r0 = READ_ONCE(Y) |

Good point, it's a nice merge of the initial examples!

> +| |
> +| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |

One "that" has to die here.

> +| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
> +| or offline) and thus won't interact directly with the RCU core |
> +| processing at all. |

Thanks a lot!

> ++-----------------------------------------------------------------------+
> +
> This approach must be extended to include idle CPUs, which need
> RCU's grace-period memory ordering guarantee to extend to any
> RCU read-side critical sections preceding and following the current

2021-06-11 17:26:49

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()

On Fri, Jun 11, 2021 at 12:34:32PM +0200, Frederic Weisbecker wrote:
> On Thu, Jun 10, 2021 at 09:57:10AM -0700, Paul E. McKenney wrote:
> > diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > index 11cdab037bff..3cd5cb4d86e5 100644
> > --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > @@ -112,6 +112,35 @@ on PowerPC.
> > The ``smp_mb__after_unlock_lock()`` invocations prevent this
> > ``WARN_ON()`` from triggering.
> >
> > ++-----------------------------------------------------------------------+
> > +| **Quick Quiz**: |
> > ++-----------------------------------------------------------------------+
> > +| But the whole chain of rcu_node-structure locking guarantees that |
> > +| readers see all pre-grace-period accesses from the updater and |
> > +| also guarantees that the updater to see all post-grace-period |
>
> Should it be either "that the updater see" or "the updater to see"?

Good catch, I have reworked this paragraph.

> > +| accesses from the readers.
>
> Is it really post-grace-period that you meant here? The updater can't see
> the future. It's rather all reader accesses before the end of the grace period?

I have reworked this to talk about old and new readers on the one hand
and the updater's pre- and post-grace-period accesses on the other.

> > So why do we need all of those calls |
> > +| to smp_mb__after_unlock_lock()? |
> > ++-----------------------------------------------------------------------+
> > +| **Answer**: |
> > ++-----------------------------------------------------------------------+
> > +| Because we must provide ordering for RCU's polling grace-period |
> > +| primitives, for example, get_state_synchronize_rcu() and |
> > +| poll_state_synchronize_rcu(). For example: |
>
> Two times "for example" (sorry I'm nitpicking...)

But the example has two threads!

Kidding aside, I substituted "Consider this code" for the second
"For example".

> > +| |
> > +| CPU 0 CPU 1 |
> > +| ---- ---- |
> > +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> > +| g = get_state_synchronize_rcu() smp_mb() |
> > +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> > +| continue; |
> > +| r0 = READ_ONCE(Y) |
>
> Good point, it's a nice merge of the initial examples!

Glad you like it!

> > +| |
> > +| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
>
> One "that" has to die here.

Can we instead show clemency and banish it to some other paragraph?

> > +| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
> > +| or offline) and thus won't interact directly with the RCU core |
> > +| processing at all. |
>
> Thanks a lot!

Glad to help, and I will reach out to you should someone make the mistake
of insisting that I write something in French. ;-)

> > ++-----------------------------------------------------------------------+
> > +
> > This approach must be extended to include idle CPUs, which need
> > RCU's grace-period memory ordering guarantee to extend to any
> > RCU read-side critical sections preceding and following the current

How about like this?

+-----------------------------------------------------------------------+
| **Quick Quiz**: |
+-----------------------------------------------------------------------+
| But the chain of rcu_node-structure lock acquisitions guarantees |
| that new readers will see all of the updater's pre-grace-period |
| accesses and also guarantees that the updater's post-grace-period |
| accesses will see all of the old reader's accesses. So why do we |
| need all of those calls to smp_mb__after_unlock_lock()? |
+-----------------------------------------------------------------------+
| **Answer**: |
+-----------------------------------------------------------------------+
| Because we must provide ordering for RCU's polling grace-period |
| primitives, for example, get_state_synchronize_rcu() and |
| poll_state_synchronize_rcu(). Consider this code:: |
| |
| CPU 0 CPU 1 |
| ---- ---- |
| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
| g = get_state_synchronize_rcu() smp_mb() |
| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
| continue; |
| r0 = READ_ONCE(Y) |
| |
| RCU guarantees that the outcome r0 == 0 && r1 == 0 will not |
| happen, even if CPU 1 is in an RCU extended quiescent state |
| (idle or offline) and thus won't interact directly with the RCU |
| core processing at all. |
+-----------------------------------------------------------------------+

Thanx, Paul

2021-06-11 22:47:15

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()

On Fri, Jun 11, 2021 at 10:25:14AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 11, 2021 at 12:34:32PM +0200, Frederic Weisbecker wrote:
> Glad to help, and I will reach out to you should someone make the mistake
> of insisting that I write something in French. ;-)

If that can help, we still have frenglish for neutral territories such as airports.
Not easy to master though...

>
> > > ++-----------------------------------------------------------------------+
> > > +
> > > This approach must be extended to include idle CPUs, which need
> > > RCU's grace-period memory ordering guarantee to extend to any
> > > RCU read-side critical sections preceding and following the current
>
> How about like this?
>
> +-----------------------------------------------------------------------+
> | **Quick Quiz**: |
> +-----------------------------------------------------------------------+
> | But the chain of rcu_node-structure lock acquisitions guarantees |
> | that new readers will see all of the updater's pre-grace-period |
> | accesses and also guarantees that the updater's post-grace-period |
> | accesses will see all of the old reader's accesses. So why do we |
> | need all of those calls to smp_mb__after_unlock_lock()? |
> +-----------------------------------------------------------------------+
> | **Answer**: |
> +-----------------------------------------------------------------------+
> | Because we must provide ordering for RCU's polling grace-period |
> | primitives, for example, get_state_synchronize_rcu() and |
> | poll_state_synchronize_rcu(). Consider this code:: |
> | |
> | CPU 0 CPU 1 |
> | ---- ---- |
> | WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> | g = get_state_synchronize_rcu() smp_mb() |
> | while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> | continue; |
> | r0 = READ_ONCE(Y) |
> | |
> | RCU guarantees that the outcome r0 == 0 && r1 == 0 will not |
> | happen, even if CPU 1 is in an RCU extended quiescent state |
> | (idle or offline) and thus won't interact directly with the RCU |
> | core processing at all. |
> +-----------------------------------------------------------------------+

Very good, thanks a lot :o)

2021-06-11 23:59:20

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()

On Sat, Jun 12, 2021 at 12:45:17AM +0200, Frederic Weisbecker wrote:
> On Fri, Jun 11, 2021 at 10:25:14AM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 11, 2021 at 12:34:32PM +0200, Frederic Weisbecker wrote:
> > Glad to help, and I will reach out to you should someone make the mistake
> > of insisting that I write something in French. ;-)
>
> If that can help, we still have frenglish for neutral territories such as airports.
> Not easy to master though...

That does sound dangerous! ;-)

> > > > ++-----------------------------------------------------------------------+
> > > > +
> > > > This approach must be extended to include idle CPUs, which need
> > > > RCU's grace-period memory ordering guarantee to extend to any
> > > > RCU read-side critical sections preceding and following the current
> >
> > How about like this?
> >
> > +-----------------------------------------------------------------------+
> > | **Quick Quiz**: |
> > +-----------------------------------------------------------------------+
> > | But the chain of rcu_node-structure lock acquisitions guarantees |
> > | that new readers will see all of the updater's pre-grace-period |
> > | accesses and also guarantees that the updater's post-grace-period |
> > | accesses will see all of the old reader's accesses. So why do we |
> > | need all of those calls to smp_mb__after_unlock_lock()? |
> > +-----------------------------------------------------------------------+
> > | **Answer**: |
> > +-----------------------------------------------------------------------+
> > | Because we must provide ordering for RCU's polling grace-period |
> > | primitives, for example, get_state_synchronize_rcu() and |
> > | poll_state_synchronize_rcu(). Consider this code:: |
> > | |
> > | CPU 0 CPU 1 |
> > | ---- ---- |
> > | WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> > | g = get_state_synchronize_rcu() smp_mb() |
> > | while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> > | continue; |
> > | r0 = READ_ONCE(Y) |
> > | |
> > | RCU guarantees that the outcome r0 == 0 && r1 == 0 will not |
> > | happen, even if CPU 1 is in an RCU extended quiescent state |
> > | (idle or offline) and thus won't interact directly with the RCU |
> > | core processing at all. |
> > +-----------------------------------------------------------------------+
>
> Very good, thanks a lot :o)

And thank you!

Thanx, Paul