2022-10-05 11:27:05

by Parav Pandit

[permalink] [raw]
Subject: [PATCH v2] locking/memory-barriers.txt: Improve documentation for writel() example

The cited commit describes that when using writel(), explcit wmb()
is not needed. wmb() is an expensive barrier. writel() uses the needed
platform specific barrier instead of expensive wmb().

Hence update the example to be more accurate that matches the current
implementation.

commit 5846581e3563 ("locking/memory-barriers.txt: Fix broken DMA vs. MMIO ordering example")
Signed-off-by: Parav Pandit <[email protected]>

---
changelog:
v1->v2:
- Further improved description of writel() example
- changed commit subject from 'usage' to 'example'
v0->v1:
- Corrected to mention I/O barrier instead of dma_wmb().
- removed numbered references in commit log
- corrected typo 'explcit' to 'explicit' in commit log
---
Documentation/memory-barriers.txt | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 832b5d36e279..49e1433db407 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1927,10 +1927,12 @@ There are some more advanced barrier functions:
before we read the data from the descriptor, and the dma_wmb() allows
us to guarantee the data is written to the descriptor before the device
can see it now has ownership. The dma_mb() implies both a dma_rmb() and
- a dma_wmb(). Note that, when using writel(), a prior wmb() is not needed
- to guarantee that the cache coherent memory writes have completed before
- writing to the MMIO region. The cheaper writel_relaxed() does not provide
- this guarantee and must not be used here.
+ a dma_wmb(). Note that, when using writel(), a prior barrier is not
+ needed to guarantee that the cache coherent memory writes have completed
+ before writing to the MMIO region. The cheaper writel_relaxed() does not
+ provide this guarantee and must not be used here. Hence, writeX() is always
+ preferred which inserts needed platform specific barrier before writing to
+ the specified MMIO region.

See the subsection "Kernel I/O barrier effects" for more information on
relaxed I/O accessors and the Documentation/core-api/dma-api.rst file for
--
2.26.2


2022-10-06 02:13:33

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [PATCH v2] locking/memory-barriers.txt: Improve documentation for writel() example

On 10/5/22 17:47, Parav Pandit wrote:
> @@ -1927,10 +1927,12 @@ There are some more advanced barrier functions:
> before we read the data from the descriptor, and the dma_wmb() allows
> us to guarantee the data is written to the descriptor before the device
> can see it now has ownership. The dma_mb() implies both a dma_rmb() and
> - a dma_wmb(). Note that, when using writel(), a prior wmb() is not needed
> - to guarantee that the cache coherent memory writes have completed before
> - writing to the MMIO region. The cheaper writel_relaxed() does not provide
> - this guarantee and must not be used here.
> + a dma_wmb(). Note that, when using writel(), a prior barrier is not
> + needed to guarantee that the cache coherent memory writes have completed
> + before writing to the MMIO region. The cheaper writel_relaxed() does not
> + provide this guarantee and must not be used here. Hence, writeX() is always
> + preferred which inserts needed platform specific barrier before writing to
> + the specified MMIO region.
>

Did you mean that writeX() is write() function family?

--
An old man doll... just what I always wanted! - Clara

2022-10-06 02:51:44

by Parav Pandit

[permalink] [raw]
Subject: RE: [PATCH v2] locking/memory-barriers.txt: Improve documentation for writel() example


> From: Bagas Sanjaya <[email protected]>
> Sent: Thursday, October 6, 2022 7:31 AM
>
> On 10/5/22 17:47, Parav Pandit wrote:
> > @@ -1927,10 +1927,12 @@ There are some more advanced barrier
> functions:
> > before we read the data from the descriptor, and the dma_wmb()
> allows
> > us to guarantee the data is written to the descriptor before the device
> > can see it now has ownership. The dma_mb() implies both a
> dma_rmb() and
> > - a dma_wmb(). Note that, when using writel(), a prior wmb() is not
> needed
> > - to guarantee that the cache coherent memory writes have completed
> before
> > - writing to the MMIO region. The cheaper writel_relaxed() does not
> provide
> > - this guarantee and must not be used here.
> > + a dma_wmb(). Note that, when using writel(), a prior barrier is not
> > + needed to guarantee that the cache coherent memory writes have
> completed
> > + before writing to the MMIO region. The cheaper writel_relaxed() does
> not
> > + provide this guarantee and must not be used here. Hence, writeX() is
> always
> > + preferred which inserts needed platform specific barrier before writing
> to
> > + the specified MMIO region.
> >
>
> Did you mean that writeX() is write() function family?
Yes.

2022-10-06 02:53:45

by Akira Yokosawa

[permalink] [raw]
Subject: Re: [PATCH v2] locking/memory-barriers.txt: Improve documentation for writel() example

Hi,

Maybe I'm too nit-picky, but see below:

On Wed, 5 Oct 2022 13:47:49 +0300, Parav Pandit wrote:
> The cited commit describes that when using writel(), explcit wmb()
> is not needed. wmb() is an expensive barrier. writel() uses the needed
> platform specific barrier instead of expensive wmb().
>
> Hence update the example to be more accurate that matches the current
> implementation.
>
> commit 5846581e3563 ("locking/memory-barriers.txt: Fix broken DMA vs. MMIO ordering example")

You can cite the commit in the Changelog text.
Just say:

Commit 5846581e3563 ("locking/memory-barriers.txt: Fix broken DMA vs.
MMIO ordering example") describes that when using writel(), ...

Also, a blank line is needed above S-o-b tags as a delimiter.

> Signed-off-by: Parav Pandit <[email protected]>
>
> ---
> changelog:
> v1->v2:
> - Further improved description of writel() example
> - changed commit subject from 'usage' to 'example'
> v0->v1:
> - Corrected to mention I/O barrier instead of dma_wmb().
> - removed numbered references in commit log
> - corrected typo 'explcit' to 'explicit' in commit log
> ---
> Documentation/memory-barriers.txt | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> index 832b5d36e279..49e1433db407 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -1927,10 +1927,12 @@ There are some more advanced barrier functions:
> before we read the data from the descriptor, and the dma_wmb() allows
> us to guarantee the data is written to the descriptor before the device
> can see it now has ownership. The dma_mb() implies both a dma_rmb() and
> - a dma_wmb(). Note that, when using writel(), a prior wmb() is not needed
> - to guarantee that the cache coherent memory writes have completed before
> - writing to the MMIO region. The cheaper writel_relaxed() does not provide
> - this guarantee and must not be used here.
> + a dma_wmb(). Note that, when using writel(), a prior barrier is not
If you permit a slightly long line here, this hunk would be much easier to
compare:

+ a dma_wmb(). Note that, when using writel(), a prior barrier is not needed
to guarantee that the cache coherent memory writes have completed before
writing to the MMIO region. The cheaper writel_relaxed() does not provide
+ this guarantee and must not be used here. Hence, writeX() is always
+ preferred which inserts needed platform specific barrier before writing to
+ the specified MMIO region.

That said, I don't feel comfortable with the sentence you added.
It looks to me it is redundant because such a guarantee of writeX() is already
covered in the section of "KERNEL I/O BARRIER EFFECTS".
See the relevant explanation quoted below:

3. A writeX() by a CPU thread to the peripheral will first wait for the
completion of all prior writes to memory either issued by, or
propagated to, the same thread. This ensures that writes by the CPU
to an outbound DMA buffer allocated by dma_alloc_coherent() will be
visible to a DMA engine when the CPU writes to its MMIO control
register to trigger the transfer.

Also please not that this document should not describe any implementation
details of those accessors. This document is not meant as an implementation
guide, but a guide for kernel developers who use them. This is clearly
mentioned in "DISCLAIMER" at the top of this file.

Thanks, Akira

> + needed to guarantee that the cache coherent memory writes have completed
> + before writing to the MMIO region. The cheaper writel_relaxed() does not
> + provide this guarantee and must not be used here. Hence, writeX() is always
> + preferred which inserts needed platform specific barrier before writing to
> + the specified MMIO region.
>
> See the subsection "Kernel I/O barrier effects" for more information on
> relaxed I/O accessors and the Documentation/core-api/dma-api.rst file for

2022-10-06 03:12:34

by Akira Yokosawa

[permalink] [raw]
Subject: Re: [PATCH v2] locking/memory-barriers.txt: Improve documentation for writel() example

On Thu, 6 Oct 2022 11:48:04 +0900, Akira Yokosawa wrote:
> Hi,
>
> Maybe I'm too nit-picky, but see below:
>
> On Wed, 5 Oct 2022 13:47:49 +0300, Parav Pandit wrote:
>> The cited commit describes that when using writel(), explcit wmb()
>> is not needed. wmb() is an expensive barrier. writel() uses the needed
>> platform specific barrier instead of expensive wmb().
>>
>> Hence update the example to be more accurate that matches the current
>> implementation.
>>
>> commit 5846581e3563 ("locking/memory-barriers.txt: Fix broken DMA vs. MMIO ordering example")
>
> You can cite the commit in the Changelog text.
> Just say:
>
> Commit 5846581e3563 ("locking/memory-barriers.txt: Fix broken DMA vs.
> MMIO ordering example") describes that when using writel(), ...
>
> Also, a blank line is needed above S-o-b tags as a delimiter.
>
>> Signed-off-by: Parav Pandit <[email protected]>
>>
>> ---
>> changelog:
>> v1->v2:
>> - Further improved description of writel() example
>> - changed commit subject from 'usage' to 'example'
>> v0->v1:
>> - Corrected to mention I/O barrier instead of dma_wmb().
>> - removed numbered references in commit log
>> - corrected typo 'explcit' to 'explicit' in commit log
>> ---
>> Documentation/memory-barriers.txt | 10 ++++++----
>> 1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
>> index 832b5d36e279..49e1433db407 100644
>> --- a/Documentation/memory-barriers.txt
>> +++ b/Documentation/memory-barriers.txt
>> @@ -1927,10 +1927,12 @@ There are some more advanced barrier functions:
>> before we read the data from the descriptor, and the dma_wmb() allows
>> us to guarantee the data is written to the descriptor before the device
>> can see it now has ownership. The dma_mb() implies both a dma_rmb() and
>> - a dma_wmb(). Note that, when using writel(), a prior wmb() is not needed
>> - to guarantee that the cache coherent memory writes have completed before
>> - writing to the MMIO region. The cheaper writel_relaxed() does not provide
>> - this guarantee and must not be used here.
>> + a dma_wmb(). Note that, when using writel(), a prior barrier is not
> If you permit a slightly long line here, this hunk would be much easier to
> compare:
>
> + a dma_wmb(). Note that, when using writel(), a prior barrier is not needed
> to guarantee that the cache coherent memory writes have completed before
> writing to the MMIO region. The cheaper writel_relaxed() does not provide
> + this guarantee and must not be used here. Hence, writeX() is always
> + preferred which inserts needed platform specific barrier before writing to
> + the specified MMIO region.
>
> That said, I don't feel comfortable with the sentence you added.
> It looks to me it is redundant because such a guarantee of writeX() is already
> covered in the section of "KERNEL I/O BARRIER EFFECTS".
> See the relevant explanation quoted below:
>
> 3. A writeX() by a CPU thread to the peripheral will first wait for the
> completion of all prior writes to memory either issued by, or
> propagated to, the same thread. This ensures that writes by the CPU
> to an outbound DMA buffer allocated by dma_alloc_coherent() will be
> visible to a DMA engine when the CPU writes to its MMIO control
> register to trigger the transfer.
>
> Also please not that this document should not describe any implementation
I meant note

> details of those accessors. This document is not meant as an implementation
> guide, but a guide for kernel developers who use them. This is clearly
> mentioned in "DISCLAIMER" at the top of this file.
>
> Thanks, Akira
>
>> + needed to guarantee that the cache coherent memory writes have completed
>> + before writing to the MMIO region. The cheaper writel_relaxed() does not
>> + provide this guarantee and must not be used here. Hence, writeX() is always
>> + preferred which inserts needed platform specific barrier before writing to
>> + the specified MMIO region.
>>
>> See the subsection "Kernel I/O barrier effects" for more information on
>> relaxed I/O accessors and the Documentation/core-api/dma-api.rst file for

2022-10-06 04:05:04

by Parav Pandit

[permalink] [raw]
Subject: RE: [PATCH v2] locking/memory-barriers.txt: Improve documentation for writel() example


> From: Akira Yokosawa <[email protected]>
> Sent: Thursday, October 6, 2022 8:18 AM
>
> Hi,
>
> Maybe I'm too nit-picky, but see below:
>
> On Wed, 5 Oct 2022 13:47:49 +0300, Parav Pandit wrote:
> > The cited commit describes that when using writel(), explcit wmb() is
> > not needed. wmb() is an expensive barrier. writel() uses the needed
> > platform specific barrier instead of expensive wmb().
> >
> > Hence update the example to be more accurate that matches the current
> > implementation.
> >
> > commit 5846581e3563 ("locking/memory-barriers.txt: Fix broken DMA vs.
> > MMIO ordering example")
>
> You can cite the commit in the Changelog text.
> Just say:
>
I prefer to annotate the long commit as a separate line. This makes the description sentence easy to read.

> Commit 5846581e3563 ("locking/memory-barriers.txt: Fix broken DMA vs.
> MMIO ordering example") describes that when using writel(), ...
>
> Also, a blank line is needed above S-o-b tags as a delimiter.
>
Adding blank line in v3.

[..]
> If you permit a slightly long line here, this hunk would be much easier to
> compare:
>
True, but document crosses the 80 lines. So I will keep the existing alignment of 80 characters per line.

> + a dma_wmb(). Note that, when using writel(), a prior barrier is
> + not needed
> to guarantee that the cache coherent memory writes have completed
> before
> writing to the MMIO region. The cheaper writel_relaxed() does not
> provide
> + this guarantee and must not be used here. Hence, writeX() is always
> + preferred which inserts needed platform specific barrier before writing
> to
> + the specified MMIO region.
>
> That said, I don't feel comfortable with the sentence you added.
> It looks to me it is redundant because such a guarantee of writeX() is already
> covered in the section of "KERNEL I/O BARRIER EFFECTS".
> See the relevant explanation quoted below:
>
I observed that too, but I thought it is worth short note to add along with this example.
But I agree since the next para refers to it anyway, I will drop this 2nd addition in v3.

> 3. A writeX() by a CPU thread to the peripheral will first wait for the
> completion of all prior writes to memory either issued by, or
> propagated to, the same thread. This ensures that writes by the
> CPU
> to an outbound DMA buffer allocated by dma_alloc_coherent() will
> be
> visible to a DMA engine when the CPU writes to its MMIO control
> register to trigger the transfer.
>
> Also please not that this document should not describe any implementation
> details of those accessors. This document is not meant as an implementation
> guide, but a guide for kernel developers who use them. This is clearly
> mentioned in "DISCLAIMER" at the top of this file.
>
In the disclaimer section I didn’t see any objection to "describe any implementation". :)
For example, disclaimer itself tells that a barrier may be no-op talking about a possible implementation on a platform.

But I agree to your above point that above #3 already covers with my addition.
Hence, I will drop that part in v3.
Thanks for the review.

> Thanks, Akira
>
> > + needed to guarantee that the cache coherent memory writes have
> completed
> > + before writing to the MMIO region. The cheaper writel_relaxed() does
> not
> > + provide this guarantee and must not be used here. Hence, writeX() is
> always
> > + preferred which inserts needed platform specific barrier before writing
> to
> > + the specified MMIO region.
> >
> > See the subsection "Kernel I/O barrier effects" for more information on
> > relaxed I/O accessors and the Documentation/core-api/dma-api.rst
> > file for