2010-07-27 11:20:46

by Stefan Richter

[permalink] [raw]
Subject: [PATCH + an old question] firewire: ohci: use memory barriers to order descriptor updates

When we append to a DMA program, we need to ensure that the PCI device
sees the updates in the intended order. We need:

1. a write memory barrier between initialization of new descriptors and
the update of the last old descriptor to point to the new
descriptors (i.e. branch_address update),

2. a write memory barrier between branch_address update and wake-up of
the DMA unit by MMIO register write.

This patch adds only barrier 1.

Barrier 2 is implicit in writel() on most machines --- or at least I
think it is. See this from arch/alpha/include/asm/io.h:

#define build_mmio_write(name, size, type, reg, barrier) \
static inline void name(type val, volatile void __iomem *addr) \
{ asm volatile("mov" size " %0,%1": :reg (val), \
"m" (*(volatile type __force *)addr) barrier); }

build_mmio_write(writel, "l", unsigned int, "r", :"memory")

Does this order the mmio write relative to previous memory writes?

I am not so sure about barrier semantics of writel() on some less
popular architectures. From arch/alpha/include/asm/io.h:

extern inline void writel(u32 b, volatile void __iomem *addr)
{
__raw_writel(b, addr);
mb();
}

This mb() is nice but we need a barrier in front of the __raw_writel.
Somebody who cares might want to add it in the architecture code or in
hundreds of drivers.

Signed-off-by: Stefan Richter <[email protected]>
---
drivers/firewire/ohci.c | 3 +++
1 file changed, 3 insertions(+)

Index: b/drivers/firewire/ohci.c
===================================================================
--- a/drivers/firewire/ohci.c
+++ b/drivers/firewire/ohci.c
@@ -595,6 +595,7 @@ static int ar_context_add_page(struct ar
ab->descriptor.res_count = cpu_to_le16(PAGE_SIZE - offset);
ab->descriptor.branch_address = 0;

+ wmb(); /* finish init of new descriptors before branch_address update */
ctx->last_buffer->descriptor.branch_address = cpu_to_le32(ab_bus | 1);
ctx->last_buffer->next = ab;
ctx->last_buffer = ab;
@@ -982,6 +983,8 @@ static void context_append(struct contex
d_bus = desc->buffer_bus + (d - desc->buffer) * sizeof(*d);

desc->used += (z + extra) * sizeof(*d);
+
+ wmb(); /* finish init of new descriptors before branch_address update */
ctx->prev->branch_address = cpu_to_le32(d_bus | z);
ctx->prev = find_branch_descriptor(d, z);


--
Stefan Richter
-=====-==-=- -=== ==-==
http://arcgraph.de/sr/


2010-07-27 11:23:05

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH + an old question] firewire: ohci: use memory barriers to order descriptor updates

On 27 Jul, Stefan Richter wrote:
> 2. a write memory barrier between branch_address update and wake-up of
> the DMA unit by MMIO register write.
>
> This patch adds only barrier 1.
>
> Barrier 2 is implicit in writel() on most machines --- or at least I
> think it is. See this from arch/alpha/include/asm/io.h:

Typo, arch/x86/include/asm/io.h was meant.

> #define build_mmio_write(name, size, type, reg, barrier) \
> static inline void name(type val, volatile void __iomem *addr) \
> { asm volatile("mov" size " %0,%1": :reg (val), \
> "m" (*(volatile type __force *)addr) barrier); }
>
> build_mmio_write(writel, "l", unsigned int, "r", :"memory")
>
> Does this order the mmio write relative to previous memory writes?
--
Stefan Richter
-=====-==-=- -=== ==-==
http://arcgraph.de/sr/

2010-07-28 07:28:14

by Clemens Ladisch

[permalink] [raw]
Subject: Re: [PATCH + an old question] firewire: ohci: use memory barriers to order descriptor updates

Stefan Richter wrote:
> We need:
> 2. a write memory barrier between branch_address update and wake-up of
> the DMA unit by MMIO register write.
>
> Barrier 2 is implicit in writel() on most machines --- or at least I
> think it is. See this from arch/x86/include/asm/io.h:
>
> #define build_mmio_write(name, size, type, reg, barrier) \
> static inline void name(type val, volatile void __iomem *addr) \
> { asm volatile("mov" size " %0,%1": :reg (val), \
> "m" (*(volatile type __force *)addr) barrier); }
>
> build_mmio_write(writel, "l", unsigned int, "r", :"memory")
>
> Does this order the mmio write relative to previous memory writes?

This asm barrier prevents the compiler from reordering.

The main purpose of writel() and friends is to access the address space
where memory-mapped I/O ranges reside; there are architectures where the
normal memory access commands cannot be used. This does not necessarily
imply anything about reordering semantics.

However, PCI address ranges are marked by the device's config registers
as either cacheable or not, and the kernel heeds this when mapping these
ranges. Registers are, of course, marked as uncacheable.


Regards,
Clemens

2010-07-28 09:09:27

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH + an old question] firewire: ohci: use memory barriers to order descriptor updates

Clemens Ladisch wrote:
> Stefan Richter wrote:
>> We need:
>> 2. a write memory barrier between branch_address update and wake-up of
>> the DMA unit by MMIO register write.
>>
>> Barrier 2 is implicit in writel() on most machines --- or at least I
>> think it is. See this from arch/x86/include/asm/io.h:
>>
>> #define build_mmio_write(name, size, type, reg, barrier) \
>> static inline void name(type val, volatile void __iomem *addr) \
>> { asm volatile("mov" size " %0,%1": :reg (val), \
>> "m" (*(volatile type __force *)addr) barrier); }
>>
>> build_mmio_write(writel, "l", unsigned int, "r", :"memory")
>>
>> Does this order the mmio write relative to previous memory writes?
>
> This asm barrier prevents the compiler from reordering.
>
> The main purpose of writel() and friends is to access the address space
> where memory-mapped I/O ranges reside; there are architectures where the
> normal memory access commands cannot be used. This does not necessarily
> imply anything about reordering semantics.
>
> However, PCI address ranges are marked by the device's config registers
> as either cacheable or not, and the kernel heeds this when mapping these
> ranges. Registers are, of course, marked as uncacheable.

But the memory to which we wrote before that is cachable (though
cache-coherent, allocated by dma_alloc_coherent). This memory write
should not cross the mmio register write.

Documentation/DocBook/deviceiobook.tmpl mentions that mmio reads to a
device are ordered WRT to DMA transactions that the device issued before
that mmio read. But no mentions of mmio write to a device vs. preceding
memory accesses by the CPU.

Well, a quick look how some hopefully well-written drivers use and don't
use wmb() indicates that I don't need to worry. Perhaps it is time to
look for a PCI book.
--
Stefan Richter
-=====-==-=- -=== ===--
http://arcgraph.de/sr/