2006-08-31 08:12:04

by Pierre Ossman

[permalink] [raw]
Subject: When to use mmiowb()?

I'm been trying to wrap my head around all this memory barrier business,
and I'm slowly grasping the inter-CPU behaviours. Barriers with regard
to devices still has me a bit confused though.

The deviceiobook document and memory-barriers.txt both make it clear
that memory operations to devices are strictly ordered from a single
CPU. When more CPUs are involved, things get a bit fuzzier.
memory-barriers.txt seems to suggest that mmiowb() is only needed before
an unlock under special circumstances, but deviceiobook states that
mmiowb() should be used before all unlocks where the writeX():s aren't
followed by a readX() (which would flush the writes anyway).

Grepping the tree indicates that mmiowb() isn't used that often, but
according to deviceiobook, they should be plentiful. This leads me to
believe that memory-barriers.txt is closer to the truth, but then the
question is what those special cirumstances that require mmiowb() are.

Any clarifications you can provide are very welcome. :)

Rgds
Pierre


2006-08-31 16:33:37

by Jesse Barnes

[permalink] [raw]
Subject: Re: When to use mmiowb()?

On Thursday, August 31, 2006 1:11 am, Pierre Ossman wrote:
> I'm been trying to wrap my head around all this memory barrier
> business, and I'm slowly grasping the inter-CPU behaviours. Barriers
> with regard to devices still has me a bit confused though.
>
> The deviceiobook document and memory-barriers.txt both make it clear
> that memory operations to devices are strictly ordered from a single
> CPU. When more CPUs are involved, things get a bit fuzzier.
> memory-barriers.txt seems to suggest that mmiowb() is only needed
> before an unlock under special circumstances, but deviceiobook states
> that mmiowb() should be used before all unlocks where the writeX():s
> aren't followed by a readX() (which would flush the writes anyway).
>
> Grepping the tree indicates that mmiowb() isn't used that often, but
> according to deviceiobook, they should be plentiful. This leads me to
> believe that memory-barriers.txt is closer to the truth, but then the
> question is what those special cirumstances that require mmiowb() are.

AFAICT, they're both right. Generally, mmiowb() should be used prior to
unlock in a critical section whose last PIO operation is a writeX.

You're right though: for portability, many more drivers should use this
type of barrier. However, rather than doing an audit of the tree and
inserting mmiowb() everywhere (w/o testing it), we chose to add it on an
as-needed basis for drivers that run on platforms that have weak I/O
ordering. Feel free to add it in other places if you want though (esp.
if you have the hardware to test your changes).

Jesse

2006-08-31 18:26:24

by pg_lkm

[permalink] [raw]
Subject: Re: When to use mmiowb()?

>>> On Thu, 31 Aug 2006 10:11:58 +0200, Pierre Ossman
>>> <[email protected]> said:

drzeus-list> I'm been trying to wrap my head around all this
drzeus-list> memory barrier business, and I'm slowly grasping
drzeus-list> the inter-CPU behaviours. Barriers with regard to
drzeus-list> devices still has me a bit confused though. [ ... ]
drzeus-list> This leads me to believe that memory-barriers.txt
drzeus-list> is closer to the truth, but then the question is
drzeus-list> what those special cirumstances that require
drzeus-list> mmiowb() are.

I have just been staring at a new driver which containts (more
or less these lines:

------------------------------------------------------------------------
#if !(defined CONFIG_ARCH_IA64_SN2 || defined CONFIG_ARCH_IA64_GENERIC)
#define mmiowb() ((void) 0)
#endif
------------------------------------------------------------------------

Very funny, isn't it? :-)

Now the story is that there is not one truth, but several. About
as many as there are platforms and configurations.

On most popular platforms and most configurations 'mmiowb' is
not necessary, so people don't bother and ''it works''. On a few
platforms and configurations it matters, so people using them do
apply it rather more extensively.

In theory it should be used everywhere there is a sequence
hazard, but that's an itch that only a minority needs to
scratch.

Free (and commercial) software is based on the ''social''
definition of ''works'', perhaps regrettably, which means that
if enough people don't see errors, the errors don't ''exist''.

Then there are forward looking people like Linus who was using
an Alpha to develop the kernel (and more recently a G5 IIRC)
precisely to create for himself itches to scratch, triggered by
configurations that the overwhelming majority of x86 users would
not see...

BTW, in a related issue Linus has said that one of the big
problems with kernel development is that a lot of people just
don't get race conditions (which are an intrinsically hard
subject anyhow), and that this has influenced his overall kernel
design.