2002-02-19 01:45:11

by Dan Maas

[permalink] [raw]
Subject: readl/writel and memory barriers

Are the PCI memory access functions like readl() and writel() supposed to
enforce ordering without explicit memory barriers?

I've heard inconsistent reports - Benjamin Herrenschmidt pointed out that on
PPC, the definitions of readl() and writel() include memory barriers. But
the code example on page 229 of Rubini and Corbet's "Linux Device Drivers"
2nd ed. suggests that an explicit wmb() is needed to preserve ordering of
writel()s.

In a quick survey of architectures that need explicit memory barriers to
enforce ordering of PCI accesses, it seems that alpha and PPC include memory
barriers inside readl() and writel(), whereas MIPS, sparc64, ia64, and s390
do not include them. (I'm not intimately familiar with these architectures
so forgive me if I got some wrong...). What is the official story here?

Regards,
Dan


2002-02-19 09:17:38

by Alan

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

> In a quick survey of architectures that need explicit memory barriers to
> enforce ordering of PCI accesses, it seems that alpha and PPC include memory
> barriers inside readl() and writel(), whereas MIPS, sparc64, ia64, and s390

Alpha and PPC include them, x86 its handled by the hardware. __raw_read/write*
are bit more exciting.

> do not include them. (I'm not intimately familiar with these architectures
> so forgive me if I got some wrong...). What is the official story here?

To quote from the Documentation dir..

<para>
The read and write functions are defined to be ordered. That is the
compiler is not permitted to reorder the I/O sequence. When the
ordering can be compiler optimised, you can use <function>
__readb</function> and friends to indicate the relaxed ordering. Use
this with care. The <function>rmb</function> provides a read memory
barrier. The <function>wmb</function> provides a write memory barrier.
</para>

<para>
While the basic functions are defined to be synchronous with respect
to each other and ordered with respect to each other the busses the
devices sit on may themselves have asynchronocity. In paticular many
authors are burned by the fact that PCI bus writes are posted
asynchronously. A driver author must issue a read from the same
device to ensure that writes have occurred in the specific cases the
author cares. This kind of property cannot be hidden from driver
writers in the API.
</para>

2002-02-19 17:11:10

by David Mosberger

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

>>>>> On Mon, 18 Feb 2002 20:45:29 -0500, "Dan Maas" <[email protected]> said:

Dan> In a quick survey of architectures that need explicit memory
Dan> barriers to enforce ordering of PCI accesses, it seems that
Dan> alpha and PPC include memory barriers inside readl() and
Dan> writel(), whereas MIPS, sparc64, ia64, and s390 do not include
Dan> them. (I'm not intimately familiar with these architectures so
Dan> forgive me if I got some wrong...). What is the official story
Dan> here?

On ia64, the fact that readl()/writel() are accessing uncached space
ensures the CPU doesn't reorder the accesses. Furthermore, the
accesses are performed through "volatile" pointers, which ensures that
the compiler doesn't reorder them (and, as a side-effect, such
pointers also generate ordered loads/stores, but this isn't strictly
needed, due to accessing uncached space).

--david

2002-02-19 18:35:51

by Jesse Barnes

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

On Tue, Feb 19, 2002 at 09:10:44AM -0800, David Mosberger wrote:
> On ia64, the fact that readl()/writel() are accessing uncached space
> ensures the CPU doesn't reorder the accesses. Furthermore, the
> accesses are performed through "volatile" pointers, which ensures that
> the compiler doesn't reorder them (and, as a side-effect, such
> pointers also generate ordered loads/stores, but this isn't strictly
> needed, due to accessing uncached space).

Making a variable volatile doesn't guarantee that the compiler won't
reorder references to it, AFAIK. And on some platforms, even uncached
I/O references aren't necessarily ordered.

To avoid the overhead of having I/O flushed on every memory barrier
and readX/writeX operation, we've introduced mmiob() on ia64, which
explicity orders I/O space accesses. Some ports have chosen to take
the performance hit in every readX/writeX, memory barrier, and
spinlock however (e.g. PPC64, MIPS).

Is this a reasonable approach? Is it acceptable to have a seperate
barrier operation for I/O space? If so, perhaps other archs would be
willing to add mmiob() ops?

Thanks,
Jesse

2002-02-19 19:33:44

by David Mosberger

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

>>>>> On Tue, 19 Feb 2002 10:35:06 -0800, Jesse Barnes <[email protected]> said:

Jesse> Making a variable volatile doesn't guarantee that the
Jesse> compiler won't reorder references to it, AFAIK. And on some
Jesse> platforms, even uncached I/O references aren't necessarily
Jesse> ordered.

It certainly does for on ia64-compliant system. Check section 9.3
"Multi-threaded Code" in the "Itanium Software Conventions and Runtime
Architecture manual".

Jesse> To avoid the overhead of having I/O flushed on every memory
Jesse> barrier and readX/writeX operation, we've introduced mmiob()
Jesse> on ia64, which explicity orders I/O space accesses. Some
Jesse> ports have chosen to take the performance hit in every
Jesse> readX/writeX, memory barrier, and spinlock however
Jesse> (e.g. PPC64, MIPS).

I think this is a bit of a different problem. On non-NUMA platforms,
the performance hit of enforcing order is not huge. Basically, as
long as the CPU issues the accesses in order, you'll be fine.

Now, with NUMA platforms, where the chipsets/switch may re-order
accesses, the performance hit will be much bigger, so the old scheme
may not be sufficient.

Jesse> Is this a reasonable approach? Is it acceptable to have a
Jesse> seperate barrier operation for I/O space? If so, perhaps
Jesse> other archs would be willing to add mmiob() ops?

I'm no NUMA expert, but my guess is that nobody will want to go
through all the existing drivers to change them to use mmiob(). For
new drivers, it might be OK.

--david

2002-02-19 19:42:57

by Jesse Barnes

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

On Tue, Feb 19, 2002 at 11:33:22AM -0800, David Mosberger wrote:
> It certainly does for on ia64-compliant system. Check section 9.3
> "Multi-threaded Code" in the "Itanium Software Conventions and Runtime
> Architecture manual".

I don't have that doc handy, but I'll trust your judgement...

> Now, with NUMA platforms, where the chipsets/switch may re-order
> accesses, the performance hit will be much bigger, so the old scheme
> may not be sufficient.

Right. I still have to do some performance measurements, but I
suspect that as the system size goes up, we'll see the I/O ordering
penalty increase. It'll probably get noticable at around 64p.

> I'm no NUMA expert, but my guess is that nobody will want to go
> through all the existing drivers to change them to use mmiob(). For
> new drivers, it might be OK.

The source level impact should actually be pretty small. An mmiob()
prior to the spin_unlock in a critical section that does I/O usually
suffices. Maybe it would be a good idea to have io_spin_lock and
io_spin_unlock for this purpose?

Thanks,
Jesse

2002-02-19 20:11:33

by Dan Maas

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

Jesse Barnes wrote:
> To avoid the overhead of having I/O flushed on every
> memory barrier and readX/writeX operation, we've introduced
> mmiob() on ia64, which explicity orders I/O space accesses.
> Some ports have chosen to take the performance hit in every
> readX/writeX, memory barrier, and spinlock however
> (e.g. PPC64, MIPS).

I have a hunch that many drivers will break if you change the semantics of
readX/writeX from in-order to out-of-order - lots of drivers are only
developed & tested on x86, which completely hides the issue...

If you consider the performance cost of in-order readX/writeX to be
significant, then I would suggest adding another group of readX/writeX APIs
that explicitly allow out-of-order PCI access. (__raw_readX/__raw_writeX
seem to offer this already on some platforms...)

Regards,
Dan

2002-02-19 20:25:08

by Jesse Barnes

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

On Tue, Feb 19, 2002 at 03:11:45PM -0500, Dan Maas wrote:
> I have a hunch that many drivers will break if you change the semantics of
> readX/writeX from in-order to out-of-order - lots of drivers are only
> developed & tested on x86, which completely hides the issue...

Fortunately, I don't think things are quite that bad. As David
pointed out, on ia64 the readX/writeX stuff is ordered coming out of
the CPU, so if you're in a spinlock protected region, for example, all
the reads/writes you do will occur in order. The problem that I'm
trying to solve is that on some platforms, I/O space references won't
necessarily occur in order if they come from different CPUs. E.g.
after you do some I/O and drop a spinlock, another CPU may pick it up
and start doing some I/O that *may* get intermixed with the I/O from
the previous holder of the spinlock unless you explicity barrier said
I/O.

Any ideas on how to address this issue? I was thinking of either
introducing an I/O space barrier (currently called mmiob() in the 2.5
ia64 patch) or taking the performance hit in mb, rmb, and wmb, as well
as readX/writeX to ensure proper ordering. Or, as I mentioned in
another mail, we could have a special io_spin_unlock routine that does
the barrier for you. Comments?

Thanks,
Jesse

2002-02-19 22:06:16

by Keith Owens

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

On Tue, 19 Feb 2002 10:35:06 -0800,
Jesse Barnes <[email protected]> wrote:
>Making a variable volatile doesn't guarantee that the compiler won't
>reorder references to it, AFAIK. And on some platforms, even uncached
>I/O references aren't necessarily ordered.

Ignoring the issue of hardware that reorders I/O, volatile accesses
must not be reordered by the compiler. From a C9X draft (1999, anybody
have the current C standard online?) :-

5.1.2.3 [#2]

Accessing a volatile object, modifying an object, modifying a file,
or calling a function that does any of those operations are all side
effects which are changes in the state of the execution environment.
Evaluation of an expression may produce side effects. At certain
specified points in the execution sequence called sequence points,
all side effects of previous evaluations shall be complete and no
side effects of subsequent evaluations shall have taken place.

5.1.2.3 [#6]

The least requirements on a conforming implementation are:

-- At sequence points, volatile objects are stable in the sense
that previous accesses are complete and subsequent accesses have
not yet occurred.

The compiler may not reorder volatile accesses across sequence points.

volatile int *a, *b;
int c;

c = *a + *b; // no sequence point, access order to a, b is undefined

c = *a; // compiler must not convert to the above format, it
c += *b; // must access a then b


2002-02-19 22:17:41

by Jesse Barnes

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

On Wed, Feb 20, 2002 at 09:05:37AM +1100, Keith Owens wrote:
> On Tue, 19 Feb 2002 10:35:06 -0800,
> Jesse Barnes <[email protected]> wrote:
> >Making a variable volatile doesn't guarantee that the compiler won't
> >reorder references to it, AFAIK.
>
> Ignoring the issue of hardware that reorders I/O, volatile accesses
> must not be reordered by the compiler. From a C9X draft (1999, anybody
> have the current C standard online?) :-

Of course volatile references must be ordered wrt each other, but a
reference to a volatile doesn't preclude the compiler from moving it
above or below accesses to other variables. That is, it doesn't act
as an optimization barrier. Sound right? I guess I'm getting a
little off-topic here...

Jesse

2002-02-21 00:35:10

by Randy.Dunlap

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

On Wed, 20 Feb 2002, Keith Owens wrote:

| On Tue, 19 Feb 2002 10:35:06 -0800,
| Jesse Barnes <[email protected]> wrote:
| >Making a variable volatile doesn't guarantee that the compiler won't
| >reorder references to it, AFAIK. And on some platforms, even uncached
| >I/O references aren't necessarily ordered.
|
| Ignoring the issue of hardware that reorders I/O, volatile accesses
| must not be reordered by the compiler. From a C9X draft (1999, anybody
| have the current C standard online?) :-
PDF file, for about US$18 - US$20, downloaded from ISO.

| 5.1.2.3 [#2]
|
| Accessing a volatile object, modifying an object, modifying a file,
| or calling a function that does any of those operations are all side
| effects which are changes in the state of the execution environment.
| Evaluation of an expression may produce side effects. At certain
| specified points in the execution sequence called sequence points,
| all side effects of previous evaluations shall be complete and no
| side effects of subsequent evaluations shall have taken place.
No changes here.

| 5.1.2.3 [#6]
|
| The least requirements on a conforming implementation are:
|
| -- At sequence points, volatile objects are stable in the sense
| that previous accesses are complete and subsequent accesses have
| not yet occurred.
Same text, although it's #5 now.

| The compiler may not reorder volatile accesses across sequence points.
|
| volatile int *a, *b;
| int c;
|
| c = *a + *b; // no sequence point, access order to a, b is undefined
|
| c = *a; // compiler must not convert to the above format, it
| c += *b; // must access a then b
|
|
| -

---
~Randy


2002-02-25 03:56:37

by Daniel Phillips

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

On February 21, 2002 01:29 am, Randy.Dunlap wrote:
> On Wed, 20 Feb 2002, Keith Owens wrote:
> | Ignoring the issue of hardware that reorders I/O, volatile accesses
> | must not be reordered by the compiler. From a C9X draft (1999, anybody
> | have the current C standard online?) :-
> PDF file, for about US$18 - US$20, downloaded from ISO.

The drafts are supposed to be public.

--
Daniel

2002-02-25 16:26:41

by Randy.Dunlap

[permalink] [raw]
Subject: Re: readl/writel and memory barriers

On Sat, 23 Feb 2002, Daniel Phillips wrote:

| On February 21, 2002 01:29 am, Randy.Dunlap wrote:
| > On Wed, 20 Feb 2002, Keith Owens wrote:
| > | Ignoring the issue of hardware that reorders I/O, volatile accesses
| > | must not be reordered by the compiler. From a C9X draft (1999, anybody
| > | have the current C standard online?) :-
| > PDF file, for about US$18 - US$20, downloaded from ISO.
|
| The drafts are supposed to be public.

We probably aren't disagreeing here.
I was writing about a released standard, not a draft.
I thought that's what Keith meant by "current C standard."

--
~Randy