2010-04-06 13:58:00

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH 1/3] X86: Optimise fls(), ffs() and fls64()

Linus Torvalds wrote:
> On Fri, 26 Mar 2010, Scott Lurndal wrote:
> >
> > I wonder if Intel's EM64 stuff makes this more deterministic, perhaps
> > David's implementation would work for x86_64 only?
>
> Limiting it to x86-64 would certainly remove all the worries about all the
> historical x86 clones.
>
> I'd still worry about it for future Intel chips, though. I absolutely
> _detest_ relying on undocumented features - it pretty much always ends up
> biting you eventually. And conditional writeback is actually pretty nasty
> from a microarchitectural standpoint.

On the same subject of relying on undocumented features:

/* If SMP and !X86_PPRO_FENCE. */
#define smp_rmb() barrier()

I've seen documentation, links posted to lkml ages ago, which implies
this is fine on 64-bit for both Intel and AMD.

But it appears to be relying on undocumented behaviour on 32-bit...

Are you sure it is ok? Has anyone from Intel/AMD ever confirmed it is
ok? Has it been tested? Clones?

-- Jamie


2010-04-06 14:44:37

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 1/3] X86: Optimise fls(), ffs() and fls64()



On Tue, 6 Apr 2010, Jamie Lokier wrote:
>
> On the same subject of relying on undocumented features:
>
> /* If SMP and !X86_PPRO_FENCE. */
> #define smp_rmb() barrier()
>
> I've seen documentation, links posted to lkml ages ago, which implies
> this is fine on 64-bit for both Intel and AMD.
>
> But it appears to be relying on undocumented behaviour on 32-bit...

That memory ordering whitepaper is very much supposed to cover all the
32-bit CPU's too. The people involved were convinced that neither AMD nor
Intel had ever produced anything that would do anything that broke the
rules.

In fact, at least the Intel "memory ordering whitepaper" doesn't even
exist any more. Go to intel.com and search, and you'll find:

"Intel? 64 Architecture Memory Ordering White Paper

This document has been merged into Volume 3A of Intel 64 and IA-32
Architectures Software Developers Manual."

which makes it pretty clear that it's not a 64-bit vs 32-bit issue.

> Are you sure it is ok? Has anyone from Intel/AMD ever confirmed it is
> ok? Has it been tested? Clones?

No clones need apply - nobody ever did very aggressive memory re-ordering,
and clones generally never did SMP either.

There is a VIA chip (I think) that had some relaxed cache mode, but that
needed a cr4 bit enable or similar, and since it wasn't SMP it only
mattered for DMA (and possibly nontemporal stores).

Anyway, it all boils down to: yes, we can depend on the memory ordering.

Linus