LinuxLists.cc - [PATCH 1/10] Cr4 is valid on some 486s

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Pavel Machek wrote:
> Hi!
>
>
>>So some 486 processors do have CR4 register. Allow them to present it in
>>register dumps by using the old fault technique rather than testing processor
>>family.
>
>
> I thought Andi commented this as "way too risky", for little
> good. Nested exceptions are evil.
> Pavel

I think the 486's that have CR4 are the same that have CPUID, and thus
can be tested for by the presence of the ID flag.

-hpa

2005-11-11 18:00:22

by Maciej W. Rozycki

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Fri, 11 Nov 2005, H. Peter Anvin wrote:

> I think the 486's that have CR4 are the same that have CPUID, and thus can be
> tested for by the presence of the ID flag.

That's correct; for our purposes a 486 that would implement CR4 but not
CPUID would not be interesting anyway, as we don't use CR4 elsewhere but
for features discovered through CPUID. And I don't think there's ever
been an implementation that had CPUID but no CR4.

Maciej

2005-11-11 19:38:57

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Pavel Machek wrote:

>Hi!
>
>
>
>>So some 486 processors do have CR4 register. Allow them to present it in
>>register dumps by using the old fault technique rather than testing processor
>>family.
>>
>>
>
>I thought Andi commented this as "way too risky", for little
>good. Nested exceptions are evil.
>
>

I didn't see Andi's comment to that effect. I may have originally
argued that when I made CR4 reads depend on CPU family. But I think it
is useful to know if PSE is enabled, especially on 486s that do support it.

Agree nested exceptions are evil. But where is this called from
execption context?

1) softlockup_tick appears to be perfectly safe call site to handle
exceptions
2) sysrq-p is also a fine site.

I tested this by assembling a hacked safe_read_cr1() macro, and dumped
the contents of my non-existant CR1 regsiter in show_regs to prove the
fault handling correct (although the code already _looks_ correct, I
thought someone might ask the question you just did. :)

Zach

2005-11-11 19:58:42

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Fri, 11 Nov 2005, Zachary Amsden wrote:
>
> Agree nested exceptions are evil. But where is this called from execption
> context?

We have really nice ways of handling these things, so we should just use
them.

For example, you can do

static inline void read_cr4(void)
{
unsigned long cr4;
alternative_input("xorl %0,%0",
"movl %%cr4,%0",
X86_FEATURE_CR4,
"r" (cr4));
return cr4;
}

and then just add that feature-flag discovery early on in boot (it needs
to be pretty early, since the alternative instruction rewriting happens
early).

We have several "calculated" features already. Things like X86_FEATURE_P4
etc.

Linus

2005-11-11 20:14:05

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Linus Torvalds wrote:

>On Fri, 11 Nov 2005, Zachary Amsden wrote:
>
>
>>Agree nested exceptions are evil. But where is this called from execption
>>context?
>>
>>
>
>We have really nice ways of handling these things, so we should just use
>them.
>
>For example, you can do
>
> static inline void read_cr4(void)
> {
> unsigned long cr4;
> alternative_input("xorl %0,%0",
> "movl %%cr4,%0",
> X86_FEATURE_CR4,
> "r" (cr4));
> return cr4;
> }
>
>and then just add that feature-flag discovery early on in boot (it needs
>to be pretty early, since the alternative instruction rewriting happens
>early).
>
>We have several "calculated" features already. Things like X86_FEATURE_P4
>etc.
>
>

Yes, this is fine, but is it worth writing the feature discovery code?
I suppose it doesn't matter, as it gets jettisoned after init. I guess
it is just preference.

Considering run time code size, the alternative approach wins, has no
extra branches, and is just nicer. The faulting technique requires two
extra dwords of space that can not be jettisonned. So obviously, I must
do it (the alternative approach).

Could we consider doing the same with LOCK prefix for SMP kernels booted
on UP? Evil grin.

Zach

2005-11-11 20:22:19

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Fri, 11 Nov 2005, Zachary Amsden wrote:
>
> Yes, this is fine, but is it worth writing the feature discovery code? I
> suppose it doesn't matter, as it gets jettisoned after init. I guess it is
> just preference.

Well, you could do the feature discovery by trying to take a fault early
at boot-time. That's how we verify that write-protect works, and how we
check that math exceptions come in the right way..

> Could we consider doing the same with LOCK prefix for SMP kernels booted on
> UP? Evil grin.

Not so evil - I think it's been discussed. Not with alternates (not worth
it), but it wouldn't be hard to do: just add a new section for "lock
address", and have each inline asm that does a lock prefix do basically

1:
lock ; xyzzy

.section .lock.address
.long 1b
.previous

and then just walk the ".lock.address" thing and turn all locks into 0x90
(nop).

Linus

2005-11-13 07:43:15

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Fri, Nov 11, 2005 at 12:22:07PM -0800, Linus Torvalds wrote:
>
>
> On Fri, 11 Nov 2005, Zachary Amsden wrote:
> >
> > Yes, this is fine, but is it worth writing the feature discovery code? I
> > suppose it doesn't matter, as it gets jettisoned after init. I guess it is
> > just preference.
>
> Well, you could do the feature discovery by trying to take a fault early
> at boot-time. That's how we verify that write-protect works, and how we
> check that math exceptions come in the right way..
>
> > Could we consider doing the same with LOCK prefix for SMP kernels booted on
> > UP? Evil grin.
>
> Not so evil - I think it's been discussed. Not with alternates (not worth
> it), but it wouldn't be hard to do: just add a new section for "lock
> address", and have each inline asm that does a lock prefix do basically
>
> 1:
> lock ; xyzzy
>
> .section .lock.address
> .long 1b
> .previous
>
> and then just walk the ".lock.address" thing and turn all locks into 0x90
> (nop).

Looks like the Ubuntu people already did this...

http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-2.6.git;a=commitdiff;h=048985336e32efe665cddd348e92e4a4a5351415;hp=1cb630c2b5aaad7cedaa78aa135e6cecf5ab91ac

Dave

2005-11-13 11:00:15

by Andi Kleen

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Dave Jones <[email protected]> writes:
>
> Looks like the Ubuntu people already did this...
>
> http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-2.6.git;a=commitdiff;h=048985336e32efe665cddd348e92e4a4a5351415;hp=1cb630c2b5aaad7cedaa78aa135e6cecf5ab91ac

It's probably not needed. At least AMD K7/K8 has a SYSCFG MSR bit to
do this (or rather they disable bus cycles for locks that makes them
very cheap) Intel has one too in a different MSR that looks similar.
With some luck they're even already set by the BIOS on UP systems. I
know they are on some AMD systems.

But overall the feature doesn't help longer term because single
threaded CPUs are on their way out.

-Andi

2005-11-13 16:55:59

by Alan

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Sul, 2005-11-13 at 11:59 +0100, Andi Kleen wrote:
> Dave Jones <[email protected]> writes:
> >
> > Looks like the Ubuntu people already did this...
> >
> > http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-2.6.git;a=commitdiff;h=048985336e32efe665cddd348e92e4a4a5351415;hp=1cb630c2b5aaad7cedaa78aa135e6cecf5ab91ac
>
> It's probably not needed. At least AMD K7/K8 has a SYSCFG MSR bit to
> do this (or rather they disable bus cycles for locks that makes them
> very cheap) Intel has one too in a different MSR that looks similar.
> With some luck they're even already set by the BIOS on UP systems. I
> know they are on some AMD systems.

I'd hope the vendors are not doing that by default because we have
kernel code that uses lock against not other processors but other bus
masters. The ECC code is one example. Is there any good info on the AMD
one so I can make the EDAC code put the processor back in x86 compatible
mode so that it behaves safely when scrubbing.

Alan

2005-11-13 17:10:54

by Eric W. Biederman

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Alan Cox <[email protected]> writes:

> On Sul, 2005-11-13 at 11:59 +0100, Andi Kleen wrote:
>> Dave Jones <[email protected]> writes:
>> >
>> > Looks like the Ubuntu people already did this...
>> >
>> >
> http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-2.6.git;a=commitdiff;h=048985336e32efe665cddd348e92e4a4a5351415;hp=1cb630c2b5aaad7cedaa78aa135e6cecf5ab91ac
>>
>> It's probably not needed. At least AMD K7/K8 has a SYSCFG MSR bit to
>> do this (or rather they disable bus cycles for locks that makes them
>> very cheap) Intel has one too in a different MSR that looks similar.
>> With some luck they're even already set by the BIOS on UP systems. I
>> know they are on some AMD systems.
>
> I'd hope the vendors are not doing that by default because we have
> kernel code that uses lock against not other processors but other bus
> masters. The ECC code is one example. Is there any good info on the AMD
> one so I can make the EDAC code put the processor back in x86 compatible
> mode so that it behaves safely when scrubbing.

Check out the AMD's BIOS and Kernel Programmer Guide for the K8. The
appropriate bits are documented, although the documentation is quite
terse.

Eric

2005-11-13 18:59:45

by Andi Kleen

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Sunday 13 November 2005 18:26, Alan Cox wrote:

> I'd hope the vendors are not doing that by default because we have
> kernel code that uses lock against not other processors but other bus
> masters. The ECC code is one example.

It's a bad hack anyways. Better would be probably to use a uncached WC write.
I would rather use that.

-Andi

2005-11-13 19:08:24

by Eric W. Biederman

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Andi Kleen <[email protected]> writes:

> On Sunday 13 November 2005 18:26, Alan Cox wrote:
>
>> I'd hope the vendors are not doing that by default because we have
>> kernel code that uses lock against not other processors but other bus
>> masters. The ECC code is one example.
>
> It's a bad hack anyways. Better would be probably to use a uncached WC write.
> I would rather use that.

For read modify write?

The point is to make the cache line dirty so that the
memory controller will write the data back.

The interesting sequence is:
lock; addl $0, %(reg)

I'm not actually sure the lock is even necessary. Mostly this is
for brain-dead chipsets, chipsets you can't trust, or at least
chipsets that won't do a background scrub for you.

I don't think it is possible to do an uncached read modify write?

Eric

2005-11-13 19:10:57

by Alan

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Sul, 2005-11-13 at 20:00 +0100, Andi Kleen wrote:
> It's a bad hack anyways. Better would be probably to use a uncached WC write.
> I would rather use that.

I'm not clear that anything but lock operations have the required
guarantee of atomicity relative to bus masters which are not processors.
Especially so on intel.

2005-11-13 19:25:19

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Sun, 13 Nov 2005, Dave Jones wrote:
>
> Looks like the Ubuntu people already did this...

Yeah, that looks like a sane patch, although I dislike the #ifdef config
option thing (either it works or it doesn't).

It also does it the right way: using LOCK_PREFIX means that you catch
exactly the users that depend on SMP, and not _all_ "lock" prefixes (as
mentioned, some of the lock prefixes are there as memory fences and are
valid and needed even on UP). So me likee.

The only question being whether you'd actually want to nop out the
spinlock instructions _entirely_ (in addition to changing the nops on
things like semaphores). Without the lock, they're not that expensive, but
hey, it's still a useless (memory-modifying) instruction.

Linus

2005-11-13 19:37:06

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Sun, 13 Nov 2005, Alan Cox wrote:
>
> On Sul, 2005-11-13 at 20:00 +0100, Andi Kleen wrote:
> > It's a bad hack anyways. Better would be probably to use a uncached WC write.
> > I would rather use that.
>
> I'm not clear that anything but lock operations have the required
> guarantee of atomicity relative to bus masters which are not processors.
> Especially so on intel.

The thing is, we wouldn't ever remove _all_ lock prefixes. Only the ones
that already depend on SMP.

So the memory barriers etc that have lock prefixes even on UP would be
totally untouched.

Linus

2005-11-13 19:57:43

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Alan Cox wrote:
> On Sul, 2005-11-13 at 11:59 +0100, Andi Kleen wrote:
>
>>Dave Jones <[email protected]> writes:
>>
>>>Looks like the Ubuntu people already did this...
>>>
>>>http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-2.6.git;a=commitdiff;h=048985336e32efe665cddd348e92e4a4a5351415;hp=1cb630c2b5aaad7cedaa78aa135e6cecf5ab91ac
>>
>>It's probably not needed. At least AMD K7/K8 has a SYSCFG MSR bit to
>>do this (or rather they disable bus cycles for locks that makes them
>>very cheap) Intel has one too in a different MSR that looks similar.
>>With some luck they're even already set by the BIOS on UP systems. I
>>know they are on some AMD systems.
>
> I'd hope the vendors are not doing that by default because we have
> kernel code that uses lock against not other processors but other bus
> masters. The ECC code is one example. Is there any good info on the AMD
> one so I can make the EDAC code put the processor back in x86 compatible
> mode so that it behaves safely when scrubbing.
>

I can't speak about AMD, but on Transmeta's CPUs operations against
cached memory are *always* atomic; the atomicity is guaranteed by the
cache hierarchy. The LOCK prefix does have effects against uncached memory.

-hpa

2005-11-13 20:30:08

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Sun, 13 Nov 2005, Linus Torvalds wrote:
>
> The only question being whether you'd actually want to nop out the
> spinlock instructions _entirely_ (in addition to changing the nops on
> things like semaphores). Without the lock, they're not that expensive, but
> hey, it's still a useless (memory-modifying) instruction.

Actually, that may turn out to be a dangerous idea.

Sad but true: There's a few tests like

#define assert_spin_locked(x) BUG_ON(!spin_is_locked(x))

and

#define __raw_spin_unlock_wait(lock) \
do { while (__raw_spin_is_locked(lock)) cpu_relax(); } while (0)

that would also need to be nopped out if we nop out the code that updates
the spinlock (right now they are just disabled entirely on UP, exactly
because tests like this don't work without the lock being instantiated).

But it would be wonderful if we could just nop out the whole call to the
spinlock (most of them are out-of-line). It would help I$ footprint, and
likely help improve dynamic scheduling around that call on many CPU's too.

So we can easily remove the lock prefix on the spinlock ops, but sadly we
can't do some other "obvious" optimizations.

We _could_ nop out the actual conditional on the lock result for a
spinlock, and turn

lock ; decb %0
js ...

into

nop ; decb %0
multi-byte-nop

which would help avoid some unnecessary branch prediction etc.

Linus

2005-11-13 21:01:15

by Alan

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Sul, 2005-11-13 at 11:36 -0800, Linus Torvalds wrote:
> The thing is, we wouldn't ever remove _all_ lock prefixes. Only the ones
> that already depend on SMP.
>
> So the memory barriers etc that have lock prefixes even on UP would be
> totally untouched.

That much makes sense. Having some magic MSR reloaded to turn lock
effects off is a bit more of a problem for ECC scrubbing however.

2005-11-14 07:46:39

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Sun, 2005-11-13 at 21:32 +0000, Alan Cox wrote:
> On Sul, 2005-11-13 at 11:36 -0800, Linus Torvalds wrote:
> > The thing is, we wouldn't ever remove _all_ lock prefixes. Only the ones
> > that already depend on SMP.
> >
> > So the memory barriers etc that have lock prefixes even on UP would be
> > totally untouched.
>
> That much makes sense. Having some magic MSR reloaded to turn lock
> effects off is a bit more of a problem for ECC scrubbing however.

well... you can expect many bioses to have done the MSR hack for you
already... so if you can't cope with that you have to set the MSR to the
value you want it to have regardless.

2005-11-14 15:06:47

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Hi,

> We _could_ nop out the actual conditional on the lock result for a
> spinlock, and turn
>
> lock ; decb %0
> js ...
>
> into
>
> nop ; decb %0
> multi-byte-nop

Throwing another patch into the discussion ;)

Comes from some xen guy. If I read the thing correctly it builds a elf
section containing a table with both smp and up versions of the code
path, then patching in the one needed at runtime. Allows patching both
directions (up->smp, smp->up) at runtime, for hotplugging (virtual)
CPU's. I'm not a inline asm expert though ...

Comments on that one?

Gerd

Attachments:

smp-alts.patch (17.87 kB)

2005-11-14 19:26:39

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Mon, 14 Nov 2005, Gerd Knorr wrote:
>
> Throwing another patch into the discussion ;)

Ouch, this one is really ugly.

If you want to go this way, then you should instead add an X86_FEATURE_SMP
that gets cleared on UP and on SMP with just one core (and detect when CPU
hotplug ain't gonna happen ;), and then do

#ifdef CONFIG_SMP
#define smp_alternative(x,y) alternative(x,y,X86_FEATURE_SMP)
#else
#define smp_alternative(x,y) asm(x)
#endif

or something similar, instead of creating a totally new infrastructure to
do the thing that "alternative()" already does.

(Yeah, the above doesn't really work, since usually the SMP form is the
longer one, and "alternative()" wants the long complex one first. So maybe
the x86 feature needs to be "X86_FEATURE_UP" instead, since it's now a
"feature" to only have one core ;)

Linus

2005-11-14 19:46:09

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Linus Torvalds wrote:

>On Mon, 14 Nov 2005, Gerd Knorr wrote:
>
>
>>Throwing another patch into the discussion ;)
>>
>>
>
>Ouch, this one is really ugly.
>
>If you want to go this way, then you should instead add an X86_FEATURE_SMP
>that gets cleared on UP and on SMP with just one core (and detect when CPU
>hotplug ain't gonna happen ;), and then do
>
> #ifdef CONFIG_SMP
> #define smp_alternative(x,y) alternative(x,y,X86_FEATURE_SMP)
> #else
> #define smp_alternative(x,y) asm(x)
> #endif
>
>or something similar, instead of creating a totally new infrastructure to
>do the thing that "alternative()" already does.
>
>(Yeah, the above doesn't really work, since usually the SMP form is the
>longer one, and "alternative()" wants the long complex one first. So maybe
>the x86 feature needs to be "X86_FEATURE_UP" instead, since it's now a
>"feature" to only have one core ;)
>
>

It seems that SMP vs. UP lock / spinlock overhead is relevant even for
future, multi-core CPUs in a virtualization context, as the notion of
hotplug here is based on scheduling constraints of the virtualization
engine, and the kernel can quite readily end up with only one VCPU.

But it also seems that there are separate, competing mechanisms for
implementing this dynamic code change, which is undesirable. The notion
of boot-time dynamic code change for SMP is useful for native hardware.
Run-time dynamic code change is useful for virtual hardware, and
minimally useful for hardware CPU hotplug. Run-time dynamic code change
is also useful on virtual hardware if you consider live kernel
migrations across CPUs from different vendors, or with different
features. Again, this is minimally useful for hardware CPU hotplug.

But in essence, there should be one nice way to encapsulate this code
modification that lives for both run-time and boot-time code. The
boot-time modifiers can jettison the alternative tables, and the
run-time guys (which might include CPU hotplug) can keep those
alternatives around so they can be unapplied later. One can even
imagine more complex alternative features (if I have SSE2, use code X,
but if SSE3 is available use code Y, else fall back to code Z) being
useful at some point.

Both points combined are a basic argument for providing an alternative
choice function in apply_alternatives, which takes as input the
alternative specification, and returns a pointer to the chosen code.
This function can be driven by dynamic data (number of plugged CPUs), or
by static specifications (feature spec in the alternative section).

Zach

2005-11-14 19:53:29

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Mon, 2005-11-14 at 11:46 -0800, Zachary Amsden wrote:

> It seems that SMP vs. UP lock / spinlock overhead is relevant even for
> future, multi-core CPUs in a virtualization context, as the notion of
> hotplug here is based on scheduling constraints of the virtualization
> engine, and the kernel can quite readily end up with only one VCPU.

this assumes that you don't just always want to assume and use SMP
primitives in a virtualized context. I sort of question that assumption;
sure these things have overhead, especially "lock", but if the solution
is more complexity and weird things to hide that half-percent or less of
performance difference... then do remember that such complexity is not
free either. Runtime tricks cost.

2005-11-14 20:34:09

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Arjan van de Ven wrote:

>On Mon, 2005-11-14 at 11:46 -0800, Zachary Amsden wrote:
>
>
>
>>It seems that SMP vs. UP lock / spinlock overhead is relevant even for
>>future, multi-core CPUs in a virtualization context, as the notion of
>>hotplug here is based on scheduling constraints of the virtualization
>>engine, and the kernel can quite readily end up with only one VCPU.
>>
>>
>
>
>this assumes that you don't just always want to assume and use SMP
>primitives in a virtualized context. I sort of question that assumption;
>sure these things have overhead, especially "lock", but if the solution
>is more complexity and weird things to hide that half-percent or less of
>performance difference... then do remember that such complexity is not
>free either. Runtime tricks cost.
>
>

Runtime tricks that increase complexity cost, yes. It's all a question
of measured gain vs. complexity. But a couple of percent gained on an
overall basis can be magnified enormously if you are looking at a
workload that stresses a particular path. I would expect some of those
gains to be non-trivial, especially if considering the optimizations you
could do on page table updates knowing you needn't worry about SMP
issues anymore. Even UP has (still?) some places where additional locks
are present here, and could benefit from having SMP alternatives.

Zach

2005-11-14 20:52:59

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

>
> Runtime tricks that increase complexity cost, yes. It's all a question
> of measured gain vs. complexity. But a couple of percent gained on an
> overall basis can be magnified enormously if you are looking at a
> workload that stresses a particular path.

a couple of percents sounds really really high to me. If it's really
that then I think Andi's conclusion is wrong with respect to that
locking cliff; if we spend a few percent of our performance on locks in
the uncontended case we're way over the edge in my opinion.

> I would expect some of those
> gains to be non-trivial, especially if considering the optimizations you
> could do on page table updates knowing you needn't worry about SMP

page table updates happen in the hypervisor in a xen like
paravirtualized setup right? so that happens outside the kernel..

2005-11-15 14:12:22

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Linus Torvalds wrote:
>
> On Mon, 14 Nov 2005, Gerd Knorr wrote:
>> Throwing another patch into the discussion ;)
>
> Ouch, this one is really ugly.

I somehow expected that answer, it took me quite some time to figure
what the patch does. It certainly needs at least a number of cleanups
before I'd consider it mergable. The alternative() macro is much easier
to read.

> If you want to go this way, then you should instead add an X86_FEATURE_SMP
> that gets cleared on UP and on SMP with just one core (and detect when CPU
> hotplug ain't gonna happen ;), and then do

Well, the "no hotplug" probably is exactly the reason why the patch
doesn't use the existing alternatives mechanism, it's a boot-time
one-way ticket. The xenified linux kernel actually switches both ways
at runtime if you plug in/out a second virtual CPU.

> #ifdef CONFIG_SMP
> #define smp_alternative(x,y) alternative(x,y,X86_FEATURE_SMP)
> #else
> #define smp_alternative(x,y) asm(x)
> #endif

I don't like the idea very much. That covers only 50% of what the patch
does, you can patch SMP => UP but not the other way around. Doesn't
matter much on real hardware, but for virtual it is quite useful.

> or something similar, instead of creating a totally new infrastructure to
> do the thing that "alternative()" already does.

Yep, extending alternatives is probably better than duplicating the
code. Maybe having some alternative_smp() macro which places both code
versions into the .altinstr_replacement table? If that sounds ok I'll
try to come up with a experimental patch. If not: other ideas are welcome.

cheers,

Gerd

2005-11-15 16:01:33

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

> Yep, extending alternatives is probably better than duplicating the
> code. Maybe having some alternative_smp() macro which places both code
> versions into the .altinstr_replacement table? If that sounds ok I'll
> try to come up with a experimental patch.

i.e. something like this (as basic idea, patch is far away from doing
anything useful ...)?

Gerd

Attachments:

smp-alternatives.diff (3.00 kB)

2005-11-15 16:04:35

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Gerd Knorr wrote:

>> Yep, extending alternatives is probably better than duplicating the
>> code. Maybe having some alternative_smp() macro which places both
>> code versions into the .altinstr_replacement table? If that sounds
>> ok I'll try to come up with a experimental patch.
>
>
> i.e. something like this (as basic idea, patch is far away from doing
> anything useful ...)?

You still need to preserve the originals so that you can patch in both
directions. In the dynamic scenario, you need a multi-way set of
alternatives, with the most conservative of those compiled in inline.

Zach

2005-11-15 16:06:40

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Tue, 2005-11-15 at 08:04 -0800, Zachary Amsden wrote:
> Gerd Knorr wrote:
>
> >> Yep, extending alternatives is probably better than duplicating the
> >> code. Maybe having some alternative_smp() macro which places both
> >> code versions into the .altinstr_replacement table? If that sounds
> >> ok I'll try to come up with a experimental patch.
> >
> >
> > i.e. something like this (as basic idea, patch is far away from doing
> > anything useful ...)?
>
>
> You still need to preserve the originals so that you can patch in both
> directions.

why do you insist on both directions? That still sounds like real
overkill to me.

2005-11-15 16:08:46

by Roland Dreier

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

> +#define alternative_smp(smpinstr, upinstr) asm(upinstr, ##input)

this wouldn't build with CONFIG_SMP=n -- you forgot the input param here.

also, given this:

> + BUG_ON(a->replacementlen > a->instrlen);

is there any way to at least catch it at compile time if the UP
alternative ends up longer than the SMP alternative?

- R.

2005-11-15 16:11:34

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Tue, Nov 15, 2005 at 05:06:03PM +0100, Arjan van de Ven wrote:

> > You still need to preserve the originals so that you can patch in both
> > directions.
>
> why do you insist on both directions? That still sounds like real
> overkill to me.

cpu hotplug going from UP to SMP ? :)

Dave

2005-11-15 16:13:30

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Tue, 15 Nov 2005, Gerd Knorr wrote:
>
> i.e. something like this (as basic idea, patch is far away from doing anything
> useful ...)?

Can't work. The altinstructions are in init-code/data, and will be free'd
after boot. Which is as it should be. But it means that any setup that
expects to use them to switch back and forth is broken (not that your
patch does so now, but if that's what you are moving toward..)

Linus

2005-11-15 16:16:11

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Dave Jones wrote:
> On Tue, Nov 15, 2005 at 05:06:03PM +0100, Arjan van de Ven wrote:
>
> > > You still need to preserve the originals so that you can patch in both
> > > directions.
> >
> > why do you insist on both directions? That still sounds like real
> > overkill to me.
>
> cpu hotplug going from UP to SMP ? :)
>

If you have CPU hotplug enabled, you can run SMP code!

-hpa

2005-11-15 16:16:41

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Zachary Amsden wrote:
> You still need to preserve the originals so that you can patch in both
> directions. In the dynamic scenario, you need a multi-way set of
> alternatives, with the most conservative of those compiled in inline.

Sure, alternatives_smp() puts both versions into the
.altinstr_replacement section because of that ;)

The idea is to have SMP compiled in and let the normal
apply_alternatives() handle the SMP->UP patching case using the new
feature bit. apply_alternatives_smp() handles UP->SMP patching when you
plug in a new virtual CPU.

cheers,

Gerd

2005-11-15 16:17:18

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Tue, Nov 15, 2005 at 08:12:36AM -0800, Linus Torvalds wrote:

> On Tue, 15 Nov 2005, Gerd Knorr wrote:
> > i.e. something like this (as basic idea, patch is far away from doing anything
> > useful ...)?
>
> Can't work. The altinstructions are in init-code/data, and will be free'd
> after boot. Which is as it should be.

Hmmm, what about modules ?

Dave

2005-11-15 16:20:13

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Tue, Nov 15, 2005 at 08:14:29AM -0800, H. Peter Anvin wrote:
> Dave Jones wrote:
> >On Tue, Nov 15, 2005 at 05:06:03PM +0100, Arjan van de Ven wrote:
> >
> > > > You still need to preserve the originals so that you can patch in
> > both > > directions.
> > >
> > > why do you insist on both directions? That still sounds like real
> > > overkill to me.
> >
> >cpu hotplug going from UP to SMP ? :)
> >
>
> If you have CPU hotplug enabled, you can run SMP code!

Sure, but if you boot with 1 CPU, spinlocks get nop'd to emulate UP,
and on a 'installed a new cpu' hotplug event, they all come back.

Dave

2005-11-15 16:25:49

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Arjan van de Ven wrote:

>On Tue, 2005-11-15 at 08:04 -0800, Zachary Amsden wrote:
>
>
>>Gerd Knorr wrote:
>>
>>
>>
>>>>Yep, extending alternatives is probably better than duplicating the
>>>>code. Maybe having some alternative_smp() macro which places both
>>>>code versions into the .altinstr_replacement table? If that sounds
>>>>ok I'll try to come up with a experimental patch.
>>>>
>>>>
>>>i.e. something like this (as basic idea, patch is far away from doing
>>>anything useful ...)?
>>>
>>>
>>You still need to preserve the originals so that you can patch in both
>>directions.
>>
>>
>
>why do you insist on both directions? That still sounds like real
>overkill to me.
>
>

It's not overkill in the virtualization context, and there are
(struggling, but infinite possibilities) opportunities for native here
as well. Run-time SMP->UP->SMP can benefit hotplug (albeit slightly).
But once you have a basic, generic mechanism for run-time code
modularization, there is very little cost to adding other features.
Run-time PAE / non-PAE conversion is far more radical, but not outside
the realm of possibility - and useful (in both directions) for memory
hotplug. Run-time CPU vendor migration is possible, if you, say hotplug
an AMD chip into a previously Intel socket.

Sure, most of this is science fiction. But the possibilities are great
- it's another tool you can use towards modularizing functionality -
specifically, scattered functionality like CPU instructions, spinlocks,
and MMU operations that really do deserve to be inlined, and really can
benefit from taking advantage of faster hardware instruction sequences.
That the tool already exists in a limited form means that with natural
extensions, it could easily be refined to allow bi-directional or
multidirectional run-time choices.

Basically, it removes a lot of the barriers that force configuration
time choices on the running kernel, and you can start to look at even
deeply entrenched parts of the kernel as modular.

Zach

2005-11-15 16:25:51

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Tue, 2005-11-15 at 11:19 -0500, Dave Jones wrote:
> On Tue, Nov 15, 2005 at 08:14:29AM -0800, H. Peter Anvin wrote:
> > Dave Jones wrote:
> > >On Tue, Nov 15, 2005 at 05:06:03PM +0100, Arjan van de Ven wrote:
> > >
> > > > > You still need to preserve the originals so that you can patch in
> > > both > > directions.
> > > >
> > > > why do you insist on both directions? That still sounds like real
> > > > overkill to me.
> > >
> > >cpu hotplug going from UP to SMP ? :)
> > >
> >
> > If you have CPU hotplug enabled, you can run SMP code!
>
> Sure, but if you boot with 1 CPU, spinlocks get nop'd to emulate UP,
> and on a 'installed a new cpu' hotplug event, they all come back.

the good news is that all hotplugable x86 cpus will have HT or dual core
support.. so you always work in pairs of 2

2005-11-15 16:27:56

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Dave Jones wrote:
> On Tue, Nov 15, 2005 at 08:12:36AM -0800, Linus Torvalds wrote:
>
> > On Tue, 15 Nov 2005, Gerd Knorr wrote:
> > > i.e. something like this (as basic idea, patch is far away from doing anything
> > > useful ...)?
> >
> > Can't work. The altinstructions are in init-code/data, and will be free'd
> > after boot. Which is as it should be.

Good point, so better place the ones we need at runtime into a separate
table ...

> Hmmm, what about modules ?

We'll need some new fields in struct module ...

Is already on the list in my head, but I didn't bother yet for that
proof-of-concept discussion patch ;)

Gerd

2005-11-15 16:30:15

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Dave Jones wrote:
> >
> > If you have CPU hotplug enabled, you can run SMP code!
>
> Sure, but if you boot with 1 CPU, spinlocks get nop'd to emulate UP,
> and on a 'installed a new cpu' hotplug event, they all come back.
>

The point that you don't nop if you have hotplug enabled (which is not
the norm.)

-hpa

2005-11-15 16:34:21

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Arjan van de Ven wrote:

>the good news is that all hotplugable x86 cpus will have HT or dual core
>support.. so you always work in pairs of 2
>
>

While there are good arguments for a pure SMP hardware world, there are
good arguments in the virtualization world for virtual uniprocessors.
No one can be sure which combination of HT / polycore / package
isolation we will end up with, but we can be sure that legacy systems
will be around for longer than anyone wants to think about. As this CR4
valid on 486s patch proves :)

2005-11-15 16:53:27

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

On Tue, 15 Nov 2005, Zachary Amsden wrote:
>
> It's not overkill in the virtualization context, and there are (struggling,
> but infinite possibilities) opportunities for native here as well.

No, there are almost no opportunities for native.

Especially with SMP, doing on-line code switching is really really nasty.
You basically have to shut down all CPU's to make sure there are no races
with other CPU's executing the code while it's being rewritten.

I'd be very very nervous about it. It would have to be some major
performance feature for it to make sense over a simple "switch function
pointers around" approach.

Linus

2005-11-16 09:58:48

[permalink] [raw]

Subject: Re: [PATCH 1/10] Cr4 is valid on some 486s

Roland Dreier wrote:
> > +#define alternative_smp(smpinstr, upinstr) asm(upinstr, ##input)
>
> this wouldn't build with CONFIG_SMP=n -- you forgot the input param here.

Yep, and I've noticed meanwhile that it becomes quite messy if you try
to do that with asm instructions which have both input and output
parameters. One way around that would be to use named parameters in the
inline assembler. Problem with that is that only gcc >= 3.1 understands
those and at the moment the minimun requited compiler for the kernel
still is gcc 2.95.3 according to Documentation/Changes ...

Is it an option to raise the required gcc version to 3.x, given that
even Debian/stable ships with gcc 3.3 these days?

cheers,

Gerd

2005-11-16 16:13:09

[permalink] [raw]

Subject: [RFC] SMP alternatives

Gerd Knorr wrote:

> i.e. something like this (as basic idea, patch is far away from doing
> anything useful ...)?

Adapting $subject to the actual topic, so other lkml readers can catch up ;)

Ok, here new version of the SMP alternatives patch. It features:

* it actually compiles and boots, so you can start playing with it ;)
* reuses the alternatives bits we have already as far as possible.
* separate table for the SMP alternatives, so we can keep them and
switch at runtime between SMP and UP (for virtual CPU hotplug).
* two new alternatives macros, one generic which can handle quite
comples stuff such as spinlocks, one for the "lock prefix" case.

TODO list:

* convert more code (bitops, ...).
* module support (using modules is fine, they run the safe SMP
version of the code, they just don't benefit from the optimizations
yet).
* integrate with xen bits and CPU hotplug, at the moment it's a
boot-time only thing.
* benchmark it.
* x86_64 version.
* drop the printk's placed into the code for debugging.
* probably more ...

How it works right now:

* The patch switches to UP unconditionally when doing the usual
alternatives stuff at boot time
* Just before booting the second CPU it switches to SMP.

How to test:

* boot with "maxcpus=1" to run the UP code.

Comments are welcome.

cheers,

Gerd

Attachments:

smp-alternatives-7.diff (11.77 kB)

2005-11-22 17:48:29