2006-10-06 21:39:07

by john stultz

[permalink] [raw]
Subject: [RFC] Avoid PIT SMP lockups

Hey Andi,
Mind testing this patch on the AMD SMP box you were using earlier w/
acpi=off? I have spent a bit of time trying to hunt down the cause of
the reported SMP boxes hanging when they use the PIT for a clocksource,
and have not been able to root cause it. Removing the first three PIT io
instructions from pit_read() seemed to avoid the issue, but I can't see
why.

My current theory is that we're livelocking somehow:

timer_interrupt:
seq_write_lock_irqsave(xtime_lock)
spin_lock_irqsave(i8253_lock)
portio()
spin_unlock_irqrestore(i8253_lock)
seq_write_unlock_irqrestore(xtime_lock)

gettime:
do {
seq = read_seqbegin(xtime_lock)
spin_lock_irqsave(i8253_lock)
portio()
spin_unlock_irqrestore(i8253_lock)
} while (read_seqretry(&xtime_lock, seq))


Where maybe one cpu is running gettime, spinning like mad grabbing and
releasing the i8253_lock, while another cpu is in the timer_interrupt
thread already holding the xtime lock, trying to grab the i8253_lock.

Yea.. its a weak theory (and sysrq-t output doesn't support it)... Don't
have a clue otherwise though. Your thoughts?

Anyway, since I can't figure it out, this patch should avoid the issue,
by disabling the PIT on SMP boxes (and makes a minor change so we
properly fall back to jiffies if the TSC is bad and there's nothing
else).

S.Çağlar: Could you give it a whirl to see if it changes your vmware
issue?

thanks
-john




This patch avoids possible PIT livelock issues seen on SMP systems, by
not allowing it as a clocksource on SMP boxes.

However, since the PIT may no longer be present, we have to properly
handle the cases where SMP systems have TSC skew and fall back from the
TSC. Since the PIT isn't there, it would "fall back" to the TSC again.
So this changes the jiffies rating to 1, and the TSC-bad rating value to
0.

Thus you will get the following behavior priority on i386 systems:

tsc [if present & stable]
hpet [if present]
cyclone [if present]
acpi_pm [if present]
pit [if UP]
jiffies

Rather then the current more complicated:
tsc [if present & stable]
hpet [if present]
cyclone [if present]
acpi_pm [if present]
pit [if cpus < 4]
tsc [if present & unstable]
jiffies

Signed-off-by: John Stultz <[email protected]>

diff --git a/arch/i386/kernel/i8253.c b/arch/i386/kernel/i8253.c
index 477b24d..9a0060b 100644
--- a/arch/i386/kernel/i8253.c
+++ b/arch/i386/kernel/i8253.c
@@ -109,7 +109,7 @@ static struct clocksource clocksource_pi

static int __init init_pit_clocksource(void)
{
- if (num_possible_cpus() > 4) /* PIT does not scale! */
+ if (num_possible_cpus() > 1) /* PIT does not scale! */
return 0;

clocksource_pit.mult = clocksource_hz2mult(CLOCK_TICK_RATE, 20);
diff --git a/arch/i386/kernel/tsc.c b/arch/i386/kernel/tsc.c
index b8fa0a8..fbc9582 100644
--- a/arch/i386/kernel/tsc.c
+++ b/arch/i386/kernel/tsc.c
@@ -349,8 +349,8 @@ static int tsc_update_callback(void)
int change = 0;

/* check to see if we should switch to the safe clocksource: */
- if (clocksource_tsc.rating != 50 && check_tsc_unstable()) {
- clocksource_tsc.rating = 50;
+ if (clocksource_tsc.rating != 0 && check_tsc_unstable()) {
+ clocksource_tsc.rating = 0;
clocksource_reselect();
change = 1;
}
@@ -461,7 +461,7 @@ static int __init init_tsc_clocksource(v
clocksource_tsc.shift);
/* lower the rating if we already know its unstable: */
if (check_tsc_unstable())
- clocksource_tsc.rating = 50;
+ clocksource_tsc.rating = 0;

init_timer(&verify_tsc_freq_timer);
verify_tsc_freq_timer.function = verify_tsc_freq;
diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index 126bb30..a99b2a6 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -57,7 +57,7 @@ static cycle_t jiffies_read(void)

struct clocksource clocksource_jiffies = {
.name = "jiffies",
- .rating = 0, /* lowest rating*/
+ .rating = 1, /* lowest valid rating*/
.read = jiffies_read,
.mask = 0xffffffff, /*32bits*/
.mult = NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */



2006-10-07 15:50:26

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

07 Eki 2006 Cts 00:38 tarihinde, john stultz şunları yazmıştı:
> S.Çağlar: Could you give it a whirl to see if it changes your vmware
> issue?

Sorry for late response, ill try this with vmware tonight and will also send
backtraces and logs to Andi.

--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (423.00 B)
(No filename) (189.00 B)
Download all attachments

2006-10-10 09:11:48

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

07 Eki 2006 Cts 00:38 tarihinde, john stultz şunları yazmıştı:
> S.Çağlar: Could you give it a whirl to see if it changes your vmware
> issue?

Nothing changes inside the vmware, same panics occured as like before :(

--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (388.00 B)
(No filename) (189.00 B)
Download all attachments

2006-10-10 18:29:13

by john stultz

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

On Tue, 2006-10-10 at 12:11 +0300, S.Çağlar Onur wrote:
> 07 Eki 2006 Cts 00:38 tarihinde, john stultz şunları yazmıştı:
> > S.Çağlar: Could you give it a whirl to see if it changes your vmware
> > issue?
>
> Nothing changes inside the vmware, same panics occured as like before :(

Hmm.. Did you manage to grab the full log?

thanks
-john

2006-10-11 10:49:29

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

10 Eki 2006 Sal 21:27 tarihinde, john stultz şunları yazmıştı:
> On Tue, 2006-10-10 at 12:11 +0300, S.Çağlar Onur wrote:
> > 07 Eki 2006 Cts 00:38 tarihinde, john stultz şunları yazmıştı:
> > > S.Çağlar: Could you give it a whirl to see if it changes your vmware
> > > issue?
> >
> > Nothing changes inside the vmware, same panics occured as like before :(
>
> Hmm.. Did you manage to grab the full log?

Yep, [1] here is whole screen and used config, and as andi suggested i
recompiled this kernel [pure vanilla 2.6.18] from scratch.

[1] http://cekirdek.pardus.org.tr/~caglar/2.6.18/

--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (765.00 B)
(No filename) (189.00 B)
Download all attachments

2006-10-11 17:59:22

by john stultz

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

On Wed, 2006-10-11 at 13:49 +0300, S.Çağlar Onur wrote:
> 10 Eki 2006 Sal 21:27 tarihinde, john stultz şunları yazmıştı:
> > On Tue, 2006-10-10 at 12:11 +0300, S.Çağlar Onur wrote:
> > > 07 Eki 2006 Cts 00:38 tarihinde, john stultz şunları yazmıştı:
> > > > S.Çağlar: Could you give it a whirl to see if it changes your vmware
> > > > issue?
> > >
> > > Nothing changes inside the vmware, same panics occured as like before :(
> >
> > Hmm.. Did you manage to grab the full log?
>
> Yep, [1] here is whole screen and used config, and as andi suggested i
> recompiled this kernel [pure vanilla 2.6.18] from scratch.
>
> [1] http://cekirdek.pardus.org.tr/~caglar/2.6.18/

Huh.. that's an odd trace. Looks like the alternative code is involved.

Mind booting w/ "noreplacement" to see if that avoids it?


thanks
-john


2006-10-11 18:36:57

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

11 Eki 2006 Çar 20:59 tarihinde, john stultz şunları yazmıştı:
> > Yep, [1] here is whole screen and used config, and as andi suggested i
> > recompiled this kernel [pure vanilla 2.6.18] from scratch.
> >
> > [1] http://cekirdek.pardus.org.tr/~caglar/2.6.18/
>
> Huh.. that's an odd trace. Looks like the alternative code is involved.
>
> Mind booting w/ "noreplacement" to see if that avoids it?

Booting with "noreplacement" solved panics (i tried booting 10 times with both
kernel) for both vanilla one and yours patch included one.

By the way i just realize, panic occurs between

Checking if this processor honours the WP bit even in supervisor mode... Ok.

and

Calibrating delay using timer specific routine.. xxxxx BogoMIPS (lpj=xxxxx)

lines, and system waits there about 5 sec. maybe more (no matter if it panics
or continues to boot somehow)

Cheers
--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (1.01 kB)
(No filename) (189.00 B)
Download all attachments

2006-10-11 18:44:09

by john stultz

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

On Wed, 2006-10-11 at 21:37 +0300, S.Çağlar Onur wrote:
> 11 Eki 2006 Çar 20:59 tarihinde, john stultz şunları yazmıştı:
> > > Yep, [1] here is whole screen and used config, and as andi suggested i
> > > recompiled this kernel [pure vanilla 2.6.18] from scratch.
> > >
> > > [1] http://cekirdek.pardus.org.tr/~caglar/2.6.18/
> >
> > Huh.. that's an odd trace. Looks like the alternative code is involved.
> >
> > Mind booting w/ "noreplacement" to see if that avoids it?
>
> Booting with "noreplacement" solved panics (i tried booting 10 times with both
> kernel) for both vanilla one and yours patch included one.

Hey Gerd,
Looks like the smp replacements code in 2.6.18 is breaking with vmware.
I'm guessing we're taking an interrupt while apply_replacements is
running. Any ideas?


> By the way i just realize, panic occurs between
>
> Checking if this processor honours the WP bit even in supervisor mode... Ok.
>
> and
>
> Calibrating delay using timer specific routine.. xxxxx BogoMIPS (lpj=xxxxx)
>
> lines, and system waits there about 5 sec. maybe more (no matter if it panics
> or continues to boot somehow)

S.Çağlar: Didn't follow this bit at all. Could you explain a bit more?

thanks
-john


2006-10-11 19:09:35

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

11 Eki 2006 Çar 21:43 tarihinde, john stultz şunları yazmıştı:
> S.Çağlar: Didn't follow this bit at all. Could you explain a bit more?

Of course, while system boots kernel waits ~5 seconds (maybe more) after
printing "Checking if this processor honours the WP bit even in supervisor
mode... Ok." line without any visual activity and after that waiting period
kernel panics.

--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (551.00 B)
(No filename) (189.00 B)
Download all attachments

2006-10-11 19:26:47

by john stultz

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

On Wed, 2006-10-11 at 22:09 +0300, S.Çağlar Onur wrote:
> 11 Eki 2006 Çar 21:43 tarihinde, john stultz şunları yazmıştı:
> > S.Çağlar: Didn't follow this bit at all. Could you explain a bit more?
>
> Of course, while system boots kernel waits ~5 seconds (maybe more) after
> printing "Checking if this processor honours the WP bit even in supervisor
> mode... Ok." line without any visual activity and after that waiting period
> kernel panics.

And this results in the same panic you linked to earlier?

thanks
-john


2006-10-11 19:31:47

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

11 Eki 2006 Çar 22:26 tarihinde, john stultz şunları yazmıştı:
> On Wed, 2006-10-11 at 22:09 +0300, S.Çağlar Onur wrote:
> > 11 Eki 2006 Çar 21:43 tarihinde, john stultz şunları yazmıştı:
> > > S.Çağlar: Didn't follow this bit at all. Could you explain a bit more?
> >
> > Of course, while system boots kernel waits ~5 seconds (maybe more) after
> > printing "Checking if this processor honours the WP bit even in
> > supervisor mode... Ok." line without any visual activity and after that
> > waiting period kernel panics.
>
> And this results in the same panic you linked to earlier?

Yes, kernel only panics if it waits there, if somehow can pass here (as i
wrote before it can boot normally for every ~1/10 reboot) no panic occurs.

Cheers
--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (924.00 B)
(No filename) (189.00 B)
Download all attachments

2006-10-12 07:28:51

by Gerd Hoffmann

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

john stultz wrote:
> Hey Gerd,
> Looks like the smp replacements code in 2.6.18 is breaking with vmware.
> I'm guessing we're taking an interrupt while apply_replacements is
> running. Any ideas?

Try switching the vmware configuration to "other OS". This turns off
os-specific binary patching. The alternatives code might have broken
assumptions vmware does about the linux kernel code ...

cheers,

Gerd

--
Gerd Hoffmann <[email protected]>
http://www.suse.de/~kraxel/julika-dora.jpeg

2006-10-12 07:45:54

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

12 Eki 2006 Per 10:28 tarihinde, Gerd Hoffmann şunları yazmıştı:
> Try switching the vmware configuration to "other OS". This turns off
> os-specific binary patching. The alternatives code might have broken
> assumptions vmware does about the linux kernel code ...

I did before, i tried these combinations

* Guest Os: Linux, Version: Other Linux 2.6.x kernel
* Guest Os: Linux, Version: Other Linux
* Guest Os: Other, Version: Other

all of them ends up with panic.

--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (639.00 B)
(No filename) (189.00 B)
Download all attachments

2006-10-16 16:08:38

by Gerd Hoffmann

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

john stultz wrote:
> Hey Gerd,
> Looks like the smp replacements code in 2.6.18 is breaking with vmware.
> I'm guessing we're taking an interrupt while apply_replacements is
> running. Any ideas?

It's not the smp alternatives code, its the one for processor-specific
instructions. The eip offset for alternative_instructions() in the
trace suggests it is the first call to apply_replacements. The second
one is the one for the smp alternatives (which doesn't do anything btw
as we patch away the lock prefixes only).

cheers,

Gerd

--
Gerd Hoffmann <[email protected]>
http://www.suse.de/~kraxel/julika-dora.jpeg

2006-10-16 16:22:54

by Andi Kleen

[permalink] [raw]
Subject: Vmware problems was Re: [RFC] Avoid PIT SMP lockups

On Monday 16 October 2006 18:08, Gerd Hoffmann wrote:
> john stultz wrote:
> > Hey Gerd,
> > Looks like the smp replacements code in 2.6.18 is breaking with vmware.
> > I'm guessing we're taking an interrupt while apply_replacements is
> > running. Any ideas?
>
> It's not the smp alternatives code, its the one for processor-specific
> instructions. The eip offset for alternative_instructions() in the
> trace suggests it is the first call to apply_replacements. The second
> one is the one for the smp alternatives (which doesn't do anything btw
> as we patch away the lock prefixes only).

I would have expected that they trap those writes and invalidate the cache.
Even qemu and valgrind do that fine.

Perhaps Zach has some clue or can refer to someone who has.

-Andi

2006-10-16 22:16:13

by Petr Vandrovec

[permalink] [raw]
Subject: Re: Vmware problems was Re: [RFC] Avoid PIT SMP lockups

Andi Kleen wrote:
> On Monday 16 October 2006 18:08, Gerd Hoffmann wrote:
>
>>john stultz wrote:
>>
>>>Hey Gerd,
>>> Looks like the smp replacements code in 2.6.18 is breaking with vmware.
>>>I'm guessing we're taking an interrupt while apply_replacements is
>>>running. Any ideas?
>>
>>It's not the smp alternatives code, its the one for processor-specific
>>instructions. The eip offset for alternative_instructions() in the
>>trace suggests it is the first call to apply_replacements. The second
>>one is the one for the smp alternatives (which doesn't do anything btw
>>as we patch away the lock prefixes only).
>
>
> I would have expected that they trap those writes and invalidate the cache.
> Even qemu and valgrind do that fine.
>
> Perhaps Zach has some clue or can refer to someone who has.

Why do you think it has something to do with VMware's emulation - except
timing? From what I see, there were bytes

0xF0 0x83 0x44 0x24 0x00 0x00

before apply_alternatives() was entered (binutils 2.17 would generate
one byte shorter code, 0xF0 0x83 0x04 0x24 0x00, but that's another
story - 2.16 and older treat 0(%esp) differently from (%esp)) . Now
update alternatives comes in and starts overwritting alternative - note
that it does that in two steps - first it memcpy()-ies alternative, then
it memcpy()-ies nop padding. So after first memcpy() code looks like

0x0F 0xAE 0xE8 0x24 0x00 0x00

Now timer interrupt arrives, and these data are interpreted as

lfence; andb $0,%al; add %cl,0x465B9415(%ebx)

If it would not crash, once it would return 0x24 0x00 0x00 will get
overwritten with 3 bytes NOP sequence and everybody will be happy.

AFAIT you do not see this because you have to use old binutils to repro
this - I'm unable to reproduce this on Debian box with binutils 2.17, as
then byte sequence is valid instruction even if partially overwritten -
it just clears %al to zero, but nobody notices that...

So as far as I can tell, interrupts (and NMIs?) should be disabled when
apply_alternatives() run if interrupt handlers are using alternatives -
and as it was just proven, they do.

Reason you see this happening in VM often is that this first
alternatives run invalidates lot of internal state, and it takes so long
that next timer interrupt is for sure pending as soon as first "rep
movsb" in alternatives finishes.

Petr

2006-10-16 22:17:27

by Zachary Amsden

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

diff -r 2b8ef2e0e25f arch/i386/kernel/alternative.c
--- a/arch/i386/kernel/alternative.c Mon Oct 16 02:30:58 2006 -0700
+++ b/arch/i386/kernel/alternative.c Mon Oct 16 02:34:29 2006 -0700
@@ -389,6 +389,7 @@ extern struct paravirt_patch *__start_pa

void __init alternative_instructions(void)
{
+ unsigned long flags;
if (no_replacement) {
printk(KERN_INFO "(SMP-)alternatives turned off\n");
free_init_pages("SMP alternatives",
@@ -396,6 +397,8 @@ void __init alternative_instructions(voi
(unsigned long)__smp_alt_end);
return;
}
+
+ local_irq_save(flags);
apply_alternatives(__alt_instructions, __alt_instructions_end);

/* switch to patch-once-at-boottime-only mode and free the
@@ -433,4 +436,5 @@ void __init alternative_instructions(voi
alternatives_smp_switch(0);
}
#endif
-}
+ local_irq_restore(flags);
+}


Attachments:
hotfix-alternatives-irq-safety.patch (843.00 B)

2006-10-16 22:23:01

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

17 Eki 2006 Sal 01:17 tarihinde, Zachary Amsden şunları yazmıştı:
> My nasty quick patch might not apply - the only tree I've got is a very
> hacked 2.6.18-rc6-mm1+local-patches thing, but the fix should be obvious
> enough.

Ok, I'll test and report back...

--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (428.00 B)
(No filename) (189.00 B)
Download all attachments

2006-10-16 22:41:00

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups


> It might only happen with SMP because the difficulty of getting good
> enough TSC / timer IRQ synchronization during boot increases
> exponentially with SMP configurations. And it might pass 10% of the time
> because you were lucky enough not to fire off another timer interrupt yet.

We have the same problem with NMI watchdog events unfortunately.
Need to call something in the nmi watchdog code to make sure it is
not renewed and then reenabled.
Or maybe it's better to figure out a way that yields atomic patches.

I think the best way is to make sure all alternative() patches
are always done before the code can be ever executed - this
means doing it very early for the main kernel. The only exception
would be the LOCK prefix patching, which should be atomic.

iirc there was some more patching except lock prefixes going on for
SMP<->UP transisitions, but last time I checked they didn't look
particularly useful and could be probably eliminated.

-Andi

2006-10-16 23:25:37

by Zachary Amsden

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

Andi Kleen wrote:
>> It might only happen with SMP because the difficulty of getting good
>> enough TSC / timer IRQ synchronization during boot increases
>> exponentially with SMP configurations. And it might pass 10% of the time
>> because you were lucky enough not to fire off another timer interrupt yet.
>>
>
> We have the same problem with NMI watchdog events unfortunately.
> Need to call something in the nmi watchdog code to make sure it is
> not renewed and then reenabled.
> Or maybe it's better to figure out a way that yields atomic patches.
>

> I think the best way is to make sure all alternative() patches
> are always done before the code can be ever executed - this
> means doing it very early for the main kernel. The only exception
> would be the LOCK prefix patching, which should be atomic

Yes, this solves the problem in most cases. Lock patching is fine no
matter when you do it. I think the problem with alternative patching in
check_bugs() is that it happens way too late; the patching really has
nothing at all to do with check_bugs(), and should be a separate step,
probably part of setup_arch.


The paravirt-ops stuff also has some patching code. Fortunately, there,
we can probably skirt the NMI issue by simply disallowing NMIs, but the
issue pops up again in stop_machine_run - what happens if you take NMIs
during stop_machine_run? Debug traps? Module unload is fine, but code
patching done using stop_machine_run is not safe.

Zach

2006-10-17 12:06:56

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

17 Eki 2006 Sal 01:21 tarihinde, S.Çağlar Onur şunları yazmıştı:
> 17 Eki 2006 Sal 01:17 tarihinde, Zachary Amsden şunları yazmıştı:
> > My nasty quick patch might not apply - the only tree I've got is a very
> > hacked 2.6.18-rc6-mm1+local-patches thing, but the fix should be obvious
> > enough.
>
> Ok, I'll test and report back...

Both 2.6.18 and 2.6.18.1 boots without any problem (and of course without
noreplacement workarund) with that patch.

Cheers
--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (636.00 B)
(No filename) (189.00 B)
Download all attachments

2006-10-17 12:17:31

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC] Avoid PIT SMP lockups

On Tuesday 17 October 2006 14:05, S.Çağlar Onur wrote:
> 17 Eki 2006 Sal 01:21 tarihinde, S.Çağlar Onur şunları yazmıştı:
> > 17 Eki 2006 Sal 01:17 tarihinde, Zachary Amsden şunları yazmıştı:
> > > My nasty quick patch might not apply - the only tree I've got is a very
> > > hacked 2.6.18-rc6-mm1+local-patches thing, but the fix should be obvious
> > > enough.
> >
> > Ok, I'll test and report back...
>
> Both 2.6.18 and 2.6.18.1 boots without any problem (and of course without
> noreplacement workarund) with that patch.

Ok thanks.

I still think we need a solution for the NMIs though. I will think
about it.

-Andi

2006-10-19 08:00:56

by Zachary Amsden

[permalink] [raw]
Subject: [PATCH] Fix potential interrupts during alternative patching [was Re: [RFC] Avoid PIT SMP lockups]

Interrupts must be disabled during alternative instruction patching.
On systems with high timer IRQ rates, or when running in an emulator,
timing differences can result in random kernel panics because of
running partially patched instructions. This doesn't yet fix NMIs,
which requires extricating the patch code from the late bug checking
and is logically separate (and also less likely to cause problems).

Signed-off-by: Zachary Amsden <[email protected]>


diff -r 773ac0ebfeb4 arch/i386/kernel/alternative.c
--- a/arch/i386/kernel/alternative.c Wed Oct 18 06:03:56 2006 -0700
+++ b/arch/i386/kernel/alternative.c Wed Oct 18 06:07:03 2006 -0700
@@ -344,6 +344,7 @@ void alternatives_smp_switch(int smp)

void __init alternative_instructions(void)
{
+ unsigned long flags;
if (no_replacement) {
printk(KERN_INFO "(SMP-)alternatives turned off\n");
free_init_pages("SMP alternatives",
@@ -351,6 +352,8 @@ void __init alternative_instructions(voi
(unsigned long)__smp_alt_end);
return;
}
+
+ local_irq_save(flags);
apply_alternatives(__alt_instructions, __alt_instructions_end);

/* switch to patch-once-at-boottime-only mode and free the
@@ -386,4 +389,5 @@ void __init alternative_instructions(voi
alternatives_smp_switch(0);
}
#endif
-}
+ local_irq_restore(flags);
+}


Attachments:
hotfix-alternative-irq-safety.patch (1.27 kB)

2006-10-19 08:46:58

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Fix potential interrupts during alternative patching [was Re: [RFC] Avoid PIT SMP lockups]

Zachary Amsden wrote:
> So this patch is an obvious bugfix - please apply, and to stable as
> well. I'm not sure when this broke, but taking interrupts in the
> middle of self modifying code is not a pretty sight.

I had actually seen this when I built the Xen paravirt kernel with SMP
on, but I assumed it was something in the pv_ops tree rather than
mainline...

J

2006-10-19 09:00:08

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH] Fix potential interrupts during alternative patching [was Re: [RFC] Avoid PIT SMP lockups]

Jeremy Fitzhardinge wrote:
> Zachary Amsden wrote:
>> So this patch is an obvious bugfix - please apply, and to stable as
>> well. I'm not sure when this broke, but taking interrupts in the
>> middle of self modifying code is not a pretty sight.
>
> I had actually seen this when I built the Xen paravirt kernel with SMP
> on, but I assumed it was something in the pv_ops tree rather than
> mainline...

Very likely to show up in qemu as well, if you use that.

2006-10-20 05:56:39

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] Fix potential interrupts during alternative patching [was Re: [RFC] Avoid PIT SMP lockups]

On Thu, Oct 19, 2006 at 01:00:55AM -0700, Zachary Amsden wrote:
> S.??a??lar Onur wrote:
> >17 Eki 2006 Sal 01:21 tarihinde, S.??a??lar Onur ??unlar?? yazm????t??:
> >
> >>17 Eki 2006 Sal 01:17 tarihinde, Zachary Amsden ??unlar?? yazm????t??:
> >>
> >>>My nasty quick patch might not apply - the only tree I've got is a very
> >>>hacked 2.6.18-rc6-mm1+local-patches thing, but the fix should be obvious
> >>>enough.
> >>>
> >>Ok, I'll test and report back...
> >>
> >
> >Both 2.6.18 and 2.6.18.1 boots without any problem (and of course without
> >noreplacement workarund) with that patch.
> >
> >Cheers
> >
>
> So this patch is an obvious bugfix - please apply, and to stable as
> well. I'm not sure when this broke, but taking interrupts in the middle
> of self modifying code is not a pretty sight.

Please send -stable patches to [email protected], not to me directly (we
are a team and hand off ownership to each other, by sending it to the
alias it makes sure that nothing gets lots in our individual mail
boxes.)

Also, please let stable know when this is upstream, we don't want to
apply it before then.

thanks,

greg k-h

2006-10-20 10:36:08

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [PATCH] Fix potential interrupts during alternative patching [was Re: [RFC] Avoid PIT SMP lockups]

19 Eki 2006 Per 12:00 tarihinde, Zachary Amsden şunları yazmıştı:
> Jeremy Fitzhardinge wrote:
> > Zachary Amsden wrote:
> >> So this patch is an obvious bugfix - please apply, and to stable as
> >> well. I'm not sure when this broke, but taking interrupts in the
> >> middle of self modifying code is not a pretty sight.
> >
> > I had actually seen this when I built the Xen paravirt kernel with SMP
> > on, but I assumed it was something in the pv_ops tree rather than
> > mainline...
>
> Very likely to show up in qemu as well, if you use that.

I can confirm qemu and virtual pc 2004 gaves same exception without that
patch.

--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (799.00 B)
(No filename) (189.00 B)
Download all attachments