2006-10-20 00:09:25

by Zachary Amsden

[permalink] [raw]
Subject: [PATCH 1/5] Skip timer works.patch

Add a way to disable the timer IRQ routing check via a boot option. The
VMI timer code uses this to avoid triggering the pester Mingo code, which
probes for some very unusual and broken motherboard routings. It fires
100% of the time when using a paravirtual delay mechanism instead of
using a realtime delay, since there is no elapsed real time, and the 4 timer
IRQs have not yet been delivered.

In addition, it is entirely possible, though improbable, that this bug
could surface on real hardware which picks a particularly bad time to enter
SMM mode, causing a long latency during one of the timer IRQs.

While here, make check_timer be __init.

Signed-off-by: Zachary Amsden <[email protected]>

===================================================================
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -603,8 +603,6 @@ and is between 256 and 4096 characters.

hugepages= [HW,IA-32,IA-64] Maximal number of HugeTLB pages.

- noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing
-
i8042.direct [HW] Put keyboard port into non-translated mode
i8042.dumbkbd [HW] Pretend that controller can only read data from
keyboard and cannot control its state
@@ -1060,8 +1058,13 @@ and is between 256 and 4096 characters.
in certain environments such as networked servers or
real-time systems.

+ noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing
+
noirqdebug [IA-32] Disables the code which attempts to detect and
disable unhandled interrupt sources.
+
+ noirqtest [IA-32,APIC] Disables the code which tests for broken
+ timer IRQ sources.

noisapnp [ISAPNP] Disables ISA PnP code.

===================================================================
--- a/arch/i386/kernel/io_apic.c
+++ b/arch/i386/kernel/io_apic.c
@@ -1864,6 +1864,15 @@ static void __init setup_ioapic_ids_from
static void __init setup_ioapic_ids_from_mpc(void) { }
#endif

+int timer_irq_really_works __initdata;
+int __init irqtest_disable(char *str)
+{
+ timer_irq_really_works = 1;
+ return 1;
+}
+
+__setup("noirqtest", irqtest_disable);
+
/*
* There is a nasty bug in some older SMP boards, their mptable lies
* about the timer IRQ. We do the following to work around the situation:
@@ -1872,9 +1881,12 @@ static void __init setup_ioapic_ids_from
* - if this function detects that timer IRQs are defunct, then we fall
* back to ISA timer IRQs
*/
-static int __init timer_irq_works(void)
+int __init timer_irq_works(void)
{
unsigned long t1 = jiffies;
+
+ if (timer_irq_really_works)
+ return 1;

local_irq_enable();
/* Let ten ticks pass... */
@@ -2146,7 +2158,7 @@ int timer_uses_ioapic_pin_0;
* is so screwy. Thanks to Brian Perkins for testing/hacking this beast
* fanatically on his truly buggy board.
*/
-static inline void check_timer(void)
+static inline void __init check_timer(void)
{
int apic1, pin1, apic2, pin2;
int vector;


2006-10-27 14:56:53

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

On Thu, Oct 19, 2006 at 05:09:22PM -0700, Zachary Amsden wrote:
> Add a way to disable the timer IRQ routing check via a boot option. The
> VMI timer code uses this to avoid triggering the pester Mingo code, which
> probes for some very unusual and broken motherboard routings. It fires
> 100% of the time when using a paravirtual delay mechanism instead of
> using a realtime delay, since there is no elapsed real time, and the 4 timer
> IRQs have not yet been delivered.

You mean paravirtualized udelay will not actually wait?

This implies that you can't ever use any real timer in that kind of guest,
right?

>
> In addition, it is entirely possible, though improbable, that this bug
> could surface on real hardware which picks a particularly bad time to enter
> SMM mode, causing a long latency during one of the timer IRQs.

We already have a no timer check option. But:

>
> While here, make check_timer be __init.

So how is this supposed to work? The hypervisor would always pass that
option? If yes that would seem rather hackish to me. We should probably
instead probe in some way if we have the required timer hardware.
The paravirt kernel should know anyways it is paravirt and that it doesn't
need to probe for flakey hardware.

-Andi

2006-10-27 19:09:44

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

Andi Kleen wrote:
> On Thu, Oct 19, 2006 at 05:09:22PM -0700, Zachary Amsden wrote:
>
>> Add a way to disable the timer IRQ routing check via a boot option. The
>> VMI timer code uses this to avoid triggering the pester Mingo code, which
>> probes for some very unusual and broken motherboard routings. It fires
>> 100% of the time when using a paravirtual delay mechanism instead of
>> using a realtime delay, since there is no elapsed real time, and the 4 timer
>> IRQs have not yet been delivered.
>>
>
> You mean paravirtualized udelay will not actually wait?
>

Yes, but even putting that problem aside, the timing element here is
tricky to get right in a VM.

> This implies that you can't ever use any real timer in that kind of guest,
> right?
>

No. You can use a real timer just fine. But there is no reason ever to
use udelay to busy wait for "hardware" in a virtual machine. Drivers
which are used for real hardware may turn udelay back on selectively;
but this is another patch.


>> In addition, it is entirely possible, though improbable, that this bug
>> could surface on real hardware which picks a particularly bad time to enter
>> SMM mode, causing a long latency during one of the timer IRQs.
>>
>
> We already have a no timer check option. But:
>

Really? I didn't see one that disabled the broken motherboard detection
/ workaround code, which is what we are trying to avoid here.


>> While here, make check_timer be __init.
>>
>
> So how is this supposed to work? The hypervisor would always pass that
> option? If yes that would seem rather hackish to me. We should probably
> instead probe in some way if we have the required timer hardware.
> The paravirt kernel should know anyways it is paravirt and that it doesn't
> need to probe for flakey hardware.
>

That is what this patch is building towards, but the boot option is
"free", so why not? In the meantime, it helps non-paravirt kernels
booted in a VM.

Zach

2006-10-27 21:19:04

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

On Friday 27 October 2006 12:09, Zachary Amsden wrote:
> Andi Kleen wrote:
> > On Thu, Oct 19, 2006 at 05:09:22PM -0700, Zachary Amsden wrote:
> >> Add a way to disable the timer IRQ routing check via a boot option. The
> >> VMI timer code uses this to avoid triggering the pester Mingo code,
> >> which probes for some very unusual and broken motherboard routings. It
> >> fires 100% of the time when using a paravirtual delay mechanism instead
> >> of using a realtime delay, since there is no elapsed real time, and the
> >> 4 timer IRQs have not yet been delivered.
> >
> > You mean paravirtualized udelay will not actually wait?
>
> Yes, but even putting that problem aside, the timing element here is
> tricky to get right in a VM.
>
> > This implies that you can't ever use any real timer in that kind of
> > guest, right?
>
> No. You can use a real timer just fine. But there is no reason ever to
> use udelay to busy wait for "hardware" in a virtual machine. Drivers
> which are used for real hardware may turn udelay back on selectively;
> but this is another patch.
>
> >> In addition, it is entirely possible, though improbable, that this bug
> >> could surface on real hardware which picks a particularly bad time to
> >> enter SMM mode, causing a long latency during one of the timer IRQs.
> >
> > We already have a no timer check option. But:
>
> Really? I didn't see one that disabled the broken motherboard detection
> / workaround code, which is what we are trying to avoid here.

no_timer_check. But it's only there on x86-64 in mainline - although there
were some patches to add it to i386 too.

> That is what this patch is building towards, but the boot option is
> "free", so why not? In the meantime, it helps non-paravirt kernels
> booted in a VM.

Hmm, you meant they paniced before? If they just fail a few tests
that is not particularly worrying (real hardware does that often too)

-Andi

2006-10-30 20:54:57

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

Andi Kleen wrote:
> no_timer_check. But it's only there on x86-64 in mainline - although there
> were some patches to add it to i386 too.
>

I can rename to match the x86-64 name.


>> That is what this patch is building towards, but the boot option is
>> "free", so why not? In the meantime, it helps non-paravirt kernels
>> booted in a VM.
>>
>
> Hmm, you meant they paniced before? If they just fail a few tests
> that is not particularly worrying (real hardware does that often too)
>

Yes, they sometimes fail to boot, and the failure message used to ask us
to pester mingo.

Zach

2006-10-30 22:50:19

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

On Mon, Oct 30, 2006 at 12:54:55PM -0800, Zachary Amsden wrote:
> Andi Kleen wrote:
> >no_timer_check. But it's only there on x86-64 in mainline - although there
> >were some patches to add it to i386 too.
> >
>
> I can rename to match the x86-64 name.

I will do that in my tree.

> >>That is what this patch is building towards, but the boot option is
> >>"free", so why not? In the meantime, it helps non-paravirt kernels
> >>booted in a VM.
> >>
> >
> >Hmm, you meant they paniced before? If they just fail a few tests
> >that is not particularly worrying (real hardware does that often too)
> >
>
> Yes, they sometimes fail to boot, and the failure message used to ask us
> to pester mingo.

I still think we should figure that out automatically. Letting
the Hypervisor pass magic boot options seems somehow unclean.

But i suppose it will only work for the paravirtualized case,
not for the case of kernel running "native" under a hypervisor
I suppose? Or does that one not panic?


-Andi

2006-10-30 23:09:22

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

Andi Kleen wrote:
> On Mon, Oct 30, 2006 at 12:54:55PM -0800, Zachary Amsden wrote:
>
>> Andi Kleen wrote:
>>
>>> no_timer_check. But it's only there on x86-64 in mainline - although there
>>> were some patches to add it to i386 too.
>>>
>>>
>> I can rename to match the x86-64 name.
>>
>
> I will do that in my tree.
>
>
>>>> That is what this patch is building towards, but the boot option is
>>>> "free", so why not? In the meantime, it helps non-paravirt kernels
>>>> booted in a VM.
>>>>
>>>>
>>> Hmm, you meant they paniced before? If they just fail a few tests
>>> that is not particularly worrying (real hardware does that often too)
>>>
>>>
>> Yes, they sometimes fail to boot, and the failure message used to ask us
>> to pester mingo.
>>
>
> I still think we should figure that out automatically. Letting
> the Hypervisor pass magic boot options seems somehow unclean.
>
> But i suppose it will only work for the paravirtualized case,
> not for the case of kernel running "native" under a hypervisor
> I suppose? Or does that one not panic?
>

That is the one that can panic, for now. Fixing the paravirtualized
case is easy, but we can't assume paravirtualization just yet.

Zach


2006-10-30 23:12:53

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

> That is the one that can panic, for now. Fixing the paravirtualized
> case is easy, but we can't assume paravirtualization just yet.

Hmm, this means standard vmware boot is not reliable unless that magic option
is set? That doesn't sound good.

-Andi

2006-10-30 23:24:57

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

Andi Kleen wrote:
>> That is the one that can panic, for now. Fixing the paravirtualized
>> case is easy, but we can't assume paravirtualization just yet.
>>
>
> Hmm, this means standard vmware boot is not reliable unless that magic option
> is set? That doesn't sound good.
>

It doesn't happen often, but it is a possibility that the kernel
calibrates the delay wrong because of timing glitches caused by CPU
migration, paging, or other phenomena which are supposed to be
transparent to the kernel (but cause temporal lapse). In that case, the
kernel may not make enough progress in a spin delay loop to properly
reach the number of microseconds required for N number of timer ticks to
occur. In theory this can happen on a real machine, as SMM mode could
be active, doing USB device emulation or something that takes a while
during the lpj calibration and throwing the computation off.

By changing the parameters (N ticks at K Hz in T seconds), it is easy to
create an unstable measurement that can achieve high failure rates,
although in practice the Linux parameters appear to be reasonable enough
that it is not a major problem.

Zach

2006-10-30 23:51:00

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch


> It doesn't happen often, but it is a possibility that the kernel
> calibrates the delay wrong because of timing glitches caused by CPU
> migration, paging, or other phenomena which are supposed to be
> transparent to the kernel (but cause temporal lapse).

We're supposed to handle those because they happen on real hardware
too with long running SMM handlers. Or at least there was a effort some time ago
to do this. If it wasn't enough we'll likely need to fix the code.

> In that case, the
> kernel may not make enough progress in a spin delay loop to properly
> reach the number of microseconds required for N number of timer ticks to
> occur.

Hmm, mdelay is polling RDTSC and assumes it makes forward progress
and waits until the time that was estimated at the original TSC<->PIT
calibration passed. While there is a spin loop it is definitely
polling a timer that is supposed to tick properly even in virtualization.

You're saying that doesn't work on vmware? Does it have trouble
with RDTSC?

Anyways if polling against TSC doesn't work I suppose we could
change it to poll against some other timer.

> In theory this can happen on a real machine, as SMM mode could
> be active, doing USB device emulation or something that takes a while
> during the lpj calibration and throwing the computation off.

Yep.

> By changing the parameters (N ticks at K Hz in T seconds), it is easy to
> create an unstable measurement that can achieve high failure rates,
> although in practice the Linux parameters appear to be reasonable enough
> that it is not a major problem.

Hmm, why exactly?

-Andi

2006-11-15 08:01:35

by Chris Wright

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

* Andi Kleen ([email protected]) wrote:
> On Mon, Oct 30, 2006 at 12:54:55PM -0800, Zachary Amsden wrote:
> > Andi Kleen wrote:
> > >no_timer_check. But it's only there on x86-64 in mainline - although there
> > >were some patches to add it to i386 too.
> > >
> >
> > I can rename to match the x86-64 name.
>
> I will do that in my tree.

Looks like this one might have been lost. Here's the renamed version
(it's against 2.6.19-rc5-mm2, which should have your tree in it).

thanks,
-chris
--

From: Zachary Amsden <[email protected]>

Add a way to disable the timer IRQ routing check via a boot option. The
VMI timer code uses this to avoid triggering the pester Mingo code, which
probes for some very unusual and broken motherboard routings. It fires
100% of the time when using a paravirtual delay mechanism instead of
using a realtime delay, since there is no elapsed real time, and the 4 timer
IRQs have not yet been delivered.

In addition, it is entirely possible, though improbable, that this bug
could surface on real hardware which picks a particularly bad time to enter
SMM mode, causing a long latency during one of the timer IRQs.

While here, make check_timer be __init.

Signed-off-by: Zachary Amsden <[email protected]>
[chrisw: use no_timer_check to bring inline with x86_64 as per Andi's request]
Signed-off-by: Chris Wright <[email protected]>

===================================================================
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -610,8 +610,6 @@ and is between 256 and 4096 characters.

hugepages= [HW,IA-32,IA-64] Maximal number of HugeTLB pages.

- noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing
-
i8042.direct [HW] Put keyboard port into non-translated mode
i8042.dumbkbd [HW] Pretend that controller can only read data from
keyboard and cannot control its state
@@ -1081,8 +1079,13 @@ and is between 256 and 4096 characters.
in certain environments such as networked servers or
real-time systems.

+ noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing
+
noirqdebug [IA-32] Disables the code which attempts to detect and
disable unhandled interrupt sources.
+
+ no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for
+ broken timer IRQ sources.

noisapnp [ISAPNP] Disables ISA PnP code.

===================================================================
--- a/arch/i386/kernel/io_apic.c
+++ b/arch/i386/kernel/io_apic.c
@@ -1931,6 +1931,15 @@ static void __init setup_ioapic_ids_from
static void __init setup_ioapic_ids_from_mpc(void) { }
#endif

+static int no_timer_check __initdata;
+
+static int __init notimercheck(char *s)
+{
+ no_timer_check = 1;
+ return 1;
+}
+__setup("no_timer_check", notimercheck);
+
/*
* There is a nasty bug in some older SMP boards, their mptable lies
* about the timer IRQ. We do the following to work around the situation:
@@ -1939,9 +1948,12 @@ static void __init setup_ioapic_ids_from
* - if this function detects that timer IRQs are defunct, then we fall
* back to ISA timer IRQs
*/
-static int __init timer_irq_works(void)
+int __init timer_irq_works(void)
{
unsigned long t1 = jiffies;
+
+ if (no_timer_check)
+ return 1;

local_irq_enable();
/* Let ten ticks pass... */
@@ -2219,7 +2231,7 @@ int timer_uses_ioapic_pin_0;
* is so screwy. Thanks to Brian Perkins for testing/hacking this beast
* fanatically on his truly buggy board.
*/
-static inline void check_timer(void)
+static inline void __init check_timer(void)
{
int apic1, pin1, apic2, pin2;
int vector;

2006-11-16 22:53:56

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

On Thu, 19 Oct 2006 17:09:22 -0700
Zachary Amsden <[email protected]> wrote:

> Add a way to disable the timer IRQ routing check via a boot option. The
> VMI timer code uses this to avoid triggering the pester Mingo code, which
> probes for some very unusual and broken motherboard routings. It fires
> 100% of the time when using a paravirtual delay mechanism instead of
> using a realtime delay, since there is no elapsed real time, and the 4 timer
> IRQs have not yet been delivered.
>
> In addition, it is entirely possible, though improbable, that this bug
> could surface on real hardware which picks a particularly bad time to enter
> SMM mode, causing a long latency during one of the timer IRQs.
>
> While here, make check_timer be __init.
>

Andi seems to have merged this patch but from somewhere I picked up a
different version, below.

I think the version I have is better. Because the patch Andi has merged is
cast in terms of "irq testing", which is broad. But that's not what the
patch does - the patch handles only timers.

IOW, this:

> +
> + noirqtest [IA-32,APIC] Disables the code which tests for broken
> + timer IRQ sources.

is misleadingly named. This:

+ no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for
+ broken timer IRQ sources.
+

is better, no?

But right now, I'll settle for anything which usually compiles.



From: Zachary Amsden <[email protected]>

Add a way to disable the timer IRQ routing check via a boot option. The
VMI timer code uses this to avoid triggering the pester Mingo code, which
probes for some very unusual and broken motherboard routings. It fires
100% of the time when using a paravirtual delay mechanism instead of using
a realtime delay, since there is no elapsed real time, and the 4 timer IRQs
have not yet been delivered.

In addition, it is entirely possible, though improbable, that this bug
could surface on real hardware which picks a particularly bad time to enter
SMM mode, causing a long latency during one of the timer IRQs.

While here, make check_timer be __init.

Signed-off-by: Zachary Amsden <[email protected]>
[chrisw: use no_timer_check to bring inline with x86_64 as per Andi's request]
Signed-off-by: Chris Wright <[email protected]>
Cc: Andi Kleen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

Documentation/kernel-parameters.txt | 7 +++++--
arch/i386/kernel/io_apic.c | 16 ++++++++++++++--
2 files changed, 19 insertions(+), 4 deletions(-)

diff -puN arch/i386/kernel/io_apic.c~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option arch/i386/kernel/io_apic.c
--- a/arch/i386/kernel/io_apic.c~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option
+++ a/arch/i386/kernel/io_apic.c
@@ -1930,6 +1930,15 @@ static void __init setup_ioapic_ids_from
static void __init setup_ioapic_ids_from_mpc(void) { }
#endif

+static int no_timer_check __initdata;
+
+static int __init notimercheck(char *s)
+{
+ no_timer_check = 1;
+ return 1;
+}
+__setup("no_timer_check", notimercheck);
+
/*
* There is a nasty bug in some older SMP boards, their mptable lies
* about the timer IRQ. We do the following to work around the situation:
@@ -1938,10 +1947,13 @@ static void __init setup_ioapic_ids_from
* - if this function detects that timer IRQs are defunct, then we fall
* back to ISA timer IRQs
*/
-static int __init timer_irq_works(void)
+int __init timer_irq_works(void)
{
unsigned long t1 = jiffies;

+ if (no_timer_check)
+ return 1;
+
local_irq_enable();
/* Let ten ticks pass... */
mdelay((10 * 1000) / HZ);
@@ -2212,7 +2224,7 @@ int timer_uses_ioapic_pin_0;
* is so screwy. Thanks to Brian Perkins for testing/hacking this beast
* fanatically on his truly buggy board.
*/
-static inline void check_timer(void)
+static inline void __init check_timer(void)
{
int apic1, pin1, apic2, pin2;
int vector;
diff -puN Documentation/kernel-parameters.txt~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option Documentation/kernel-parameters.txt
--- a/Documentation/kernel-parameters.txt~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option
+++ a/Documentation/kernel-parameters.txt
@@ -599,8 +599,6 @@ and is between 256 and 4096 characters.

hugepages= [HW,IA-32,IA-64] Maximal number of HugeTLB pages.

- noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing
-
i8042.direct [HW] Put keyboard port into non-translated mode
i8042.dumbkbd [HW] Pretend that controller can only read data from
keyboard and cannot control its state
@@ -1056,9 +1054,14 @@ and is between 256 and 4096 characters.
in certain environments such as networked servers or
real-time systems.

+ noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing
+
noirqdebug [IA-32] Disables the code which attempts to detect and
disable unhandled interrupt sources.

+ no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for
+ broken timer IRQ sources.
+
noisapnp [ISAPNP] Disables ISA PnP code.

noinitrd [RAM] Tells the kernel not to load any configured
_

2006-11-16 23:08:42

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

Andrew Morton wrote:
> On Thu, 19 Oct 2006 17:09:22 -0700
> Zachary Amsden <[email protected]> wrote:
>
>
>> Add a way to disable the timer IRQ routing check via a boot option. The
>> VMI timer code uses this to avoid triggering the pester Mingo code, which
>> probes for some very unusual and broken motherboard routings. It fires
>> 100% of the time when using a paravirtual delay mechanism instead of
>> using a realtime delay, since there is no elapsed real time, and the 4 timer
>> IRQs have not yet been delivered.
>>
>> In addition, it is entirely possible, though improbable, that this bug
>> could surface on real hardware which picks a particularly bad time to enter
>> SMM mode, causing a long latency during one of the timer IRQs.
>>
>> While here, make check_timer be __init.
>>
>>
>
> Andi seems to have merged this patch but from somewhere I picked up a
> different version, below.
>
> I think the version I have is better. Because the patch Andi has merged is
> cast in terms of "irq testing", which is broad. But that's not what the
> patch does - the patch handles only timers.
>
> IOW, this:
>
>
>> +
>> + noirqtest [IA-32,APIC] Disables the code which tests for broken
>> + timer IRQ sources.
>>
>
> is misleadingly named. This:
>
> + no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for
> + broken timer IRQ sources.
> +
>
> is better, no?
>
> But right now, I'll settle for anything which usually compiles.
>
>

Yes, the name sucks. There is no real reason to actually have a boot
parameter at all once the paravirt / VMI patches are in, but I wanted
something to be able to set timer_irq_really_works until then to avoid
someone accidentally removing it.

Zach

2006-11-16 23:08:19

by Chris Wright

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch

* Andrew Morton ([email protected]) wrote:
> Andi seems to have merged this patch but from somewhere I picked up a
> different version, below.

That was from me. The earlier one had fallen through the cracks, so
I picked it up and reworked it (so that it uses no_timer_check) as per
Andi's suggestion.

> I think the version I have is better.

Agreed ;-)

> Because the patch Andi has merged is
> cast in terms of "irq testing", which is broad. But that's not what the
> patch does - the patch handles only timers.
>
> IOW, this:
>
> > +
> > + noirqtest [IA-32,APIC] Disables the code which tests for broken
> > + timer IRQ sources.
>
> is misleadingly named. This:
>
> + no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for
> + broken timer IRQ sources.
> +
>
> is better, no?

Clearer, and also same as x86_64.

> But right now, I'll settle for anything which usually compiles.

I believe the current round has fixed all the know compilation issues.
There's one outstanding runtime issue from Andi, which I believe is due
to an old glibc which requires COMPAT_VDSO (and which CONFIG_PARAVIRT
must disable). So current set is clean from the reports I have (and
assuming above vdso analysis is correct).

thanks,
-chris
--
> From: Zachary Amsden <[email protected]>
>
> Add a way to disable the timer IRQ routing check via a boot option. The
> VMI timer code uses this to avoid triggering the pester Mingo code, which
> probes for some very unusual and broken motherboard routings. It fires
> 100% of the time when using a paravirtual delay mechanism instead of using
> a realtime delay, since there is no elapsed real time, and the 4 timer IRQs
> have not yet been delivered.
>
> In addition, it is entirely possible, though improbable, that this bug
> could surface on real hardware which picks a particularly bad time to enter
> SMM mode, causing a long latency during one of the timer IRQs.
>
> While here, make check_timer be __init.
>
> Signed-off-by: Zachary Amsden <[email protected]>
> [chrisw: use no_timer_check to bring inline with x86_64 as per Andi's request]
> Signed-off-by: Chris Wright <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> ---
>
> Documentation/kernel-parameters.txt | 7 +++++--
> arch/i386/kernel/io_apic.c | 16 ++++++++++++++--
> 2 files changed, 19 insertions(+), 4 deletions(-)
>
> diff -puN arch/i386/kernel/io_apic.c~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option arch/i386/kernel/io_apic.c
> --- a/arch/i386/kernel/io_apic.c~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option
> +++ a/arch/i386/kernel/io_apic.c
> @@ -1930,6 +1930,15 @@ static void __init setup_ioapic_ids_from
> static void __init setup_ioapic_ids_from_mpc(void) { }
> #endif
>
> +static int no_timer_check __initdata;
> +
> +static int __init notimercheck(char *s)
> +{
> + no_timer_check = 1;
> + return 1;
> +}
> +__setup("no_timer_check", notimercheck);
> +
> /*
> * There is a nasty bug in some older SMP boards, their mptable lies
> * about the timer IRQ. We do the following to work around the situation:
> @@ -1938,10 +1947,13 @@ static void __init setup_ioapic_ids_from
> * - if this function detects that timer IRQs are defunct, then we fall
> * back to ISA timer IRQs
> */
> -static int __init timer_irq_works(void)
> +int __init timer_irq_works(void)
> {
> unsigned long t1 = jiffies;
>
> + if (no_timer_check)
> + return 1;
> +
> local_irq_enable();
> /* Let ten ticks pass... */
> mdelay((10 * 1000) / HZ);
> @@ -2212,7 +2224,7 @@ int timer_uses_ioapic_pin_0;
> * is so screwy. Thanks to Brian Perkins for testing/hacking this beast
> * fanatically on his truly buggy board.
> */
> -static inline void check_timer(void)
> +static inline void __init check_timer(void)
> {
> int apic1, pin1, apic2, pin2;
> int vector;
> diff -puN Documentation/kernel-parameters.txt~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option Documentation/kernel-parameters.txt
> --- a/Documentation/kernel-parameters.txt~i386-add-a-way-to-disable-the-timer-irq-routing-check-via-a-boot-option
> +++ a/Documentation/kernel-parameters.txt
> @@ -599,8 +599,6 @@ and is between 256 and 4096 characters.
>
> hugepages= [HW,IA-32,IA-64] Maximal number of HugeTLB pages.
>
> - noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing
> -
> i8042.direct [HW] Put keyboard port into non-translated mode
> i8042.dumbkbd [HW] Pretend that controller can only read data from
> keyboard and cannot control its state
> @@ -1056,9 +1054,14 @@ and is between 256 and 4096 characters.
> in certain environments such as networked servers or
> real-time systems.
>
> + noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing
> +
> noirqdebug [IA-32] Disables the code which attempts to detect and
> disable unhandled interrupt sources.
>
> + no_timer_check [IA-32,X86_64,APIC] Disables the code which tests for
> + broken timer IRQ sources.
> +
> noisapnp [ISAPNP] Disables ISA PnP code.
>
> noinitrd [RAM] Tells the kernel not to load any configured
> _

2006-11-17 05:05:35

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/5] Skip timer works.patch


> Andi seems to have merged this patch but from somewhere I picked up a
> different version, below.
>
> I think the version I have is better. Because the patch Andi has merged is
> cast in terms of "irq testing", which is broad. But that's not what the
> patch does - the patch handles only timers.

Agreed. I updated to your version now.

-Andi