2008-07-01 00:13:38

by Maciej W. Rozycki

[permalink] [raw]
Subject: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

From: Matthew Garrett <[email protected]>

Some HP laptops have a problem with their DSDT reporting as
HP/SB400/10000, which includes some code which overrides all temperature
trip points to 16C if the INTIN2 input of the I/O APIC is enabled. This
input is incorrectly designated the ISA IRQ 0 via an interrupt source
override even though it is wired to the output of the master 8259A and
INTIN0 is not connected at all. So far two models have been identified,
namely nx6125 and nx6325.

Use a knob provided by the I/O APIC interrupt registration code to
abandon any attempts to route IRQ 0 through the I/O APIC for these
systems.

Signed-off-by: Maciej W. Rozycki <[email protected]>
---
Hello,

Matthew, you have not added your sign-off with the original patch, please
do so now.

Rafael, please try this change together with
patch-2.6.26-rc1-20080505-mpparse-acpi-noirq0-2 and see if this fixes your
system.

I have tried the patches with a wildcard entry added to acpi_dmi_table[]
so that it trips for my system. The result is as follows:

ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=-1 pin1=-1 apic2=-1 pin2=-1
...trying to set up timer as Virtual Wire IRQ... works.

(note the absence of any timer pin), which means the changes work as
expected. Here the timer has been set up as a native local APIC interrupt
routed through the 8259A set up as a "virtual wire".

Ingo, please apply the two patches where appropriate.

Maciej

patch-2.6.26-rc1-20080505-mjg59-acpi-noirq0-0
diff -up --recursive --new-file linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/acpi/boot.c linux-2.6.26-rc1-20080505/arch/x86/kernel/acpi/boot.c
--- linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/acpi/boot.c 2008-05-05 02:55:24.000000000 +0000
+++ linux-2.6.26-rc1-20080505/arch/x86/kernel/acpi/boot.c 2008-06-30 23:31:48.000000000 +0000
@@ -1049,6 +1049,17 @@ static int __init force_acpi_ht(const st
}

/*
+ * Don't register any I/O APIC entries for the 8254 timer IRQ.
+ */
+static int __init
+dmi_disable_irq0_through_ioapic(const struct dmi_system_id *d)
+{
+ pr_notice("%s detected: disabling IRQ 0 through I/O APIC\n", d->ident);
+ set_disable_irq0_through_ioapic(1);
+ return 0;
+}
+
+/*
* If your system is blacklisted here, but you find that acpi=force
* works for you, please contact [email protected]
*/
@@ -1215,6 +1226,32 @@ static struct dmi_system_id __initdata a
DMI_MATCH(DMI_PRODUCT_NAME, "TravelMate 360"),
},
},
+ /*
+ * HP laptops which use a DSDT reporting as HP/SB400/10000,
+ * which includes some code which overrides all temperature
+ * trip points to 16C if the INTIN2 input of the I/O APIC
+ * is enabled. This input is incorrectly designated the
+ * ISA IRQ 0 via an interrupt source override even though
+ * it is wired to the output of the master 8259A and INTIN0
+ * is not connected at all. Abandon any attempts to route
+ * IRQ 0 through the I/O APIC therefore.
+ */
+ {
+ .callback = dmi_disable_irq0_through_ioapic,
+ .ident = "HP NX6125 laptop",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "HP Compaq nx6125"),
+ },
+ },
+ {
+ .callback = dmi_disable_irq0_through_ioapic,
+ .ident = "HP NX6325 laptop",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "HP Compaq nx6325"),
+ },
+ },
{}
};


2008-07-01 00:17:54

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

Assuming it works for Rafael (still haven't had time to pull my nx6125
out of storage):

Signed-off-by: [email protected]

--
Matthew Garrett | [email protected]

2008-07-01 08:47:23

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems


* Matthew Garrett <[email protected]> wrote:

> Assuming it works for Rafael (still haven't had time to pull my nx6125
> out of storage):
>
> Signed-off-by: [email protected]

Rafael, could you please try the latest tip/master that i've just pushed
out:

http://people.redhat.com/mingo/tip.git/README

it has the final form of these changes integrated - it should in theory
work out of box on your system, with no boot parameters or other
explicit quirks needed anywhere.

Ingo

2008-07-01 19:57:12

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Tuesday, 1 of July 2008, Ingo Molnar wrote:
>
> * Matthew Garrett <[email protected]> wrote:
>
> > Assuming it works for Rafael (still haven't had time to pull my nx6125
> > out of storage):
> >
> > Signed-off-by: [email protected]
>
> Rafael, could you please try the latest tip/master that i've just pushed
> out:
>
> http://people.redhat.com/mingo/tip.git/README
>
> it has the final form of these changes integrated - it should in theory
> work out of box on your system, with no boot parameters or other
> explicit quirks needed anywhere.

I tested patches [1/2] (your version) and [2/2] on top of today's linux-next
and they work just fine.

Thanks,
Rafael

2008-07-01 20:25:35

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems


* Rafael J. Wysocki <[email protected]> wrote:

> On Tuesday, 1 of July 2008, Ingo Molnar wrote:
> >
> > * Matthew Garrett <[email protected]> wrote:
> >
> > > Assuming it works for Rafael (still haven't had time to pull my nx6125
> > > out of storage):
> > >
> > > Signed-off-by: [email protected]
> >
> > Rafael, could you please try the latest tip/master that i've just pushed
> > out:
> >
> > http://people.redhat.com/mingo/tip.git/README
> >
> > it has the final form of these changes integrated - it should in
> > theory work out of box on your system, with no boot parameters or
> > other explicit quirks needed anywhere.
>
> I tested patches [1/2] (your version) and [2/2] on top of today's
> linux-next and they work just fine.

thanks Rafael!

Ingo

2008-07-03 07:12:49

by Robert Hancock

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

Maciej W. Rozycki wrote:
> From: Matthew Garrett <[email protected]>
>
> Some HP laptops have a problem with their DSDT reporting as
> HP/SB400/10000, which includes some code which overrides all temperature
> trip points to 16C if the INTIN2 input of the I/O APIC is enabled. This
> input is incorrectly designated the ISA IRQ 0 via an interrupt source
> override even though it is wired to the output of the master 8259A and
> INTIN0 is not connected at all. So far two models have been identified,
> namely nx6125 and nx6325.
>
> Use a knob provided by the I/O APIC interrupt registration code to
> abandon any attempts to route IRQ 0 through the I/O APIC for these
> systems.

Just a thought.. has anyone recently looked into using the RTC timer
instead of the PIT as the default clockevent source? There still seem to
be systems like these seeping through with broken PIT through IOAPIC
since Windows apparently doesn't use that config at all. If we could use
the RTC instead (which is apparently what Windows does) we could avoid
that problem.

The last I read on the subject, the issue was that the RTC timer
frequencies didn't match the existing Linux HZ values, but I'm not sure
if that's still an issue with the current clockevents infrastructure..

2008-07-04 21:19:47

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Tuesday, 1 of July 2008, Ingo Molnar wrote:
>
> * Rafael J. Wysocki <[email protected]> wrote:
>
> > On Tuesday, 1 of July 2008, Ingo Molnar wrote:
> > >
> > > * Matthew Garrett <[email protected]> wrote:
> > >
> > > > Assuming it works for Rafael (still haven't had time to pull my nx6125
> > > > out of storage):
> > > >
> > > > Signed-off-by: [email protected]
> > >
> > > Rafael, could you please try the latest tip/master that i've just pushed
> > > out:
> > >
> > > http://people.redhat.com/mingo/tip.git/README
> > >
> > > it has the final form of these changes integrated - it should in
> > > theory work out of box on your system, with no boot parameters or
> > > other explicit quirks needed anywhere.
> >
> > I tested patches [1/2] (your version) and [2/2] on top of today's
> > linux-next and they work just fine.
>
> thanks Rafael!

Can you please push them to linux-next, btw?

They still apply cleanly to linux-next for me.

Thanks,
Rafael

2008-07-06 23:00:39

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Tuesday, 1 of July 2008, Ingo Molnar wrote:
>
> * Rafael J. Wysocki <[email protected]> wrote:
>
> > On Tuesday, 1 of July 2008, Ingo Molnar wrote:
> > >
> > > * Matthew Garrett <[email protected]> wrote:
> > >
> > > > Assuming it works for Rafael (still haven't had time to pull my nx6125
> > > > out of storage):
> > > >
> > > > Signed-off-by: [email protected]
> > >
> > > Rafael, could you please try the latest tip/master that i've just pushed
> > > out:
> > >
> > > http://people.redhat.com/mingo/tip.git/README
> > >
> > > it has the final form of these changes integrated - it should in
> > > theory work out of box on your system, with no boot parameters or
> > > other explicit quirks needed anywhere.
> >
> > I tested patches [1/2] (your version) and [2/2] on top of today's
> > linux-next and they work just fine.
>
> thanks Rafael!

Well, I'm afraid that my information wasn't correct. The patches actually
don't work, but I had my own additional patch that fixed the problem applied,
which I've only just discovered. Sorry for the confusion and the issue is
still unfixed.

Thanks,
Rafael

2008-07-07 01:18:19

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Monday, 7 of July 2008, Rafael J. Wysocki wrote:
> On Tuesday, 1 of July 2008, Ingo Molnar wrote:
> >
> > * Rafael J. Wysocki <[email protected]> wrote:
> >
> > > On Tuesday, 1 of July 2008, Ingo Molnar wrote:
> > > >
> > > > * Matthew Garrett <[email protected]> wrote:
> > > >
> > > > > Assuming it works for Rafael (still haven't had time to pull my nx6125
> > > > > out of storage):
> > > > >
> > > > > Signed-off-by: [email protected]
> > > >
> > > > Rafael, could you please try the latest tip/master that i've just pushed
> > > > out:
> > > >
> > > > http://people.redhat.com/mingo/tip.git/README
> > > >
> > > > it has the final form of these changes integrated - it should in
> > > > theory work out of box on your system, with no boot parameters or
> > > > other explicit quirks needed anywhere.
> > >
> > > I tested patches [1/2] (your version) and [2/2] on top of today's
> > > linux-next and they work just fine.
> >
> > thanks Rafael!
>
> Well, I'm afraid that my information wasn't correct. The patches actually
> don't work, but I had my own additional patch that fixed the problem applied,
> which I've only just discovered. Sorry for the confusion and the issue is
> still unfixed.

Appended is a patch that actually works for me (it's missing the i386 part,
but I can't test that anyway).

Please note two things:
(1) The whole acpi_dmi_table[] thing originally depended on __i386__, so it
wouldn't work on x86-64 no matter what. I removed that dependecy, but
I have no idea why it was there and so I'm not sure if that's correct.
(2) The clear_IO_APIC_pin(apic1, pin1) done if disable_irq0_through_ioapic is
true is absolutely essential. The symptoms are 100% reproducible without it.

Thanks,
Rafael


---
arch/x86/kernel/acpi/boot.c | 43 +++++++++++++++++++++++++++++++++++++------
arch/x86/kernel/io_apic_64.c | 19 ++++++++++++++-----
2 files changed, 51 insertions(+), 11 deletions(-)

Index: linux-next/arch/x86/kernel/acpi/boot.c
===================================================================
--- linux-next.orig/arch/x86/kernel/acpi/boot.c
+++ linux-next/arch/x86/kernel/acpi/boot.c
@@ -1363,8 +1363,6 @@ static void __init acpi_process_madt(voi
return;
}

-#ifdef __i386__
-
static int __init disable_acpi_irq(const struct dmi_system_id *d)
{
if (!acpi_force) {
@@ -1415,6 +1413,17 @@ static int __init force_acpi_ht(const st
}

/*
+ * Don't register any I/O APIC entries for the 8254 timer IRQ.
+ */
+static int __init
+dmi_disable_irq0_through_ioapic(const struct dmi_system_id *d)
+{
+ pr_notice("%s detected: disabling IRQ 0 through I/O APIC\n", d->ident);
+ force_disable_irq0_through_ioapic();
+ return 0;
+}
+
+/*
* If your system is blacklisted here, but you find that acpi=force
* works for you, please contact [email protected]
*/
@@ -1581,11 +1590,35 @@ static struct dmi_system_id __initdata a
DMI_MATCH(DMI_PRODUCT_NAME, "TravelMate 360"),
},
},
+ /*
+ * HP laptops which use a DSDT reporting as HP/SB400/10000,
+ * which includes some code which overrides all temperature
+ * trip points to 16C if the INTIN2 input of the I/O APIC
+ * is enabled. This input is incorrectly designated the
+ * ISA IRQ 0 via an interrupt source override even though
+ * it is wired to the output of the master 8259A and INTIN0
+ * is not connected at all. Abandon any attempts to route
+ * IRQ 0 through the I/O APIC therefore.
+ */
+ {
+ .callback = dmi_disable_irq0_through_ioapic,
+ .ident = "HP NX6125 laptop",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "HP Compaq nx6125"),
+ },
+ },
+ {
+ .callback = dmi_disable_irq0_through_ioapic,
+ .ident = "HP NX6325 laptop",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "HP Compaq nx6325"),
+ },
+ },
{}
};

-#endif /* __i386__ */
-
/*
* acpi_boot_table_init() and acpi_boot_init()
* called from setup_arch(), always.
@@ -1613,9 +1646,7 @@ int __init acpi_boot_table_init(void)
{
int error;

-#ifdef __i386__
dmi_check_system(acpi_dmi_table);
-#endif

/*
* If acpi_disabled, bail out
Index: linux-next/arch/x86/kernel/io_apic_64.c
===================================================================
--- linux-next.orig/arch/x86/kernel/io_apic_64.c
+++ linux-next/arch/x86/kernel/io_apic_64.c
@@ -94,6 +94,13 @@ static int no_timer_check;

static int disable_timer_pin_1 __initdata;

+static bool disable_irq0_through_ioapic __initdata;
+
+void __init force_disable_irq0_through_ioapic(void)
+{
+ disable_irq0_through_ioapic = true;
+}
+
int timer_through_8259 __initdata;

/* Where if anywhere is the i8259 connect in external int mode */
@@ -1714,12 +1721,14 @@ static inline void __init check_timer(vo
apic2 = apic1;
}

- replace_pin_at_irq(0, 0, 0, apic1, pin1);
- apic1 = 0;
- pin1 = 0;
- setup_timer_IRQ0_pin(apic1, pin1, cfg->vector);
+ if (disable_irq0_through_ioapic) {
+ clear_IO_APIC_pin(apic1, pin1);
+ } else {
+ replace_pin_at_irq(0, 0, 0, apic1, pin1);
+ apic1 = 0;
+ pin1 = 0;
+ setup_timer_IRQ0_pin(apic1, pin1, cfg->vector);

- if (pin1 != -1) {
/*
* Ok, does IRQ0 through the IOAPIC work?
*/

2008-07-07 03:54:19

by Zhao Forrest

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

> Hello,
>
> Matthew, you have not added your sign-off with the original patch, please
> do so now.
>
> Rafael, please try this change together with
> patch-2.6.26-rc1-20080505-mpparse-acpi-noirq0-2 and see if this fixes your
> system.
>
> I have tried the patches with a wildcard entry added to acpi_dmi_table[]
> so that it trips for my system. The result is as follows:

Since we add a new entry to acpi_dmi_table[], I think it's necessary
for this new entry to preserve the semantics of this table, that is
"acpi=force overrules DMI blacklist", am I right?

Thanks,
Forrest

2008-07-07 07:17:38

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems


* Rafael J. Wysocki <[email protected]> wrote:

> Appended is a patch that actually works for me (it's missing the i386
> part, but I can't test that anyway).
>
> Please note two things:
> (1) The whole acpi_dmi_table[] thing originally depended on __i386__, so it
> wouldn't work on x86-64 no matter what. I removed that dependecy, but
> I have no idea why it was there and so I'm not sure if that's correct.

> (2) The clear_IO_APIC_pin(apic1, pin1) done if
> disable_irq0_through_ioapic is true is absolutely essential. The
> symptoms are 100% reproducible without it.

thanks, applied to tip/x86, to give this some more testing.

the clear_IO_APIC_pin() is the most worrisome aspect of the change - but
since we are already in limited quirk mode, does it hurt? Maciej, any
preferences?

Ingo

2008-07-07 07:37:09

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems


* Ingo Molnar <[email protected]> wrote:

> > Please note two things:
> > (1) The whole acpi_dmi_table[] thing originally depended on __i386__, so it
> > wouldn't work on x86-64 no matter what. I removed that dependecy, but
> > I have no idea why it was there and so I'm not sure if that's correct.
>
> > (2) The clear_IO_APIC_pin(apic1, pin1) done if
> > disable_irq0_through_ioapic is true is absolutely essential. The
> > symptoms are 100% reproducible without it.
>
> thanks, applied to tip/x86, to give this some more testing.

i know you havent submitted this for inclusion yet, but FYI it wont
build on 32-bit with a config like:

http://redhat.com/~mingo/misc/config-Mon_Jul__7_09_22_45_CEST_2008.bad

Ingo

2008-07-07 11:43:56

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Mon, 7 Jul 2008, Ingo Molnar wrote:

> > Please note two things:
> > (1) The whole acpi_dmi_table[] thing originally depended on __i386__, so it
> > wouldn't work on x86-64 no matter what. I removed that dependecy, but
> > I have no idea why it was there and so I'm not sure if that's correct.

Well spottedd -- perhaps the x86-64 was though to be perfect. ;)

> > (2) The clear_IO_APIC_pin(apic1, pin1) done if
> > disable_irq0_through_ioapic is true is absolutely essential. The
> > symptoms are 100% reproducible without it.
>
> thanks, applied to tip/x86, to give this some more testing.
>
> the clear_IO_APIC_pin() is the most worrisome aspect of the change - but
> since we are already in limited quirk mode, does it hurt? Maciej, any
> preferences?

It makes absolutely no sense and should be harmful to call
clear_IO_APIC_pin(apic1, pin1) here, because both apic1 and pin1 should be
equal to -1 here. If it has to be called, then I suppose the DMI matching
did not work and the workaround has not been enabled.

Rafael, can you please provide a full bootstrap log obtained with all the
changes, but *without* this part? Also, can you please verify DMI IDs of
your system (dmidecode or /sys/class/dmi, I am told)?

I do think we should record DMI vendor and name information in the
bootstrap log. It won't take a lot of space and will be more useful than
some other IDs we never use for anything else, which we obtain from some
other tables.

Maciej

2008-07-07 12:22:19

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Monday, 7 of July 2008, Maciej W. Rozycki wrote:
> On Mon, 7 Jul 2008, Ingo Molnar wrote:
>
> > > Please note two things:
> > > (1) The whole acpi_dmi_table[] thing originally depended on __i386__, so it
> > > wouldn't work on x86-64 no matter what. I removed that dependecy, but
> > > I have no idea why it was there and so I'm not sure if that's correct.
>
> Well spottedd -- perhaps the x86-64 was though to be perfect. ;)
>
> > > (2) The clear_IO_APIC_pin(apic1, pin1) done if
> > > disable_irq0_through_ioapic is true is absolutely essential. The
> > > symptoms are 100% reproducible without it.
> >
> > thanks, applied to tip/x86, to give this some more testing.
> >
> > the clear_IO_APIC_pin() is the most worrisome aspect of the change - but
> > since we are already in limited quirk mode, does it hurt? Maciej, any
> > preferences?
>
> It makes absolutely no sense and should be harmful to call
> clear_IO_APIC_pin(apic1, pin1) here, because both apic1 and pin1 should be
> equal to -1 here. If it has to be called, then I suppose the DMI matching
> did not work and the workaround has not been enabled.

Do you realize that the clear_IO_APIC_pin(apic1, pin1) thing is _only_ called
_IF_ the DMI matching did work?

> Rafael, can you please provide a full bootstrap log obtained with all the
> changes, but *without* this part?

I assume you want the boot log with the ACPI debugging messages enabled?
That'll have to wait until I get back home.

> Also, can you please verify DMI IDs of your system (dmidecode or
> /sys/class/dmi, I am told)?

See above.

Thanks,
Rafael

2008-07-07 12:22:34

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Monday, 7 of July 2008, Ingo Molnar wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > > Please note two things:
> > > (1) The whole acpi_dmi_table[] thing originally depended on __i386__, so it
> > > wouldn't work on x86-64 no matter what. I removed that dependecy, but
> > > I have no idea why it was there and so I'm not sure if that's correct.
> >
> > > (2) The clear_IO_APIC_pin(apic1, pin1) done if
> > > disable_irq0_through_ioapic is true is absolutely essential. The
> > > symptoms are 100% reproducible without it.
> >
> > thanks, applied to tip/x86, to give this some more testing.
>
> i know you havent submitted this for inclusion yet, but FYI it wont
> build on 32-bit with a config like:
>
> http://redhat.com/~mingo/misc/config-Mon_Jul__7_09_22_45_CEST_2008.bad

Well, I wrote it was missing the i386 part. ;-)

Thanks,
Rafael

2008-07-07 12:26:54

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Monday, 7 of July 2008, Maciej W. Rozycki wrote:
> On Mon, 7 Jul 2008, Ingo Molnar wrote:
>
> > > Please note two things:
> > > (1) The whole acpi_dmi_table[] thing originally depended on __i386__, so it
> > > wouldn't work on x86-64 no matter what. I removed that dependecy, but
> > > I have no idea why it was there and so I'm not sure if that's correct.
>
> Well spottedd -- perhaps the x86-64 was though to be perfect. ;)
>
> > > (2) The clear_IO_APIC_pin(apic1, pin1) done if
> > > disable_irq0_through_ioapic is true is absolutely essential. The
> > > symptoms are 100% reproducible without it.
> >
> > thanks, applied to tip/x86, to give this some more testing.
> >
> > the clear_IO_APIC_pin() is the most worrisome aspect of the change - but
> > since we are already in limited quirk mode, does it hurt? Maciej, any
> > preferences?
>
> It makes absolutely no sense and should be harmful to call
> clear_IO_APIC_pin(apic1, pin1) here, because both apic1 and pin1 should be
> equal to -1 here. If it has to be called, then I suppose the DMI matching
> did not work and the workaround has not been enabled.

BTW, did you even to look at the code _as_ _is_ in linux-next?

In fact, it is _impossible_ that either apic1 or pin1 are equal to -1 at this
point, because of this part:

/*
* Some BIOS writers are clueless and report the ExtINTA
* I/O APIC input from the cascaded 8259A as the timer
* interrupt input. So just in case, if only one pin
* was found above, try it both directly and through the
* 8259A.
*/
if (pin1 == -1) {
pin1 = pin2;
apic1 = apic2;
no_pin1 = 1;
} else if (pin2 == -1) {
pin2 = pin1;
apic2 = apic1;
}

that originates from your patch.

End even without this part apic1 and pin1 are _not_ equal to -1 on this box
(apic2 and pin2 are, but that's a different matter).

Thanks,
Rafael

2008-07-07 12:32:38

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Monday, 7 of July 2008, Zhao Forrest wrote:
> > Hello,
> >
> > Matthew, you have not added your sign-off with the original patch, please
> > do so now.
> >
> > Rafael, please try this change together with
> > patch-2.6.26-rc1-20080505-mpparse-acpi-noirq0-2 and see if this fixes your
> > system.
> >
> > I have tried the patches with a wildcard entry added to acpi_dmi_table[]
> > so that it trips for my system. The result is as follows:
>
> Since we add a new entry to acpi_dmi_table[], I think it's necessary
> for this new entry to preserve the semantics of this table, that is
> "acpi=force overrules DMI blacklist", am I right?

Can you please elaborate?

What exactly are we supposed to do apart from adding the new entries to the
table?

Thanks,
Rafael

2008-07-07 14:03:03

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Mon, 7 Jul 2008, Rafael J. Wysocki wrote:

> > It makes absolutely no sense and should be harmful to call
> > clear_IO_APIC_pin(apic1, pin1) here, because both apic1 and pin1 should be
> > equal to -1 here. If it has to be called, then I suppose the DMI matching
> > did not work and the workaround has not been enabled.
>
> Do you realize that the clear_IO_APIC_pin(apic1, pin1) thing is _only_ called
> _IF_ the DMI matching did work?

Well, it is the very intent of the DMI quirk to set apic1 and pin1 both
to -1, as a result of IRQ0 being absent from our I/O APIC interrupt
routing table. Therefore if the quirk did indeed work, a call to
clear_IO_APIC_pin() is useless and likely harmful as its callees don't do
range checking (my understanding of code is it results in random poking at
the local APIC through the FIX_APIC_BASE fixmap). There should be nothing
to clear too, as interrupt redirection entries for all the I/O APIC inputs
are cleared (the mask is set to 1 and the remaining fields zeroed) when
clear_IO_APIC() is called from enable_IO_APIC() upon initialization and
all the unused ones (not referred to from anywhere in the interrupt
routing table) are never touched afterwards.

Maciej

2008-07-07 17:12:22

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Mon, 7 Jul 2008, Rafael J. Wysocki wrote:

> BTW, did you even to look at the code _as_ _is_ in linux-next?

Well, I hope it has not changed too much since my recent work in this
area...

> In fact, it is _impossible_ that either apic1 or pin1 are equal to -1 at this
> point, because of this part:

With the workaround activated it is virtually certain.

> /*
> * Some BIOS writers are clueless and report the ExtINTA
> * I/O APIC input from the cascaded 8259A as the timer
> * interrupt input. So just in case, if only one pin
> * was found above, try it both directly and through the
> * 8259A.
> */
> if (pin1 == -1) {
> pin1 = pin2;
> apic1 = apic2;
> no_pin1 = 1;
> } else if (pin2 == -1) {
> pin2 = pin1;
> apic2 = apic1;
> }
>
> that originates from your patch.

And the conclusion is? If we leave MP-table setups aside as irrelevant
for this configuration, we can be almost certain apic2 and pin2 are both
equal to -1 at this point. This is because (unlike the MP table) ACPI has
no provisions in its tables for ExtINTA APIC interrupt inputs. Therefore
the only case apic2 and pin2 may not be equal -1 here is when the firmware
had set up one of the inputs as such in the hardware before initiating
bootstrap which has been subsequently noted by a piece of code in
enable_IO_APIC() which examines the I/O APIC for such a condition.

I have taken these circumstances very much into account when preparing
the workaround, based on the assumption that if the firmware has set up an
I/O APIC line as an ExtINTA interrupt, then it means it considers it
suitable to use in such a manner. This furthermore implies the line
should be safe to be used in any valid 8259A mode of operation, such as
one we use to forward IRQ0 transparently through the 8259A (we
double-check it just in case though, as workarounds for hardware bugs in
the past made it not always true). The workaround therefore applies to
genuine IRQ0 routing as reported by ACPI only and not any possible legacy
ExtINTA fallback that we may attempt to use.

Of course, as determined previously, the ExtINTA line is not safe to be
used on your box, but it has not been set up by the firmware as an ExtINTA
interrupt either, so the assumption mentioned above remains valid and has
no negative impact on your system. At this point all of apic1, apic2,
pin1 and pin2 should be equal -1, which means the reassignments you quoted
make no changes to the variables.

> End even without this part apic1 and pin1 are _not_ equal to -1 on this box
> (apic2 and pin2 are, but that's a different matter).

Which means the workaround has not triggered and the rest of cosideration
is therefore irrelevant. Please get us these DMI IDs, so that we can see
what's wrong with the quirk.

Maciej

2008-07-07 17:51:48

by Ray Lee

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Mon, Jul 7, 2008 at 10:10 AM, Maciej W. Rozycki <[email protected]> wrote:
> Which means the workaround has not triggered and the rest of cosideration
> is therefore irrelevant. Please get us these DMI IDs, so that we can see
> what's wrong with the quirk.

I have the nx6125 version of the laptop, though it's not long for this
world. Regardless, dmi data below. Earlier, I looked at your DMI match
strings up-thread and they seemed good, so you may be barking up the
wrong tree here.

ray@phoenix:~$ sudo grep . /sys/class/dmi/id/*
/sys/class/dmi/id/bios_date:03/10/2006
/sys/class/dmi/id/bios_vendor:Hewlett-Packard
/sys/class/dmi/id/bios_version:68DTT Ver. F.0E
/sys/class/dmi/id/board_name:308B
/sys/class/dmi/id/board_vendor:Hewlett-Packard
/sys/class/dmi/id/board_version:KBC Version 45.26
/sys/class/dmi/id/chassis_asset_tag:
/sys/class/dmi/id/chassis_serial:CND5500R4K
/sys/class/dmi/id/chassis_type:10
/sys/class/dmi/id/chassis_vendor:Hewlett-Packard
/sys/class/dmi/id/modalias:dmi:bvnHewlett-Packard:bvr68DTTVer.F.0E:bd03/10/2006:svnHewlett-Packard:pnHPCompaqnx6125(PZ880UA#ABA):pvrF.0E:rvnHewlett-Packard:rn308B:rvrKBCVersion45.26:cvnHewlett-Packard:ct10:cvr:
/sys/class/dmi/id/product_name:HP Compaq nx6125 (PZ880UA#ABA)
/sys/class/dmi/id/product_serial:CND5500R4K
/sys/class/dmi/id/product_uuid:4D0AF52B-ED6E-DA11-1880-66990A524D29
/sys/class/dmi/id/product_version:F.0E
/sys/class/dmi/id/sys_vendor:Hewlett-Packard
/sys/class/dmi/id/uevent:MODALIAS=dmi:bvnHewlett-Packard:bvr68DTTVer.F.0E:bd03/10/2006:svnHewlett-Packard:pnHPCompaqnx6125(PZ880UA#ABA):pvrF.0E:rvnHewlett-Packard:rn308B:rvrKBCVersion45.26:cvnHewlett-Packard:ct10:cvr:

2008-07-07 18:01:57

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Monday, 7 of July 2008, Maciej W. Rozycki wrote:
> On Mon, 7 Jul 2008, Rafael J. Wysocki wrote:
>
> > > It makes absolutely no sense and should be harmful to call
> > > clear_IO_APIC_pin(apic1, pin1) here, because both apic1 and pin1 should be
> > > equal to -1 here. If it has to be called, then I suppose the DMI matching
> > > did not work and the workaround has not been enabled.
> >
> > Do you realize that the clear_IO_APIC_pin(apic1, pin1) thing is _only_ called
> > _IF_ the DMI matching did work?
>
> Well, it is the very intent of the DMI quirk to set apic1 and pin1 both
> to -1, as a result of IRQ0 being absent from our I/O APIC interrupt
> routing table. Therefore if the quirk did indeed work, a call to
> clear_IO_APIC_pin() is useless and likely harmful as its callees don't do
> range checking (my understanding of code is it results in random poking at
> the local APIC through the FIX_APIC_BASE fixmap). There should be nothing
> to clear too, as interrupt redirection entries for all the I/O APIC inputs
> are cleared (the mask is set to 1 and the remaining fields zeroed) when
> clear_IO_APIC() is called from enable_IO_APIC() upon initialization and
> all the unused ones (not referred to from anywhere in the interrupt
> routing table) are never touched afterwards.

Sorry, the patch I posted was _instead_ of your previous patch with the quirk,
because that patch didn't work. I don't know why it didn't work, however, I
can only say it didn't work after removing the __i386__ dependency of
acpi_dmi_table[].

My patch is on top of the linux-next tree that didn't contain your patch.
So, my patch adds a quirk that sets disable_irq0_through_ioapic to 1 (this
variable is defined differently in my patch) and uses it to skip the part of
check_timer() that breaks my box.

I hope that makes things clear.

Thanks,
Rafael

2008-07-07 18:07:40

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Monday, 7 of July 2008, Maciej W. Rozycki wrote:
> On Mon, 7 Jul 2008, Rafael J. Wysocki wrote:
>
> > BTW, did you even to look at the code _as_ _is_ in linux-next?
>
> Well, I hope it has not changed too much since my recent work in this
> area...

Please look at it.

> > In fact, it is _impossible_ that either apic1 or pin1 are equal to -1 at this
> > point, because of this part:
>
> With the workaround activated it is virtually certain.

Which workaround?

There are no any workarounds related to this problem in the 20080704 linux-next
and that's what my patch was against.

> > /*
> > * Some BIOS writers are clueless and report the ExtINTA
> > * I/O APIC input from the cascaded 8259A as the timer
> > * interrupt input. So just in case, if only one pin
> > * was found above, try it both directly and through the
> > * 8259A.
> > */
> > if (pin1 == -1) {
> > pin1 = pin2;
> > apic1 = apic2;
> > no_pin1 = 1;
> > } else if (pin2 == -1) {
> > pin2 = pin1;
> > apic2 = apic1;
> > }
> >
> > that originates from your patch.
>
> And the conclusion is? If we leave MP-table setups aside as irrelevant
> for this configuration, we can be almost certain apic2 and pin2 are both
> equal to -1 at this point. This is because (unlike the MP table) ACPI has
> no provisions in its tables for ExtINTA APIC interrupt inputs. Therefore
> the only case apic2 and pin2 may not be equal -1 here is when the firmware
> had set up one of the inputs as such in the hardware before initiating
> bootstrap which has been subsequently noted by a piece of code in
> enable_IO_APIC() which examines the I/O APIC for such a condition.
>
> I have taken these circumstances very much into account when preparing
> the workaround, based on the assumption that if the firmware has set up an
> I/O APIC line as an ExtINTA interrupt, then it means it considers it
> suitable to use in such a manner. This furthermore implies the line
> should be safe to be used in any valid 8259A mode of operation, such as
> one we use to forward IRQ0 transparently through the 8259A (we
> double-check it just in case though, as workarounds for hardware bugs in
> the past made it not always true). The workaround therefore applies to
> genuine IRQ0 routing as reported by ACPI only and not any possible legacy
> ExtINTA fallback that we may attempt to use.
>
> Of course, as determined previously, the ExtINTA line is not safe to be
> used on your box, but it has not been set up by the firmware as an ExtINTA
> interrupt either, so the assumption mentioned above remains valid and has
> no negative impact on your system. At this point all of apic1, apic2,
> pin1 and pin2 should be equal -1, which means the reassignments you quoted
> make no changes to the variables.
>
> > End even without this part apic1 and pin1 are _not_ equal to -1 on this box
> > (apic2 and pin2 are, but that's a different matter).
>
> Which means the workaround has not triggered and the rest of cosideration
> is therefore irrelevant. Please get us these DMI IDs, so that we can see
> what's wrong with the quirk.

I used those DMI IDs and your quirk didn't work, even after removing the
dependency of acpi_dmi_table[] from __i386__. For this reason, I prepared an
alternative patch that did work and posted it for your information.

Now, you're saying that my patch couldn't work, while in fact it did.

Thanks,
Rafael

2008-07-07 20:01:21

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Mon, 7 Jul 2008, Rafael J. Wysocki wrote:

> @@ -1714,12 +1721,14 @@ static inline void __init check_timer(vo
> apic2 = apic1;
> }
>
> - replace_pin_at_irq(0, 0, 0, apic1, pin1);
> - apic1 = 0;
> - pin1 = 0;
> - setup_timer_IRQ0_pin(apic1, pin1, cfg->vector);
> + if (disable_irq0_through_ioapic) {
> + clear_IO_APIC_pin(apic1, pin1);
> + } else {
> + replace_pin_at_irq(0, 0, 0, apic1, pin1);
> + apic1 = 0;
> + pin1 = 0;
> + setup_timer_IRQ0_pin(apic1, pin1, cfg->vector);
>
> - if (pin1 != -1) {
> /*
> * Ok, does IRQ0 through the IOAPIC work?
> */

This is completely broken -- you cannot blindly assume IRQ0 is wired to
the pin #0 of the I/O APIC #0. You have to respect routing information
provided by the system.

Ingo, from the sequence above, I gather this code is currently in the
tree:

- replace_pin_at_irq(0, 0, 0, apic1, pin1);
- apic1 = 0;
- pin1 = 0;

Please revert the change which introduced it. While I recall posting a
patch which added code like this, I clearly stated it was solely for
diagnostics of Rafael's system and not to apply to any tree.

Maciej

2008-07-07 20:12:32

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Mon, 7 Jul 2008, Rafael J. Wysocki wrote:

> Sorry, the patch I posted was _instead_ of your previous patch with the quirk,
> because that patch didn't work. I don't know why it didn't work, however, I
> can only say it didn't work after removing the __i386__ dependency of
> acpi_dmi_table[].

I don't recall seeing a note that my patch was reverted. I had a careful
look at yours now and it applies on top of some diagnostic code which
should never have been applied to the tree in the first place as it was
not meant to and would also break the majority of systems out there. I am
fairly sure this piece of code is the reason my implementation of the
workaround did not work for you.

> My patch is on top of the linux-next tree that didn't contain your patch.
> So, my patch adds a quirk that sets disable_irq0_through_ioapic to 1 (this
> variable is defined differently in my patch) and uses it to skip the part of
> check_timer() that breaks my box.
>
> I hope that makes things clear.

It does, thanks. However I insist on getting this issue dealt with the
way I proposed -- check_timer() is too complicated and too fragile to mess
with as our history has already shown. If IRQ0 is not to be routed
through the I/O APIC, then it should never be registered as an I/O APIC
interrupt in the first place.

I am fairly with the offending change removed my fix will work for you as
expected as I went through the effort of checking at the run time that
once activated it makes the kernel correctly avoid the sequence in
check_timer() that hits your system, so it is only the DMI ID matching
that could have gone wrong, but your change indicates this is actually not
the case.

Maciej

2008-07-07 20:15:49

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Monday, 7 of July 2008, Maciej W. Rozycki wrote:
> On Mon, 7 Jul 2008, Rafael J. Wysocki wrote:
>
> > @@ -1714,12 +1721,14 @@ static inline void __init check_timer(vo
> > apic2 = apic1;
> > }
> >
> > - replace_pin_at_irq(0, 0, 0, apic1, pin1);
> > - apic1 = 0;
> > - pin1 = 0;
> > - setup_timer_IRQ0_pin(apic1, pin1, cfg->vector);
> > + if (disable_irq0_through_ioapic) {
> > + clear_IO_APIC_pin(apic1, pin1);
> > + } else {
> > + replace_pin_at_irq(0, 0, 0, apic1, pin1);
> > + apic1 = 0;
> > + pin1 = 0;
> > + setup_timer_IRQ0_pin(apic1, pin1, cfg->vector);
> >
> > - if (pin1 != -1) {
> > /*
> > * Ok, does IRQ0 through the IOAPIC work?
> > */
>
> This is completely broken -- you cannot blindly assume IRQ0 is wired to
> the pin #0 of the I/O APIC #0. You have to respect routing information
> provided by the system.
>
> Ingo, from the sequence above, I gather this code is currently in the
> tree:
>
> - replace_pin_at_irq(0, 0, 0, apic1, pin1);
> - apic1 = 0;
> - pin1 = 0;
>
> Please revert the change which introduced it. While I recall posting a
> patch which added code like this, I clearly stated it was solely for
> diagnostics of Rafael's system and not to apply to any tree.

Shouldn't the setup_timer_IRQ0_pin(apic1, pin1, cfg->vector) be removed as
well?

Rafael

2008-07-07 20:30:16

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Monday, 7 of July 2008, Maciej W. Rozycki wrote:
> On Mon, 7 Jul 2008, Rafael J. Wysocki wrote:
>
> > Sorry, the patch I posted was _instead_ of your previous patch with the quirk,
> > because that patch didn't work. I don't know why it didn't work, however, I
> > can only say it didn't work after removing the __i386__ dependency of
> > acpi_dmi_table[].
>
> I don't recall seeing a note that my patch was reverted. I had a careful
> look at yours now and it applies on top of some diagnostic code which
> should never have been applied to the tree in the first place as it was
> not meant to and would also break the majority of systems out there. I am
> fairly sure this piece of code is the reason my implementation of the
> workaround did not work for you.
>
> > My patch is on top of the linux-next tree that didn't contain your patch.
> > So, my patch adds a quirk that sets disable_irq0_through_ioapic to 1 (this
> > variable is defined differently in my patch) and uses it to skip the part of
> > check_timer() that breaks my box.
> >
> > I hope that makes things clear.
>
> It does, thanks. However I insist on getting this issue dealt with the
> way I proposed -- check_timer() is too complicated and too fragile to mess
> with as our history has already shown. If IRQ0 is not to be routed
> through the I/O APIC, then it should never be registered as an I/O APIC
> interrupt in the first place.
>
> I am fairly with the offending change removed my fix will work for you as
> expected as I went through the effort of checking at the run time that
> once activated it makes the kernel correctly avoid the sequence in
> check_timer() that hits your system, so it is only the DMI ID matching
> that could have gone wrong, but your change indicates this is actually not
> the case.

So I'm going to apply the appended combined patch on top of linux-next and
retest.

Thanks,
Rafael

---
arch/x86/kernel/acpi/boot.c | 53 ++++++++++++++++++++++++++++++++++++++-----
arch/x86/kernel/io_apic_64.c | 5 ----
2 files changed, 47 insertions(+), 11 deletions(-)

Index: linux-next/arch/x86/kernel/acpi/boot.c
===================================================================
--- linux-next.orig/arch/x86/kernel/acpi/boot.c
+++ linux-next/arch/x86/kernel/acpi/boot.c
@@ -83,6 +83,8 @@ int acpi_lapic;
int acpi_ioapic;
int acpi_strict;

+static int disable_irq0_through_ioapic __initdata;
+
u8 acpi_sci_flags __initdata;
int acpi_sci_override_gsi __initdata;
int acpi_skip_timer_override __initdata;
@@ -990,6 +992,10 @@ void __init mp_override_legacy_irq(u8 bu
int pin;
struct mp_config_intsrc mp_irq;

+ /* Skip the 8254 timer interrupt (IRQ 0) if requested. */
+ if (bus_irq == 0 && disable_irq0_through_ioapic)
+ return;
+
/*
* Convert 'gsi' to 'ioapic.pin'.
*/
@@ -1056,6 +1062,10 @@ void __init mp_config_acpi_legacy_irqs(v
for (i = 0; i < 16; i++) {
int idx;

+ /* Skip the 8254 timer interrupt (IRQ 0) if requested. */
+ if (i == 0 && disable_irq0_through_ioapic)
+ continue;
+
for (idx = 0; idx < mp_irq_entries; idx++) {
struct mp_config_intsrc *irq = mp_irqs + idx;

@@ -1363,8 +1373,6 @@ static void __init acpi_process_madt(voi
return;
}

-#ifdef __i386__
-
static int __init disable_acpi_irq(const struct dmi_system_id *d)
{
if (!acpi_force) {
@@ -1415,6 +1423,17 @@ static int __init force_acpi_ht(const st
}

/*
+ * Don't register any I/O APIC entries for the 8254 timer IRQ.
+ */
+static int __init
+dmi_disable_irq0_through_ioapic(const struct dmi_system_id *d)
+{
+ pr_notice("%s detected: disabling IRQ 0 through I/O APIC\n", d->ident);
+ disable_irq0_through_ioapic = 1;
+ return 0;
+}
+
+/*
* If your system is blacklisted here, but you find that acpi=force
* works for you, please contact [email protected]
*/
@@ -1581,11 +1600,35 @@ static struct dmi_system_id __initdata a
DMI_MATCH(DMI_PRODUCT_NAME, "TravelMate 360"),
},
},
+ /*
+ * HP laptops which use a DSDT reporting as HP/SB400/10000,
+ * which includes some code which overrides all temperature
+ * trip points to 16C if the INTIN2 input of the I/O APIC
+ * is enabled. This input is incorrectly designated the
+ * ISA IRQ 0 via an interrupt source override even though
+ * it is wired to the output of the master 8259A and INTIN0
+ * is not connected at all. Abandon any attempts to route
+ * IRQ 0 through the I/O APIC therefore.
+ */
+ {
+ .callback = dmi_disable_irq0_through_ioapic,
+ .ident = "HP NX6125 laptop",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "HP Compaq nx6125"),
+ },
+ },
+ {
+ .callback = dmi_disable_irq0_through_ioapic,
+ .ident = "HP NX6325 laptop",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "HP Compaq nx6325"),
+ },
+ },
{}
};

-#endif /* __i386__ */
-
/*
* acpi_boot_table_init() and acpi_boot_init()
* called from setup_arch(), always.
@@ -1613,9 +1656,7 @@ int __init acpi_boot_table_init(void)
{
int error;

-#ifdef __i386__
dmi_check_system(acpi_dmi_table);
-#endif

/*
* If acpi_disabled, bail out
Index: linux-next/arch/x86/kernel/io_apic_64.c
===================================================================
--- linux-next.orig/arch/x86/kernel/io_apic_64.c
+++ linux-next/arch/x86/kernel/io_apic_64.c
@@ -1714,11 +1714,6 @@ static inline void __init check_timer(vo
apic2 = apic1;
}

- replace_pin_at_irq(0, 0, 0, apic1, pin1);
- apic1 = 0;
- pin1 = 0;
- setup_timer_IRQ0_pin(apic1, pin1, cfg->vector);
-
if (pin1 != -1) {
/*
* Ok, does IRQ0 through the IOAPIC work?

2008-07-07 20:31:55

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Mon, 7 Jul 2008, Maciej W. Rozycki wrote:

> Ingo, from the sequence above, I gather this code is currently in the
> tree:
>
> - replace_pin_at_irq(0, 0, 0, apic1, pin1);
> - apic1 = 0;
> - pin1 = 0;
>
> Please revert the change which introduced it. While I recall posting a
> patch which added code like this, I clearly stated it was solely for
> diagnostics of Rafael's system and not to apply to any tree.

I have had a look at the tree and there are two commits containing
diagnostic code that should have never been applied to the tree. Their
IDs in the linux-next tree are as follows:

90221a61a71b7ad659d8741cf1e404506b174982
a74a1cc3df0be89658bc735c8aed80c8392e2c15

Please revert them both. Thanks.

Maciej

2008-07-07 21:39:25

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Mon, 7 Jul 2008, Rafael J. Wysocki wrote:

> > Ingo, from the sequence above, I gather this code is currently in the
> > tree:
> >
> > - replace_pin_at_irq(0, 0, 0, apic1, pin1);
> > - apic1 = 0;
> > - pin1 = 0;
> >
> > Please revert the change which introduced it. While I recall posting a
> > patch which added code like this, I clearly stated it was solely for
> > diagnostics of Rafael's system and not to apply to any tree.
>
> Shouldn't the setup_timer_IRQ0_pin(apic1, pin1, cfg->vector) be removed as
> well?

Yes. The whole commit should be reverted as mentioned in the other
e-mail (it didn't even bear my sign-off mark). This would include the
line you are referring to as well. I have submitted a patch to add a
64-bit variation of replace_pin_at_irq() separately.

Maciej

2008-07-07 21:41:01

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Mon, 7 Jul 2008, Rafael J. Wysocki wrote:

> So I'm going to apply the appended combined patch on top of linux-next and
> retest.

Wonderful! Thanks for your patience and cooperation.

Maciej

2008-07-07 22:08:56

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Monday, 7 of July 2008, Maciej W. Rozycki wrote:
> On Mon, 7 Jul 2008, Rafael J. Wysocki wrote:
>
> > So I'm going to apply the appended combined patch on top of linux-next and
> > retest.
>
> Wonderful! Thanks for your patience and cooperation.

Unfortunately, it hangs solid very early on boot, before earlyprintk=vga
kicks in.

Thanks,
Rafael

Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Tue, Jul 01, 2008 at 01:12:06AM +0100, Maciej W. Rozycki wrote:
> From: Matthew Garrett <[email protected]>
>
> Some HP laptops have a problem with their DSDT reporting as
> HP/SB400/10000, which includes some code which overrides all temperature
> trip points to 16C if the INTIN2 input of the I/O APIC is enabled. This
> input is incorrectly designated the ISA IRQ 0 via an interrupt source
> override even though it is wired to the output of the master 8259A and
> INTIN0 is not connected at all. So far two models have been identified,
> namely nx6125 and nx6325.

Out of sheer curiosity. What makes you think that IRQ0 is not
connected to INT0 of IO APIC? The IRQ0 pin2 override?

Why would we trust that BIOS information if the broken BIOS does
weird things if INT2 of IO APIC is _not_ masked?

IMHO on those HP systems we should skip the (bogus?) timer override,
leave IRQ0 -> IOAPIC/INT0 and mask IOAPIC/INT2. Or at least we should
test that configuration.

I admit that I have lost track of who has provided which patches,
which patch does exactly what ((IO|L)A)PIC configuration, and who has
tested what combinations, and what the results were.

So I've just done the following (based on x86/master as of yesterday):

Booting an HP nx6325

(1) with "acpi_skip_timer_override" to avoid configuration
of IOAPIC/INT2 and to avoid masking of IOAPIC/INT0.

plus

(2) adding (hardcoded in check_timer()) some code to explicitly mask
IOAPIC/INT2 to quiesce the fan (work around the DSDT issue).

With that setup my test box boots w/o problems (no fan noise, no
throttling, no stablity problems so far). Here the relevant dmesg
parts:

Command line: root=/dev/sda2 video=radeonfb:noaccel debug apic=debug acpi_skip_timer_override
...
IOAPIC[0]: apic_id 2, version 0, address 0xfec00000, GSI
...
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: BIOS IRQ0 pin2 override ignored.
...
using C1E aware idle routine
...
init IO_APIC IRQs
IOAPIC[0]: Set routing entry (2-0 -> 0x30 -> IRQ 0 Mode:0 Active:0)
IOAPIC[0]: Set routing entry (2-1 -> 0x31 -> IRQ 1 Mode:0 Active:0)
IOAPIC[0]: Set routing entry (2-2 -> 0x32 -> IRQ 2 Mode:0 Active:0)
...
IO-APIC (apicid-pin) 2-16, 2-17, 2-18, 2-19, 2-20 not connected.
IOAPIC[0]: Set routing entry (2-21 -> 0x49 -> IRQ 21 Mode:1 Active:1)
IO-APIC (apicid-pin) 2-22, 2-23 not connected.
..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
CPU0: AMD Turion(tm) 64 X2 Mobile Technology TL-60 stepping 02
Using local APIC timer interrupts.
...
IO APIC #2......
...
NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
00 003 0 0 0 0 0 1 1 30
01 003 0 0 0 0 0 1 1 31
02 003 1 0 0 0 0 1 1 32


I think we have 3 alternatives to setup timer interrupt on those machines

(A) Setup timer as Virtual Wire IRQ.
(that's the way the old code used to configure it, e.g. in 2.6.25)
(B) The current approach to setup timer as ExtINT.
(C) Use IOAPIC/INT0.

Shouldn't be that hard to find the best solution here:

(A) proved to work (and even "accidentially" masked IOAPIC/INT2 which
avoided the DSDT problem)

(B) ??? who tested this, Rafael -- The stability issues, did they happen
with a kernel based on that solution?

(C) ... is my preferred solution. I tested it and I've not seen problems
with it (so far) -- maybe it's naive but I think it's the most
obvious solution (*)


Regards,

Andreas

(*) BTW, default for SB400 is to route IRQ0 to IOAPIC/INT0 and output
of PIC to IOAPIC/INT2. On my test box this wasn't configured
differently. Thus I don't understand why the BIOS provided an
IRQ0/INT2 override at all.

2008-07-08 14:51:37

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Tue, 8 Jul 2008, Andreas Herrmann wrote:

> > Some HP laptops have a problem with their DSDT reporting as
> > HP/SB400/10000, which includes some code which overrides all temperature
> > trip points to 16C if the INTIN2 input of the I/O APIC is enabled. This
> > input is incorrectly designated the ISA IRQ 0 via an interrupt source
> > override even though it is wired to the output of the master 8259A and
> > INTIN0 is not connected at all. So far two models have been identified,
> > namely nx6125 and nx6325.
>
> Out of sheer curiosity. What makes you think that IRQ0 is not
> connected to INT0 of IO APIC? The IRQ0 pin2 override?

The test I prepared a couple of weeks ago which Rafael ran for me. The
test forcibly rerouted IRQ0 to INTIN0, but the timer did not work with
that setup.

> Why would we trust that BIOS information if the broken BIOS does
> weird things if INT2 of IO APIC is _not_ masked?

This is why I prepared the workaround to ignore BIOS IRQ0 configuration
triggered by DMI matching proposed by Matthew.

> IMHO on those HP systems we should skip the (bogus?) timer override,
> leave IRQ0 -> IOAPIC/INT0 and mask IOAPIC/INT2. Or at least we should
> test that configuration.

Unfortunately it did not work for Rafael's box and diagnostic messages
have been quiesced in the 64-bit variation of code so bootstrap logs
archived throughout the Internet could not be used as a reference.

> I admit that I have lost track of who has provided which patches,
> which patch does exactly what ((IO|L)A)PIC configuration, and who has
> tested what combinations, and what the results were.

I can be blamed for the vast majority. If you are unsure what does what,
then feel free to ask me.

> So I've just done the following (based on x86/master as of yesterday):
>
> Booting an HP nx6325
>
> (1) with "acpi_skip_timer_override" to avoid configuration
> of IOAPIC/INT2 and to avoid masking of IOAPIC/INT0.
>
> plus
>
> (2) adding (hardcoded in check_timer()) some code to explicitly mask
> IOAPIC/INT2 to quiesce the fan (work around the DSDT issue).

Now that you mentioned (2) a sudden insight made me realise in the
absence of an overlapping override we register ISA IRQ2 as an
edge-triggered I/O APIC interrupt, which means its input will be
implicitly unmasked. But despite the ACPI spec does not make it
explicitly clear, there is no such thing as ISA IRQ2 -- what used to be
IRQ2 up to PC/XT got remapped to IRQ9 when the IR2 line of the 8259A was
reused as a cascade to the slave and no device since has ever been able to
make use of IRQ2. Therefore I think we should actually not register this
IRQ2 thingy at all under any circumstances. I'll send a patch.

> With that setup my test box boots w/o problems (no fan noise, no
> throttling, no stablity problems so far). Here the relevant dmesg
> parts:
>
> Command line: root=/dev/sda2 video=radeonfb:noaccel debug apic=debug acpi_skip_timer_override
> ...
> IOAPIC[0]: apic_id 2, version 0, address 0xfec00000, GSI
> ...
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: BIOS IRQ0 pin2 override ignored.
> ...
> using C1E aware idle routine
> ...
> init IO_APIC IRQs
> IOAPIC[0]: Set routing entry (2-0 -> 0x30 -> IRQ 0 Mode:0 Active:0)
> IOAPIC[0]: Set routing entry (2-1 -> 0x31 -> IRQ 1 Mode:0 Active:0)
> IOAPIC[0]: Set routing entry (2-2 -> 0x32 -> IRQ 2 Mode:0 Active:0)
> ...
> IO-APIC (apicid-pin) 2-16, 2-17, 2-18, 2-19, 2-20 not connected.
> IOAPIC[0]: Set routing entry (2-21 -> 0x49 -> IRQ 21 Mode:1 Active:1)
> IO-APIC (apicid-pin) 2-22, 2-23 not connected.
> ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
> CPU0: AMD Turion(tm) 64 X2 Mobile Technology TL-60 stepping 02
> Using local APIC timer interrupts.
> ...
> IO APIC #2......
> ...
> NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
> 00 003 0 0 0 0 0 1 1 30
> 01 003 0 0 0 0 0 1 1 31
> 02 003 1 0 0 0 0 1 1 32
>
>
> I think we have 3 alternatives to setup timer interrupt on those machines
>
> (A) Setup timer as Virtual Wire IRQ.
> (that's the way the old code used to configure it, e.g. in 2.6.25)
> (B) The current approach to setup timer as ExtINT.
> (C) Use IOAPIC/INT0.
>
> Shouldn't be that hard to find the best solution here:
>
> (A) proved to work (and even "accidentially" masked IOAPIC/INT2 which
> avoided the DSDT problem)
>
> (B) ??? who tested this, Rafael -- The stability issues, did they happen
> with a kernel based on that solution?
>
> (C) ... is my preferred solution. I tested it and I've not seen problems
> with it (so far) -- maybe it's naive but I think it's the most
> obvious solution (*)

I am assuming by "Virtual Wire IRQ" you mean a setup we are referring to
as "8259A Virtual Wire" (without this additional qualifier the term is
synonymous to ExtINTA).

Given (C) did not work a workaround has been prepared to implement (A).
As it does DMI ID matching and "knows" for this system (C) would not work
I decided there is no point in trying such a configuration.

Now you say your machine actually supports (C), which makes me wonder why
Rafael's does not, hmm...

> (*) BTW, default for SB400 is to route IRQ0 to IOAPIC/INT0 and output
> of PIC to IOAPIC/INT2. On my test box this wasn't configured

Well, this is a piece of information I could not know. Documentation for
the SB400 is not publicly available. I asked whether anybody could answer
how this piece of circuitry has been designed in the chip and you are the
first to provide some kind of an answer.

> differently. Thus I don't understand why the BIOS provided an
> IRQ0/INT2 override at all.

One or more of the following:

1. Because the 8254 is not in fact routed to INTIN0 (as seen with the
test).

2. Because almost everything else requires such an override and it was
left in place in the code base by mistake/misunderstanding/etc.

OK, to recap: given what you have written, I think we should refrain from
registering IRQ2 as suggested above and then it will be safe to ignore the
interrupt override for the two systems affected based on a DMI quirk,
which means INTIN0 will be tried first, which may or may not work. Of
course if it does not, then we have to quiesce the warning about the BIOS
having provided us with incorrect interrupt routing information, because
we ignored what it told us. ;)

Maciej

2008-07-08 15:26:08

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Tuesday, 8 of July 2008, Andreas Herrmann wrote:
> On Tue, Jul 01, 2008 at 01:12:06AM +0100, Maciej W. Rozycki wrote:
> > From: Matthew Garrett <[email protected]>
> >
> > Some HP laptops have a problem with their DSDT reporting as
> > HP/SB400/10000, which includes some code which overrides all temperature
> > trip points to 16C if the INTIN2 input of the I/O APIC is enabled. This
> > input is incorrectly designated the ISA IRQ 0 via an interrupt source
> > override even though it is wired to the output of the master 8259A and
> > INTIN0 is not connected at all. So far two models have been identified,
> > namely nx6125 and nx6325.
>
> Out of sheer curiosity. What makes you think that IRQ0 is not
> connected to INT0 of IO APIC? The IRQ0 pin2 override?
>
> Why would we trust that BIOS information if the broken BIOS does
> weird things if INT2 of IO APIC is _not_ masked?
>
> IMHO on those HP systems we should skip the (bogus?) timer override,
> leave IRQ0 -> IOAPIC/INT0 and mask IOAPIC/INT2. Or at least we should
> test that configuration.
>
> I admit that I have lost track of who has provided which patches,
> which patch does exactly what ((IO|L)A)PIC configuration, and who has
> tested what combinations, and what the results were.
>
> So I've just done the following (based on x86/master as of yesterday):
>
> Booting an HP nx6325
>
> (1) with "acpi_skip_timer_override" to avoid configuration
> of IOAPIC/INT2 and to avoid masking of IOAPIC/INT0.
>
> plus
>
> (2) adding (hardcoded in check_timer()) some code to explicitly mask
> IOAPIC/INT2 to quiesce the fan (work around the DSDT issue).
>
> With that setup my test box boots w/o problems (no fan noise, no
> throttling, no stablity problems so far). Here the relevant dmesg
> parts:
>
> Command line: root=/dev/sda2 video=radeonfb:noaccel debug apic=debug acpi_skip_timer_override
> ...
> IOAPIC[0]: apic_id 2, version 0, address 0xfec00000, GSI
> ...
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: BIOS IRQ0 pin2 override ignored.
> ...
> using C1E aware idle routine
> ...
> init IO_APIC IRQs
> IOAPIC[0]: Set routing entry (2-0 -> 0x30 -> IRQ 0 Mode:0 Active:0)
> IOAPIC[0]: Set routing entry (2-1 -> 0x31 -> IRQ 1 Mode:0 Active:0)
> IOAPIC[0]: Set routing entry (2-2 -> 0x32 -> IRQ 2 Mode:0 Active:0)
> ...
> IO-APIC (apicid-pin) 2-16, 2-17, 2-18, 2-19, 2-20 not connected.
> IOAPIC[0]: Set routing entry (2-21 -> 0x49 -> IRQ 21 Mode:1 Active:1)
> IO-APIC (apicid-pin) 2-22, 2-23 not connected.
> ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
> CPU0: AMD Turion(tm) 64 X2 Mobile Technology TL-60 stepping 02
> Using local APIC timer interrupts.
> ...
> IO APIC #2......
> ...
> NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
> 00 003 0 0 0 0 0 1 1 30
> 01 003 0 0 0 0 0 1 1 31
> 02 003 1 0 0 0 0 1 1 32
>
>
> I think we have 3 alternatives to setup timer interrupt on those machines
>
> (A) Setup timer as Virtual Wire IRQ.
> (that's the way the old code used to configure it, e.g. in 2.6.25)
> (B) The current approach to setup timer as ExtINT.
> (C) Use IOAPIC/INT0.
>
> Shouldn't be that hard to find the best solution here:
>
> (A) proved to work (and even "accidentially" masked IOAPIC/INT2 which
> avoided the DSDT problem)
>
> (B) ??? who tested this, Rafael

Yes, I tested this.

> -- The stability issues, did they happen with a kernel based on that solution?

Yes, they did.

> (C) ... is my preferred solution. I tested it and I've not seen problems
> with it (so far) -- maybe it's naive but I think it's the most
> obvious solution (*)

FWIW, it's my preferred one too. So far, I haven't seen any problems with this
approach (ie. "acpi_skip_timer_override" plus masking IOAPIC/INT2 in
check_timer()), even with the C1E aware idle routine.

Thanks,
Rafael

2008-07-08 16:26:46

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Tue, 8 Jul 2008, Rafael J. Wysocki wrote:

> > (B) The current approach to setup timer as ExtINT.
[...]
> > (B) ??? who tested this, Rafael
>
> Yes, I tested this.

There is some confusion here apparently -- you certainly have not tested
the timer as an ExtINTA interrupt during the course of this discussion,
because the "8259A Virtual Wire" through the local APIC has always worked
for you. ExtINTA would be the next, fourth and final attempt before a
panic().

For a reference, here is the list of configurations of the 8254 timer
interrupt in the order they are tried:

1. Native I/O APIC interrupt.

2. "8259A Virtual Wire" through the I/O APIC.

3. "8259A Virtual Wire" through the local APIC of the bootstrap processor.

4. ExtINTA ("Virtual Wire") through the local APIC of the BSP.

The configurations #1 and #2 are only tried if the firmware has supplied
information about how the IRQ0 or 8259A have been wired. The #4 is only
reached in very rare cases where there is some special glue logic between
the 8259A and the APIC (seen before as a workaround for broken hardware;
unlock_ExtINT_logic() is needed for that) or a given implementation of the
8259A is not a full one (hypothetical).

Maciej

2008-07-08 16:52:45

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi: Disable IRQ 0 through I/O APIC for some HP systems

On Tuesday, 8 of July 2008, Maciej W. Rozycki wrote:
> On Tue, 8 Jul 2008, Rafael J. Wysocki wrote:
>
> > > (B) The current approach to setup timer as ExtINT.
> [...]
> > > (B) ??? who tested this, Rafael
> >
> > Yes, I tested this.
>
> There is some confusion here apparently -- you certainly have not tested
> the timer as an ExtINTA interrupt during the course of this discussion,
> because the "8259A Virtual Wire" through the local APIC has always worked
> for you. ExtINTA would be the next, fourth and final attempt before a
> panic().

Yes, you're right, sorry.

So in fact I didn't test B.

Thanks,
Rafael