2010-02-01 14:59:17

by Thomas Renninger

[permalink] [raw]
Subject: IRQ regression messes up xseries 330 SCI resulting in apic=off - bisected to commit b9c61b70075c87a861262473

Hi,

booting a latest kernel on this machine results in:

PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
ACPI: SCI (IRQ30) allocation failed
ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
ACPI: Unable to start the ACPI Interpreter

Later all kind of devices fail...

I could bisect it down to this commit:
commit b9c61b70075c87a8612624736faf4a2de5b1ed30
Author: Yinghai Lu <[email protected]>
Date: Wed May 6 10:10:06 2009 -0700

x86/pci: update pirq_enable_irq() to setup io apic routing

So we can set io apic routing only when enabling the device irq.

This is advantageous for IRQ descriptor allocation affinity: if we set up
the IO-APIC entry later, we have a chance to allocate the IRQ descriptor
later and know which device it is on and can set affinity accordingly.

[ Impact: standardize/enhance irq-enabling sequence for mptable irqs ]

Signed-off-by: Yinghai Lu <[email protected]>
Acked-by: Jesse Barnes <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Andrew Morton <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>



Attached are dmesg of an umodified broken 2.6.32 kernel and
dmesg of a 2.6.32 kernel in which I reverted above patch (apic=verbose).
The reverting needed some adjusting and I did this without understanding
the code. I also attach the backported patch reverting above for 2.6.32
which makes the machine work again (see dmesg attachment).
This probably cannot go in, it would be great if someone could help
finding a proper patch for mainline which makes the machine work again.
(The ACPI irq, SCI, is meant to be on IRQ 30, rerouted from IRQ 3 via
APIC source override table, which is rather odd/uncommon. Hope that helps)

Thanks,

Thomas


Attachments:
dmesg_2.6.32_boots_apic_verbose.txt (39.98 kB)
dmesg_2.6.32_does_not_boot.txt (62.58 kB)
x86_xseries_330_sci_irq.fix (9.17 kB)
Download all attachments

2010-02-01 20:37:57

by Yinghai Lu

[permalink] [raw]
Subject: Re: IRQ regression messes up xseries 330 SCI resulting in apic=off - bisected to commit b9c61b70075c87a861262473

On 02/01/2010 06:59 AM, Thomas Renninger wrote:
> Hi,
>
> booting a latest kernel on this machine results in:
>
> PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
> PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
> ACPI: SCI (IRQ30) allocation failed
> ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
> ACPI: Unable to start the ACPI Interpreter
>
> Later all kind of devices fail...
>
...
>
>
> Attached are dmesg of an umodified broken 2.6.32 kernel and
> dmesg of a 2.6.32 kernel in which I reverted above patch (apic=verbose).
> The reverting needed some adjusting and I did this without understanding
> the code. I also attach the backported patch reverting above for 2.6.32
> which makes the machine work again (see dmesg attachment).
> This probably cannot go in, it would be great if someone could help
> finding a proper patch for mainline which makes the machine work again.
> (The ACPI irq, SCI, is meant to be on IRQ 30, rerouted from IRQ 3 via
> APIC source override table, which is rather odd/uncommon. Hope that helps)

ok, the root cause the SCI in on second ioapic....

will have a patch for it.

Thanks

Yinghai

2010-02-02 01:17:30

by Yinghai Lu

[permalink] [raw]
Subject: Re: IRQ regression messes up xseries 330 SCI resulting in apic=off - bisected to commit b9c61b70075c87a861262473

On 02/01/2010 06:59 AM, Thomas Renninger wrote:
> Hi,
>
> booting a latest kernel on this machine results in:
>
> PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
> PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
> ACPI: SCI (IRQ30) allocation failed
> ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
> ACPI: Unable to start the ACPI Interpreter
>

please check

[PATCH] x86: fix sci on ioapic 1

Thomas Renninger <[email protected]> reported on IBM x3330

booting a latest kernel on this machine results in:

PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
ACPI: SCI (IRQ30) allocation failed
ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
ACPI: Unable to start the ACPI Interpreter

Later all kind of devices fail...

and bisect it down to this commit:
commit b9c61b70075c87a8612624736faf4a2de5b1ed30

x86/pci: update pirq_enable_irq() to setup io apic routing

it turns out we need to set irq routing for the sci on ioapic1 early.

Reported-by: Thomas Renninger <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>

---
arch/x86/include/asm/io_apic.h | 2 +
arch/x86/kernel/acpi/boot.c | 6 ++++-
arch/x86/kernel/apic/io_apic.c | 49 +++++++++++++++++++++++++++++++++++++++++
3 files changed, 56 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/include/asm/io_apic.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/io_apic.h
+++ linux-2.6/arch/x86/include/asm/io_apic.h
@@ -160,6 +160,7 @@ extern int io_apic_get_redir_entries(int
struct io_apic_irq_attr;
extern int io_apic_set_pci_routing(struct device *dev, int irq,
struct io_apic_irq_attr *irq_attr);
+void setup_IO_APIC_irq_extra(u32 gsi);
extern int (*ioapic_renumber_irq)(int ioapic, int irq);
extern void ioapic_init_mappings(void);
extern void ioapic_insert_resources(void);
@@ -197,6 +198,7 @@ static const int timer_through_8259 = 0;
static inline void ioapic_init_mappings(void) { }
static inline void ioapic_insert_resources(void) { }
static inline void probe_nr_irqs_gsi(void) { }
+static inline void setup_IO_APIC_irq_extra(u32 gsi) { }

#endif

Index: linux-2.6/arch/x86/kernel/acpi/boot.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/acpi/boot.c
+++ linux-2.6/arch/x86/kernel/acpi/boot.c
@@ -446,6 +446,9 @@ void __init acpi_pic_sci_set_trigger(uns
int acpi_gsi_to_irq(u32 gsi, unsigned int *irq)
{
*irq = gsi;
+
+ setup_IO_APIC_irq_extra(gsi);
+
return 0;
}

@@ -473,7 +476,8 @@ int acpi_register_gsi(struct device *dev
plat_gsi = mp_register_gsi(dev, gsi, trigger, polarity);
}
#endif
- acpi_gsi_to_irq(plat_gsi, &irq);
+ irq = plat_gsi;
+
return irq;
}

Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
+++ linux-2.6/arch/x86/kernel/apic/io_apic.c
@@ -1541,6 +1541,55 @@ static void __init setup_IO_APIC_irqs(vo
}

/*
+ * for the gsit that is not in first ioapic
+ * but could not use acpi_register_gsi()
+ * like some special sci in IBM x3330
+ */
+void setup_IO_APIC_irq_extra(u32 gsi)
+{
+ int apic_id = 0, pin, idx, irq;
+ int node = cpu_to_node(boot_cpu_id);
+ struct irq_desc *desc;
+ struct irq_cfg *cfg;
+
+ /*
+ * Convert 'gsi' to 'ioapic.pin'.
+ */
+ apic_id = mp_find_ioapic(gsi);
+ if (apic_id < 0)
+ return;
+
+ pin = mp_find_ioapic_pin(apic_id, gsi);
+ idx = find_irq_entry(apic_id, pin, mp_INT);
+ if (idx == -1)
+ return;
+
+ irq = pin_2_irq(idx, apic_id, pin);
+ desc = irq_to_desc(irq);
+ if (desc)
+ return;
+
+ desc = irq_to_desc_alloc_node(irq, node);
+ if (!desc) {
+ printk(KERN_INFO "can not get irq_desc for %d\n", irq);
+ return;
+ }
+
+ cfg = desc->chip_data;
+ add_pin_to_irq_node(cfg, node, apic_id, pin);
+
+ if (test_bit(pin, mp_ioapic_routing[apic_id].pin_programmed)) {
+ pr_debug("Pin %d-%d already programmed\n",
+ mp_ioapics[apic_id].apicid, pin);
+ return;
+ }
+ set_bit(pin, mp_ioapic_routing[apic_id].pin_programmed);
+
+ setup_IO_APIC_irq(apic_id, pin, irq, desc,
+ irq_trigger(idx), irq_polarity(idx));
+}
+
+/*
* Set up the timer pin, possibly with the 8259A-master behind.
*/
static void __init setup_timer_IRQ0_pin(unsigned int apic_id, unsigned int pin,

2010-02-02 08:05:21

by Yinghai Lu

[permalink] [raw]
Subject: Re: IRQ regression messes up xseries 330 SCI resulting in apic=off - bisected to commit b9c61b70075c87a861262473

On 02/01/2010 05:16 PM, Yinghai Lu wrote:
> On 02/01/2010 06:59 AM, Thomas Renninger wrote:
>> Hi,
>>
>> booting a latest kernel on this machine results in:
>>
>> PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
>> PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
>> ACPI: SCI (IRQ30) allocation failed
>> ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
>> ACPI: Unable to start the ACPI Interpreter
>>
>
> please check

Subject: [PATCH -v2] x86: fix sci on ioapic 1

Thomas Renninger <[email protected]> reported on IBM x3330

booting a latest kernel on this machine results in:

PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
ACPI: SCI (IRQ30) allocation failed
ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
ACPI: Unable to start the ACPI Interpreter

Later all kind of devices fail...

and bisect it down to this commit:
commit b9c61b70075c87a8612624736faf4a2de5b1ed30

x86/pci: update pirq_enable_irq() to setup io apic routing

it turns out we need to set irq routing for the sci on ioapic1 early.

-v2: make it work without sparseirq too.

Reported-by: Thomas Renninger <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>

---
arch/x86/include/asm/io_apic.h | 1
arch/x86/kernel/acpi/boot.c | 9 ++++++-
arch/x86/kernel/apic/io_apic.c | 50 +++++++++++++++++++++++++++++++++++++++++
3 files changed, 59 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/include/asm/io_apic.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/io_apic.h
+++ linux-2.6/arch/x86/include/asm/io_apic.h
@@ -160,6 +160,7 @@ extern int io_apic_get_redir_entries(int
struct io_apic_irq_attr;
extern int io_apic_set_pci_routing(struct device *dev, int irq,
struct io_apic_irq_attr *irq_attr);
+void setup_IO_APIC_irq_extra(u32 gsi);
extern int (*ioapic_renumber_irq)(int ioapic, int irq);
extern void ioapic_init_mappings(void);
extern void ioapic_insert_resources(void);
Index: linux-2.6/arch/x86/kernel/acpi/boot.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/acpi/boot.c
+++ linux-2.6/arch/x86/kernel/acpi/boot.c
@@ -446,6 +446,12 @@ void __init acpi_pic_sci_set_trigger(uns
int acpi_gsi_to_irq(u32 gsi, unsigned int *irq)
{
*irq = gsi;
+
+#ifdef CONFIG_X86_IO_APIC
+ if (acpi_irq_model == ACPI_IRQ_MODEL_IOAPIC)
+ setup_IO_APIC_irq_extra(gsi);
+#endif
+
return 0;
}

@@ -473,7 +479,8 @@ int acpi_register_gsi(struct device *dev
plat_gsi = mp_register_gsi(dev, gsi, trigger, polarity);
}
#endif
- acpi_gsi_to_irq(plat_gsi, &irq);
+ irq = plat_gsi;
+
return irq;
}

Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
+++ linux-2.6/arch/x86/kernel/apic/io_apic.c
@@ -1541,6 +1541,56 @@ static void __init setup_IO_APIC_irqs(vo
}

/*
+ * for the gsit that is not in first ioapic
+ * but could not use acpi_register_gsi()
+ * like some special sci in IBM x3330
+ */
+void setup_IO_APIC_irq_extra(u32 gsi)
+{
+ int apic_id = 0, pin, idx, irq;
+ int node = cpu_to_node(boot_cpu_id);
+ struct irq_desc *desc;
+ struct irq_cfg *cfg;
+
+ /*
+ * Convert 'gsi' to 'ioapic.pin'.
+ */
+ apic_id = mp_find_ioapic(gsi);
+ if (apic_id < 0)
+ return;
+
+ pin = mp_find_ioapic_pin(apic_id, gsi);
+ idx = find_irq_entry(apic_id, pin, mp_INT);
+ if (idx == -1)
+ return;
+
+ irq = pin_2_irq(idx, apic_id, pin);
+#ifdef CONFIG_SPARSE_IRQ
+ desc = irq_to_desc(irq);
+ if (desc)
+ return;
+#endif
+ desc = irq_to_desc_alloc_node(irq, node);
+ if (!desc) {
+ printk(KERN_INFO "can not get irq_desc for %d\n", irq);
+ return;
+ }
+
+ cfg = desc->chip_data;
+ add_pin_to_irq_node(cfg, node, apic_id, pin);
+
+ if (test_bit(pin, mp_ioapic_routing[apic_id].pin_programmed)) {
+ pr_debug("Pin %d-%d already programmed\n",
+ mp_ioapics[apic_id].apicid, pin);
+ return;
+ }
+ set_bit(pin, mp_ioapic_routing[apic_id].pin_programmed);
+
+ setup_IO_APIC_irq(apic_id, pin, irq, desc,
+ irq_trigger(idx), irq_polarity(idx));
+}
+
+/*
* Set up the timer pin, possibly with the 8259A-master behind.
*/
static void __init setup_timer_IRQ0_pin(unsigned int apic_id, unsigned int pin,

2010-02-02 09:59:22

by Thomas Renninger

[permalink] [raw]
Subject: Re: IRQ regression messes up xseries 330 SCI resulting in apic=off - bisected to commit b9c61b70075c87a861262473

On Tuesday 02 February 2010 09:03:36 Yinghai Lu wrote:
> On 02/01/2010 05:16 PM, Yinghai Lu wrote:
> > On 02/01/2010 06:59 AM, Thomas Renninger wrote:
> >> Hi,
> >>
> >> booting a latest kernel on this machine results in:
> >>
> >> PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
> >> PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
> >> ACPI: SCI (IRQ30) allocation failed
> >> ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
> >> ACPI: Unable to start the ACPI Interpreter
> >>
> >
> > please check
>
> Subject: [PATCH -v2] x86: fix sci on ioapic 1
Works for me, thanks!
Tested-by: Thomas Renninger <[email protected]>

Is this supposed to go into 2.6.33 still?
Do you consider this save enough to CC: [email protected] and just
push it/commit it there?
I can confirm that this one patches and works fine for 2.6.32.
2.6.31 would also need this fix, the regression was introduced somewhere
between 2.6.30 and 2.6.31.

Thanks again,

Thomas

2010-02-02 18:31:45

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH -v3] x86: fix sci on ioapic 1



Thomas Renninger <[email protected]> reported on IBM x3330

booting a latest kernel on this machine results in:

PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
ACPI: SCI (IRQ30) allocation failed
ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
ACPI: Unable to start the ACPI Interpreter

Later all kind of devices fail...

and bisect it down to this commit:
commit b9c61b70075c87a8612624736faf4a2de5b1ed30

x86/pci: update pirq_enable_irq() to setup io apic routing

it turns out we need to set irq routing for the sci on ioapic1 early.

-v2: make it work without sparseirq too.
-v3: fix checkpatch.pl warning, and cc to stable

Reported-by: Thomas Renninger <[email protected]>
Bisected-by: Thomas Renninger <[email protected]>
Tested-by: Thomas Renninger <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>
Cc: [email protected]

---
arch/x86/include/asm/io_apic.h | 1
arch/x86/kernel/acpi/boot.c | 9 ++++++-
arch/x86/kernel/apic/io_apic.c | 50 +++++++++++++++++++++++++++++++++++++++++
3 files changed, 59 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/include/asm/io_apic.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/io_apic.h
+++ linux-2.6/arch/x86/include/asm/io_apic.h
@@ -160,6 +160,7 @@ extern int io_apic_get_redir_entries(int
struct io_apic_irq_attr;
extern int io_apic_set_pci_routing(struct device *dev, int irq,
struct io_apic_irq_attr *irq_attr);
+void setup_IO_APIC_irq_extra(u32 gsi);
extern int (*ioapic_renumber_irq)(int ioapic, int irq);
extern void ioapic_init_mappings(void);
extern void ioapic_insert_resources(void);
Index: linux-2.6/arch/x86/kernel/acpi/boot.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/acpi/boot.c
+++ linux-2.6/arch/x86/kernel/acpi/boot.c
@@ -446,6 +446,12 @@ void __init acpi_pic_sci_set_trigger(uns
int acpi_gsi_to_irq(u32 gsi, unsigned int *irq)
{
*irq = gsi;
+
+#ifdef CONFIG_X86_IO_APIC
+ if (acpi_irq_model == ACPI_IRQ_MODEL_IOAPIC)
+ setup_IO_APIC_irq_extra(gsi);
+#endif
+
return 0;
}

@@ -473,7 +479,8 @@ int acpi_register_gsi(struct device *dev
plat_gsi = mp_register_gsi(dev, gsi, trigger, polarity);
}
#endif
- acpi_gsi_to_irq(plat_gsi, &irq);
+ irq = plat_gsi;
+
return irq;
}

Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
+++ linux-2.6/arch/x86/kernel/apic/io_apic.c
@@ -1541,6 +1541,56 @@ static void __init setup_IO_APIC_irqs(vo
}

/*
+ * for the gsit that is not in first ioapic
+ * but could not use acpi_register_gsi()
+ * like some special sci in IBM x3330
+ */
+void setup_IO_APIC_irq_extra(u32 gsi)
+{
+ int apic_id = 0, pin, idx, irq;
+ int node = cpu_to_node(boot_cpu_id);
+ struct irq_desc *desc;
+ struct irq_cfg *cfg;
+
+ /*
+ * Convert 'gsi' to 'ioapic.pin'.
+ */
+ apic_id = mp_find_ioapic(gsi);
+ if (apic_id < 0)
+ return;
+
+ pin = mp_find_ioapic_pin(apic_id, gsi);
+ idx = find_irq_entry(apic_id, pin, mp_INT);
+ if (idx == -1)
+ return;
+
+ irq = pin_2_irq(idx, apic_id, pin);
+#ifdef CONFIG_SPARSE_IRQ
+ desc = irq_to_desc(irq);
+ if (desc)
+ return;
+#endif
+ desc = irq_to_desc_alloc_node(irq, node);
+ if (!desc) {
+ printk(KERN_INFO "can not get irq_desc for %d\n", irq);
+ return;
+ }
+
+ cfg = desc->chip_data;
+ add_pin_to_irq_node(cfg, node, apic_id, pin);
+
+ if (test_bit(pin, mp_ioapic_routing[apic_id].pin_programmed)) {
+ pr_debug("Pin %d-%d already programmed\n",
+ mp_ioapics[apic_id].apicid, pin);
+ return;
+ }
+ set_bit(pin, mp_ioapic_routing[apic_id].pin_programmed);
+
+ setup_IO_APIC_irq(apic_id, pin, irq, desc,
+ irq_trigger(idx), irq_polarity(idx));
+}
+
+/*
* Set up the timer pin, possibly with the 8259A-master behind.
*/
static void __init setup_timer_IRQ0_pin(unsigned int apic_id, unsigned int pin,