2007-08-09 15:04:21

by John Stoffel

[permalink] [raw]
Subject: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()


Hi,

I'm opening this ticket as a new subject, even though it looks like it
might be related to the thread "Networking dies after random time".
Sorry for the wide CC list, but since my network hasn't died since I
rebooted into 2.6.23-rc2 (after 30+ days at 2.6.22-rc7), I'm wondering
if the problem is more than networking related.

Honestly, I haven't gone back over the previous thread in detail, so I
might be missing info here.

System details: Dell Precision 610MT, Intel 440GX chipset, Dual PIII
Xeon, 550Mhz, 2gb RAM (upgraded from 768Mb last night), a mix of IDE,
SCSI and SATA disks in the system. My poor PCI bus! Just upgraded to
2.6.23-rc2. Interrupts looks like this:

> cat /proc/interrupts
CPU0 CPU1
0: 280 1 IO-APIC-edge timer
1: 788 0 IO-APIC-edge i8042
6: 1 4 IO-APIC-edge floppy
8: 0 1 IO-APIC-edge rtc
9: 0 0 IO-APIC-fasteoi acpi
11: 82410 1239 IO-APIC-edge Cyclom-Y
12: 279 106 IO-APIC-edge i8042
14: 440901 4266 IO-APIC-edge libata
15: 0 0 IO-APIC-edge libata
16: 2394727 42983 IO-APIC-fasteoi ohci_hcd:usb3, Ensoniq
AudioPCI, mga@pci:0000:01:00.0
17: 2237362 1110 IO-APIC-fasteoi sata_sil,
ehci_hcd:usb1, eth0
18: 126520 31978 IO-APIC-fasteoi aic7xxx, aic7xxx, ide2,
ide3, ohci1394
19: 0 0 IO-APIC-fasteoi ohci_hcd:usb2,
uhci_hcd:usb4
NMI: 0 0
LOC: 40672484 40672246
ERR: 0
MIS: 0

I've only seen the one Warning oops, and backups and other system
processes have been running for the past 12 hours without a problem.


[ 187.747442] Probing IDE interface ide2...
[ 188.011634] hde: WDC WD1200JB-00CRA1, ATA DISK drive
[ 188.623038] WARNING: at kernel/irq/resend.c:70 check_irq_resend()
[ 188.623105] [<c0149e38>] check_irq_resend+0xa8/0xc0
[ 188.623204] [<c01499d3>] enable_irq+0xc3/0xd0
[ 188.623295] [<f8867280>] probe_hwif+0x670/0x7c0 [ide_core]
[ 188.623448] [<f8869f04>] do_ide_setup_pci_device+0x154/0x480
[ide_core]
[ 188.623571] [<f8867d6c>] probe_hwif_init_with_fixup+0xc/0x90
[ide_core]
[ 188.623690] [<f88817d0>] init_setup_hpt302+0x0/0x30 [hpt366]
[ 188.623791] [<f886a39b>] ide_setup_pci_device+0x7b/0xc0 [ide_core]
[ 188.623909] [<f88817d0>] init_setup_hpt302+0x0/0x30 [hpt366]
[ 188.624004] [<f88811ed>] hpt366_init_one+0x8d/0xa0 [hpt366]
[ 188.624095] [<f88817d0>] init_setup_hpt302+0x0/0x30 [hpt366]
[ 188.624187] [<f8881e50>] init_chipset_hpt366+0x0/0x680 [hpt366]
[ 188.624281] [<f8882680>] init_hwif_hpt366+0x0/0x380 [hpt366]
[ 188.624372] [<f8881800>] init_dma_hpt366+0x0/0xe0 [hpt366]
[ 188.624466] [<c0265fc6>] pci_device_probe+0x56/0x80
[ 188.624565] [<c02d0f8e>] driver_probe_device+0x8e/0x190
[ 188.624669] [<c02d11fe>] __driver_attach+0x9e/0xa0
[ 188.624756] [<c02d038a>] bus_for_each_dev+0x3a/0x60
[ 188.624845] [<c02d0e06>] driver_attach+0x16/0x20
[ 188.624932] [<c02d1160>] __driver_attach+0x0/0xa0
[ 188.625017] [<c02d075a>] bus_add_driver+0x8a/0x1b0
[ 188.625107] [<c0266173>] __pci_register_driver+0x53/0xa0
[ 188.625197] [<c0144d5d>] sys_init_module+0x13d/0x1820
[ 188.625315] [<f8844000>] snd_timer_find+0x0/0x90 [snd_timer]
[ 188.625424] [<c0149530>] disable_irq+0x0/0x30
[ 188.625513] [<c0108b7d>] sys_mmap2+0xcd/0xd0
[ 188.625612] [<c0104266>] syscall_call+0x7/0xb
[ 188.625701] [<c0410000>] rpc_get_inode+0x0/0x80
[ 188.625798] =======================
[ 188.625871] hde: selected mode 0x45
[ 188.626817] ide2 at 0xecf8-0xecff,0xecf2 on irq 18
[ 188.627080] Probing IDE interface ide3...
[ 188.891165] hdg: WDC WD1200JB-00EVA0, ATA DISK drive
[ 189.502580] hdg: selected mode 0x45
[ 189.503698] ide3 at 0xece0-0xece7,0xecda on irq 18


Let


2007-08-09 15:54:25

by Jarek Poplawski

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

On Thu, Aug 09, 2007 at 11:03:03AM -0400, John Stoffel wrote:
>
> Hi,

Hi, read below, please...

>
> I'm opening this ticket as a new subject, even though it looks like it
> might be related to the thread "Networking dies after random time".
> Sorry for the wide CC list, but since my network hasn't died since I
> rebooted into 2.6.23-rc2 (after 30+ days at 2.6.22-rc7), I'm wondering
> if the problem is more than networking related.
>
> Honestly, I haven't gone back over the previous thread in detail, so I
> might be missing info here.
>
> System details: Dell Precision 610MT, Intel 440GX chipset, Dual PIII
> Xeon, 550Mhz, 2gb RAM (upgraded from 768Mb last night), a mix of IDE,
> SCSI and SATA disks in the system. My poor PCI bus! Just upgraded to
> 2.6.23-rc2. Interrupts looks like this:
>
> > cat /proc/interrupts
> CPU0 CPU1
> 0: 280 1 IO-APIC-edge timer
> 1: 788 0 IO-APIC-edge i8042
> 6: 1 4 IO-APIC-edge floppy
> 8: 0 1 IO-APIC-edge rtc
> 9: 0 0 IO-APIC-fasteoi acpi
> 11: 82410 1239 IO-APIC-edge Cyclom-Y
> 12: 279 106 IO-APIC-edge i8042
> 14: 440901 4266 IO-APIC-edge libata
> 15: 0 0 IO-APIC-edge libata
> 16: 2394727 42983 IO-APIC-fasteoi ohci_hcd:usb3, Ensoniq
> AudioPCI, mga@pci:0000:01:00.0
> 17: 2237362 1110 IO-APIC-fasteoi sata_sil,
> ehci_hcd:usb1, eth0
> 18: 126520 31978 IO-APIC-fasteoi aic7xxx, aic7xxx, ide2,
> ide3, ohci1394
> 19: 0 0 IO-APIC-fasteoi ohci_hcd:usb2,
> uhci_hcd:usb4
> NMI: 0 0
> LOC: 40672484 40672246
> ERR: 0
> MIS: 0
>
> I've only seen the one Warning oops, and backups and other system
> processes have been running for the past 12 hours without a problem.
>
>
> [ 187.747442] Probing IDE interface ide2...
> [ 188.011634] hde: WDC WD1200JB-00CRA1, ATA DISK drive
> [ 188.623038] WARNING: at kernel/irq/resend.c:70 check_irq_resend()
> [ 188.623105] [<c0149e38>] check_irq_resend+0xa8/0xc0
> [ 188.623204] [<c01499d3>] enable_irq+0xc3/0xd0
> [ 188.623295] [<f8867280>] probe_hwif+0x670/0x7c0 [ide_core]
> [ 188.623448] [<f8869f04>] do_ide_setup_pci_device+0x154/0x480
> [ide_core]
> [ 188.623571] [<f8867d6c>] probe_hwif_init_with_fixup+0xc/0x90
> [ide_core]
> [ 188.623690] [<f88817d0>] init_setup_hpt302+0x0/0x30 [hpt366]
> [ 188.623791] [<f886a39b>] ide_setup_pci_device+0x7b/0xc0 [ide_core]
> [ 188.623909] [<f88817d0>] init_setup_hpt302+0x0/0x30 [hpt366]
> [ 188.624004] [<f88811ed>] hpt366_init_one+0x8d/0xa0 [hpt366]
> [ 188.624095] [<f88817d0>] init_setup_hpt302+0x0/0x30 [hpt366]
> [ 188.624187] [<f8881e50>] init_chipset_hpt366+0x0/0x680 [hpt366]
> [ 188.624281] [<f8882680>] init_hwif_hpt366+0x0/0x380 [hpt366]
> [ 188.624372] [<f8881800>] init_dma_hpt366+0x0/0xe0 [hpt366]
> [ 188.624466] [<c0265fc6>] pci_device_probe+0x56/0x80
> [ 188.624565] [<c02d0f8e>] driver_probe_device+0x8e/0x190
> [ 188.624669] [<c02d11fe>] __driver_attach+0x9e/0xa0
> [ 188.624756] [<c02d038a>] bus_for_each_dev+0x3a/0x60
> [ 188.624845] [<c02d0e06>] driver_attach+0x16/0x20
> [ 188.624932] [<c02d1160>] __driver_attach+0x0/0xa0
> [ 188.625017] [<c02d075a>] bus_add_driver+0x8a/0x1b0
> [ 188.625107] [<c0266173>] __pci_register_driver+0x53/0xa0
> [ 188.625197] [<c0144d5d>] sys_init_module+0x13d/0x1820
> [ 188.625315] [<f8844000>] snd_timer_find+0x0/0x90 [snd_timer]
> [ 188.625424] [<c0149530>] disable_irq+0x0/0x30
> [ 188.625513] [<c0108b7d>] sys_mmap2+0xcd/0xd0
> [ 188.625612] [<c0104266>] syscall_call+0x7/0xb
> [ 188.625701] [<c0410000>] rpc_get_inode+0x0/0x80
> [ 188.625798] =======================
> [ 188.625871] hde: selected mode 0x45
> [ 188.626817] ide2 at 0xecf8-0xecff,0xecf2 on irq 18
> [ 188.627080] Probing IDE interface ide3...
> [ 188.891165] hdg: WDC WD1200JB-00EVA0, ATA DISK drive
> [ 189.502580] hdg: selected mode 0x45
> [ 189.503698] ide3 at 0xece0-0xece7,0xecda on irq 18
>
>
> Let

I'm not sure I don't miss anything (a little in hurry now), but this
warning's aim was purely diagnostical and nothing wrong is meant!
Unless there is something wrong... Then please try to be more explicit.

If you prefer to not see this, there is my patch proposal somewhere
in this older thread:
Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
Date: Wed, 8 Aug 2007 13:00:37 +0200

On the other hand, if it works OK, it would be better to let it be
tested more like this...

Regards,
Jarek P.

2007-08-10 08:06:13

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

On Thu, 2007-08-09 at 17:54 +0200, Jarek Poplawski wrote:
> I'm not sure I don't miss anything (a little in hurry now), but this
> warning's aim was purely diagnostical and nothing wrong is meant!
> Unless there is something wrong... Then please try to be more explicit.
>
> If you prefer to not see this, there is my patch proposal somewhere
> in this older thread:
> Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
> Date: Wed, 8 Aug 2007 13:00:37 +0200
>
> On the other hand, if it works OK, it would be better to let it be
> tested more like this...

Hmm. This solution is still just pampering over the real problem. The
delayed disable just re-sends level interrupts unnecessarily. I have a
fix (needs some testing) for this, which I send out tomorrow, when I'm
really back from vacation.

But suppressing the resend is not fixing the driver problem. The problem
can show up with spurious interrupts and with interrupts on a shared PCI
interrupt line at any time. It just might take weeks instead of minutes.

Alan,

is there anything which can be done on the driver level ?

tglx


2007-08-10 08:23:46

by Jarek Poplawski

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

On Fri, Aug 10, 2007 at 10:05:40AM +0200, Thomas Gleixner wrote:
> On Thu, 2007-08-09 at 17:54 +0200, Jarek Poplawski wrote:
> > I'm not sure I don't miss anything (a little in hurry now), but this
> > warning's aim was purely diagnostical and nothing wrong is meant!
> > Unless there is something wrong... Then please try to be more explicit.
> >
> > If you prefer to not see this, there is my patch proposal somewhere
> > in this older thread:
> > Subject: [patch] genirq: temporary fix for level-triggered IRQ resend
> > Date: Wed, 8 Aug 2007 13:00:37 +0200
> >
> > On the other hand, if it works OK, it would be better to let it be
> > tested more like this...
>
> Hmm. This solution is still just pampering over the real problem. The
> delayed disable just re-sends level interrupts unnecessarily. I have a
> fix (needs some testing) for this, which I send out tomorrow, when I'm
> really back from vacation.
>
> But suppressing the resend is not fixing the driver problem. The problem
> can show up with spurious interrupts and with interrupts on a shared PCI
> interrupt line at any time. It just might take weeks instead of minutes.

Doesn't it look like a little change of mind? Well, there are probably
(but need more testing) two other solutions: _SW_RESEND and disabling
without delay for levels only...

Jarek P.

2007-08-10 08:31:57

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()


* Jarek Poplawski <[email protected]> wrote:

> > Hmm. This solution is still just pampering over the real problem.
> > The delayed disable just re-sends level interrupts unnecessarily. I
> > have a fix (needs some testing) for this, which I send out tomorrow,
> > when I'm really back from vacation.
> >
> > But suppressing the resend is not fixing the driver problem. The
> > problem can show up with spurious interrupts and with interrupts on
> > a shared PCI interrupt line at any time. It just might take weeks
> > instead of minutes.
>
> Doesn't it look like a little change of mind? [...]

what change of mind do you mean exactly?

> [...] Well, there are probably (but need more testing) two other
> solutions: _SW_RESEND and disabling without delay for levels only...

IIRC Marcin tested software-resend and it didnt fix the hang. That
strongly points in the direction of a driver bug (or a genirq bug) being
made more prominent by the genirq change - not any hardware detail such
as the APIC vector-retrigger sequence.

While we'd like to see the suspected driver bug (or any higher level
genirq bug) fixed, we'll undo the effect of the genirq change (because
it is causing a regression). We'll also add a separate, optional
irq-debugging feature that generates high-rate interrupts on any shared
irq line. (and thus artificially stresses the robustness of the driver
and the genirq layer against spurious interrupts.)

Ingo

2007-08-10 08:49:17

by Jarek Poplawski

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

On Fri, Aug 10, 2007 at 10:30:50AM +0200, Ingo Molnar wrote:
>
> * Jarek Poplawski <[email protected]> wrote:
>
> > > Hmm. This solution is still just pampering over the real problem.
> > > The delayed disable just re-sends level interrupts unnecessarily. I
> > > have a fix (needs some testing) for this, which I send out tomorrow,
> > > when I'm really back from vacation.
> > >
> > > But suppressing the resend is not fixing the driver problem. The
> > > problem can show up with spurious interrupts and with interrupts on
> > > a shared PCI interrupt line at any time. It just might take weeks
> > > instead of minutes.
> >
> > Doesn't it look like a little change of mind? [...]
>
> what change of mind do you mean exactly?
>
> > [...] Well, there are probably (but need more testing) two other
> > solutions: _SW_RESEND and disabling without delay for levels only...
>
> IIRC Marcin tested software-resend and it didnt fix the hang. That
> strongly points in the direction of a driver bug (or a genirq bug) being
> made more prominent by the genirq change - not any hardware detail such
> as the APIC vector-retrigger sequence.
>
> While we'd like to see the suspected driver bug (or any higher level
> genirq bug) fixed, we'll undo the effect of the genirq change (because
> it is causing a regression). We'll also add a separate, optional
> irq-debugging feature that generates high-rate interrupts on any shared
> irq line. (and thus artificially stresses the robustness of the driver
> and the genirq layer against spurious interrupts.)

Not exactly so... I've send modified version of your software-resend
patch, and it seems to work OK.

Jarek P.

>From [email protected] Wed Aug 8 13:20:02 2007
From: "=?ISO-8859-2?Q?Marcin_=A6lusarz?=" <[email protected]>
...
Subject: Re: 2.6.20->2.6.21 - networking dies after random time
...
2007/8/7, Jarek Poplawski <[email protected]>:
> So, the let's try this idea yet: modified Ingo's "x86: activate
> HARDIRQS_SW_RESEND" patch.
> (Don't forget about make oldconfig before make.)
> For testing only.
>
> Cheers,
> Jarek P.
>
> PS: alas there was not even time for "compile checking"...
>
> ---
>
> diff -Nurp 2.6.22.1-/arch/i386/Kconfig 2.6.22.1/arch/i386/Kconfig
> --- 2.6.22.1-/arch/i386/Kconfig 2007-07-09 01:32:17.000000000 +0200
> +++ 2.6.22.1/arch/i386/Kconfig 2007-08-07 13:13:03.000000000 +0200
> @@ -1252,6 +1252,10 @@ config GENERIC_PENDING_IRQ
> depends on GENERIC_HARDIRQS && SMP
> default y
>
> +config HARDIRQS_SW_RESEND
> + bool
> + default y
> +
> config X86_SMP
> bool
> depends on SMP && !X86_VOYAGER
> diff -Nurp 2.6.22.1-/arch/x86_64/Kconfig 2.6.22.1/arch/x86_64/Kconfig
> --- 2.6.22.1-/arch/x86_64/Kconfig 2007-07-09 01:32:17.000000000 +0200
> +++ 2.6.22.1/arch/x86_64/Kconfig 2007-08-07 13:13:03.000000000 +0200
> @@ -690,6 +690,10 @@ config GENERIC_PENDING_IRQ
> depends on GENERIC_HARDIRQS && SMP
> default y
>
> +config HARDIRQS_SW_RESEND
> + bool
> + default y
> +
> menu "Power management options"
>
> source kernel/power/Kconfig
> diff -Nurp 2.6.22.1-/kernel/irq/manage.c 2.6.22.1/kernel/irq/manage.c
> --- 2.6.22.1-/kernel/irq/manage.c 2007-07-09 01:32:17.000000000 +0200
> +++ 2.6.22.1/kernel/irq/manage.c 2007-08-07 13:13:03.000000000 +0200
> @@ -169,6 +169,14 @@ void enable_irq(unsigned int irq)
> desc->depth--;
> }
> spin_unlock_irqrestore(&desc->lock, flags);
> +#ifdef CONFIG_HARDIRQS_SW_RESEND
> + /*
> + * Do a bh disable/enable pair to trigger any pending
> + * irq resend logic:
> + */
> + local_bh_disable();
> + local_bh_enable();
> +#endif
> }
> EXPORT_SYMBOL(enable_irq);
>
> diff -Nurp 2.6.22.1-/kernel/irq/resend.c 2.6.22.1/kernel/irq/resend.c
> --- 2.6.22.1-/kernel/irq/resend.c 2007-07-09 01:32:17.000000000 +0200
> +++ 2.6.22.1/kernel/irq/resend.c 2007-08-07 13:57:54.000000000 +0200
> @@ -62,16 +62,24 @@ void check_irq_resend(struct irq_desc *d
> */
> desc->chip->enable(irq);
>
> + /*
> + * Temporary hack to figure out more about the problem, which
> + * is causing the ancient network cards to die.
> + */
> +
> if ((status & (IRQ_PENDING | IRQ_REPLAY)) == IRQ_PENDING) {
> desc->status = (status & ~IRQ_PENDING) | IRQ_REPLAY;
>
> - if (!desc->chip || !desc->chip->retrigger ||
> - !desc->chip->retrigger(irq)) {
> + if (desc->handle_irq == handle_edge_irq) {
> + if (desc->chip->retrigger)
> + desc->chip->retrigger(irq);
> + return;
> + }
> #ifdef CONFIG_HARDIRQS_SW_RESEND
> - /* Set it pending and activate the softirq: */
> - set_bit(irq, irqs_resend);
> - tasklet_schedule(&resend_tasklet);
> + WARN_ON_ONCE(1);
> + /* Set it pending and activate the softirq: */
> + set_bit(irq, irqs_resend);
> + tasklet_schedule(&resend_tasklet);
> #endif
> - }
> }
> }
>
Works fine with:
WARNING: at kernel/irq/resend.c:79 check_irq_resend()

Call Trace:
[<ffffffff8025e660>] check_irq_resend+0xc0/0xd0
[<ffffffff8025e1cd>] enable_irq+0xed/0xf0
[<ffffffff8807f21d>] :8390:ei_start_xmit+0x14d/0x30c
[<ffffffff8024d055>] lock_release_non_nested+0xe5/0x190
[<ffffffff80539b78>] __qdisc_run+0x98/0x1f0
[<ffffffff80539b8e>] __qdisc_run+0xae/0x1f0
[<ffffffff8052b65e>] dev_hard_start_xmit+0x26e/0x2d0
[<ffffffff80539ba0>] __qdisc_run+0xc0/0x1f0
[<ffffffff8052dc2f>] dev_queue_xmit+0x24f/0x310
[<ffffffff805337a7>] neigh_resolve_output+0xe7/0x290
[<ffffffff8054f5c0>] dst_output+0x0/0x10
[<ffffffff80552aff>] ip_output+0x19f/0x340
[<ffffffff80551f77>] ip_queue_xmit+0x217/0x430
[<ffffffff80563b2a>] tcp_transmit_skb+0x40a/0x7c0
[<ffffffff805657bb>] __tcp_push_pending_frames+0x11b/0x940
[<ffffffff8055972a>] tcp_sendmsg+0x87a/0xc80
[<ffffffff80577735>] inet_sendmsg+0x45/0x80
[<ffffffff8051e2d4>] sock_aio_write+0x104/0x120
[<ffffffff80285fc1>] do_sync_write+0xf1/0x130
[<ffffffff80243290>] autoremove_wake_function+0x0/0x40
[<ffffffff802868e9>] vfs_write+0x159/0x170
[<ffffffff80286ef0>] sys_write+0x50/0x90
[<ffffffff802097fe>] system_call+0x7e/0x83


2007-08-10 08:56:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()


* Jarek Poplawski <[email protected]> wrote:

> > > [...] Well, there are probably (but need more testing) two other
> > > solutions: _SW_RESEND and disabling without delay for levels
> > > only...
> >
> > IIRC Marcin tested software-resend and it didnt fix the hang. That
> > strongly points in the direction of a driver bug (or a genirq bug)
> > being made more prominent by the genirq change - not any hardware
> > detail such as the APIC vector-retrigger sequence.
> >
> > While we'd like to see the suspected driver bug (or any higher level
> > genirq bug) fixed, we'll undo the effect of the genirq change
> > (because it is causing a regression). We'll also add a separate,
> > optional irq-debugging feature that generates high-rate interrupts
> > on any shared irq line. (and thus artificially stresses the
> > robustness of the driver and the genirq layer against spurious
> > interrupts.)
>
> Not exactly so... I've send modified version of your software-resend
> patch, and it seems to work OK.

ah, i completely missed that! Thanks :-)

this changes the picture completely and makes the IO-APIC/local-APIC hw
retrigger code/logic the main suspect. I think you right that it's quite
bogus to hw-retrigger level irqs, and that could be confusing the
IO-APIC (or the local APIC, or both).

and i think i see why my first sw-resend patch didnt do the trick:

> > - if (!desc->chip || !desc->chip->retrigger ||
> > - !desc->chip->retrigger(irq)) {
> > + if (desc->handle_irq == handle_edge_irq) {
> > + if (desc->chip->retrigger)
> > + desc->chip->retrigger(irq);
> > + return;
> > + }
> > #ifdef CONFIG_HARDIRQS_SW_RESEND

we used the hw-resend method unconditionally, right?

Ingo

2007-08-10 09:12:18

by Jarek Poplawski

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

On Fri, Aug 10, 2007 at 10:56:11AM +0200, Ingo Molnar wrote:
...
> this changes the picture completely and makes the IO-APIC/local-APIC hw
> retrigger code/logic the main suspect. I think you right that it's quite
> bogus to hw-retrigger level irqs, and that could be confusing the
> IO-APIC (or the local APIC, or both).
>
> and i think i see why my first sw-resend patch didnt do the trick:
>
> > > - if (!desc->chip || !desc->chip->retrigger ||
> > > - !desc->chip->retrigger(irq)) {
> > > + if (desc->handle_irq == handle_edge_irq) {
> > > + if (desc->chip->retrigger)
> > > + desc->chip->retrigger(irq);
> > > + return;
> > > + }
> > > #ifdef CONFIG_HARDIRQS_SW_RESEND
>
> we used the hw-resend method unconditionally, right?

Right: unconditionally on a condition they are not edges...

But, since not resending at all seems to work so good in testing,
I thought, _SW_RESEND could be considered as an unnecessarily
complicated alternative.

Now, I'm a bit confused...

Jarek P.

2007-08-10 09:34:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()


* Jarek Poplawski <[email protected]> wrote:

> > > > + }
> > > > #ifdef CONFIG_HARDIRQS_SW_RESEND
> >
> > we used the hw-resend method unconditionally, right?
>
> Right: unconditionally on a condition they are not edges...
>
> But, since not resending at all seems to work so good in testing, I
> thought, _SW_RESEND could be considered as an unnecessarily
> complicated alternative.
>
> Now, I'm a bit confused...

the idea is multi-pronged:

- Primarily, we want to fix the regression. 2.6.20 worked, 2.6.21
didnt, that has to be fixed, no matter what - end of story. But we've
got a wide selection of patches for that purpose now, so what matters
at this point is the secondary question:

- we want to know _why exactly_ the hang happens. We now have a pretty
good theory: hw-resend hangs the IO-APIC. (there is a delicate dance
between local APICs and IO-APICs for level-triggered irqs, and if we
interject via hw-resending via the local APIC, existing races, hw
bugs or weaknesses in our hw-resend implementation might be exposed)

and even though we now have a wide selection of patches we really want
to get to the bottom of the problem so that we can fix the bug that got
exposed: apparently hw resend doesnt always work with level-triggered
irqs.

Note that the hw-resend sequence can trigger _even without our original
patch that triggered the regression_, it's just much less likely to
happen, so this is a pre-existing IO-APIC/APIC code bug that could
trigger anytime, and which we want to see fixed.

To confirm this theory - does the debug-patch below fix the hang? If it
fixes the hang then the theory is confirmed and then the right solution
is to retrigger an IRQ for level-triggered irqs with the proper
trigger-type set.

Ingo

------------------>
Not-Signed-off-by: Ingo Molnar <[email protected]>

Index: linux/arch/i386/kernel/io_apic.c
===================================================================
--- linux.orig/arch/i386/kernel/io_apic.c
+++ linux/arch/i386/kernel/io_apic.c
@@ -735,7 +735,8 @@ void fastcall send_IPI_self(int vector)
* Wait for idle.
*/
apic_wait_icr_idle();
- cfg = APIC_DM_FIXED | APIC_DEST_SELF | vector | APIC_DEST_LOGICAL;
+ cfg = APIC_DM_FIXED | APIC_DEST_SELF | vector | APIC_DEST_LOGICAL |
+ APIC_INT_LEVELTRIG;
/*
* Send the IPI. The write to APIC_ICR fires this off.
*/

2007-08-10 10:04:52

by Jarek Poplawski

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

On Fri, Aug 10, 2007 at 11:33:53AM +0200, Ingo Molnar wrote:
>
> * Jarek Poplawski <[email protected]> wrote:
>
> > > > > + }
> > > > > #ifdef CONFIG_HARDIRQS_SW_RESEND
> > >
> > > we used the hw-resend method unconditionally, right?
> >
> > Right: unconditionally on a condition they are not edges...
> >
> > But, since not resending at all seems to work so good in testing, I
> > thought, _SW_RESEND could be considered as an unnecessarily
> > complicated alternative.
> >
> > Now, I'm a bit confused...
>
> the idea is multi-pronged:
>
> - Primarily, we want to fix the regression. 2.6.20 worked, 2.6.21
> didnt, that has to be fixed, no matter what - end of story. But we've
> got a wide selection of patches for that purpose now, so what matters
> at this point is the secondary question:
>
> - we want to know _why exactly_ the hang happens. We now have a pretty
> good theory: hw-resend hangs the IO-APIC. (there is a delicate dance
> between local APICs and IO-APICs for level-triggered irqs, and if we
> interject via hw-resending via the local APIC, existing races, hw
> bugs or weaknesses in our hw-resend implementation might be exposed)
>
> and even though we now have a wide selection of patches we really want
> to get to the bottom of the problem so that we can fix the bug that got
> exposed: apparently hw resend doesnt always work with level-triggered
> irqs.
>
> Note that the hw-resend sequence can trigger _even without our original
> patch that triggered the regression_, it's just much less likely to
> happen, so this is a pre-existing IO-APIC/APIC code bug that could
> trigger anytime, and which we want to see fixed.
>
> To confirm this theory - does the debug-patch below fix the hang? If it
> fixes the hang then the theory is confirmed and then the right solution
> is to retrigger an IRQ for level-triggered irqs with the proper
> trigger-type set.
>
> Ingo

Ingo: I think, you have to do this in x86_64, and there is probably
send_IPI_mask used for this (but I can miss something...).

I think, Marcin will not be able to do this and report before monday,
but,
Jean-Baptiste: of course current Ingo's or Thomas' patches are
more urgent, so if you could break the current test and try this
(maybe after Ingo acks this yet?) with eg. clean 2.6.23-rc1 or 2.6.22?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Jarek P.

>
> ------------------>
> Not-Signed-off-by: Ingo Molnar <[email protected]>
>
> Index: linux/arch/i386/kernel/io_apic.c
> ===================================================================
> --- linux.orig/arch/i386/kernel/io_apic.c
> +++ linux/arch/i386/kernel/io_apic.c
> @@ -735,7 +735,8 @@ void fastcall send_IPI_self(int vector)
> * Wait for idle.
> */
> apic_wait_icr_idle();
> - cfg = APIC_DM_FIXED | APIC_DEST_SELF | vector | APIC_DEST_LOGICAL;
> + cfg = APIC_DM_FIXED | APIC_DEST_SELF | vector | APIC_DEST_LOGICAL |
> + APIC_INT_LEVELTRIG;
> /*
> * Send the IPI. The write to APIC_ICR fires this off.
> */
>

2007-08-10 10:13:47

by Stephen Hemminger

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

On Fri, 10 Aug 2007 11:33:53 +0200
Ingo Molnar <[email protected]> wrote:

>
> * Jarek Poplawski <[email protected]> wrote:
>
> > > > > + }
> > > > > #ifdef CONFIG_HARDIRQS_SW_RESEND
> > >
> > > we used the hw-resend method unconditionally, right?
> >
> > Right: unconditionally on a condition they are not edges...
> >
> > But, since not resending at all seems to work so good in testing, I
> > thought, _SW_RESEND could be considered as an unnecessarily
> > complicated alternative.
> >
> > Now, I'm a bit confused...
>
> the idea is multi-pronged:
>
> - Primarily, we want to fix the regression. 2.6.20 worked, 2.6.21
> didnt, that has to be fixed, no matter what - end of story. But we've
> got a wide selection of patches for that purpose now, so what matters
> at this point is the secondary question:
>
> - we want to know _why exactly_ the hang happens. We now have a pretty
> good theory: hw-resend hangs the IO-APIC. (there is a delicate dance
> between local APICs and IO-APICs for level-triggered irqs, and if we
> interject via hw-resending via the local APIC, existing races, hw
> bugs or weaknesses in our hw-resend implementation might be exposed)
>
> and even though we now have a wide selection of patches we really want
> to get to the bottom of the problem so that we can fix the bug that got
> exposed: apparently hw resend doesnt always work with level-triggered
> irqs.
>
> Note that the hw-resend sequence can trigger _even without our original
> patch that triggered the regression_, it's just much less likely to
> happen, so this is a pre-existing IO-APIC/APIC code bug that could
> trigger anytime, and which we want to see fixed.
>
> To confirm this theory - does the debug-patch below fix the hang? If it
> fixes the hang then the theory is confirmed and then the right solution
> is to retrigger an IRQ for level-triggered irqs with the proper
> trigger-type set.
>


All this might explain some of the IRQ loss, I saw with sky2 on mac mini.
Basically, the device would act like it missed an IRQ. The chip and PCI registers
all said "device has asserted IRQ" but the IRQ handler never got called.

Then again, the problem might be completely different since this was with
PCI-E with either MSI or INTA mode.

The workaround was to perodically call the soft IRQ handler and that would
clear the IRQ, but it's not something I want to keep.

2007-08-10 10:17:26

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()


* Jarek Poplawski <[email protected]> wrote:

> Ingo: I think, you have to do this in x86_64, and there is probably
> send_IPI_mask used for this (but I can miss something...).

indeed - full patch below.

Ingo

---
arch/i386/kernel/io_apic.c | 3 ++-
arch/x86_64/kernel/genapic.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux/arch/i386/kernel/io_apic.c
===================================================================
--- linux.orig/arch/i386/kernel/io_apic.c
+++ linux/arch/i386/kernel/io_apic.c
@@ -735,7 +735,8 @@ void fastcall send_IPI_self(int vector)
* Wait for idle.
*/
apic_wait_icr_idle();
- cfg = APIC_DM_FIXED | APIC_DEST_SELF | vector | APIC_DEST_LOGICAL;
+ cfg = APIC_DM_FIXED | APIC_DEST_SELF | vector | APIC_DEST_LOGICAL |
+ APIC_INT_LEVELTRIG;
/*
* Send the IPI. The write to APIC_ICR fires this off.
*/
Index: linux/arch/x86_64/kernel/genapic.c
===================================================================
--- linux.orig/arch/x86_64/kernel/genapic.c
+++ linux/arch/x86_64/kernel/genapic.c
@@ -62,5 +62,6 @@ void __init setup_apic_routing(void)

void send_IPI_self(int vector)
{
- __send_IPI_shortcut(APIC_DEST_SELF, vector, APIC_DEST_PHYSICAL);
+ __send_IPI_shortcut(APIC_DEST_SELF, vector, APIC_DEST_PHYSICAL |
+ APIC_INT_LEVELTRIG);
}

2007-08-10 11:36:26

by Jean-Baptiste Vignaud

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()


> Ingo: I think, you have to do this in x86_64, and there is probably
> send_IPI_mask used for this (but I can miss something...).
>
> I think, Marcin will not be able to do this and report before monday,
> but,
> Jean-Baptiste: of course current Ingo's or Thomas' patches are
> more urgent, so if you could break the current test and try this
> (maybe after Ingo acks this yet?) with eg. clean 2.6.23-rc1 or 2.6.22?
>

i'm compiling 2.6.23-rc1 with http://lkml.org/lkml/diff/2007/8/10/101/1
when finished, i'll stop current test (atm : about 100Go of network traffic and still ok) to try it.

Jb

2007-08-13 12:39:30

by Marcin Ślusarz

[permalink] [raw]
Subject: Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

2007/8/10, Ingo Molnar <[email protected]>:
> Index: linux/arch/i386/kernel/io_apic.c
> ===================================================================
> --- linux.orig/arch/i386/kernel/io_apic.c
> +++ linux/arch/i386/kernel/io_apic.c
> @@ -735,7 +735,8 @@ void fastcall send_IPI_self(int vector)
> * Wait for idle.
> */
> apic_wait_icr_idle();
> - cfg = APIC_DM_FIXED | APIC_DEST_SELF | vector | APIC_DEST_LOGICAL;
> + cfg = APIC_DM_FIXED | APIC_DEST_SELF | vector | APIC_DEST_LOGICAL |
> + APIC_INT_LEVELTRIG;
> /*
> * Send the IPI. The write to APIC_ICR fires this off.
> */
> Index: linux/arch/x86_64/kernel/genapic.c
> ===================================================================
> --- linux.orig/arch/x86_64/kernel/genapic.c
> +++ linux/arch/x86_64/kernel/genapic.c
> @@ -62,5 +62,6 @@ void __init setup_apic_routing(void)
>
> void send_IPI_self(int vector)
> {
> - __send_IPI_shortcut(APIC_DEST_SELF, vector, APIC_DEST_PHYSICAL);
> + __send_IPI_shortcut(APIC_DEST_SELF, vector, APIC_DEST_PHYSICAL |
> + APIC_INT_LEVELTRIG);
> }
>
network card timed out as usual ;)

Marcin