2004-09-09 22:39:25

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH] Fix for spurious interrupts on e100 resume

I've been having problems with spurious interrupts being raised when the
e100 driver resets the chip during a resume:
irq 11: nobody cared!
[<c0105937>] dump_stack+0x17/0x20
[<c0107457>] __report_bad_irq+0x27/0x90
[<c0107a1b>] note_interrupt+0x6b/0x1f0
[<c0108132>] do_IRQ+0x272/0x360
[<c010533c>] common_interrupt+0x18/0x20
[<c012732b>] do_softirq+0x2b/0x30
[<c01080d7>] do_IRQ+0x217/0x360
[<c010533c>] common_interrupt+0x18/0x20
[<c01f7d4c>] __delay+0xc/0x10
[<f8b4806b>] e100_hw_reset+0x6b/0x90 [e100]
[<f8b48d70>] e100_hw_init+0x10/0x1490 [e100]
[<f8b4cd9f>] e100_up+0x3f/0x2e0 [e100]
[<f8b4f11f>] e100_resume+0x8f/0xb0 [e100]
[<c01fca62>] pci_device_resume+0x22/0x30
[<c023736a>] resume_device+0x1a/0x20
[<c02373dd>] dpm_resume+0x6d/0x70
[<c02373fc>] device_resume+0x1c/0x30
[<c01150d0>] suspend+0x540/0x7c0
[<c0115e33>] do_ioctl+0x123/0x180
[<c0185397>] sys_ioctl+0xb7/0x210
[<c010497d>] sysenter_past_esp+0x52/0x71
handlers:
[<c02782c0>] (yenta_interrupt+0x0/0x30)
[<f89c3310>] (radeon_driver_irq_handler+0x0/0xb0 [radeon])
[<f88c9320>] (usb_hcd_irq+0x0/0x60 [usbcore])
Disabling IRQ #11


The interrupt seems to be generated during the [um]sleep after one of
the writel's in e100_hw_reset. To fix this, I simply moved the call to
e100_disable_irq() to be first in the function, which seems to work
well.

J

Patch:

On resume, the e100 chip seems to raise an interrupt during chip reset.
Since there's no IRQ handler registered yet, the kernel complains that
"nobody cared" about the interrupt. This change moves the call to
e100_disable_irq to be first in e100_hw_reset() so there are no spurious
interrupts.


drivers/net/e100.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)

diff -puN drivers/net/e100.c~e100-restore-irq drivers/net/e100.c
--- local-2.6/drivers/net/e100.c~e100-restore-irq 2004-09-09 12:52:33.000000000 -0700
+++ local-2.6-jeremy/drivers/net/e100.c 2004-09-09 12:53:18.000000000 -0700
@@ -587,6 +587,10 @@ static inline void e100_disable_irq(stru

static void e100_hw_reset(struct nic *nic)
{
+ /* Mask off our interrupt line - it's unmasked after reset
+ Do it early to prevent spurious interrupts. */
+ e100_disable_irq(nic);
+
/* Put CU and RU into idle with a selective reset to get
* device off of PCI bus */
writel(selective_reset, &nic->csr->port);
@@ -605,9 +609,6 @@ static void e100_hw_reset(struct nic *ni
writeb(cuc_load_base, &nic->csr->scb.cmd_lo);
mdelay(20);
}
-
- /* Mask off our interrupt line - it's unmasked after reset */
- e100_disable_irq(nic);
}

static int e100_self_test(struct nic *nic)

_
Signed-off-by: Jeremy Fitzhardinge <[email protected]>


2004-09-09 23:57:43

by Nathan Bryant

[permalink] [raw]
Subject: Re: [PATCH] Fix for spurious interrupts on e100 resume

Jeremy Fitzhardinge wrote:

> static void e100_hw_reset(struct nic *nic)
> {
>+ /* Mask off our interrupt line - it's unmasked after reset
>+ Do it early to prevent spurious interrupts. */
>+ e100_disable_irq(nic);
>+
> /* Put CU and RU into idle with a selective reset to get
> * device off of PCI bus */
> writel(selective_reset, &nic->csr->port);
>@@ -605,9 +609,6 @@ static void e100_hw_reset(struct nic *ni
> writeb(cuc_load_base, &nic->csr->scb.cmd_lo);
> mdelay(20);
> }
>-
>- /* Mask off our interrupt line - it's unmasked after reset */
>- e100_disable_irq(nic);
> }
>
> static int e100_self_test(struct nic *nic)
>
You sure this is right? The comment seems to imply that writing the
reset command to the registers also clears the interrupt mask. So you
might need to have the e100_disable_irq() both before and after the reset.

2004-09-10 00:57:04

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Fix for spurious interrupts on e100 resume

On Thu, 2004-09-09 at 19:57 -0400, Nathan Bryant wrote:
> You sure this is right? The comment seems to imply that writing the
> reset command to the registers also clears the interrupt mask. So you
> might need to have the e100_disable_irq() both before and after the reset.

Good point - I'm not sure. I don't really know what would cause the
chip to raise an interrupt. Could it do it spontaneously, in which case
there's always a window between poking the selective/software_reset and
the disable_irq? Or is it that the chip raises an interrupt on reset,
which is masked by disabling the interrupt, but then it enables the
interrupt after the reset is complete? Or perhaps "it's unmasked after
reset" means "after a hardware reset (power cycle)" and not by poking
the reset registers.

Well, there was definitely a real bug there before, and it works with a
bit of testing. We'll see how it goes over the next few days (and maybe
someone who actually understands this hardware will weigh in).

J

2004-09-13 22:59:27

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Fix for spurious interrupts on e100 resume

On Thu, 2004-09-09 at 15:36 -0700, Jeremy Fitzhardinge wrote:
> I've been having problems with spurious interrupts being raised when the
> e100 driver resets the chip during a resume:

OK, that patch didn't really work terribly well - the interrupt still
happens. I've changed it to simply disable the interrupt during e100_up
(), which does seem to work properly.

J



On resume, the e100 chip seems to raise an interrupt during chip reset.
Since there's no IRQ handler registered yet, the kernel complains that
"nobody cared" about the interrupt. This change disables the IRQ during
e100_up(), while the hardware is being (re-)initialized.


drivers/net/e100.c | 6 ++++++
1 files changed, 6 insertions(+)

diff -puN drivers/net/e100.c~e100-restore-irq drivers/net/e100.c
--- local-2.6/drivers/net/e100.c~e100-restore-irq 2004-09-13 13:38:27.000000000 -0700
+++ local-2.6-jeremy/drivers/net/e100.c 2004-09-13 13:38:28.075416972 -0700
@@ -1678,6 +1678,9 @@ static int e100_up(struct nic *nic)

if((err = e100_rx_alloc_list(nic)))
return err;
+
+ disable_irq(nic->pdev->irq);
+
if((err = e100_alloc_cbs(nic)))
goto err_rx_clean_list;
if((err = e100_hw_init(nic)))
@@ -1689,6 +1692,7 @@ static int e100_up(struct nic *nic)
nic->netdev->name, nic->netdev)))
goto err_no_irq;
e100_enable_irq(nic);
+ enable_irq(nic->pdev->irq);
netif_wake_queue(nic->netdev);
return 0;

@@ -1698,6 +1702,8 @@ err_clean_cbs:
e100_clean_cbs(nic);
err_rx_clean_list:
e100_rx_clean_list(nic);
+
+ enable_irq(nic->pdev->irq);
return err;
}


_
Signed-off-by: Jeremy Fitzhardinge <[email protected]>

2004-09-30 16:26:55

by Venkatesan, Ganesh

[permalink] [raw]
Subject: RE: [PATCH] Fix for spurious interrupts on e100 resume

Andrew:

I propose that we remove this patch from the -mm tree. We will work on a
clean solution and send a patch soon. Please see further discussion on
this under the subject "2.6.9-rc2-mm4 e100 enable_irq unbalanced from"

Ganesh.

-----Original Message-----
From: Jeremy Fitzhardinge [mailto:[email protected]]
Sent: Monday, September 13, 2004 3:53 PM
To: Ronciak, John
Cc: Venkatesan, Ganesh; Feldman, Scott; Andrew Morton; linux-kernel
Subject: Re: [PATCH] Fix for spurious interrupts on e100 resume

On Thu, 2004-09-09 at 15:36 -0700, Jeremy Fitzhardinge wrote:
> I've been having problems with spurious interrupts being raised when
the
> e100 driver resets the chip during a resume:

OK, that patch didn't really work terribly well - the interrupt still
happens. I've changed it to simply disable the interrupt during e100_up
(), which does seem to work properly.

J



On resume, the e100 chip seems to raise an interrupt during chip reset.
Since there's no IRQ handler registered yet, the kernel complains that
"nobody cared" about the interrupt. This change disables the IRQ during
e100_up(), while the hardware is being (re-)initialized.


drivers/net/e100.c | 6 ++++++
1 files changed, 6 insertions(+)

diff -puN drivers/net/e100.c~e100-restore-irq drivers/net/e100.c
--- local-2.6/drivers/net/e100.c~e100-restore-irq 2004-09-13
13:38:27.000000000 -0700
+++ local-2.6-jeremy/drivers/net/e100.c 2004-09-13 13:38:28.075416972
-0700
@@ -1678,6 +1678,9 @@ static int e100_up(struct nic *nic)

if((err = e100_rx_alloc_list(nic)))
return err;
+
+ disable_irq(nic->pdev->irq);
+
if((err = e100_alloc_cbs(nic)))
goto err_rx_clean_list;
if((err = e100_hw_init(nic)))
@@ -1689,6 +1692,7 @@ static int e100_up(struct nic *nic)
nic->netdev->name, nic->netdev)))
goto err_no_irq;
e100_enable_irq(nic);
+ enable_irq(nic->pdev->irq);
netif_wake_queue(nic->netdev);
return 0;

@@ -1698,6 +1702,8 @@ err_clean_cbs:
e100_clean_cbs(nic);
err_rx_clean_list:
e100_rx_clean_list(nic);
+
+ enable_irq(nic->pdev->irq);
return err;
}


_
Signed-off-by: Jeremy Fitzhardinge <[email protected]>

2004-09-30 19:02:58

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: RE: [PATCH] Fix for spurious interrupts on e100 resume

On Thu, 2004-09-30 at 09:26 -0700, Venkatesan, Ganesh wrote:
> I propose that we remove this patch from the -mm tree. We will work on a
> clean solution and send a patch soon. Please see further discussion on
> this under the subject "2.6.9-rc2-mm4 e100 enable_irq unbalanced from"

Fine by me. It was just a sleazy hack to prompt a proper fix. I'm
using the eepro100 driver for a bit anyway, to see how it fares.

J

2004-09-30 19:39:55

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix for spurious interrupts on e100 resume

"Venkatesan, Ganesh" <[email protected]> wrote:
>
> I propose that we remove this patch from the -mm tree. We will work on a
> clean solution and send a patch soon. Please see further discussion on
> this under the subject "2.6.9-rc2-mm4 e100 enable_irq unbalanced from"

OK, thanks. I'll retain the current patch in -mm until the problem is
fixed for real. We wouldn't want to be forgetting about it ;)