2004-09-27 19:25:31

by Paul Fulghum

[permalink] [raw]
Subject: 2.6.9-rc2-mm4 e100 enable_irq unbalanced from

The e100 module is generating a warning:

Sep 27 13:30:29 deimos kernel: e100: Intel(R) PRO/100 Network Driver, 3.1.4-NAPI
Sep 27 13:30:29 deimos kernel: e100: Copyright(c) 1999-2004 Intel Corporation
Sep 27 13:30:29 deimos kernel: e100: eth0: e100_probe: addr 0xfecfc000, irq 16, MAC addr 00:90:27:3A:C5:E3
Sep 27 13:30:29 deimos kernel: enable_irq(16) unbalanced from ec83ff33
Sep 27 13:30:29 deimos kernel: [<c010923f>] enable_irq+0xcf/0xe0
Sep 27 13:30:29 deimos kernel: [<ec83ff33>] e100_up+0xf3/0x1f0 [e100]
Sep 27 13:30:29 deimos kernel: [<ec83ff33>] e100_up+0xf3/0x1f0 [e100]
Sep 27 13:30:29 deimos kernel: [<ec83f410>] e100_intr+0x0/0x140 [e100]
Sep 27 13:30:29 deimos kernel: [<ec841131>] e100_open+0x31/0x80 [e100]
Sep 27 13:30:29 deimos kernel: [<c0318d4c>] dev_open+0x8c/0xa0
Sep 27 13:30:29 deimos kernel: [<c031cc74>] dev_mc_upload+0x24/0x40
Sep 27 13:30:29 deimos kernel: [<c031a4ea>] dev_change_flags+0x12a/0x150
Sep 27 13:30:29 deimos kernel: [<c0318c0d>] dev_load+0x2d/0x80
Sep 27 13:30:29 deimos kernel: [<c0355b37>] devinet_ioctl+0x277/0x730

e100_up calls disable_irq, request_irq, then enable_irq
as shown below.

static int e100_up(struct nic *nic)
{
...
disable_irq(nic->pdev->irq);
...
if((err = request_irq(nic->pdev->irq, e100_intr, SA_SHIRQ,
nic->netdev->name, nic->netdev)))
goto err_no_irq;
e100_enable_irq(nic);
enable_irq(nic->pdev->irq);
netif_wake_queue(nic->netdev);
return 0;
...
}

On this machine, the e100 is the only device on that IRQ.

request_irq calls setup_irq which clears the irq descriptor
depth member to 0 and enables the interrupt because this
is the first device to use that interrupt.
This results in the warning on the next enable_irq().

I'm not sure why the driver is calling disable_irq
IRQ before calling request_irq. You can't get that
interrupt until you call request_irq, and once you
call request_irq you can (at least when this is
the first device on that IRQ) even before the
call to enable_irq.

I suspect the correct thing is to remove
disable_irq/enable_irq from e100_up.
I don't see any purpose for these calls in e100_up.

--
Paul Fulghum
[email protected]


2004-09-27 21:18:13

by Paul Fulghum

[permalink] [raw]
Subject: Re: 2.6.9-rc2-mm4 e100 enable_irq unbalanced from

On Mon, 2004-09-27 at 14:24, Paul Fulghum wrote:
> The e100 module is generating a warning:
>
> Sep 27 13:30:29 deimos kernel: e100: Intel(R) PRO/100 Network Driver, 3.1.4-NAPI
> Sep 27 13:30:29 deimos kernel: e100: Copyright(c) 1999-2004 Intel Corporation
> Sep 27 13:30:29 deimos kernel: e100: eth0: e100_probe: addr 0xfecfc000, irq 16, MAC addr 00:90:27:3A:C5:E3
> Sep 27 13:30:29 deimos kernel: enable_irq(16) unbalanced from ec83ff33
> Sep 27 13:30:29 deimos kernel: [<c010923f>] enable_irq+0xcf/0xe0
> Sep 27 13:30:29 deimos kernel: [<ec83ff33>] e100_up+0xf3/0x1f0 [e100]

The following patch works for me and removes the warning.

The disable_irq/enable_irq is not needed because
the ISR can't be called before calling request_irq,
the hardware is initialized before calling request_irq,
and request_irq itself enables the interrupt if needed.

Comments?

--
Paul Fulghum
[email protected]

--- a/drivers/net/e100.c 2004-09-27 09:57:35.000000000 -0500
+++ b/drivers/net/e100.c 2004-09-27 16:00:12.115482112 -0500
@@ -1675,9 +1675,6 @@

if((err = e100_rx_alloc_list(nic)))
return err;
-
- disable_irq(nic->pdev->irq);
-
if((err = e100_alloc_cbs(nic)))
goto err_rx_clean_list;
if((err = e100_hw_init(nic)))
@@ -1689,7 +1686,6 @@
nic->netdev->name, nic->netdev)))
goto err_no_irq;
e100_enable_irq(nic);
- enable_irq(nic->pdev->irq);
netif_wake_queue(nic->netdev);
return 0;

@@ -1700,7 +1696,6 @@
err_rx_clean_list:
e100_rx_clean_list(nic);

- enable_irq(nic->pdev->irq);
return err;
}



2004-09-27 23:00:42

by J.A. Magallon

[permalink] [raw]
Subject: Re: 2.6.9-rc2-mm4 e100 enable_irq unbalanced from


On 2004.09.27, Paul Fulghum wrote:
> The e100 module is generating a warning:
>
> Sep 27 13:30:29 deimos kernel: e100: Intel(R) PRO/100 Network Driver, 3.1.4-NAPI
> Sep 27 13:30:29 deimos kernel: e100: Copyright(c) 1999-2004 Intel Corporation
> Sep 27 13:30:29 deimos kernel: e100: eth0: e100_probe: addr 0xfecfc000, irq 16, MAC addr 00:90:27:3A:C5:E3
> Sep 27 13:30:29 deimos kernel: enable_irq(16) unbalanced from ec83ff33
> Sep 27 13:30:29 deimos kernel: [<c010923f>] enable_irq+0xcf/0xe0
> Sep 27 13:30:29 deimos kernel: [<ec83ff33>] e100_up+0xf3/0x1f0 [e100]
> Sep 27 13:30:29 deimos kernel: [<ec83ff33>] e100_up+0xf3/0x1f0 [e100]
> Sep 27 13:30:29 deimos kernel: [<ec83f410>] e100_intr+0x0/0x140 [e100]
> Sep 27 13:30:29 deimos kernel: [<ec841131>] e100_open+0x31/0x80 [e100]
> Sep 27 13:30:29 deimos kernel: [<c0318d4c>] dev_open+0x8c/0xa0
> Sep 27 13:30:29 deimos kernel: [<c031cc74>] dev_mc_upload+0x24/0x40
> Sep 27 13:30:29 deimos kernel: [<c031a4ea>] dev_change_flags+0x12a/0x150
> Sep 27 13:30:29 deimos kernel: [<c0318c0d>] dev_load+0x2d/0x80
> Sep 27 13:30:29 deimos kernel: [<c0355b37>] devinet_ioctl+0x277/0x730
>

Just a 'me-too', with a slightly different oops:

e100: Intel(R) PRO/100 Network Driver, 3.1.4-NAPI
e100: Copyright(c) 1999-2004 Intel Corporation
ACPI: PCI interrupt 0000:00:0d.0[A] -> GSI 19 (level, low) -> IRQ 185
e100: eth0: e100_probe: addr 0xf7161000, irq 185, MAC addr 00:30:48:41:22:9F
enable_irq(185) unbalanced from f89b3e25
[<c0108973>] enable_irq+0xa3/0xf0
[<f89b3e25>] e100_up+0xd5/0x1e0 [e100]
[<f89b3e25>] e100_up+0xd5/0x1e0 [e100]
[<f89b4e9a>] e100_open+0x2a/0x90 [e100]
[<c0368a24>] dev_open+0x74/0x90
[<c0369f66>] dev_change_flags+0x56/0x130
[<c03a2f42>] devinet_ioctl+0x5f2/0x6a0
[<c03a4e5f>] inet_ioctl+0xdf/0x100
[<c0360563>] sock_ioctl+0x1b3/0x270
[<c015a709>] fget+0x49/0x60
[<c016b905>] sys_ioctl+0x205/0x270
[<c0105b0d>] sysenter_past_esp+0x52/0x71
e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex

--
J.A. Magallon <jamagallon()able!es> \ Software is like sex:
werewolf!able!es \ It's better when it's free
Mandrakelinux release 10.1 (Community) for i586
Linux 2.6.9-rc2-mm4 (gcc 3.4.1 (Mandrakelinux (Alpha 3.4.1-3mdk)) #2


2004-09-27 23:10:59

by Paul Fulghum

[permalink] [raw]
Subject: Re: 2.6.9-rc2-mm4 e100 enable_irq unbalanced from

On Mon, 2004-09-27 at 18:00, J.A. Magallon wrote:
> Just a 'me-too', with a slightly different oops:

Can you try the following patch please?

--
Paul Fulghum
[email protected]


--- a/drivers/net/e100.c 2004-09-27 09:57:35.000000000 -0500
+++ b/drivers/net/e100.c 2004-09-27 16:00:12.115482112 -0500
@@ -1675,9 +1675,6 @@

if((err = e100_rx_alloc_list(nic)))
return err;
-
- disable_irq(nic->pdev->irq);
-
if((err = e100_alloc_cbs(nic)))
goto err_rx_clean_list;
if((err = e100_hw_init(nic)))
@@ -1689,7 +1686,6 @@
nic->netdev->name, nic->netdev)))
goto err_no_irq;
e100_enable_irq(nic);
- enable_irq(nic->pdev->irq);
netif_wake_queue(nic->netdev);
return 0;

@@ -1700,7 +1696,6 @@
err_rx_clean_list:
e100_rx_clean_list(nic);

- enable_irq(nic->pdev->irq);
return err;
}


2004-09-28 21:04:31

by Feldman, Scott

[permalink] [raw]
Subject: RE: 2.6.9-rc2-mm4 e100 enable_irq unbalanced from


> I suspect the correct thing is to remove
> disable_irq/enable_irq from e100_up.
> I don't see any purpose for these calls in e100_up.

I don't either! This doesn't look right to me at all.

These enable_irq/disable_irq calls got added recently to -mm, probably
in the fix-for-spurious-interrupts-on-e100-resume-2.patch. Maybe just
the disable_irq call is all that is needed to solve the spurious
interrupt case?

Ganesh, can we back out patch out of -mm and go back to the drawing
board on the original problem? This patch will cause problems.

-scott

2004-09-28 22:16:52

by J.A. Magallon

[permalink] [raw]
Subject: Re: 2.6.9-rc2-mm4 e100 enable_irq unbalanced from


On 2004.09.28, Paul Fulghum wrote:
> On Mon, 2004-09-27 at 18:00, J.A. Magallon wrote:
> > Just a 'me-too', with a slightly different oops:
>
> Can you try the following patch please?
>

It has killed the messages, and I have not seen any side effects.
Network is working fine.

--
J.A. Magallon <jamagallon()able!es> \ Software is like sex:
werewolf!able!es \ It's better when it's free
Mandrakelinux release 10.1 (Community) for i586
Linux 2.6.9-rc2-mm4 (gcc 3.4.1 (Mandrakelinux (Alpha 3.4.1-3mdk)) #2


2004-09-29 03:06:26

by Paul Fulghum

[permalink] [raw]
Subject: RE: 2.6.9-rc2-mm4 e100 enable_irq unbalanced from

On Tue, 2004-09-28 at 16:03, Feldman, Scott wrote:
> Maybe just
> the disable_irq call is all that is needed to solve the spurious
> interrupt case?

That would not work either.

request_irq() clears the enable/disable depth
only when registering the first device on that
interrupt.

If a device is sharing an interrupt with e100
and calls request_irq() before e100, then the
disable_irq() requires a matching enable_irq().

If e100 is the only or first device on
that interrupt, then the enable_irq() causes
the warning because the depth has been reset.

It is interesting behavior for request_irq.
I don't know if it was planned that way,
or is an unexpected artifact.
It makes the effect of the disable_irq call
indeterminate (to the driver) if made
before registering with the interrupt.

--
Paul Fulghum
[email protected]


2004-09-29 13:16:03

by Alan

[permalink] [raw]
Subject: RE: 2.6.9-rc2-mm4 e100 enable_irq unbalanced from

On Mer, 2004-09-29 at 04:06, Paul Fulghum wrote:
> It is interesting behavior for request_irq.
> I don't know if it was planned that way,
> or is an unexpected artifact.
> It makes the effect of the disable_irq call
> indeterminate (to the driver) if made
> before registering with the interrupt.

Essentially the code believes that you cannot use
disable_irq() until you've requested it. You can
however in 2.6 call request_irq with local interrupts
disabled.

We have a fundamental API design problem going back to
day one. The API IMHO should really be

struct irq *irq;
irq = allocate_irq(5, ...)
enable_irq(irq);

That would fix
- Drivers failing to load/init under freak low memory situations
- How to cleanly report irqs (because each irq can now have ->name)
- How to tell which shared irq users are disabled/enabled for the irq
poll/recovery code I posted (and is testing in -mm).

Unfortunately it would require changes to rather a lot of code.

2004-09-29 13:32:27

by Russell King

[permalink] [raw]
Subject: Re: 2.6.9-rc2-mm4 e100 enable_irq unbalanced from

On Wed, Sep 29, 2004 at 01:13:26PM +0100, Alan Cox wrote:
> We have a fundamental API design problem going back to
> day one. The API IMHO should really be
>
> struct irq *irq;
> irq = allocate_irq(5, ...)
> enable_irq(irq);
>
> That would fix
> - Drivers failing to load/init under freak low memory situations
> - How to cleanly report irqs (because each irq can now have ->name)
> - How to tell which shared irq users are disabled/enabled for the irq
> poll/recovery code I posted (and is testing in -mm).
>
> Unfortunately it would require changes to rather a lot of code.

I suggested something like this a while back on the linux-arch list
but it didn't particularly have a good reception from the other
arch maintainers. If there's interest in it, I could dig it out.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2004-09-30 18:22:39

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.9-rc2-mm4 e100 enable_irq unbalanced from

On Mon, 2004-09-27 at 16:12 -0500, Paul Fulghum wrote:
> On Mon, 2004-09-27 at 14:24, Paul Fulghum wrote:
> > The e100 module is generating a warning:
> >
> > Sep 27 13:30:29 deimos kernel: e100: Intel(R) PRO/100 Network Driver, 3.1.4-NAPI
> > Sep 27 13:30:29 deimos kernel: e100: Copyright(c) 1999-2004 Intel Corporation
> > Sep 27 13:30:29 deimos kernel: e100: eth0: e100_probe: addr 0xfecfc000, irq 16, MAC addr 00:90:27:3A:C5:E3
> > Sep 27 13:30:29 deimos kernel: enable_irq(16) unbalanced from ec83ff33
> > Sep 27 13:30:29 deimos kernel: [<c010923f>] enable_irq+0xcf/0xe0
> > Sep 27 13:30:29 deimos kernel: [<ec83ff33>] e100_up+0xf3/0x1f0 [e100]
>
> The following patch works for me and removes the warning.
>
> The disable_irq/enable_irq is not needed because
> the ISR can't be called before calling request_irq,
> the hardware is initialized before calling request_irq,
> and request_irq itself enables the interrupt if needed.
>
> Comments?

Hi,

That change was my fault, to address a problem with doing a resume from
suspend. The e100 in my laptop seems to emit a spurious interrupt on
resume, and the intention was to block interrupts until the handler had
been installed - otherwise I get a "got interrupt X and nobody cared"
message.

While this works for me (probably because IRQ11 is heavily shared), it's
apparently pretty bogus. But it did achieve the goal of getting the
problem looked at...

J