2008-11-16 12:37:37

by Alan Jenkins

[permalink] [raw]
Subject: atl2: don't mess up irq on suspend/resume (probably msi only)

diff --git a/drivers/net/atlx/atl2.c b/drivers/net/atlx/atl2.c
index f5bdc92..bfc334a 100644
--- a/drivers/net/atlx/atl2.c
+++ b/drivers/net/atlx/atl2.c
@@ -1553,6 +1553,7 @@ static int atl2_suspend(struct pci_dev *pdev, pm_message_t state)
if (netif_running(netdev)) {
WARN_ON(test_bit(__ATL2_RESETTING, &adapter->flags));
atl2_down(adapter);
+ atl2_free_irq(adapter);
}

#ifdef CONFIG_PM
@@ -1654,9 +1655,6 @@ wol_dis:
pci_enable_wake(pdev, pci_choose_state(pdev, state), 0);

suspend_exit:
- if (netif_running(netdev))
- atl2_free_irq(adapter);
-
pci_disable_device(pdev);

pci_set_power_state(pdev, pci_choose_state(pdev, state));


Attachments:
msi (658.00 B)

2008-11-17 21:43:30

by Alan Jenkins

[permalink] [raw]
Subject: Re: atl2: don't mess up irq on suspend/resume (probably msi only)

Alan Jenkins wrote:
> According to Documentation/power/pci.txt, the correct suspend sequence is:
>
> A reference implementation
> -------------------------
> .suspend()
> {
> /* driver specific operations */
>
> /* Disable IRQ */
> free_irq();
> /* If using MSI */
> pci_disable_msi();
>
> pci_save_state();
> pci_enable_wake();
> /* Disable IO/bus master/irq router */
> pci_disable_device();
> pci_set_power_state(pci_choose_state());
> }
>
>
> Whereas atl2 calls pci_save_state() *before* free_irq() and
> pci_disable_msi(). Hilarity ensues on resume as pci_restore_state()
> restores a state with MSI enabled; the atl2 resume handler then enables
> msi for the second time. Looking at /proc/interrupts, it seems to leak
> an irq each time this happens. (I had 6 entries for "eth0" by the time
> I noticed this). It also triggers a warning.
>
> I was able to avoid the warning + leak by re-ordering the resume
> method. Patch attached - it's just a few lines, I hope that's all
> that's necessary.
>

Don't listen to me. The problem went away because when I was testing my
patch, I plugged the ethernet in and actually used it. My patch doesn't
seem to fix anything.

So I now have the steps to reproduce this warning: remove the ethernet
cable and suspend.

Also, I was speaking rubbish because the warning isn't on the resume
path. It seems to be when network-manager re-opens the device on resume.

I'll have another look for possible culprits. Hints appreciated :).

> ------------[ cut here ]------------
> WARNING: at drivers/pci/msi.c:570 pci_enable_msi+0x3a/0x1cd()
> Modules linked in: cpufreq_stats freq_table nls_iso8859_1 nls_cp437 vfat fat rfkill_input ae
> Pid: 2620, comm: NetworkManager Tainted: G W 2.6.28-rc4eeepc #136
> Call Trace:
> [<c011b89a>] warn_on_slowpath+0x40/0x59
> [<c023df15>] raw_pci_read+0x4d/0x55
> [<c023dff3>] pci_read+0x1c/0x21
> [<c01bcd81>] __pci_find_next_cap_ttl+0x44/0x70
> [<c01bce86>] __pci_find_next_cap+0x1a/0x1f
> [<c01bcef9>] pci_find_capability+0x28/0x2c
> [<c01c4144>] pci_msi_check_device+0x53/0x62
> [<c01c49c2>] pci_enable_msi+0x3a/0x1cd
> [<e019f17b>] atl2_write_phy_reg+0x40/0x5f [atl2]
> [<c01061b1>] dma_generic_alloc_coherent+0x0/0xd7
> [<e019f107>] atl2_request_irq+0x15/0x49 [atl2]
> [<e01a1481>] atl2_open+0x20b/0x297 [atl2]
> [<c024a35c>] dev_open+0x62/0x91
> [<c0248b9a>] dev_change_flags+0x93/0x141
> [<c024f308>] do_setlink+0x238/0x2d5
> [<c02501b2>] rtnl_setlink+0xa9/0xbf
> [<c0297f0c>] mutex_lock+0xb/0x19
> [<c024ffa7>] rtnl_dump_ifinfo+0x0/0x69
> [<c0250109>] rtnl_setlink+0x0/0xbf
> [<c024fe42>] rtnetlink_rcv_msg+0x185/0x19f
> [<c0240fd1>] sock_rmalloc+0x23/0x57
> [<c024fcbd>] rtnetlink_rcv_msg+0x0/0x19f
> [<c0259457>] netlink_rcv_skb+0x2d/0x71
> [<c024fcb7>] rtnetlink_rcv+0x14/0x1a
> [<c025929e>] netlink_unicast+0x184/0x1e4
> [<c025992a>] netlink_sendmsg+0x233/0x240
> [<c023f405>] sock_sendmsg+0xb7/0xd0
> [<c0129131>] autoremove_wake_function+0x0/0x2b
> [<c0129131>] autoremove_wake_function+0x0/0x2b
> [<c0147796>] mempool_alloc+0x2d/0x9e
> [<c020c923>] scsi_pool_alloc_command+0x35/0x4f
> [<c0297f0c>] mutex_lock+0xb/0x19
> [<c028e867>] unix_stream_recvmsg+0x357/0x3e2
> [<c01b81c9>] copy_from_user+0x23/0x4f
> [<c02452ea>] verify_iovec+0x3e/0x6c
> [<c023f5ab>] sys_sendmsg+0x18d/0x1f0
> [<c023ffa8>] sys_recvmsg+0x146/0x1c8
> [<c0240016>] sys_recvmsg+0x1b4/0x1c8
> [<c0118f48>] __wake_up+0xf/0x15
> [<c02586cd>] netlink_table_ungrab+0x17/0x19
> [<c01b83ba>] copy_to_user+0x25/0x3b
> [<c023fe4a>] move_addr_to_user+0x50/0x68
> [<c0240266>] sys_getsockname+0x6f/0x9a
> [<c0240280>] sys_getsockname+0x89/0x9a
> [<c015046a>] do_wp_page+0x3ae/0x41a
> [<c0151525>] handle_mm_fault+0x4c5/0x540
> [<c02405d0>] sys_socketcall+0x176/0x1b0
> [<c010376d>] sysenter_do_call+0x12/0x21
> ---[ end trace ad80134e35e20aad ]---
>
>
>
>
>

2008-11-18 00:13:55

by J. K. Cliburn

[permalink] [raw]
Subject: Re: [atl1-devel] atl2: don't mess up irq on suspend/resume (probably msi only)

[adding netdev to cc list]

On Mon, 17 Nov 2008 21:43:16 +0000
Alan Jenkins <[email protected]> wrote:


> Don't listen to me. The problem went away because when I was testing
> my patch, I plugged the ethernet in and actually used it. My patch
> doesn't seem to fix anything.
>
> So I now have the steps to reproduce this warning: remove the ethernet
> cable and suspend.
>
> Also, I was speaking rubbish because the warning isn't on the resume
> path. It seems to be when network-manager re-opens the device on
> resume.
>
> I'll have another look for possible culprits. Hints appreciated :).

Does this help?

diff --git a/drivers/net/atlx/atl2.c b/drivers/net/atlx/atl2.c
index f5bdc92..5b292f0 100644
--- a/drivers/net/atlx/atl2.c
+++ b/drivers/net/atlx/atl2.c
@@ -1690,10 +1690,6 @@ static int atl2_resume(struct pci_dev *pdev)

ATL2_WRITE_REG(&adapter->hw, REG_WOL_CTRL, 0);

- err = atl2_request_irq(adapter);
- if (netif_running(netdev) && err)
- return err;
-
atl2_reset_hw(&adapter->hw);

if (netif_running(netdev))

2008-11-18 14:31:07

by Alan Jenkins

[permalink] [raw]
Subject: Re: [atl1-devel] atl2: don't mess up irq on suspend/resume (probably msi only)

Jay Cliburn wrote:
> [adding netdev to cc list]
>
> On Mon, 17 Nov 2008 21:43:16 +0000
> Alan Jenkins <[email protected]> wrote:
>
>
>
>> Don't listen to me. The problem went away because when I was testing
>> my patch, I plugged the ethernet in and actually used it. My patch
>> doesn't seem to fix anything.
>>
>> So I now have the steps to reproduce this warning: remove the ethernet
>> cable and suspend.
>>
>> Also, I was speaking rubbish because the warning isn't on the resume
>> path. It seems to be when network-manager re-opens the device on
>> resume.
>>
>> I'll have another look for possible culprits. Hints appreciated :).
>>
>
> Does this help?
>

Ah! Yes, that fixes it, thanks.

> diff --git a/drivers/net/atlx/atl2.c b/drivers/net/atlx/atl2.c
> index f5bdc92..5b292f0 100644
> --- a/drivers/net/atlx/atl2.c
> +++ b/drivers/net/atlx/atl2.c
> @@ -1690,10 +1690,6 @@ static int atl2_resume(struct pci_dev *pdev)
>
> ATL2_WRITE_REG(&adapter->hw, REG_WOL_CTRL, 0);
>
> - err = atl2_request_irq(adapter);
> - if (netif_running(netdev) && err)
> - return err;
> -
> atl2_reset_hw(&adapter->hw);
>
> if (netif_running(netdev))
>

2008-11-18 17:36:20

by Alan Jenkins

[permalink] [raw]
Subject: Re: [atl1-devel] atl2: don't mess up irq on suspend/resume (probably msi only)

Alan Jenkins wrote:
> Jay Cliburn wrote:
>
>> [adding netdev to cc list]
>>
>> On Mon, 17 Nov 2008 21:43:16 +0000
>> Alan Jenkins <[email protected]> wrote:
>>
>>
>>
>>
>>> Don't listen to me. The problem went away because when I was testing
>>> my patch, I plugged the ethernet in and actually used it. My patch
>>> doesn't seem to fix anything.
>>>
>>> So I now have the steps to reproduce this warning: remove the ethernet
>>> cable and suspend.
>>>
>>> Also, I was speaking rubbish because the warning isn't on the resume
>>> path. It seems to be when network-manager re-opens the device on
>>> resume.
>>>
>>> I'll have another look for possible culprits. Hints appreciated :).
>>>
>>>
>> Does this help?
>>
>>
>
> Ah! Yes, that fixes it, thanks.
>

Just noticed another problem. If I suspend with the cable plugged in, I
get a warning message about freeing the IRQ:

[ 331.023417] atl2 0000:03:00.0: restoring config space at offset 0x1
(was 0x100006, writing 0x100406)
[ 331.023453] atl2 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 331.023462] atl2 0000:03:00.0: setting latency timer to 64
[ 331.043449] ath5k_pci 0000:01:00.0: restoring config space at offset
0xf (was 0x100, writing 0x10a)
...
[ 332.681591] Restarting tasks ... done.
[ 333.617398] hpet1: lost 2 rtc interrupts
[ 334.016700] Trying to free already-free IRQ 17
[ 334.057257] ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 334.082570] atl2 0000:03:00.0: irq 44 for MSI/MSI-X

2008-11-18 19:51:57

by Alan Jenkins

[permalink] [raw]
Subject: Re: [atl1-devel] atl2: don't mess up irq on suspend/resume (probably msi only)

Alan Jenkins wrote:
> Alan Jenkins wrote:
>> Jay Cliburn wrote:
>>> [adding netdev to cc list]
>>>
>>> On Mon, 17 Nov 2008 21:43:16 +0000
>>> Alan Jenkins <[email protected]> wrote:
>>>
>>>
>>>> Don't listen to me. The problem went away because when I was testing
>>>> my patch, I plugged the ethernet in and actually used it. My patch
>>>> doesn't seem to fix anything.
>>>>
>>>> So I now have the steps to reproduce this warning: remove the ethernet
>>>> cable and suspend.
>>>>
>>>> Also, I was speaking rubbish because the warning isn't on the resume
>>>> path. It seems to be when network-manager re-opens the device on
>>>> resume.
>>>>
>>>> I'll have another look for possible culprits. Hints appreciated :).
>>> Does this help?
>>
>> Ah! Yes, that fixes it, thanks.
>
> Just noticed another problem. If I suspend with the cable plugged in,
> I get a warning message about freeing the IRQ
which was caused by your patch. It works better (for both cable plugged and unplugged cases) if I do this instead:


diff --git a/drivers/net/atlx/atl2.c b/drivers/net/atlx/atl2.c
index f5bdc92..8571e8c 100644
--- a/drivers/net/atlx/atl2.c
+++ b/drivers/net/atlx/atl2.c
@@ -1690,9 +1690,11 @@ static int atl2_resume(struct pci_dev *pdev)

ATL2_WRITE_REG(&adapter->hw, REG_WOL_CTRL, 0);

- err = atl2_request_irq(adapter);
- if (netif_running(netdev) && err)
- return err;
+ if (netif_running(netdev)) {
+ err = atl2_request_irq(adapter);
+ if (err)
+ return err;
+ }

atl2_reset_hw(&adapter->hw);


2008-11-20 00:35:07

by J. K. Cliburn

[permalink] [raw]
Subject: Re: atl2: don't mess up irq on suspend/resume (probably msi only)

Can someone tell before I submit this patch if the description is sane?

atl2: don't request irq on resume if netif running

If the device is suspended with the cable disconnected, then
resumed with the cable connected, dev->open is called before
resume. During resume, we request an IRQ, but the IRQ was
already assigned during dev->open, resulting in the warning
shown below.

Don't request an IRQ if the device is running.

Call Trace:
[<c011b89a>] warn_on_slowpath+0x40/0x59
[<c023df15>] raw_pci_read+0x4d/0x55
[<c023dff3>] pci_read+0x1c/0x21
[<c01bcd81>] __pci_find_next_cap_ttl+0x44/0x70
[<c01bce86>] __pci_find_next_cap+0x1a/0x1f
[<c01bcef9>] pci_find_capability+0x28/0x2c
[<c01c4144>] pci_msi_check_device+0x53/0x62
[<c01c49c2>] pci_enable_msi+0x3a/0x1cd
[<e019f17b>] atl2_write_phy_reg+0x40/0x5f [atl2]
[<c01061b1>] dma_generic_alloc_coherent+0x0/0xd7
[<e019f107>] atl2_request_irq+0x15/0x49 [atl2]
[<e01a1481>] atl2_open+0x20b/0x297 [atl2]
[<c024a35c>] dev_open+0x62/0x91
[<c0248b9a>] dev_change_flags+0x93/0x141
[<c024f308>] do_setlink+0x238/0x2d5
[<c02501b2>] rtnl_setlink+0xa9/0xbf
[<c0297f0c>] mutex_lock+0xb/0x19
[<c024ffa7>] rtnl_dump_ifinfo+0x0/0x69
[<c0250109>] rtnl_setlink+0x0/0xbf
[<c024fe42>] rtnetlink_rcv_msg+0x185/0x19f
[<c0240fd1>] sock_rmalloc+0x23/0x57
[<c024fcbd>] rtnetlink_rcv_msg+0x0/0x19f
[<c0259457>] netlink_rcv_skb+0x2d/0x71
[<c024fcb7>] rtnetlink_rcv+0x14/0x1a
[<c025929e>] netlink_unicast+0x184/0x1e4
[<c025992a>] netlink_sendmsg+0x233/0x240
[<c023f405>] sock_sendmsg+0xb7/0xd0
[<c0129131>] autoremove_wake_function+0x0/0x2b
[<c0129131>] autoremove_wake_function+0x0/0x2b
[<c0147796>] mempool_alloc+0x2d/0x9e
[<c020c923>] scsi_pool_alloc_command+0x35/0x4f
[<c0297f0c>] mutex_lock+0xb/0x19
[<c028e867>] unix_stream_recvmsg+0x357/0x3e2
[<c01b81c9>] copy_from_user+0x23/0x4f
[<c02452ea>] verify_iovec+0x3e/0x6c
[<c023f5ab>] sys_sendmsg+0x18d/0x1f0
[<c023ffa8>] sys_recvmsg+0x146/0x1c8
[<c0240016>] sys_recvmsg+0x1b4/0x1c8
[<c0118f48>] __wake_up+0xf/0x15
[<c02586cd>] netlink_table_ungrab+0x17/0x19
[<c01b83ba>] copy_to_user+0x25/0x3b
[<c023fe4a>] move_addr_to_user+0x50/0x68
[<c0240266>] sys_getsockname+0x6f/0x9a
[<c0240280>] sys_getsockname+0x89/0x9a
[<c015046a>] do_wp_page+0x3ae/0x41a
[<c0151525>] handle_mm_fault+0x4c5/0x540
[<c02405d0>] sys_socketcall+0x176/0x1b0
[<c010376d>] sysenter_do_call+0x12/0x21

Signed-off-by: Alan Jenkins <[email protected]>
Signed-off-by: Jay Cliburn <[email protected]>
---
drivers/net/atlx/atl2.c | 8 +++++---
1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/atlx/atl2.c b/drivers/net/atlx/atl2.c
index f5bdc92..8571e8c 100644
--- a/drivers/net/atlx/atl2.c
+++ b/drivers/net/atlx/atl2.c
@@ -1690,9 +1690,11 @@ static int atl2_resume(struct pci_dev *pdev)

ATL2_WRITE_REG(&adapter->hw, REG_WOL_CTRL, 0);

- err = atl2_request_irq(adapter);
- if (netif_running(netdev) && err)
- return err;
+ if (netif_running(netdev)) {
+ err = atl2_request_irq(adapter);
+ if (err)
+ return err;
+ }

atl2_reset_hw(&adapter->hw);

--
1.5.6.5

2008-11-20 09:44:34

by David Miller

[permalink] [raw]
Subject: Re: atl2: don't mess up irq on suspend/resume (probably msi only)

From: Jay Cliburn <[email protected]>
Date: Wed, 19 Nov 2008 18:34:43 -0600

> Can someone tell before I submit this patch if the description is sane?
>
> atl2: don't request irq on resume if netif running

It looks good, mind if I just apply this as-is?

Thanks.

2008-11-20 11:35:00

by J. K. Cliburn

[permalink] [raw]
Subject: Re: [atl1-devel] atl2: don't mess up irq on suspend/resume (probably msi only)

On Thu, 20 Nov 2008 01:44:20 -0800 (PST)
David Miller <[email protected]> wrote:

> From: Jay Cliburn <[email protected]>
> Date: Wed, 19 Nov 2008 18:34:43 -0600
>
> > Can someone tell before I submit this patch if the description is
> > sane?
> >
> > atl2: don't request irq on resume if netif running
>
> It looks good, mind if I just apply this as-is?

Yes please. Go right ahead. Thanks.

2008-11-20 12:18:45

by David Miller

[permalink] [raw]
Subject: Re: [atl1-devel] atl2: don't mess up irq on suspend/resume (probably msi only)

From: Jay Cliburn <[email protected]>
Date: Thu, 20 Nov 2008 05:33:29 -0600

> On Thu, 20 Nov 2008 01:44:20 -0800 (PST)
> David Miller <[email protected]> wrote:
>
> > From: Jay Cliburn <[email protected]>
> > Date: Wed, 19 Nov 2008 18:34:43 -0600
> >
> > > Can someone tell before I submit this patch if the description is
> > > sane?
> > >
> > > atl2: don't request irq on resume if netif running
> >
> > It looks good, mind if I just apply this as-is?
>
> Yes please. Go right ahead. Thanks.

Done.