LinuxLists.cc - Re: 2.6.29 e100.c non-MII support status? (Re: [GIT]: Networking)

2009-03-01 10:58:03

Subject: Re: 2.6.29 e100.c non-MII support status? (Re: [GIT]: Networking)

On Sat, Feb 28, 2009 at 12:37 PM, Andreas Mohr
<[email protected]> wrote:
> Hi,
>
> On Tue, Dec 30, 2008 at 01:07:56PM +0100, Andreas Mohr wrote:
>> On Mon, Dec 29, 2008 at 03:15:37PM -0800, Jeff Kirsher wrote:
>> > He is referring to a patch to add support for devices which do not use
>> > the MII interface which use an Intel MAC. I have the patch in my
>> > local tree and have been keeping it up-to-date. I should have an
>> > update on this patch after the New Year.
>>
>> Thanks!
>>
>> One idea worth pondering might be to provide a special struct
>> mii_if_info hookup (fully emulated mdio_read()/write())
>> for the case of the non-MII hardware types of this driver,
>> thus eliminating any added penalty in the case of nicely MII-compliant
>> hardware.
>> But this should be tackled later, now let's better move on with eepro100
>> (non-)plans...
>>
>> Andreas Mohr
>
> Given that 2.6.29-rc now actually has eepro100.c removed (since -rc3 or so)
> and the current -rc6 e100.c does not contain non-MII parts, it looks like
> 2.6.29 proper will actually remove support for some PCI-based non-MII
> e100 cards.
>
> Thus I'd like to ask what should be done about this for real.
> Of course we could consciously decide to simply skip e100.c support for those
> non-MII cards entirely (hoping that they're not THAT wide-spread),
> but obviously I'd semi-hate this (nevermind the fact that _locally_
> I'd have to swap cards, too, which is somewhat more of a problem given an
> older network requiring to be served by "combo" cards).
>
> Simply spoken, it all boils down to actively deciding
> what to do in the next, about-to-be _affected_, kernel version
> about support for cards that an older, deprecated driver did support
> and the new one doesn't.
>
> Or one could just let things progress as-is and wait for
> yet unknown (non-)amounts of scorching user feedback to arrive
> after 2.6.29 has hit their machines ;)
>
>
> My description in the mail above was meant to list things to be done
> _after_ an initial non-MII support for e100.c has been added,
> just for clarification.
>
> Thanks,
>
> Andreas Mohr
> --

I am personally sorry that nothing has been done regarding this patch.
I still have this patch in my local tree. I admit that this patch
has been on the low priority list, so I will make sure the patch is
current this week and respond accordingly this week.

--
Cheers,
Jeff

2009-03-01 21:24:34

by Andreas Mohr

[permalink] [raw]

Subject: Re: 2.6.29 e100.c non-MII support status? (Re: [GIT]: Networking)

Hi,

On Sun, Mar 01, 2009 at 02:57:46AM -0800, Jeff Kirsher wrote:
> I am personally sorry that nothing has been done regarding this patch.
> I still have this patch in my local tree. I admit that this patch
> has been on the low priority list, so I will make sure the patch is
> current this week and respond accordingly this week.

The whole thing has been somewhat low-priority indeed
given the probable age of some non-MII contenders. OTOH it would be quite sad
(and locally problematic) to see support for those non-MII cards vanish
come 2.6.29.
Subsequently it would then be nice to devise non-MII support
in some structured way that enables us to support them without
much burden (code-wise) on proper standards-supporting MII cards.

Thank you,

Andreas Mohr

2009-06-02 21:49:04

by Andreas Mohr

[permalink] [raw]

Subject: [PATCH] Add non-MII PHY support to e100 (Re: 2.6.29 e100.c non-MII support status? (Re: [GIT]: Networking))

Hi,

On Sun, Mar 01, 2009 at 10:24:12PM +0100, Andreas Mohr wrote:
> The whole thing has been somewhat low-priority indeed
> given the probable age of some non-MII contenders. OTOH it would be quite sad
> (and locally problematic) to see support for those non-MII cards vanish
> come 2.6.29.
> Subsequently it would then be nice to devise non-MII support
> in some structured way that enables us to support them without
> much burden (code-wise) on proper standards-supporting MII cards.

OK, so.......

Frankly, I didn't want to see my card slowly and painfully getting unsupported
(which, conversely, results in a PC cut off from ANY kernel upgrades...).
Thus I decided to clean up my previous still-too-messy patch
by removing all dirty "are we MII?" - noooo - skip it! barriers.

ChangeLog:
Restore support for cards with MII-lacking PHYs as compared to
removed pre-2.6.29 eepro100 driver:
use full low-level MII I/O access abstraction by providing clean
PHY-specific mdio_ctrl() functions for either standard MII-compliant cards,
slightly special ones or non-MII PHY ones.

We now have another netdev_priv member for mdio_ctrl(),
thus we have some array indirection, but we save some additional opcodes
for specific phy_82552_v handling in the common case.

Patch tested on 2.6.30-rc7 (cold boot results in working network
access - this mail typed.....), checkpatch.pl'ed.

Obviously this patch needs quite some testing (-mmotm?),
since it also touches phy_82552_v functionality.

Oh, and this will be my very ultimately last attempt at this,
if there's no serious inclusion of something similar to this,
then I'll just give up.
[and install my nice WGA'd DRM'd and billed/invaded-by-the-minute Windows...]

Thanks!

Signed-off-by: Andreas Mohr <[email protected]>

--- linux-2.6.30-rc7/drivers/net/e100.c.orig 2009-06-02 19:27:19.000000000 +0200
+++ linux-2.6.30-rc7/drivers/net/e100.c 2009-06-02 23:31:25.000000000 +0200
@@ -143,6 +143,8 @@
* FIXES:
* 2005/12/02 - Michael O'Donnell <Michael.ODonnell at stratus dot com>
* - Stratus87247: protect MDI control register manipulations
+ * 2009/06/01 - Andreas Mohr <andi at lisas dot de>
+ * - add clean lowlevel I/O emulation for cards with MII-lacking PHYs
*/

#include <linux/module.h>
@@ -372,6 +374,7 @@

enum eeprom_offsets {
eeprom_cnfg_mdix = 0x03,
+ eeprom_phy_iface = 0x06,
eeprom_id = 0x0A,
eeprom_config_asf = 0x0D,
eeprom_smbus_addr = 0x90,
@@ -381,6 +384,18 @@
eeprom_mdix_enabled = 0x0080,
};

+enum eeprom_phy_iface {
+ NonSuchPhy = 0,
+ I82553AB,
+ I82553C,
+ I82503,
+ DP83840,
+ S80C240,
+ S80C24,
+ I82555,
+ DP83840A = 10,
+};
+
enum eeprom_id {
eeprom_id_wol = 0x0020,
};
@@ -545,6 +560,7 @@
u32 msg_enable ____cacheline_aligned;
struct net_device *netdev;
struct pci_dev *pdev;
+ u16 (*mdio_ctrl)(struct nic *nic, u32 addr, u32 dir, u32 reg, u16 data);

struct rx *rxs ____cacheline_aligned;
struct rx *rx_to_use;
@@ -899,7 +915,21 @@
return err;
}

-static u16 mdio_ctrl(struct nic *nic, u32 addr, u32 dir, u32 reg, u16 data)
+static int mdio_read(struct net_device *netdev, int addr, int reg)
+{
+ struct nic *nic = netdev_priv(netdev);
+ return nic->mdio_ctrl(nic, addr, mdi_read, reg, 0);
+}
+
+static void mdio_write(struct net_device *netdev, int addr, int reg, int data)
+{
+ struct nic *nic = netdev_priv(netdev);
+
+ nic->mdio_ctrl(nic, addr, mdi_write, reg, data);
+}
+
+/* the standard mdio_ctrl() function for usual MII-compliant hardware */
+static u16 mdio_ctrl_hw(struct nic *nic, u32 addr, u32 dir, u32 reg, u16 data)
{
u32 data_out = 0;
unsigned int i;
@@ -938,30 +968,83 @@
return (u16)data_out;
}

-static int mdio_read(struct net_device *netdev, int addr, int reg)
-{
- return mdio_ctrl(netdev_priv(netdev), addr, mdi_read, reg, 0);
+/* slightly tweaked mdio_ctrl() function for phy_82552_v specifics */
+static u16 mdio_ctrl_phy_82552_v(struct nic *nic,
+ u32 addr,
+ u32 dir,
+ u32 reg,
+ u16 data)
+{
+ if ((reg == MII_BMCR) && (dir == mdi_write)) {
+ if (data & (BMCR_ANRESTART | BMCR_ANENABLE)) {
+ u16 advert = mdio_read(nic->netdev, nic->mii.phy_id,
+ MII_ADVERTISE);
+
+ /*
+ * Workaround Si issue where sometimes the part will not
+ * autoneg to 100Mbps even when advertised.
+ */
+ if (advert & ADVERTISE_100FULL)
+ data |= BMCR_SPEED100 | BMCR_FULLDPLX;
+ else if (advert & ADVERTISE_100HALF)
+ data |= BMCR_SPEED100;
+ }
+ }
+ return mdio_ctrl_hw(nic, addr, dir, reg, data);
}

-static void mdio_write(struct net_device *netdev, int addr, int reg, int data)
-{
- struct nic *nic = netdev_priv(netdev);
-
- if ((nic->phy == phy_82552_v) && (reg == MII_BMCR) &&
- (data & (BMCR_ANRESTART | BMCR_ANENABLE))) {
- u16 advert = mdio_read(netdev, nic->mii.phy_id, MII_ADVERTISE);
-
- /*
- * Workaround Si issue where sometimes the part will not
- * autoneg to 100Mbps even when advertised.
- */
- if (advert & ADVERTISE_100FULL)
- data |= BMCR_SPEED100 | BMCR_FULLDPLX;
- else if (advert & ADVERTISE_100HALF)
- data |= BMCR_SPEED100;
+/* Fully software-emulated mdio_ctrl() function for cards without
+ * MII-compliant PHYs.
+ * For now, this is mainly geared towards 80c24 support; in case of further
+ * requirements for other types (i82503, ...?) either extend this mechanism
+ * or split it, whichever is cleaner.
+ */
+static u16 mdio_ctrl_phy_mii_emulated(struct nic *nic,
+ u32 addr,
+ u32 dir,
+ u32 reg,
+ u16 data)
+{
+ /* might need to allocate a netdev_priv'ed register array eventually
+ * to be able to record state changes, but for now
+ * some fully hardcoded register handling ought to be ok I guess. */
+
+ if (dir == mdi_read) {
+ switch (reg) {
+ case MII_BMCR:
+ /* Auto-negotiation, right? */
+ return BMCR_ANENABLE |
+ BMCR_FULLDPLX;
+ case MII_BMSR:
+ return BMSR_LSTATUS /* for mii_link_ok() */ |
+ BMSR_ANEGCAPABLE |
+ BMSR_10FULL;
+ case MII_ADVERTISE:
+ /* 80c24 is a "combo card" PHY, right? */
+ return ADVERTISE_10HALF |
+ ADVERTISE_10FULL;
+ default:
+ DPRINTK(HW, DEBUG,
+ "%s:addr=%d, reg=%d, data=0x%04X: unimplemented emulation!\n",
+ dir == mdi_read ? "READ" : "WRITE", addr, reg, data);
+ return 0xFFFF;
+ }
+ } else {
+ switch (reg) {
+ default:
+ DPRINTK(HW, DEBUG,
+ "%s:addr=%d, reg=%d, data=0x%04X: unimplemented emulation!\n",
+ dir == mdi_read ? "READ" : "WRITE", addr, reg, data);
+ return 0xFFFF;
+ }
}
-
- mdio_ctrl(netdev_priv(netdev), addr, mdi_write, reg, data);
+}
+static inline int e100_phy_supports_mii(struct nic *nic)
+{
+ /* for now, just check it by comparing whether we
+ are using MII software emulation.
+ */
+ return (nic->mdio_ctrl != mdio_ctrl_phy_mii_emulated);
}

static void e100_get_defaults(struct nic *nic)
@@ -1013,7 +1096,8 @@
config->standard_stat_counter = 0x1; /* 1=standard, 0=extended */
config->rx_discard_short_frames = 0x1; /* 1=discard, 0=pass */
config->tx_underrun_retry = 0x3; /* # of underrun retries */
- config->mii_mode = 0x1; /* 1=MII mode, 0=503 mode */
+ if (e100_phy_supports_mii(nic))
+ config->mii_mode = 1; /* 1=MII mode, 0=i82503 mode */
config->pad10 = 0x6;
config->no_source_addr_insertion = 0x1; /* 1=no, 0=yes */
config->preamble_length = 0x2; /* 0=1, 1=3, 2=7, 3=15 bytes */
@@ -1270,6 +1354,44 @@
offsetof(struct mem, dump_buf));
}

+static int e100_phy_check_without_mii(struct nic *nic)
+{
+ u8 phy_type;
+ int without_mii;
+
+ phy_type = (nic->eeprom[eeprom_phy_iface] >> 8) & 0x0f;
+
+ switch (phy_type) {
+ case NonSuchPhy: /* Non-MII PHY; UNTESTED! */
+ case I82503: /* Non-MII PHY; UNTESTED! */
+ case S80C24: /* Non-MII PHY; tested and working */
+ {
+ /* paragraph from the FreeBSD driver, "FXP_PHY_80C24":
+ * The Seeq 80c24 AutoDUPLEX(tm) Ethernet Interface Adapter
+ * doesn't have a programming interface of any sort. The
+ * media is sensed automatically based on how the link partner
+ * is configured. This is, in essence, manual configuration.
+ */
+ DPRINTK(PROBE, INFO,
+ "found MII-less i82503 or 80c24 or other PHY\n");
+
+ nic->mdio_ctrl = mdio_ctrl_phy_mii_emulated;
+ nic->mii.phy_id = 0; /* is this ok for an MII-less PHY? */
+
+ /* these might be needed for certain MII-less cards...
+ * nic->flags |= ich;
+ * nic->flags |= ich_10h_workaround; */
+
+ without_mii = 1;
+ }
+ break;
+ default:
+ without_mii = 0;
+ break;
+ }
+ return without_mii;
+}
+
#define NCONFIG_AUTO_SWITCH 0x0080
#define MII_NSC_CONG MII_RESV1
#define NSC_CONG_ENABLE 0x0100
@@ -1290,9 +1412,21 @@
if (!((bmcr == 0xFFFF) || ((stat == 0) && (bmcr == 0))))
break;
}
- DPRINTK(HW, DEBUG, "phy_addr = %d\n", nic->mii.phy_id);
- if (addr == 32)
- return -EAGAIN;
+ if (addr == 32) {
+ /* uhoh, no PHY detected: check whether we seem to be some
+ * weird, rare variant which is *known* to not have any MII.
+ * But do this AFTER MII checking only, since this does
+ * lookup of EEPROM values which may easily be unreliable. */
+ if (e100_phy_check_without_mii(nic))
+ return 0; /* simply return and hope for the best */
+ else {
+ /* for unknown cases log a fatal error */
+ DPRINTK(HW, ERR,
+ "Failed to locate any known PHY, aborting.\n");
+ return -EAGAIN;
+ }
+ } else
+ DPRINTK(HW, DEBUG, "phy_addr = %d\n", nic->mii.phy_id);

/* Isolate all the PHY ids */
for (addr = 0; addr < 32; addr++)
@@ -1320,6 +1454,9 @@
if (nic->phy == phy_82552_v) {
u16 advert = mdio_read(netdev, nic->mii.phy_id, MII_ADVERTISE);

+ /* assign special tweaked mdio_ctrl() function */
+ nic->mdio_ctrl = mdio_ctrl_phy_82552_v;
+
/* Workaround Si not advertising flow-control during autoneg */
advert |= ADVERTISE_PAUSE_CAP | ADVERTISE_PAUSE_ASYM;
mdio_write(netdev, nic->mii.phy_id, MII_ADVERTISE, advert);
@@ -2585,6 +2722,7 @@
nic->netdev = netdev;
nic->pdev = pdev;
nic->msg_enable = (1 << debug) - 1;
+ nic->mdio_ctrl = mdio_ctrl_hw;
pci_set_drvdata(pdev, netdev);

if ((err = pci_enable_device(pdev))) {

2009-06-03 06:01:36

by Andreas Mohr

[permalink] [raw]

Subject: e100 kills S2R on my box, plus network drops dead

Hi,

following my patch I tested -rc8 with it, everything pretty fine so far,
except for a S2R attempt:

PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.02 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
Suspending console(s) (use no_console_suspend to debug)
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Stopping disk
ACPI handle has no context!
serial 00:09: disabled
ACPI handle has no context!
r8169 0000:02:0f.0: PME# enabled
ACPI handle has no context!
ACPI handle has no context!
e100 0000:02:07.0: PCI INT A disabled
pci_legacy_suspend(): e100_suspend+0x0/0x20 [e100] returns -5
pm_op(): pci_pm_suspend+0x0/0xd7 returns -5
PM: Device 0000:02:07.0 failed to suspend: error -5
PM: Some devices failed to suspend
firewire_ohci 0000:02:0e.0: restoring config space at offset 0xf (was
0x4020100, writing 0x402010b)
firewire_ohci 0000:02:0e.0: restoring config space at offset 0x5 (was
0x0, writing 0xfddf8000)

static int e100_suspend(struct pci_dev *pdev, pm_message_t state)
{
bool wake;
__e100_shutdown(pdev, &wake);
return __e100_power_off(pdev, wake);
}

static int __e100_power_off(struct pci_dev *pdev, bool wake)
{
if (wake) {
return pci_prepare_to_sleep(pdev);
} else {
pci_wake_from_d3(pdev, false);
return pci_set_power_state(pdev, PCI_D3hot);
}
}

Well, the problem being that my card does not _have_ any PM support:

lspci -vvv:

02:07.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet
Pro 100 (rev 01)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 32 (2000ns min, 14000ns max)
Interrupt: pin A routed to IRQ 21
Region 0: Memory at fdaff000 (32-bit, prefetchable) [size=4K]
Region 1: I/O ports at df00 [size=32]
Region 2: Memory at fdc00000 (32-bit, non-prefetchable)
[size=1M]
[virtual] Expansion ROM at fdb00000 [disabled] [size=1M]
Kernel driver in use: e100

So I'm back up to the desktop rather quicker than I would have liked.

Worse, after resume I don't have my network back, and attempting to
unload e100 or ifconfig eth0 down results in this:

INFO: task nmbd:4633 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
nmbd D 00000061 0 4633 1
f56f7d14 00000082 0410aa36 00000061 00000100 f6716240 f61a7200 c0563740
c0563740 f6716000 f56f7cd0 f61f3000 f61f3284 c1f1f740 00000001 04132611
00000061 00000000 f56f7cfc c031fe90 f61a7200 40000040 f61f3284 f6716240
Call Trace:
[<c031fe90>] ? ip_push_pending_frames+0x2b6/0x2c0
[<c0336a8e>] ? udp_push_pending_frames+0x296/0x2e3
[<c0365bef>] __mutex_lock_common+0x136/0x239
[<c0365d04>] __mutex_lock_slowpath+0x12/0x15
[<c0365dbc>] ? mutex_lock+0x21/0x2e
[<c0365dbc>] mutex_lock+0x21/0x2e
[<c030b7ad>] rtnetlink_rcv+0x10/0x24
[<c0316723>] netlink_unicast+0xee/0x144
[<c0316996>] netlink_sendmsg+0x21d/0x22a
[<c02f800e>] sock_sendmsg+0xca/0xe1
[<c01352bf>] ? autoremove_wake_function+0x0/0x33
[<c01352bf>] ? autoremove_wake_function+0x0/0x33
[<c0183e88>] ? set_fd_set+0x38/0x3d
[<c011a47b>] ? __wake_up+0x31/0x3b
[<c022cb52>] ? might_fault+0x17/0x19
[<c022cb7e>] ? copy_from_user+0x2a/0x112
[<c02f825b>] sys_sendto+0xa4/0xc3
[<c02f893d>] ? move_addr_to_user+0x40/0x57
[<c02f8c63>] ? sys_getsockname+0x52/0x6f
[<c0199905>] ? inotify_d_instantiate+0x12/0x34
[<c0185f9f>] ? __d_instantiate+0x2d/0x30
[<c02f7d21>] ? sock_attach_fd+0x7e/0xab
[<c02f8ecc>] sys_socketcall+0xd5/0x16d
[<c01029f5>] syscall_call+0x7/0xb

IOW, we're deadlocking on the rtnl lock - something must have gone wrong network-wise
during suspend / emergency-resume handling.

IOW, we have _two_ issues:

- that PM suspend part here doesn't support non-PM PCI cards
- PM suspend breaks networking stuff (or is that caused by incomplete reinitialization of my card,
thus it's not network-suitable after resume and hangs on some network APIs?)

What to do?

(I should have provided some SysRq-T(?) lock traces I guess, will record that now)

Oh, and I will test whether eepro100 S2R works on that machine, and if
so what that driver does to avoid trouble.

Thanks,

Andreas Mohr

2009-06-03 06:30:37

by Andreas Mohr

[permalink] [raw]

Subject: Re: e100 kills S2R on my box, plus network drops dead

Hi,

On Wed, Jun 03, 2009 at 08:01:23AM +0200, Andreas Mohr wrote:
> IOW, we have _two_ issues:
>
> - that PM suspend part here doesn't support non-PM PCI cards
> - PM suspend breaks networking stuff (or is that caused by incomplete reinitialization of my card,
> thus it's not network-suitable after resume and hangs on some network APIs?)
>
> What to do?
>
> (I should have provided some SysRq-T(?) lock traces I guess, will record that now)

Those were not too useful methinks, but I can provide them if need be.

> Oh, and I will test whether eepro100 S2R works on that machine, and if
> so what that driver does to avoid trouble.

Well, yes, eepro100 (2.6.28.10) does achieve successful S2R,
and I _do_ have working network even after resume (which is not too astonishing
since it almost takes active measures to NOT make this card work,
given that it's fully auto-configuring on the transceiver side)

static int eepro100_suspend(struct pci_dev *pdev, pm_message_t state)
{
struct net_device *dev = pci_get_drvdata (pdev);
struct speedo_private *sp = netdev_priv(dev);
void __iomem *ioaddr = sp->regs;

pci_save_state(pdev);

if (!netif_running(dev))
return 0;

del_timer_sync(&sp->timer);

netif_device_detach(dev);
iowrite32(PortPartialReset, ioaddr + SCBPort);

/* XXX call pci_set_power_state ()? */
pci_disable_device(pdev);
pci_set_power_state (pdev, PCI_D3hot);
return 0;
}

That's pretty boring code, linearily executing stuff without checking
results, thus it's unsurprising that it does not prevent suspend
on a non-PM PCI card.

So, what to do to fix those suspend issues on the e100 side of things?
And do we perhaps have similar non-PM PCI card support issues
with other drivers?

Thanks,

Andreas Mohr

2009-06-13 19:19:43

by Andreas Mohr

[permalink] [raw]

Subject: [PATCH] Make e100 suspend handler support PCI cards lacking PM capability

Hi all,

after having added non-MII PHY card support to e100, I noticed that
the suspend handler rejects power-management non-capable PCI cards,
causing a S2R request to immediately get back up to the desktop,
losing network access in the process (rtnl mutex deadlock).

ChangeLog:
Support PCI cards which are lacking power management capability
in the e100 suspend handler.

Frankly I was unsure how to best add this to the driver in a clean way.
Usually drivers use pci_set_power_state(..., pci_choose_state(...))
in order to avoid the rejection of an open-coded
pci_set_power_state(..., PCI_D3hot) in case of a non-PM card,
however pci_choose_state() depends on the _pm-internal_ pm_message_t type,
which was doable in .suspend directly but not at the other e100
driver locations where it was used.

Next attempt was to extend __e100_power_off() with a pci_power_t parameter,
however since __e100_power_off() is called by two locations,
that meant that I'd have to use pci_choose_state() at _both_ call sites.

Thus I simply resorted to do a brute-force yet most simple
pci_find_capability() check in the __e100_power_off() function.

Tested on 2.6.30-rc8 and suspending/resuming fine, checkpatch.pl:ed.
Patch against 2.6.30-rc8 with my original non-MII support patch applied.
(should apply fine in any case, I'd think).
Intended for testing in -mmotm or so.

Thanks!

Signed-off-by: Andreas Mohr <[email protected]>

--- linux-2.6.30-rc8.e100/drivers/net/e100.c.my_patch_orig 2009-06-13 18:47:53.000000000 +0200
+++ linux-2.6.30-rc8.e100/drivers/net/e100.c 2009-06-13 20:27:46.000000000 +0200
@@ -2897,6 +2897,13 @@ static void __e100_shutdown(struct pci_d

static int __e100_power_off(struct pci_dev *pdev, bool wake)
{
+ /* some older devices don't support PCI PM
+ * (e.g. mac_82557_D100_B combo card with 80c24 PHY)
+ * - skip those! (they most likely won't support WoL either)
+ */
+ if (!pci_find_capability(pdev, PCI_CAP_ID_PM))
+ return 0;
+
if (wake) {
return pci_prepare_to_sleep(pdev);
} else {

2009-06-13 22:27:58

by Rafael J. Wysocki

[permalink] [raw]

Subject: Re: [PATCH] Make e100 suspend handler support PCI cards lacking PM capability

On Saturday 13 June 2009, Andreas Mohr wrote:
> Hi all,
>
> after having added non-MII PHY card support to e100, I noticed that
> the suspend handler rejects power-management non-capable PCI cards,

Well, that means we have a bug somewhere in the PCI PM core.

> causing a S2R request to immediately get back up to the desktop,
> losing network access in the process (rtnl mutex deadlock).

That's bad.

>
> ChangeLog:
> Support PCI cards which are lacking power management capability
> in the e100 suspend handler.
>
>
> Frankly I was unsure how to best add this to the driver in a clean way.
> Usually drivers use pci_set_power_state(..., pci_choose_state(...))
> in order to avoid the rejection of an open-coded
> pci_set_power_state(..., PCI_D3hot) in case of a non-PM card,
> however pci_choose_state() depends on the _pm-internal_ pm_message_t type,
> which was doable in .suspend directly but not at the other e100
> driver locations where it was used.
>
> Next attempt was to extend __e100_power_off() with a pci_power_t parameter,
> however since __e100_power_off() is called by two locations,
> that meant that I'd have to use pci_choose_state() at _both_ call sites.
>
> Thus I simply resorted to do a brute-force yet most simple
> pci_find_capability() check in the __e100_power_off() function.
>
>
> Tested on 2.6.30-rc8 and suspending/resuming fine, checkpatch.pl:ed.
> Patch against 2.6.30-rc8 with my original non-MII support patch applied.
> (should apply fine in any case, I'd think).
> Intended for testing in -mmotm or so.
>
> Thanks!
>
> Signed-off-by: Andreas Mohr <[email protected]>
>
>
> --- linux-2.6.30-rc8.e100/drivers/net/e100.c.my_patch_orig 2009-06-13 18:47:53.000000000 +0200
> +++ linux-2.6.30-rc8.e100/drivers/net/e100.c 2009-06-13 20:27:46.000000000 +0200
> @@ -2897,6 +2897,13 @@ static void __e100_shutdown(struct pci_d
>
> static int __e100_power_off(struct pci_dev *pdev, bool wake)
> {
> + /* some older devices don't support PCI PM
> + * (e.g. mac_82557_D100_B combo card with 80c24 PHY)
> + * - skip those! (they most likely won't support WoL either)
> + */
> + if (!pci_find_capability(pdev, PCI_CAP_ID_PM))
> + return 0;

Devices without PCI_CAP_ID_PM may still be power-manageable by ACPI, so
returning 0 at this point is not a general solution.

> +
> if (wake) {
> return pci_prepare_to_sleep(pdev);

pci_prepare_to_sleep() is supposed to return 0 for your device. I'll have a
look at it.

Best,
Rafael

2009-06-13 22:45:22

by Rafael J. Wysocki

[permalink] [raw]

Subject: Re: [PATCH] Make e100 suspend handler support PCI cards lacking PM capability

On Sunday 14 June 2009, Rafael J. Wysocki wrote:
> On Saturday 13 June 2009, Andreas Mohr wrote:
> > Hi all,
> >
> > after having added non-MII PHY card support to e100, I noticed that
> > the suspend handler rejects power-management non-capable PCI cards,
>
> Well, that means we have a bug somewhere in the PCI PM core.
>
> > causing a S2R request to immediately get back up to the desktop,
> > losing network access in the process (rtnl mutex deadlock).
>
> That's bad.
>
> >
> > ChangeLog:
> > Support PCI cards which are lacking power management capability
> > in the e100 suspend handler.
> >
> >
> > Frankly I was unsure how to best add this to the driver in a clean way.
> > Usually drivers use pci_set_power_state(..., pci_choose_state(...))
> > in order to avoid the rejection of an open-coded
> > pci_set_power_state(..., PCI_D3hot) in case of a non-PM card,
> > however pci_choose_state() depends on the _pm-internal_ pm_message_t type,
> > which was doable in .suspend directly but not at the other e100
> > driver locations where it was used.
> >
> > Next attempt was to extend __e100_power_off() with a pci_power_t parameter,
> > however since __e100_power_off() is called by two locations,
> > that meant that I'd have to use pci_choose_state() at _both_ call sites.
> >
> > Thus I simply resorted to do a brute-force yet most simple
> > pci_find_capability() check in the __e100_power_off() function.
> >
> >
> > Tested on 2.6.30-rc8 and suspending/resuming fine, checkpatch.pl:ed.
> > Patch against 2.6.30-rc8 with my original non-MII support patch applied.
> > (should apply fine in any case, I'd think).
> > Intended for testing in -mmotm or so.
> >
> > Thanks!
> >
> > Signed-off-by: Andreas Mohr <[email protected]>
> >
> >
> > --- linux-2.6.30-rc8.e100/drivers/net/e100.c.my_patch_orig 2009-06-13 18:47:53.000000000 +0200
> > +++ linux-2.6.30-rc8.e100/drivers/net/e100.c 2009-06-13 20:27:46.000000000 +0200
> > @@ -2897,6 +2897,13 @@ static void __e100_shutdown(struct pci_d
> >
> > static int __e100_power_off(struct pci_dev *pdev, bool wake)
> > {
> > + /* some older devices don't support PCI PM
> > + * (e.g. mac_82557_D100_B combo card with 80c24 PHY)
> > + * - skip those! (they most likely won't support WoL either)
> > + */
> > + if (!pci_find_capability(pdev, PCI_CAP_ID_PM))
> > + return 0;
>
> Devices without PCI_CAP_ID_PM may still be power-manageable by ACPI, so
> returning 0 at this point is not a general solution.
>
> > +
> > if (wake) {
> > return pci_prepare_to_sleep(pdev);
>
> pci_prepare_to_sleep() is supposed to return 0 for your device. I'll have a
> look at it.

Please check if the appended patch helps.

Best,
Rafael

---
drivers/pci/pci.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

Index: linux-2.6/drivers/pci/pci.c
===================================================================
--- linux-2.6.orig/drivers/pci/pci.c
+++ linux-2.6/drivers/pci/pci.c
@@ -1284,15 +1284,14 @@ pci_power_t pci_target_state(struct pci_
default:
target_state = state;
}
+ } else if (!dev->pm_cap) {
+ target_state = PCI_D0;
} else if (device_may_wakeup(&dev->dev)) {
/*
* Find the deepest state from which the device can generate
* wake-up events, make it the target state and enable device
* to generate PME#.
*/
- if (!dev->pm_cap)
- return PCI_POWER_ERROR;
-
if (dev->pme_support) {
while (target_state
&& !(dev->pme_support & (1 << target_state)))

2009-06-14 12:51:39

by Andreas Mohr

[permalink] [raw]

Subject: Re: [PATCH] Make e100 suspend handler support PCI cards lacking PM capability

Hi,

On Sun, Jun 14, 2009 at 12:28:15AM +0200, Rafael J. Wysocki wrote:
> On Saturday 13 June 2009, Andreas Mohr wrote:
> > Hi all,
> >
> > after having added non-MII PHY card support to e100, I noticed that
> > the suspend handler rejects power-management non-capable PCI cards,
>
> Well, that means we have a bug somewhere in the PCI PM core.

I don't know. I had shortly investigated the same thing,
but it very much seemed that this is by design, pci_set_power_state()
is documented to reject non-PM cards (in power/pci.txt, and in
pci.c, too). Thus I didn't work in this area.

And from a cleanliness point of view pci_set_power_state()
acting on a non-PM card with no special non-PM ACPI support _should_
return an error status I guess.
(especially since docs say that pci_set_power_state() should
be used for PM cards)

> > causing a S2R request to immediately get back up to the desktop,
> > losing network access in the process (rtnl mutex deadlock).
>
> That's bad.

Indeed, and I have no idea what the problem was.
rtnl_is_locked() always was false within suspend/resume,
thus it had to be a userspace-triggered effect sometime
_after_ (non-)resume happened
(probably due to the network controller being down and thus inoperable
after .suspend).

BTW, after that failed .suspend, .resume was not called. I assume this to
be correct behaviour.

> > static int __e100_power_off(struct pci_dev *pdev, bool wake)
> > {
> > + /* some older devices don't support PCI PM
> > + * (e.g. mac_82557_D100_B combo card with 80c24 PHY)
> > + * - skip those! (they most likely won't support WoL either)
> > + */
> > + if (!pci_find_capability(pdev, PCI_CAP_ID_PM))
> > + return 0;
>
> Devices without PCI_CAP_ID_PM may still be power-manageable by ACPI, so
> returning 0 at this point is not a general solution.

Oh, interesting.

BTW, any idea why we have several drivers doing some seemingly useless

/* Find power-management capability. */
pm_cap = pci_find_capability(pdev, PCI_CAP_ID_PM);
if (pm_cap == 0) {
printk(KERN_ERR PFX "Cannot find PowerManagement capability, "
"aborting.\n");
err = -EIO;
goto err_out_free_res;
}

?

- it's code bloat
- it needlessly rejects non-PM cards
- it annoys the hell out of users of a non-PM card

> > +
> > if (wake) {
> > return pci_prepare_to_sleep(pdev);
>
> pci_prepare_to_sleep() is supposed to return 0 for your device. I'll have a
> look at it.

No, wake is false for my card, thus that's not the branch to
investigate.

And that's probably the reason why the patch in the next mail
didn't work.
(I #if 0'ed my e100 changes, and then patched pci.c,
and it did properly contain the patched parts,
and the kernel rebuild then failed to suspend)

pci_legacy_suspend(): e100_suspend+0x0/0x20 [e100] returns -5
pm_op(): pci_pm_suspend+0x0/0xd7 returns -5
PM: Device 0000:02:07.0 failed to suspend: error -5
PM: Some devices failed to suspend

Most likely your patch didn't change pci_set_power_state() processing,
I'd think.

Now what to do next?
Note that I won't be able to test anything for about a week now,
unfortunately.

Thanks for your info!

Andreas Mohr

2009-06-14 14:06:28

by Rafael J. Wysocki

[permalink] [raw]

Subject: Re: [PATCH] Make e100 suspend handler support PCI cards lacking PM capability

On Sunday 14 June 2009, Andreas Mohr wrote:
> Hi,
>
> On Sun, Jun 14, 2009 at 12:28:15AM +0200, Rafael J. Wysocki wrote:
> > On Saturday 13 June 2009, Andreas Mohr wrote:
> > > Hi all,
> > >
> > > after having added non-MII PHY card support to e100, I noticed that
> > > the suspend handler rejects power-management non-capable PCI cards,
> >
> > Well, that means we have a bug somewhere in the PCI PM core.
>
> I don't know. I had shortly investigated the same thing,
> but it very much seemed that this is by design, pci_set_power_state()
> is documented to reject non-PM cards (in power/pci.txt, and in
> pci.c, too). Thus I didn't work in this area.
>
> And from a cleanliness point of view pci_set_power_state()
> acting on a non-PM card with no special non-PM ACPI support _should_
> return an error status I guess.
> (especially since docs say that pci_set_power_state() should
> be used for PM cards)
>
> > > causing a S2R request to immediately get back up to the desktop,
> > > losing network access in the process (rtnl mutex deadlock).
> >
> > That's bad.
>
> Indeed, and I have no idea what the problem was.
> rtnl_is_locked() always was false within suspend/resume,
> thus it had to be a userspace-triggered effect sometime
> _after_ (non-)resume happened
> (probably due to the network controller being down and thus inoperable
> after .suspend).
>
> BTW, after that failed .suspend, .resume was not called. I assume this to
> be correct behaviour.
>
> > > static int __e100_power_off(struct pci_dev *pdev, bool wake)
> > > {
> > > + /* some older devices don't support PCI PM
> > > + * (e.g. mac_82557_D100_B combo card with 80c24 PHY)
> > > + * - skip those! (they most likely won't support WoL either)
> > > + */
> > > + if (!pci_find_capability(pdev, PCI_CAP_ID_PM))
> > > + return 0;
> >
> > Devices without PCI_CAP_ID_PM may still be power-manageable by ACPI, so
> > returning 0 at this point is not a general solution.
>
> Oh, interesting.
>
> BTW, any idea why we have several drivers doing some seemingly useless
>
> /* Find power-management capability. */
> pm_cap = pci_find_capability(pdev, PCI_CAP_ID_PM);
> if (pm_cap == 0) {
> printk(KERN_ERR PFX "Cannot find PowerManagement capability, "
> "aborting.\n");
> err = -EIO;
> goto err_out_free_res;
> }
>
> ?
>
> - it's code bloat
> - it needlessly rejects non-PM cards
> - it annoys the hell out of users of a non-PM card

No idea.

> > > +
> > > if (wake) {
> > > return pci_prepare_to_sleep(pdev);
> >
> > pci_prepare_to_sleep() is supposed to return 0 for your device. I'll have a
> > look at it.
>
> No, wake is false for my card, thus that's not the branch to
> investigate.

Ah. The problem is, then, that we try to put the device into D3, which it
cannot do and error code is correctly returned from pci_set_power_state().

I would use the appended patch in that case and the patch sent previously
is necessary for the 'wake = true' case.

Thanks,
Rafael

---
drivers/net/e100.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/net/e100.c
===================================================================
--- linux-2.6.orig/drivers/net/e100.c
+++ linux-2.6/drivers/net/e100.c
@@ -2763,8 +2763,9 @@ static int __e100_power_off(struct pci_d
return pci_prepare_to_sleep(pdev);
} else {
pci_wake_from_d3(pdev, false);
- return pci_set_power_state(pdev, PCI_D3hot);
+ pci_set_power_state(pdev, PCI_D3hot);
}
+ return 0;
}

#ifdef CONFIG_PM

2009-06-14 16:30:37

by Rafael J. Wysocki

[permalink] [raw]

Subject: Re: [PATCH] Make e100 suspend handler support PCI cards lacking PM capability

On Sunday 14 June 2009, Rafael J. Wysocki wrote:
> On Sunday 14 June 2009, Andreas Mohr wrote:
> > Hi,
> >
> > On Sun, Jun 14, 2009 at 12:28:15AM +0200, Rafael J. Wysocki wrote:
> > > On Saturday 13 June 2009, Andreas Mohr wrote:
> > > > Hi all,
> > > >
> > > > after having added non-MII PHY card support to e100, I noticed that
> > > > the suspend handler rejects power-management non-capable PCI cards,
> > >
> > > Well, that means we have a bug somewhere in the PCI PM core.
> >
> > I don't know. I had shortly investigated the same thing,
> > but it very much seemed that this is by design, pci_set_power_state()
> > is documented to reject non-PM cards (in power/pci.txt, and in
> > pci.c, too). Thus I didn't work in this area.
> >
> > And from a cleanliness point of view pci_set_power_state()
> > acting on a non-PM card with no special non-PM ACPI support _should_
> > return an error status I guess.
> > (especially since docs say that pci_set_power_state() should
> > be used for PM cards)
> >
> > > > causing a S2R request to immediately get back up to the desktop,
> > > > losing network access in the process (rtnl mutex deadlock).
> > >
> > > That's bad.
> >
> > Indeed, and I have no idea what the problem was.
> > rtnl_is_locked() always was false within suspend/resume,
> > thus it had to be a userspace-triggered effect sometime
> > _after_ (non-)resume happened
> > (probably due to the network controller being down and thus inoperable
> > after .suspend).
> >
> > BTW, after that failed .suspend, .resume was not called. I assume this to
> > be correct behaviour.
> >
> > > > static int __e100_power_off(struct pci_dev *pdev, bool wake)
> > > > {
> > > > + /* some older devices don't support PCI PM
> > > > + * (e.g. mac_82557_D100_B combo card with 80c24 PHY)
> > > > + * - skip those! (they most likely won't support WoL either)
> > > > + */
> > > > + if (!pci_find_capability(pdev, PCI_CAP_ID_PM))
> > > > + return 0;
> > >
> > > Devices without PCI_CAP_ID_PM may still be power-manageable by ACPI, so
> > > returning 0 at this point is not a general solution.
> >
> > Oh, interesting.
> >
> > BTW, any idea why we have several drivers doing some seemingly useless
> >
> > /* Find power-management capability. */
> > pm_cap = pci_find_capability(pdev, PCI_CAP_ID_PM);
> > if (pm_cap == 0) {
> > printk(KERN_ERR PFX "Cannot find PowerManagement capability, "
> > "aborting.\n");
> > err = -EIO;
> > goto err_out_free_res;
> > }
> >
> > ?
> >
> > - it's code bloat
> > - it needlessly rejects non-PM cards
> > - it annoys the hell out of users of a non-PM card
>
> No idea.
>
> > > > +
> > > > if (wake) {
> > > > return pci_prepare_to_sleep(pdev);
> > >
> > > pci_prepare_to_sleep() is supposed to return 0 for your device. I'll have a
> > > look at it.
> >
> > No, wake is false for my card, thus that's not the branch to
> > investigate.
>
> Ah. The problem is, then, that we try to put the device into D3, which it
> cannot do and error code is correctly returned from pci_set_power_state().
>
> I would use the appended patch.

Or perhaps better this one (functionally equivalent).

---
drivers/net/e100.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/net/e100.c
===================================================================
--- linux-2.6.orig/drivers/net/e100.c
+++ linux-2.6/drivers/net/e100.c
@@ -2759,12 +2759,13 @@ static void __e100_shutdown(struct pci_d

static int __e100_power_off(struct pci_dev *pdev, bool wake)
{
- if (wake) {
+ if (wake)
return pci_prepare_to_sleep(pdev);
- } else {
- pci_wake_from_d3(pdev, false);
- return pci_set_power_state(pdev, PCI_D3hot);
- }
+
+ pci_wake_from_d3(pdev, false);
+ pci_set_power_state(pdev, PCI_D3hot);
+
+ return 0;
}

#ifdef CONFIG_PM

2009-06-14 16:46:21

by Andreas Mohr

[permalink] [raw]

Subject: Re: [PATCH] Make e100 suspend handler support PCI cards lacking PM capability

Hi,

On Sun, Jun 14, 2009 at 04:06:29PM +0200, Rafael J. Wysocki wrote:
> On Sunday 14 June 2009, Andreas Mohr wrote:
> > Hi,
> >
> > On Sun, Jun 14, 2009 at 12:28:15AM +0200, Rafael J. Wysocki wrote:
> > > On Saturday 13 June 2009, Andreas Mohr wrote:
> > > > +
> > > > if (wake) {
> > > > return pci_prepare_to_sleep(pdev);
> > >
> > > pci_prepare_to_sleep() is supposed to return 0 for your device. I'll have a
> > > look at it.
> >
> > No, wake is false for my card, thus that's not the branch to
> > investigate.
>
> Ah. The problem is, then, that we try to put the device into D3, which it
> cannot do and error code is correctly returned from pci_set_power_state().
>
> I would use the appended patch in that case and the patch sent previously
> is necessary for the 'wake = true' case.

OK, as said I cannot test this right now, but I'm _damn_ sure it would
work. Thus I'd say your equivalent patch posted a bit later can be
committed already.

But what about the wake = true case?
I'm not affected by this since my card doesn't have any wake capa,
thus it's your call of whether that pci core code part really was broken
and needed fixing.

So, for the patch in your next mail:
Acked-by: Andreas Mohr <[email protected]>

BTW, that patch was (pasted):

static int __e100_power_off(struct pci_dev *pdev, bool wake)
{
- if (wake) {
+ if (wake)
return pci_prepare_to_sleep(pdev);
- } else {
- pci_wake_from_d3(pdev, false);
- return pci_set_power_state(pdev, PCI_D3hot);
- }
+
+ pci_wake_from_d3(pdev, false);
+ pci_set_power_state(pdev, PCI_D3hot);
+
+ return 0;
}

Couple questions still:
- why do we call pci_wake_from_d3(...false) only!?
Wouldn't this break WoL after one iteration back and forth,
due to missing 'true' case?
- why do we call netif_device_detach() _after_ doing hardware shutdown
of the network controller? I'd guess this can cause huge issues?
Someone told me he had rtnl lock issues upon S2D with e100
(very similar to my rtnl issues during aborted .suspend),
and that might possibly be the reason?

So few lines of code, so many questions...

Thanks,

Andreas Mohr

2009-06-14 17:09:22

by Rafael J. Wysocki

[permalink] [raw]

Subject: Re: [PATCH] Make e100 suspend handler support PCI cards lacking PM capability

On Sunday 14 June 2009, Andreas Mohr wrote:
> Hi,
>
> On Sun, Jun 14, 2009 at 04:06:29PM +0200, Rafael J. Wysocki wrote:
> > On Sunday 14 June 2009, Andreas Mohr wrote:
> > > Hi,
> > >
> > > On Sun, Jun 14, 2009 at 12:28:15AM +0200, Rafael J. Wysocki wrote:
> > > > On Saturday 13 June 2009, Andreas Mohr wrote:
> > > > > +
> > > > > if (wake) {
> > > > > return pci_prepare_to_sleep(pdev);
> > > >
> > > > pci_prepare_to_sleep() is supposed to return 0 for your device. I'll have a
> > > > look at it.
> > >
> > > No, wake is false for my card, thus that's not the branch to
> > > investigate.
> >
> > Ah. The problem is, then, that we try to put the device into D3, which it
> > cannot do and error code is correctly returned from pci_set_power_state().
> >
> > I would use the appended patch in that case and the patch sent previously
> > is necessary for the 'wake = true' case.
>
> OK, as said I cannot test this right now, but I'm _damn_ sure it would
> work. Thus I'd say your equivalent patch posted a bit later can be
> committed already.
>
> But what about the wake = true case?
> I'm not affected by this since my card doesn't have any wake capa,
> thus it's your call of whether that pci core code part really was broken
> and needed fixing.

I think it needs fixing.

> So, for the patch in your next mail:
> Acked-by: Andreas Mohr <[email protected]>
>
>
> BTW, that patch was (pasted):
>
> static int __e100_power_off(struct pci_dev *pdev, bool wake)
> {
> - if (wake) {
> + if (wake)
> return pci_prepare_to_sleep(pdev);
> - } else {
> - pci_wake_from_d3(pdev, false);
> - return pci_set_power_state(pdev, PCI_D3hot);
> - }
> +
> + pci_wake_from_d3(pdev, false);
> + pci_set_power_state(pdev, PCI_D3hot);
> +
> + return 0;
> }
>
>
> Couple questions still:
> - why do we call pci_wake_from_d3(...false) only!?
> Wouldn't this break WoL after one iteration back and forth,
> due to missing 'true' case?

The 'true' case is the 'wake = true' one.

> - why do we call netif_device_detach() _after_ doing hardware shutdown
> of the network controller? I'd guess this can cause huge issues?
> Someone told me he had rtnl lock issues upon S2D with e100
> (very similar to my rtnl issues during aborted .suspend),
> and that might possibly be the reason?

I think you're right, but I'm not a network driver expert.

Perhaps you can change the ordering and see if that fixes the rtnl issue
(since you're able to reproduce it without my patch, that should be easy to
verify).

Best,
Rafael

2009-06-14 17:20:32

by Andreas Mohr

[permalink] [raw]

Subject: Re: [PATCH] Make e100 suspend handler support PCI cards lacking PM capability

Hi,

On Sun, Jun 14, 2009 at 07:09:45PM +0200, Rafael J. Wysocki wrote:
> On Sunday 14 June 2009, Andreas Mohr wrote:
> > Couple questions still:
> > - why do we call pci_wake_from_d3(...false) only!?
> > Wouldn't this break WoL after one iteration back and forth,
> > due to missing 'true' case?
>
> The 'true' case is the 'wake = true' one.

OK, so it wasn't an explicit pci_wake_from_d3(...true), but the
operations done there are the equivalent of it probably.

> > - why do we call netif_device_detach() _after_ doing hardware shutdown
> > of the network controller? I'd guess this can cause huge issues?
> > Someone told me he had rtnl lock issues upon S2D with e100
> > (very similar to my rtnl issues during aborted .suspend),
> > and that might possibly be the reason?
>
> I think you're right, but I'm not a network driver expert.
>
> Perhaps you can change the ordering and see if that fixes the rtnl issue
> (since you're able to reproduce it without my patch, that should be easy to
> verify).

I'll test this - later.

Thanks a lot,

Andreas Mohr

2009-06-14 19:46:44

by Rafael J. Wysocki

[permalink] [raw]

Subject: [PATCH] Net / e100: Fix suspend of devices that cannot be power managed

From: Rafael J. Wysocki <[email protected]>

If the adapter is not power-manageable using either ACPI, or the
native PCI PM interface, __e100_power_off() returns error code, which
causes every attempt to suspend to fail, although it should return 0
in such a case. Fix this problem by ignoring the return value of
pci_set_power_state() in __e100_power_off().

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Andreas Mohr <[email protected]>
---
drivers/net/e100.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/net/e100.c
===================================================================
--- linux-2.6.orig/drivers/net/e100.c
+++ linux-2.6/drivers/net/e100.c
@@ -2759,12 +2759,13 @@ static void __e100_shutdown(struct pci_d

static int __e100_power_off(struct pci_dev *pdev, bool wake)
{
- if (wake) {
+ if (wake)
return pci_prepare_to_sleep(pdev);
- } else {
- pci_wake_from_d3(pdev, false);
- return pci_set_power_state(pdev, PCI_D3hot);
- }
+
+ pci_wake_from_d3(pdev, false);
+ pci_set_power_state(pdev, PCI_D3hot);
+
+ return 0;
}

#ifdef CONFIG_PM

2009-06-18 02:03:46

by David Miller

[permalink] [raw]

Subject: Re: [PATCH] Net / e100: Fix suspend of devices that cannot be power managed

From: "Rafael J. Wysocki" <[email protected]>
Date: Sun, 14 Jun 2009 21:46:46 +0200

> From: Rafael J. Wysocki <[email protected]>
>
> If the adapter is not power-manageable using either ACPI, or the
> native PCI PM interface, __e100_power_off() returns error code, which
> causes every attempt to suspend to fail, although it should return 0
> in such a case. Fix this problem by ignoring the return value of
> pci_set_power_state() in __e100_power_off().
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> Acked-by: Andreas Mohr <[email protected]>

Applied, thanks!

2009-06-19 08:00:59

by Andreas Mohr

[permalink] [raw]

Subject: Re: [PATCH] Make e100 suspend handler support PCI cards lacking PM capability

Hi,

On Sun, Jun 14, 2009 at 07:09:45PM +0200, Rafael J. Wysocki wrote:
> On Sunday 14 June 2009, Andreas Mohr wrote:
> > - why do we call netif_device_detach() _after_ doing hardware shutdown
> > of the network controller? I'd guess this can cause huge issues?
> > Someone told me he had rtnl lock issues upon S2D with e100
> > (very similar to my rtnl issues during aborted .suspend),
> > and that might possibly be the reason?
>
> I think you're right, but I'm not a network driver expert.
>
> Perhaps you can change the ordering and see if that fixes the rtnl issue
> (since you're able to reproduce it without my patch, that should be easy to
> verify).

Well, I just moved netif_device_detach() above netif_running() check,
but this didn't fix my network issues in case of a rejecting .suspend
handler: after resume when unloading e100, that hangs, and I get tons of
rtnl timeouts and locked rtnl mutex.
This is most likely because upon e100 unload, a backtrace showed that I
was hanging in e100_down -> msleep (somewhere at the very beginning of e100_down),
which is most definitely the inlined napi_disable() call there:

static inline void napi_disable(struct napi_struct *n)
{
set_bit(NAPI_STATE_DISABLE, &n->state);
while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
msleep(1);
clear_bit(NAPI_STATE_DISABLE, &n->state);
}

IOW the .suspend seems to keep NAPI layer active, yet due to .suspend failure
there's no .resume called, thus card is in an _inoperable_ state and
NAPI cannot be processed any further, thus napi_disable() on driver unload
locks up.

BTW, in include/linux/napi.h, shouldn't napi_disable() make use of
napi_synchronize() instead of C&P?
(simply move napi_synchronize() above napi_disable() and use it there)
Oh wait, there's the CONFIG_SMP complication:
napi_synchronize() is implemented for SMP only, whereas napi_disable()
checks the same thing _always_.
(or is it a BUG that napi_disable() does the same check for non-SMP,
too??)

Thanks,

Andreas Mohr