2008-10-02 23:33:33

by Jesse Brandeburg

[permalink] [raw]
Subject: [PATCH 2.6.27-rc8 0/6] e1000e mutex protection

This series of patches fixes several bugs found by putting a
mutex inside e1000_acquire_swfw_flag, and then the patch from
Thomas actually adds this mutex to help find any further bugs.

Like Jesse Barnes said, "totally up to you (obviously) whether we
stuff this into 2.6.27 or hold on it until 2.6.28."

Linus, since you mentioned this exact same issue that Thomas (and
Dave Airlie) spotted I figured I should just send these.
Testing will continue over the next couple of days and I'll let
you know immediately if we find something.

The patches are pretty straight forward and each one fixed a
separate bug we found using the mutex patch. It is possible that
these fix the actual corruption issue but we haven't verified
that yet.

---

Jesse Brandeburg (5):
e1000e: update version from k4 to k6
e1000e: drop stats lock
e1000e: remove phy read from inside spinlock
e1000e: do not ever sleep in interrupt context
e1000e: reset swflag after resetting hardware

Thomas Gleixner (1):
e1000e: debug contention on NVM SWFLAG


drivers/net/e1000e/e1000.h | 3 +-
drivers/net/e1000e/ethtool.c | 6 +++-
drivers/net/e1000e/ich8lan.c | 20 +++++++++++++
drivers/net/e1000e/netdev.c | 64 +++++++++++++++++++-----------------------
4 files changed, 56 insertions(+), 37 deletions(-)

--
Jesse Brandeburg


2008-10-02 23:33:54

by Jesse Brandeburg

[permalink] [raw]
Subject: [PATCH 2.6.27-rc8 1/6] e1000e: reset swflag after resetting hardware

in the process of debugging things, noticed that the swflag is not reset
by the driver after reset, and the swflag is probably not reset unless
management firmware clears it after 100ms.

Signed-off-by: Jesse Brandeburg <[email protected]>
---

drivers/net/e1000e/ich8lan.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c
index d8efba8..f4b6744 100644
--- a/drivers/net/e1000e/ich8lan.c
+++ b/drivers/net/e1000e/ich8lan.c
@@ -1778,6 +1778,9 @@ static s32 e1000_reset_hw_ich8lan(struct e1000_hw *hw)
ew32(CTRL, (ctrl | E1000_CTRL_RST));
msleep(20);

+ /* release the swflag because it is not reset by hardware reset */
+ e1000_release_swflag_ich8lan(hw);
+
ret_val = e1000e_get_auto_rd_done(hw);
if (ret_val) {
/*

2008-10-02 23:34:21

by Jesse Brandeburg

[permalink] [raw]
Subject: [PATCH 2.6.27-rc8 2/6] e1000e: do not ever sleep in interrupt context

e1000e was apparently calling two functions that attempted to reserve
the SWFLAG bit for exclusive (to hardware and firmware) access to
the PHY and NVM (aka eeprom). These accesses could possibly call
msleep to wait for the resource which is not allowed from interrupt
context.

Signed-off-by: Jesse Brandeburg <[email protected]>
CC: Thomas Gleixner <[email protected]>
---

drivers/net/e1000e/e1000.h | 2 ++
drivers/net/e1000e/netdev.c | 31 ++++++++++++++++++++++++++++---
2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/drivers/net/e1000e/e1000.h b/drivers/net/e1000e/e1000.h
index f0c48a2..8087bda 100644
--- a/drivers/net/e1000e/e1000.h
+++ b/drivers/net/e1000e/e1000.h
@@ -284,6 +284,8 @@ struct e1000_adapter {
unsigned long led_status;

unsigned int flags;
+ struct work_struct downshift_task;
+ struct work_struct update_phy_task;
};

struct e1000_info {
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index 1f767fe..803545b 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -1115,6 +1115,14 @@ static void e1000_clean_rx_ring(struct e1000_adapter *adapter)
writel(0, adapter->hw.hw_addr + rx_ring->tail);
}

+static void e1000e_downshift_workaround(struct work_struct *work)
+{
+ struct e1000_adapter *adapter = container_of(work,
+ struct e1000_adapter, downshift_task);
+
+ e1000e_gig_downshift_workaround_ich8lan(&adapter->hw);
+}
+
/**
* e1000_intr_msi - Interrupt Handler
* @irq: interrupt number
@@ -1139,7 +1147,7 @@ static irqreturn_t e1000_intr_msi(int irq, void *data)
*/
if ((adapter->flags & FLAG_LSC_GIG_SPEED_DROP) &&
(!(er32(STATUS) & E1000_STATUS_LU)))
- e1000e_gig_downshift_workaround_ich8lan(hw);
+ schedule_work(&adapter->downshift_task);

/*
* 80003ES2LAN workaround-- For packet buffer work-around on
@@ -1205,7 +1213,7 @@ static irqreturn_t e1000_intr(int irq, void *data)
*/
if ((adapter->flags & FLAG_LSC_GIG_SPEED_DROP) &&
(!(er32(STATUS) & E1000_STATUS_LU)))
- e1000e_gig_downshift_workaround_ich8lan(hw);
+ schedule_work(&adapter->downshift_task);

/*
* 80003ES2LAN workaround--
@@ -2912,6 +2920,21 @@ static int e1000_set_mac(struct net_device *netdev, void *p)
return 0;
}

+/**
+ * e1000e_update_phy_task - work thread to update phy
+ * @work: pointer to our work struct
+ *
+ * this worker thread exists because we must acquire a
+ * semaphore to read the phy, which we could msleep while
+ * waiting for it, and we can't msleep in a timer.
+ **/
+static void e1000e_update_phy_task(struct work_struct *work)
+{
+ struct e1000_adapter *adapter = container_of(work,
+ struct e1000_adapter, update_phy_task);
+ e1000_get_phy_info(&adapter->hw);
+}
+
/*
* Need to wait a few seconds after link up to get diagnostic information from
* the phy
@@ -2919,7 +2942,7 @@ static int e1000_set_mac(struct net_device *netdev, void *p)
static void e1000_update_phy_info(unsigned long data)
{
struct e1000_adapter *adapter = (struct e1000_adapter *) data;
- e1000_get_phy_info(&adapter->hw);
+ schedule_work(&adapter->update_phy_task);
}

/**
@@ -4578,6 +4601,8 @@ static int __devinit e1000_probe(struct pci_dev *pdev,

INIT_WORK(&adapter->reset_task, e1000_reset_task);
INIT_WORK(&adapter->watchdog_task, e1000_watchdog_task);
+ INIT_WORK(&adapter->downshift_task, e1000e_downshift_workaround);
+ INIT_WORK(&adapter->update_phy_task, e1000e_update_phy_task);

/* Initialize link parameters. User can change them with ethtool */
adapter->hw.mac.autoneg = 1;

2008-10-02 23:34:38

by Jesse Brandeburg

[permalink] [raw]
Subject: [PATCH 2.6.27-rc8 4/6] e1000e: drop stats lock

the stats lock is left over from e1000, e1000e no longer
has the adjust tbi stats function that required the addition
of the stats lock to begin with.

adding a mutex to acquire_swflag helped catch this one too.

Signed-off-by: Jesse Brandeburg <[email protected]>
CC: Thomas Gleixner <[email protected]>
---

drivers/net/e1000e/e1000.h | 1 -
drivers/net/e1000e/netdev.c | 18 ------------------
2 files changed, 0 insertions(+), 19 deletions(-)

diff --git a/drivers/net/e1000e/e1000.h b/drivers/net/e1000e/e1000.h
index 8087bda..5ea6b60 100644
--- a/drivers/net/e1000e/e1000.h
+++ b/drivers/net/e1000e/e1000.h
@@ -257,7 +257,6 @@ struct e1000_adapter {
struct net_device *netdev;
struct pci_dev *pdev;
struct net_device_stats net_stats;
- spinlock_t stats_lock; /* prevent concurrent stats updates */

/* structs defined in e1000_hw.h */
struct e1000_hw hw;
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index 835b692..01e9558 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -2600,8 +2600,6 @@ static int __devinit e1000_sw_init(struct e1000_adapter *adapter)
/* Explicitly disable IRQ since the NIC can be in any state. */
e1000_irq_disable(adapter);

- spin_lock_init(&adapter->stats_lock);
-
set_bit(__E1000_DOWN, &adapter->state);
return 0;

@@ -2953,7 +2951,6 @@ void e1000e_update_stats(struct e1000_adapter *adapter)
{
struct e1000_hw *hw = &adapter->hw;
struct pci_dev *pdev = adapter->pdev;
- unsigned long irq_flags;

/*
* Prevent stats update while adapter is being reset, or if the pci
@@ -2964,14 +2961,6 @@ void e1000e_update_stats(struct e1000_adapter *adapter)
if (pci_channel_offline(pdev))
return;

- spin_lock_irqsave(&adapter->stats_lock, irq_flags);
-
- /*
- * these counters are modified from e1000_adjust_tbi_stats,
- * called from the interrupt context, so they must only
- * be written while holding adapter->stats_lock
- */
-
adapter->stats.crcerrs += er32(CRCERRS);
adapter->stats.gprc += er32(GPRC);
adapter->stats.gorc += er32(GORCL);
@@ -3046,8 +3035,6 @@ void e1000e_update_stats(struct e1000_adapter *adapter)
adapter->stats.mgptc += er32(MGTPTC);
adapter->stats.mgprc += er32(MGTPRC);
adapter->stats.mgpdc += er32(MGTPDC);
-
- spin_unlock_irqrestore(&adapter->stats_lock, irq_flags);
}

/**
@@ -3059,9 +3046,6 @@ static void e1000_phy_read_status(struct e1000_adapter *adapter)
struct e1000_hw *hw = &adapter->hw;
struct e1000_phy_regs *phy = &adapter->phy_regs;
int ret_val;
- unsigned long irq_flags;
-
- spin_lock_irqsave(&adapter->stats_lock, irq_flags);

if ((er32(STATUS) & E1000_STATUS_LU) &&
(adapter->hw.phy.media_type == e1000_media_type_copper)) {
@@ -3092,8 +3076,6 @@ static void e1000_phy_read_status(struct e1000_adapter *adapter)
phy->stat1000 = 0;
phy->estatus = (ESTATUS_1000_TFULL | ESTATUS_1000_THALF);
}
-
- spin_unlock_irqrestore(&adapter->stats_lock, irq_flags);
}

static void e1000_print_link_info(struct e1000_adapter *adapter)

2008-10-02 23:34:54

by Jesse Brandeburg

[permalink] [raw]
Subject: [PATCH 2.6.27-rc8 5/6] e1000e: debug contention on NVM SWFLAG

From: Thomas Gleixner <[email protected]>

This patch adds a mutex to the e1000e driver that would help
catch any collisions of two e1000e threads accessing hardware
at the same time.

description and patch updated by Jesse

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Jesse Brandeburg <[email protected]>
---

drivers/net/e1000e/ich8lan.c | 17 +++++++++++++++++
1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c
index f4b6744..0b6095b 100644
--- a/drivers/net/e1000e/ich8lan.c
+++ b/drivers/net/e1000e/ich8lan.c
@@ -380,6 +380,9 @@ static s32 e1000_get_variants_ich8lan(struct e1000_adapter *adapter)
return 0;
}

+static DEFINE_MUTEX(nvm_mutex);
+static pid_t nvm_owner = -1;
+
/**
* e1000_acquire_swflag_ich8lan - Acquire software control flag
* @hw: pointer to the HW structure
@@ -393,6 +396,15 @@ static s32 e1000_acquire_swflag_ich8lan(struct e1000_hw *hw)
u32 extcnf_ctrl;
u32 timeout = PHY_CFG_TIMEOUT;

+ WARN_ON(preempt_count());
+
+ if (!mutex_trylock(&nvm_mutex)) {
+ WARN(1, KERN_ERR "e1000e mutex contention. Owned by pid %d\n",
+ nvm_owner);
+ mutex_lock(&nvm_mutex);
+ }
+ nvm_owner = current->pid;
+
while (timeout) {
extcnf_ctrl = er32(EXTCNF_CTRL);
extcnf_ctrl |= E1000_EXTCNF_CTRL_SWFLAG;
@@ -407,6 +419,8 @@ static s32 e1000_acquire_swflag_ich8lan(struct e1000_hw *hw)

if (!timeout) {
hw_dbg(hw, "FW or HW has locked the resource for too long.\n");
+ nvm_owner = -1;
+ mutex_unlock(&nvm_mutex);
return -E1000_ERR_CONFIG;
}

@@ -428,6 +442,9 @@ static void e1000_release_swflag_ich8lan(struct e1000_hw *hw)
extcnf_ctrl = er32(EXTCNF_CTRL);
extcnf_ctrl &= ~E1000_EXTCNF_CTRL_SWFLAG;
ew32(EXTCNF_CTRL, extcnf_ctrl);
+
+ nvm_owner = -1;
+ mutex_unlock(&nvm_mutex);
}

/**

2008-10-02 23:35:18

by Jesse Brandeburg

[permalink] [raw]
Subject: [PATCH 2.6.27-rc8 3/6] e1000e: remove phy read from inside spinlock

thanks to tglx, we're finding some interesting reentrancy issues.
this patch removes the phy read from inside a spinlock, paving
the way for removing the spinlock completely. The phy read was
only feeding a statistic that wasn't used.

Signed-off-by: Jesse Brandeburg <[email protected]>
CC: Thomas Gleixner <[email protected]>
---

drivers/net/e1000e/ethtool.c | 6 +++++-
drivers/net/e1000e/netdev.c | 13 -------------
2 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c
index 5ed8e13..33a3ff1 100644
--- a/drivers/net/e1000e/ethtool.c
+++ b/drivers/net/e1000e/ethtool.c
@@ -432,6 +432,10 @@ static void e1000_get_regs(struct net_device *netdev,
regs_buff[11] = er32(TIDV);

regs_buff[12] = adapter->hw.phy.type; /* PHY type (IGP=1, M88=0) */
+
+ /* ethtool doesn't use anything past this point, so all this
+ * code is likely legacy junk for apps that may or may not
+ * exist */
if (hw->phy.type == e1000_phy_m88) {
e1e_rphy(hw, M88E1000_PHY_SPEC_STATUS, &phy_data);
regs_buff[13] = (u32)phy_data; /* cable length */
@@ -447,7 +451,7 @@ static void e1000_get_regs(struct net_device *netdev,
regs_buff[22] = adapter->phy_stats.receive_errors;
regs_buff[23] = regs_buff[13]; /* mdix mode */
}
- regs_buff[21] = adapter->phy_stats.idle_errors; /* phy idle errors */
+ regs_buff[21] = 0; /* was idle_errors */
e1e_rphy(hw, PHY_1000T_STATUS, &phy_data);
regs_buff[24] = (u32)phy_data; /* phy local receiver status */
regs_buff[25] = regs_buff[24]; /* phy remote receiver status */
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index 803545b..835b692 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -2954,9 +2954,6 @@ void e1000e_update_stats(struct e1000_adapter *adapter)
struct e1000_hw *hw = &adapter->hw;
struct pci_dev *pdev = adapter->pdev;
unsigned long irq_flags;
- u16 phy_tmp;
-
-#define PHY_IDLE_ERROR_COUNT_MASK 0x00FF

/*
* Prevent stats update while adapter is being reset, or if the pci
@@ -3045,15 +3042,6 @@ void e1000e_update_stats(struct e1000_adapter *adapter)

/* Tx Dropped needs to be maintained elsewhere */

- /* Phy Stats */
- if (hw->phy.media_type == e1000_media_type_copper) {
- if ((adapter->link_speed == SPEED_1000) &&
- (!e1e_rphy(hw, PHY_1000T_STATUS, &phy_tmp))) {
- phy_tmp &= PHY_IDLE_ERROR_COUNT_MASK;
- adapter->phy_stats.idle_errors += phy_tmp;
- }
- }
-
/* Management Stats */
adapter->stats.mgptc += er32(MGTPTC);
adapter->stats.mgprc += er32(MGTPRC);
@@ -3073,7 +3061,6 @@ static void e1000_phy_read_status(struct e1000_adapter *adapter)
int ret_val;
unsigned long irq_flags;

-
spin_lock_irqsave(&adapter->stats_lock, irq_flags);

if ((er32(STATUS) & E1000_STATUS_LU) &&

2008-10-02 23:35:45

by Jesse Brandeburg

[permalink] [raw]
Subject: [PATCH 2.6.27-rc8 6/6] e1000e: update version from k4 to k6

Signed-off-by: Jesse Brandeburg <[email protected]>
---

drivers/net/e1000e/netdev.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index 01e9558..b81c423 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -47,7 +47,7 @@

#include "e1000.h"

-#define DRV_VERSION "0.3.3.3-k4"
+#define DRV_VERSION "0.3.3.3-k6"
char e1000e_driver_name[] = "e1000e";
const char e1000e_driver_version[] = DRV_VERSION;

2008-10-03 00:37:01

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 2.6.27-rc8 2/6] e1000e: do not ever sleep in interrupt context



On Thu, 2 Oct 2008, Jesse Brandeburg wrote:

> e1000e was apparently calling two functions that attempted to reserve
> the SWFLAG bit for exclusive (to hardware and firmware) access to
> the PHY and NVM (aka eeprom). These accesses could possibly call
> msleep to wait for the resource which is not allowed from interrupt
> context.
>
> Signed-off-by: Jesse Brandeburg <[email protected]>
> CC: Thomas Gleixner <[email protected]>

Acked-by: Thomas Gleixner <[email protected]>
Tested-by: Thomas Gleixner <[email protected]>

---
>
> drivers/net/e1000e/e1000.h | 2 ++
> drivers/net/e1000e/netdev.c | 31 ++++++++++++++++++++++++++++---
> 2 files changed, 30 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/e1000e/e1000.h b/drivers/net/e1000e/e1000.h
> index f0c48a2..8087bda 100644
> --- a/drivers/net/e1000e/e1000.h
> +++ b/drivers/net/e1000e/e1000.h
> @@ -284,6 +284,8 @@ struct e1000_adapter {
> unsigned long led_status;
>
> unsigned int flags;
> + struct work_struct downshift_task;
> + struct work_struct update_phy_task;
> };
>
> struct e1000_info {
> diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
> index 1f767fe..803545b 100644
> --- a/drivers/net/e1000e/netdev.c
> +++ b/drivers/net/e1000e/netdev.c
> @@ -1115,6 +1115,14 @@ static void e1000_clean_rx_ring(struct e1000_adapter *adapter)
> writel(0, adapter->hw.hw_addr + rx_ring->tail);
> }
>
> +static void e1000e_downshift_workaround(struct work_struct *work)
> +{
> + struct e1000_adapter *adapter = container_of(work,
> + struct e1000_adapter, downshift_task);
> +
> + e1000e_gig_downshift_workaround_ich8lan(&adapter->hw);
> +}
> +
> /**
> * e1000_intr_msi - Interrupt Handler
> * @irq: interrupt number
> @@ -1139,7 +1147,7 @@ static irqreturn_t e1000_intr_msi(int irq, void *data)
> */
> if ((adapter->flags & FLAG_LSC_GIG_SPEED_DROP) &&
> (!(er32(STATUS) & E1000_STATUS_LU)))
> - e1000e_gig_downshift_workaround_ich8lan(hw);
> + schedule_work(&adapter->downshift_task);
>
> /*
> * 80003ES2LAN workaround-- For packet buffer work-around on
> @@ -1205,7 +1213,7 @@ static irqreturn_t e1000_intr(int irq, void *data)
> */
> if ((adapter->flags & FLAG_LSC_GIG_SPEED_DROP) &&
> (!(er32(STATUS) & E1000_STATUS_LU)))
> - e1000e_gig_downshift_workaround_ich8lan(hw);
> + schedule_work(&adapter->downshift_task);
>
> /*
> * 80003ES2LAN workaround--
> @@ -2912,6 +2920,21 @@ static int e1000_set_mac(struct net_device *netdev, void *p)
> return 0;
> }
>
> +/**
> + * e1000e_update_phy_task - work thread to update phy
> + * @work: pointer to our work struct
> + *
> + * this worker thread exists because we must acquire a
> + * semaphore to read the phy, which we could msleep while
> + * waiting for it, and we can't msleep in a timer.
> + **/
> +static void e1000e_update_phy_task(struct work_struct *work)
> +{
> + struct e1000_adapter *adapter = container_of(work,
> + struct e1000_adapter, update_phy_task);
> + e1000_get_phy_info(&adapter->hw);
> +}
> +
> /*
> * Need to wait a few seconds after link up to get diagnostic information from
> * the phy
> @@ -2919,7 +2942,7 @@ static int e1000_set_mac(struct net_device *netdev, void *p)
> static void e1000_update_phy_info(unsigned long data)
> {
> struct e1000_adapter *adapter = (struct e1000_adapter *) data;
> - e1000_get_phy_info(&adapter->hw);
> + schedule_work(&adapter->update_phy_task);
> }
>
> /**
> @@ -4578,6 +4601,8 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
>
> INIT_WORK(&adapter->reset_task, e1000_reset_task);
> INIT_WORK(&adapter->watchdog_task, e1000_watchdog_task);
> + INIT_WORK(&adapter->downshift_task, e1000e_downshift_workaround);
> + INIT_WORK(&adapter->update_phy_task, e1000e_update_phy_task);
>
> /* Initialize link parameters. User can change them with ethtool */
> adapter->hw.mac.autoneg = 1;
>

2008-10-03 00:38:01

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 2.6.27-rc8 3/6] e1000e: remove phy read from inside spinlock

On Thu, 2 Oct 2008, Jesse Brandeburg wrote:

> thanks to tglx, we're finding some interesting reentrancy issues.
> this patch removes the phy read from inside a spinlock, paving
> the way for removing the spinlock completely. The phy read was
> only feeding a statistic that wasn't used.
>
> Signed-off-by: Jesse Brandeburg <[email protected]>
> CC: Thomas Gleixner <[email protected]>

Acked-by: Thomas Gleixner <[email protected]>

> ---
>
> drivers/net/e1000e/ethtool.c | 6 +++++-
> drivers/net/e1000e/netdev.c | 13 -------------
> 2 files changed, 5 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c
> index 5ed8e13..33a3ff1 100644
> --- a/drivers/net/e1000e/ethtool.c
> +++ b/drivers/net/e1000e/ethtool.c
> @@ -432,6 +432,10 @@ static void e1000_get_regs(struct net_device *netdev,
> regs_buff[11] = er32(TIDV);
>
> regs_buff[12] = adapter->hw.phy.type; /* PHY type (IGP=1, M88=0) */
> +
> + /* ethtool doesn't use anything past this point, so all this
> + * code is likely legacy junk for apps that may or may not
> + * exist */
> if (hw->phy.type == e1000_phy_m88) {
> e1e_rphy(hw, M88E1000_PHY_SPEC_STATUS, &phy_data);
> regs_buff[13] = (u32)phy_data; /* cable length */
> @@ -447,7 +451,7 @@ static void e1000_get_regs(struct net_device *netdev,
> regs_buff[22] = adapter->phy_stats.receive_errors;
> regs_buff[23] = regs_buff[13]; /* mdix mode */
> }
> - regs_buff[21] = adapter->phy_stats.idle_errors; /* phy idle errors */
> + regs_buff[21] = 0; /* was idle_errors */
> e1e_rphy(hw, PHY_1000T_STATUS, &phy_data);
> regs_buff[24] = (u32)phy_data; /* phy local receiver status */
> regs_buff[25] = regs_buff[24]; /* phy remote receiver status */
> diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
> index 803545b..835b692 100644
> --- a/drivers/net/e1000e/netdev.c
> +++ b/drivers/net/e1000e/netdev.c
> @@ -2954,9 +2954,6 @@ void e1000e_update_stats(struct e1000_adapter *adapter)
> struct e1000_hw *hw = &adapter->hw;
> struct pci_dev *pdev = adapter->pdev;
> unsigned long irq_flags;
> - u16 phy_tmp;
> -
> -#define PHY_IDLE_ERROR_COUNT_MASK 0x00FF
>
> /*
> * Prevent stats update while adapter is being reset, or if the pci
> @@ -3045,15 +3042,6 @@ void e1000e_update_stats(struct e1000_adapter *adapter)
>
> /* Tx Dropped needs to be maintained elsewhere */
>
> - /* Phy Stats */
> - if (hw->phy.media_type == e1000_media_type_copper) {
> - if ((adapter->link_speed == SPEED_1000) &&
> - (!e1e_rphy(hw, PHY_1000T_STATUS, &phy_tmp))) {
> - phy_tmp &= PHY_IDLE_ERROR_COUNT_MASK;
> - adapter->phy_stats.idle_errors += phy_tmp;
> - }
> - }
> -
> /* Management Stats */
> adapter->stats.mgptc += er32(MGTPTC);
> adapter->stats.mgprc += er32(MGTPRC);
> @@ -3073,7 +3061,6 @@ static void e1000_phy_read_status(struct e1000_adapter *adapter)
> int ret_val;
> unsigned long irq_flags;
>
> -
> spin_lock_irqsave(&adapter->stats_lock, irq_flags);
>
> if ((er32(STATUS) & E1000_STATUS_LU) &&
>

2008-10-03 00:38:52

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 2.6.27-rc8 4/6] e1000e: drop stats lock



On Thu, 2 Oct 2008, Jesse Brandeburg wrote:

> the stats lock is left over from e1000, e1000e no longer
> has the adjust tbi stats function that required the addition
> of the stats lock to begin with.
>
> adding a mutex to acquire_swflag helped catch this one too.
>
> Signed-off-by: Jesse Brandeburg <[email protected]>
> CC: Thomas Gleixner <[email protected]>

Acked-by: Thomas Gleixner <[email protected]>


> ---
>
> drivers/net/e1000e/e1000.h | 1 -
> drivers/net/e1000e/netdev.c | 18 ------------------
> 2 files changed, 0 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/net/e1000e/e1000.h b/drivers/net/e1000e/e1000.h
> index 8087bda..5ea6b60 100644
> --- a/drivers/net/e1000e/e1000.h
> +++ b/drivers/net/e1000e/e1000.h
> @@ -257,7 +257,6 @@ struct e1000_adapter {
> struct net_device *netdev;
> struct pci_dev *pdev;
> struct net_device_stats net_stats;
> - spinlock_t stats_lock; /* prevent concurrent stats updates */
>
> /* structs defined in e1000_hw.h */
> struct e1000_hw hw;
> diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
> index 835b692..01e9558 100644
> --- a/drivers/net/e1000e/netdev.c
> +++ b/drivers/net/e1000e/netdev.c
> @@ -2600,8 +2600,6 @@ static int __devinit e1000_sw_init(struct e1000_adapter *adapter)
> /* Explicitly disable IRQ since the NIC can be in any state. */
> e1000_irq_disable(adapter);
>
> - spin_lock_init(&adapter->stats_lock);
> -
> set_bit(__E1000_DOWN, &adapter->state);
> return 0;
>
> @@ -2953,7 +2951,6 @@ void e1000e_update_stats(struct e1000_adapter *adapter)
> {
> struct e1000_hw *hw = &adapter->hw;
> struct pci_dev *pdev = adapter->pdev;
> - unsigned long irq_flags;
>
> /*
> * Prevent stats update while adapter is being reset, or if the pci
> @@ -2964,14 +2961,6 @@ void e1000e_update_stats(struct e1000_adapter *adapter)
> if (pci_channel_offline(pdev))
> return;
>
> - spin_lock_irqsave(&adapter->stats_lock, irq_flags);
> -
> - /*
> - * these counters are modified from e1000_adjust_tbi_stats,
> - * called from the interrupt context, so they must only
> - * be written while holding adapter->stats_lock
> - */
> -
> adapter->stats.crcerrs += er32(CRCERRS);
> adapter->stats.gprc += er32(GPRC);
> adapter->stats.gorc += er32(GORCL);
> @@ -3046,8 +3035,6 @@ void e1000e_update_stats(struct e1000_adapter *adapter)
> adapter->stats.mgptc += er32(MGTPTC);
> adapter->stats.mgprc += er32(MGTPRC);
> adapter->stats.mgpdc += er32(MGTPDC);
> -
> - spin_unlock_irqrestore(&adapter->stats_lock, irq_flags);
> }
>
> /**
> @@ -3059,9 +3046,6 @@ static void e1000_phy_read_status(struct e1000_adapter *adapter)
> struct e1000_hw *hw = &adapter->hw;
> struct e1000_phy_regs *phy = &adapter->phy_regs;
> int ret_val;
> - unsigned long irq_flags;
> -
> - spin_lock_irqsave(&adapter->stats_lock, irq_flags);
>
> if ((er32(STATUS) & E1000_STATUS_LU) &&
> (adapter->hw.phy.media_type == e1000_media_type_copper)) {
> @@ -3092,8 +3076,6 @@ static void e1000_phy_read_status(struct e1000_adapter *adapter)
> phy->stat1000 = 0;
> phy->estatus = (ESTATUS_1000_TFULL | ESTATUS_1000_THALF);
> }
> -
> - spin_unlock_irqrestore(&adapter->stats_lock, irq_flags);
> }
>
> static void e1000_print_link_info(struct e1000_adapter *adapter)
>

2008-10-03 11:48:17

by Jiri Kosina

[permalink] [raw]
Subject: Re: [PATCH 2.6.27-rc8 5/6] e1000e: debug contention on NVM SWFLAG

On Thu, 2 Oct 2008, Jesse Brandeburg wrote:

> From: Thomas Gleixner <[email protected]>
>
> This patch adds a mutex to the e1000e driver that would help
> catch any collisions of two e1000e threads accessing hardware
> at the same time.

Apparently this has some bug wrt suspend, see
https://bugzilla.novell.com/show_bug.cgi?id=431914

WARNING: at drivers/net/e1000e/ich8lan.c:424
e1000_acquire_swflag_ich8lan+0x56/0xcb [e1000e]()
e1000e mutex contention. Owned by pid -1
Modules linked in: xt_tcpudp xt_pkttype tun ipt_ULOG xt_limit aes_i586
aes_generic i915 drm af_packet snd_pcm_oss snd_mixer_oss snd_seq
snd_seq_device ipt_REJECT xt_state cpufreq_conservative iptable_mangle
cpufreq_userspace iptable_nat cpufreq_powersave nf_nat acpi_cpufreq
iptable_filter speedstep_lib nf_conntrack_ipv4 nf_conntrack ip_tables
ip6_tables x_tables microcode loop arc4 ecb crypto_blkcipher snd_hda_intel
iwl3945 pcmcia thinkpad_acpi snd_pcm snd_timer sdhci_pci snd_page_alloc
rfkill sdhci snd_hwdep mac80211 yenta_socket rsrc_nonstatic video rtc_cmos
led_class ohci1394 snd ieee1394 output mmc_core i2c_i801 ac battery
pcmcia_core iTCO_wdt button intel_agp rtc_core cfg80211 nvram soundcore
iTCO_vendor_support agpgart e1000e rtc_lib i2c_core pcspkr uinput sg
sd_mod ehci_hcd uhci_hcd crc_t10dif usbcore edd ext3 mbcache jbd fan
ata_piix ahci libata scsi_mod dock thermal processor
Pid: 23153, comm: events/1 Tainted: G W
2.6.27-rc8-HEAD_20081001123454-pae #1
[<c0105590>] dump_trace+0x6b/0x249
[<c01060c5>] show_trace+0x20/0x39
[<c033b52d>] dump_stack+0x71/0x76
[<c012ba12>] warn_slowpath+0x6f/0x90
[<f91cfb9b>] e1000_acquire_swflag_ich8lan+0x56/0xcb [e1000e]
[<f91d3db4>] e1000e_read_phy_reg_igp+0x10/0x4f [e1000e]
[<f91d3f83>] e1000e_phy_has_link_generic+0x32/0x99 [e1000e]
[<f91d2e35>] e1000e_check_for_copper_link+0x26/0x80 [e1000e]
[<f91d8f3a>] e1000_watchdog_task+0x5b/0x5eb [e1000e]
[<c013a1b4>] run_workqueue+0x9f/0x13e
[<c013a309>] worker_thread+0xb6/0xc2
[<c013cdff>] kthread+0x38/0x5d
[<c01050e7>] kernel_thread_helper+0x7/0x10

> + WARN_ON(preempt_count());
> +
> + if (!mutex_trylock(&nvm_mutex)) {
> + WARN(1, KERN_ERR "e1000e mutex contention. Owned by pid %d\n",
> + nvm_owner);
> + mutex_lock(&nvm_mutex);
> + }
> + nvm_owner = current->pid;
> +
> while (timeout) {
> extcnf_ctrl = er32(EXTCNF_CTRL);
> extcnf_ctrl |= E1000_EXTCNF_CTRL_SWFLAG;
> @@ -407,6 +419,8 @@ static s32 e1000_acquire_swflag_ich8lan(struct e1000_hw *hw)
>
> if (!timeout) {
> hw_dbg(hw, "FW or HW has locked the resource for too long.\n");
> + nvm_owner = -1;
> + mutex_unlock(&nvm_mutex);
> return -E1000_ERR_CONFIG;
> }
>
> @@ -428,6 +442,9 @@ static void e1000_release_swflag_ich8lan(struct e1000_hw *hw)
> extcnf_ctrl = er32(EXTCNF_CTRL);
> extcnf_ctrl &= ~E1000_EXTCNF_CTRL_SWFLAG;
> ew32(EXTCNF_CTRL, extcnf_ctrl);
> +
> + nvm_owner = -1;
> + mutex_unlock(&nvm_mutex);
> }

The debugging message is racy anyway with respect to accessing nvm_owner,
right? It should be done after the mutex has been succesfully acquired.

--
Jiri Kosina
SUSE Labs

2008-10-03 15:17:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 2.6.27-rc8 5/6] e1000e: debug contention on NVM SWFLAG



On Fri, 3 Oct 2008, Jiri Kosina wrote:
>
> The debugging message is racy anyway with respect to accessing nvm_owner,
> right? It should be done after the mutex has been succesfully acquired.

It's done that way on purpose - to see who _could_ be racing.

After the mutex, it could never trigger, because the mutex is the thing
that guarantees non-racy-ness.

IOW, it's a debugging message just to see that the old bug (the "before
the fix") really did happen. We can/will remove it, but I think people are
still looking at the e1000e driver and probably want to see the paths that
can cause problems.

Of course, it's entirely possible that we should remove it in mainline
already, and just let the people inside intel/suse/xyzzy who are trying to
reproduce it have it.

Jesse? Thomas?

Linus

2008-10-03 15:39:31

by Jiri Kosina

[permalink] [raw]
Subject: Re: [PATCH 2.6.27-rc8 5/6] e1000e: debug contention on NVM SWFLAG

On Fri, 3 Oct 2008, Linus Torvalds wrote:

> > The debugging message is racy anyway with respect to accessing
> > nvm_owner, right? It should be done after the mutex has been
> > succesfully acquired.
> It's done that way on purpose - to see who _could_ be racing.

I know. I just wanted to point out that we probably don't want the patch
in 2.6.27 in this form, users wouldn't like to have warning in their logs
every time mutex is not acquired on a first attempt :)

We are currently running reproduction tests on affected machines to verify
that this patch really fixes the root cause of this whole e1000e saga.

--
Jiri Kosina
SUSE Labs

2008-10-03 19:39:26

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 2.6.27-rc8 5/6] e1000e: debug contention on NVM SWFLAG

On Fri, 3 Oct 2008, Jiri Kosina wrote:
> On Fri, 3 Oct 2008, Linus Torvalds wrote:
>
> > > The debugging message is racy anyway with respect to accessing
> > > nvm_owner, right? It should be done after the mutex has been
> > > succesfully acquired.
> > It's done that way on purpose - to see who _could_ be racing.
>
> I know. I just wanted to point out that we probably don't want the patch
> in 2.6.27 in this form, users wouldn't like to have warning in their logs
> every time mutex is not acquired on a first attempt :)

Yeah, it should go before .27 final. It now just gathers data as
people actually care about warn_ons or they are even caught
automatically via kernel oops.

Thanks,

tglx