2006-09-22 13:20:29

by Larry Finger

[permalink] [raw]
Subject: Re: Bcm43xx softMac Driver in 2.6.18

When we found the cause of NETDEV watchdog timeouts in the wireless-2.6 code, I knew that the 2.6.18
release code would cause a serious regression. I immediately prepared and tested a patch to fix
this; however, I was unable to get it pushed into the stable code before release of 2.6.18. The
following submission text and patch has been sent to the stable maintainers, which I hope is
included in 2.6.18.1.

For people having these lockups with 2.6.18, please try this patch and inform me of the results,
both positive and negative. The latter are particularly important.

Larry
==================================================================

In 2.6.18, the section that prepares for preemptible periodic work
has two bugs. The first is due to an improper locking sequence that leads
to frozen systems. The second causes netdev watchdog timeouts. Both
lead to serious regressions as compared with 2.6.17.

Signed-off-by: Larry Finger <[email protected]>
---

This patch, which was originally sent to John Linville on 9/14/06,
has been taken against 2.6.18. It changes more lines
than would be absolutely necessary to affect the fix; however, it
ends up with this section looking exactly like the current code (with
pending patches) that is in wireless-2.6.

Larry

Index: linux-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
===================================================================
--- linux-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c
+++ linux-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
@@ -3182,39 +3182,37 @@ static void bcm43xx_periodic_work_handle
/* Periodic work will take a long time, so we want it to
* be preemtible.
*/
- bcm43xx_lock_irqonly(bcm, flags);
- netif_stop_queue(bcm->net_dev);
+ mutex_lock(&bcm->mutex);
+ netif_tx_disable(bcm->net_dev);
+ spin_lock_irqsave(&bcm->irq_lock, flags);
+ bcm43xx_mac_suspend(bcm);
if (bcm43xx_using_pio(bcm))
bcm43xx_pio_freeze_txqueues(bcm);
savedirqs = bcm43xx_interrupt_disable(bcm, BCM43xx_IRQ_ALL);
- bcm43xx_unlock_irqonly(bcm, flags);
- bcm43xx_lock_noirq(bcm);
+ spin_unlock_irqrestore(&bcm->irq_lock, flags);
bcm43xx_synchronize_irq(bcm);
} else {
/* Periodic work should take short time, so we want low
* locking overhead.
*/
- bcm43xx_lock_irqsafe(bcm, flags);
+ mutex_lock(&bcm->mutex);
+ spin_lock_irqsave(&bcm->irq_lock, flags);
}

do_periodic_work(bcm);

if (badness > BADNESS_LIMIT) {
- bcm43xx_lock_irqonly(bcm, flags);
- if (likely(bcm43xx_status(bcm) == BCM43xx_STAT_INITIALIZED)) {
- tasklet_enable(&bcm->isr_tasklet);
- bcm43xx_interrupt_enable(bcm, savedirqs);
- if (bcm43xx_using_pio(bcm))
- bcm43xx_pio_thaw_txqueues(bcm);
- }
+ spin_lock_irqsave(&bcm->irq_lock, flags);
+ tasklet_enable(&bcm->isr_tasklet);
+ bcm43xx_interrupt_enable(bcm, savedirqs);
+ if (bcm43xx_using_pio(bcm))
+ bcm43xx_pio_thaw_txqueues(bcm);
+ bcm43xx_mac_enable(bcm);
netif_wake_queue(bcm->net_dev);
- mmiowb();
- bcm43xx_unlock_irqonly(bcm, flags);
- bcm43xx_unlock_noirq(bcm);
- } else {
- mmiowb();
- bcm43xx_unlock_irqsafe(bcm, flags);
}
+ mmiowb();
+ spin_unlock_irqrestore(&bcm->irq_lock, flags);
+ mutex_unlock(&bcm->mutex);
}

static void bcm43xx_periodic_tasks_delete(struct bcm43xx_private *bcm)



_______________________________________________
Bcm43xx-dev mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev




2006-09-22 13:53:55

by Erik Mouw

[permalink] [raw]
Subject: Re: Bcm43xx softMac Driver in 2.6.18

On Fri, Sep 22, 2006 at 08:20:08AM -0500, Larry Finger wrote:
> This patch, which was originally sent to John Linville on 9/14/06,
> has been taken against 2.6.18. It changes more lines
> than would be absolutely necessary to affect the fix; however, it
> ends up with this section looking exactly like the current code (with
> pending patches) that is in wireless-2.6.

Sorry, but your patch doesn't apply cleanly against 2.6.18:

erik@arthur:~/git/linux-2.6 > patch -p1 --dry-run < ../bcm43xx-2.6.18.diff
patching file drivers/net/wireless/bcm43xx/bcm43xx_main.c
Hunk #1 FAILED at 3182.
1 out of 1 hunk FAILED -- saving rejects to file drivers/net/wireless/bcm43xx/bcm43xx_main.c.rej


Erik

--
+-- Erik Mouw -- http://www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands

2006-09-22 14:31:29

by Larry Finger

[permalink] [raw]
Subject: Re: Bcm43xx softMac Driver in 2.6.18

Index: linux-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
===================================================================
--- linux-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c
+++ linux-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
@@ -3182,39 +3182,37 @@ static void bcm43xx_periodic_work_handle
/* Periodic work will take a long time, so we want it to
* be preemtible.
*/
- bcm43xx_lock_irqonly(bcm, flags);
- netif_stop_queue(bcm->net_dev);
+ mutex_lock(&bcm->mutex);
+ netif_tx_disable(bcm->net_dev);
+ spin_lock_irqsave(&bcm->irq_lock, flags);
+ bcm43xx_mac_suspend(bcm);
if (bcm43xx_using_pio(bcm))
bcm43xx_pio_freeze_txqueues(bcm);
savedirqs = bcm43xx_interrupt_disable(bcm, BCM43xx_IRQ_ALL);
- bcm43xx_unlock_irqonly(bcm, flags);
- bcm43xx_lock_noirq(bcm);
+ spin_unlock_irqrestore(&bcm->irq_lock, flags);
bcm43xx_synchronize_irq(bcm);
} else {
/* Periodic work should take short time, so we want low
* locking overhead.
*/
- bcm43xx_lock_irqsafe(bcm, flags);
+ mutex_lock(&bcm->mutex);
+ spin_lock_irqsave(&bcm->irq_lock, flags);
}

do_periodic_work(bcm);

if (badness > BADNESS_LIMIT) {
- bcm43xx_lock_irqonly(bcm, flags);
- if (likely(bcm43xx_status(bcm) == BCM43xx_STAT_INITIALIZED)) {
- tasklet_enable(&bcm->isr_tasklet);
- bcm43xx_interrupt_enable(bcm, savedirqs);
- if (bcm43xx_using_pio(bcm))
- bcm43xx_pio_thaw_txqueues(bcm);
- }
+ spin_lock_irqsave(&bcm->irq_lock, flags);
+ tasklet_enable(&bcm->isr_tasklet);
+ bcm43xx_interrupt_enable(bcm, savedirqs);
+ if (bcm43xx_using_pio(bcm))
+ bcm43xx_pio_thaw_txqueues(bcm);
+ bcm43xx_mac_enable(bcm);
netif_wake_queue(bcm->net_dev);
- mmiowb();
- bcm43xx_unlock_irqonly(bcm, flags);
- bcm43xx_unlock_noirq(bcm);
- } else {
- mmiowb();
- bcm43xx_unlock_irqsafe(bcm, flags);
}
+ mmiowb();
+ spin_unlock_irqrestore(&bcm->irq_lock, flags);
+ mutex_unlock(&bcm->mutex);
}

static void bcm43xx_periodic_tasks_delete(struct bcm43xx_private *bcm)


Attachments:
patch_preempt (1.96 kB)

2006-09-23 06:03:32

by Ray Lee

[permalink] [raw]
Subject: Re: Bcm43xx softMac Driver in 2.6.18

On 9/22/06, Larry Finger <[email protected]> wrote:
> When we found the cause of NETDEV watchdog timeouts in the wireless-2.6 code,
> I knew that the 2.6.18 release code would cause a serious regression.

I don't know if this is the lockup you're trying to address, but
2.6.18's bcm43xx has definitely regressed for me versus 2.6.17.x.

2.6.18 vanilla and 2.6.18 with your patch both lock my system hard
with bcm43xx. I've got an HP/Compaq nx6125 laptop. Symptoms are that
it will associate fine on its own and send traffic to/fro upon ifup,
but when I do an iwconfig, ifdown, ifup to change the access point,
the system locks (somewhat randomly) during one of those operations.
Well, the iwconfig or the ifup, actually.

lspci -v:

02:02.0 Network controller: Broadcom Corporation BCM4309 802.11a/b/g (rev 03)
Subsystem: Hewlett-Packard Company Unknown device 12f9
Flags: bus master, fast devsel, latency 64, IRQ 11
Memory at d0010000 (32-bit, non-prefetchable) [size=8K]

./bcm43xx-fwcutter -i BCMWL5.SYS
filename : bcmwl5.sys
version : 4.10.40.1
MD5 : 69f940672be0ecee5bd1e905706ba8ce

Wireless tools are Version: 28-1ubuntu2.

I've got multiple access points in view of the laptop, a g (54Mb), and
a b (11Mb). Neither with encryption enabled, if that makes a
difference (we live in the boonies).

It's 2.6.18 + your patch, compiled for x86_64, ubuntu devel.

Any suggestions or requests for tests?

Ray

2006-09-23 09:38:40

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Bcm43xx softMac Driver in 2.6.18

On Saturday, 23 September 2006 08:03, Ray Lee wrote:
> On 9/22/06, Larry Finger <[email protected]> wrote:
> > When we found the cause of NETDEV watchdog timeouts in the wireless-2.6 code,
> > I knew that the 2.6.18 release code would cause a serious regression.
>
> I don't know if this is the lockup you're trying to address, but
> 2.6.18's bcm43xx has definitely regressed for me versus 2.6.17.x.
>
> 2.6.18 vanilla and 2.6.18 with your patch both lock my system hard
> with bcm43xx. I've got an HP/Compaq nx6125 laptop. Symptoms are that
> it will associate fine on its own and send traffic to/fro upon ifup,
> but when I do an iwconfig, ifdown, ifup to change the access point,
> the system locks (somewhat randomly) during one of those operations.
> Well, the iwconfig or the ifup, actually.

I have observed similar symptoms on HPC nx6325, although I haven't managed
to get the adapter associate with an AP.

This is a PCI-E card so I need some additional patches to make the driver
detect it, and I use the firmware cut from wl_apsta.o. The kernel is also
64-bit, 2.6.18-rc6-mm2.

lspci -v:

30:00.0 Network controller: Broadcom Corporation BCM4310 UART (rev 01)
Subsystem: Hewlett-Packard Company Unknown device 1361
Flags: bus master, fast devsel, latency 0, IRQ 10
Memory at c8000000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 2
Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/0 Enable-
Capabilities: [d0] Express Legacy Endpoint IRQ 0

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-09-23 16:52:09

by Ray Lee

[permalink] [raw]
Subject: Re: Bcm43xx softMac Driver in 2.6.18

Rafael J. Wysocki wrote:
>> 2.6.18 vanilla and 2.6.18 with your patch both lock my system hard
>> with bcm43xx. I've got an HP/Compaq nx6125 laptop. Symptoms are that
>> it will associate fine on its own and send traffic to/fro upon ifup,
>> but when I do an iwconfig, ifdown, ifup to change the access point,
>> the system locks (somewhat randomly) during one of those operations.
>> Well, the iwconfig or the ifup, actually.
>
> I have observed similar symptoms on HPC nx6325, although I haven't managed
> to get the adapter associate with an AP.

Yeah, I'm having the same troubles. Carefully watching the iwconfig
results showed me that only half of the time did my `iwconfig eth1 essid
AccessPointName` actually take. (It listed the essid of the ap I told it
to associate with, but then showed "Access Point: Invalid" or words to
that effect, until I issued the exact same iwconfig again.)

So, try it twice, double check the iwconfig output, then try bringing up
the interface. Though that seems awfully difficult to do as well (DHCP
is just sending out stuff with nothing coming back).

When I switch consoles while DHCP is plaintively asking for an IP, and
issue *another* iwconfig with the same essid, then it seems to kick
something in the driver and DHCP immediately associates. Happened twice
for me so far, though that could merely be a coincidence.

Ray