2009-10-26 18:53:14

by Rafael J. Wysocki

[permalink] [raw]
Subject: 2.6.32-rc5-git3: Reported regressions from 2.6.31

This message contains a list of some regressions from 2.6.31, for which there
are no fixes in the mainline I know of. If any of them have been fixed already,
please let me know.

If you know of any other unresolved regressions from 2.6.31, please let me know
either and I'll add them to the list. Also, please let me know if any of the
entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

Date Total Pending Unresolved
----------------------------------------
2009-10-26 66 42 37
2009-10-12 48 31 27
2009-10-02 22 15 9


Unresolved regressions
----------------------

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14485
Subject : System lockup running "cat /sys/kernel/debug/dri/0/i915_regs"
Submitter : Miles Lane <[email protected]>
Date : 2009-10-26 4:00 (1 days old)
References : http://marc.info/?l=linux-kernel&m=125652968117713&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14484
Subject : no video output after suspend
Submitter : Riccardo Magliocchetti <[email protected]>
Date : 2009-10-25 20:57 (2 days old)
References : http://marc.info/?l=linux-kernel&m=125650430123713&w=4
Handled-By : Jesse Barnes <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14483
Subject : WARNING: at drivers/base/sys.c:353 __sysdev_resume+0x54/0xca()
Submitter : Justin Mattock <[email protected]>
Date : 2009-10-25 19:58 (2 days old)
References : http://marc.info/?l=linux-kernel&m=125650070420168&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14482
Subject : kernel BUG at fs/dcache.c:670 +lvm +md +ext3
Submitter : Alexander Clouter <[email protected]>
Date : 2009-10-23 10:30 (4 days old)
References : http://lkml.org/lkml/2009/10/23/50


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14481
Subject : umount blocked for more than 120 seconds after USB drive removal
Submitter : Robert Hancock <[email protected]>
Date : 2009-10-21 5:26 (6 days old)
References : http://marc.info/?l=linux-kernel&m=125610280532245&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14479
Subject : nfs oops
Submitter : Egon Alter <[email protected]>
Date : 2009-10-19 16:03 (8 days old)
References : http://marc.info/?l=linux-kernel&m=125596822630410&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14477
Subject : possible circular locking dependency in ISDN PPP
Submitter : Tilman Schmidt <[email protected]>
Date : 2009-10-18 22:16 (9 days old)
References : http://marc.info/?l=linux-kernel&m=125590423416087&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14473
Subject : ATA related kernel warning after resume
Submitter : Tino Keitel <[email protected]>
Date : 2009-10-14 6:55 (13 days old)
References : http://marc.info/?l=linux-kernel&m=125550466624678&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14472
Subject : EXT4 corruption
Submitter : Shawn Starr <[email protected]>
Date : 2009-10-13 2:07 (14 days old)
References : http://marc.info/?l=linux-kernel&m=125539997508256&w=4
Handled-By : Theodore Tso <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14467
Subject : Linker errors on ia64 with NR_CPUS=4096
Submitter : Jeff Mahoney <[email protected]>
Date : 2009-10-18 22:28 (9 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=34d76c41554a05425613d16efebb3069c4c545f0
References : http://marc.info/?l=linux-kernel&m=125590493116720&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14466
Subject : EFI boot on x86 fails in .32
Submitter : Matthew Garrett <[email protected]>
Date : 2009-10-20 0:34 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7bd867dfb4e0357e06a3211ab2bd0e714110def3
References : http://marc.info/?l=linux-kernel&m=125599887314290&w=4
Handled-By : Feng Tang <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14442
Subject : resume after hibernate: /dev/sdb drops and returns as /dev/sde
Submitter : Duncan <[email protected]>
Date : 2009-10-20 01:52 (7 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14430
Subject : sync() hangs in bdi_sched_wait
Submitter : Petr Vandrovec <[email protected]>
Date : 2009-10-17 19:14 (10 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14415
Subject : Reboot on kernel load
Submitter : Brian Beardall <[email protected]>
Date : 2009-10-15 23:57 (12 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14408
Subject : sysctl check failed
Submitter : Peter Teoh <[email protected]>
Date : 2009-10-14 22:59 (13 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14406
Subject : uvcvideo stopped work on Toshiba
Submitter : okias <[email protected]>
Date : 2009-10-14 19:08 (13 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14390
Subject : "bind" a device to a driver doesn't not work anymore
Submitter : Éric Piel <[email protected]>
Date : 2009-10-11 0:04 (16 days old)
References : http://marc.info/?l=linux-kernel&m=125521979921241&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14389
Subject : Build system issue
Submitter : Peter Zijlstra <[email protected]>
Date : 2009-10-09 8:58 (18 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=575543347b5baed0ca927cb90ba8807396fe9cc9
References : http://marc.info/?l=linux-kernel&m=125507914909152&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14387
Subject : deadlock with fallocate
Submitter : Thomas Neumann <[email protected]>
Date : 2009-10-07 3:00 (20 days old)
References : http://marc.info/?l=linux-kernel&m=125488495526471&w=4
Handled-By : Christoph Hellwig <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14384
Subject : tbench regression with 2.6.32-rc1
Submitter : Zhang, Yanmin <[email protected]>
Date : 2009-10-09 9:51 (18 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=59abf02644c45f1591e1374ee7bb45dc757fcb88
References : http://marc.info/?l=linux-kernel&m=125508216713138&w=4
Handled-By : Peter Zijlstra <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14383
Subject : hackbench regression with kernel 2.6.32-rc1
Submitter : Zhang, Yanmin <[email protected]>
Date : 2009-10-09 9:19 (18 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=29cd8bae396583a2ee9a3340db8c5102acf9f6fd
References : http://marc.info/?l=linux-kernel&m=125508007510274&w=4
Handled-By : Peter Zijlstra <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14381
Subject : iwlagn lost connection after s2ram (with warnings)
Submitter : Carlos R. Mafra <[email protected]>
Date : 2009-10-07 14:20 (20 days old)
References : http://marc.info/?l=linux-kernel&m=125492569119947&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14378
Subject : Problems with net/core/skbuff.c
Submitter : Massimo Cetra <[email protected]>
Date : 2009-10-08 14:51 (19 days old)
References : http://marc.info/?l=linux-kernel&m=125501488220358&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14376
Subject : Kernel NULL pointer dereference/ kvm subsystem
Submitter : Don Dupuis <[email protected]>
Date : 2009-10-06 14:38 (21 days old)
References : http://marc.info/?l=linux-kernel&m=125484025021737&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14373
Subject : Task blocked for more than 120 seconds
Submitter : Zeno Davatz <[email protected]>
Date : 2009-10-02 10:16 (25 days old)
References : http://marc.info/?l=linux-kernel&m=125447858618412&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14372
Subject : ath5k wireless not working after suspend-resume - eeepc
Submitter : Fabio Comolli <[email protected]>
Date : 2009-10-03 15:36 (24 days old)
References : http://lkml.org/lkml/2009/10/3/91


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14355
Subject : USB serial regression after 2.6.31.1 with Huawei E169 GSM modem
Submitter : Benjamin Herrenschmidt <[email protected]>
Date : 2009-10-10 03:07 (17 days old)
References : http://marc.info/?l=linux-kernel&m=125513456327542&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14354
Subject : Bad corruption with 2.6.32-rc1 and upwards
Submitter : Holger Freyther <[email protected]>
Date : 2009-10-09 15:42 (18 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14353
Subject : BUG: sleeping function called from invalid context at kernel/mutex.c:280
Submitter : Miles Lane <[email protected]>
Date : 2009-10-05 3:39 (22 days old)
References : http://marc.info/?l=linux-kernel&m=125471432208671&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14352
Subject : WARNING: at net/mac80211/scan.c:267
Submitter : Maciej Rutecki <[email protected]>
Date : 2009-10-08 00:30 (19 days old)
References : http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2089#c7


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14334
Subject : pcmcia suspend regression from 2.6.31.1 to 2.6.31.2 - Dell Inspiron 600m
Submitter : Jose Marino <[email protected]>
Date : 2009-10-06 15:44 (21 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14331
Subject : Radeon XPRESS 200M: System hang with radeon DRI and Fedora 10 userspace unless DRI=off
Submitter : Alex Villacis Lasso <[email protected]>
Date : 2009-10-06 00:29 (21 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14299
Subject : oops in wireless, iwl3945 related?
Submitter : Pavel Machek <[email protected]>
Date : 2009-09-29 17:12 (28 days old)
References : http://marc.info/?l=linux-kernel&m=125424439725743&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14298
Subject : warning at manage.c:361 (set_irq_wake), matrix-keypad related?
Submitter : Pavel Machek <[email protected]>
Date : 2009-09-30 20:07 (27 days old)
References : http://marc.info/?l=linux-kernel&m=125434130703538&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14297
Subject : console resume broken since ba15ab0e8d
Submitter : Sascha Hauer <[email protected]>
Date : 2009-09-30 15:11 (27 days old)
References : http://marc.info/?l=linux-kernel&m=125432349404060&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14296
Subject : spitz boots but suspend/resume is broken
Submitter : Pavel Machek <[email protected]>
Date : 2009-09-30 12:06 (27 days old)
References : http://marc.info/?l=linux-kernel&m=125431244516449&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14277
Subject : Caught 8-bit read from freed memory in b43 driver at association
Submitter : Christian Casteyde <[email protected]>
Date : 2009-09-30 18:06 (27 days old)


Regressions with patches
------------------------

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14480
Subject : 2 locks held by cat -- running "find /sys | head -c 4" --> system hang
Submitter : Miles Lane <[email protected]>
Date : 2009-10-20 16:11 (7 days old)
References : http://marc.info/?l=linux-kernel&m=125605511728088&w=4
Handled-By : Chris Wilson <[email protected]>
Patch : http://patchwork.kernel.org/patch/54974/


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14380
Subject : Video tearing/glitching with T400 laptops
Submitter : Theodore Ts'o <[email protected]>
Date : 2009-10-02 22:40 (25 days old)
References : http://marc.info/?l=linux-kernel&m=125452324520623&w=4
Handled-By : Jesse Barnes <[email protected]>
Patch : http://marc.info/?l=linux-kernel&m=125591495325000&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14379
Subject : ACPI Warning for _SB_.BAT0._BIF: Converted Buffer to expected String
Submitter : Justin Mattock <[email protected]>
Date : 2009-10-08 21:46 (19 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d9adc2e031bd22d5d9607a53a8d3b30e0b675f39
References : http://marc.info/?l=linux-kernel&m=125504031328941&w=4
Handled-By : Alexey Starikovskiy <[email protected]>
Patch : http://bugzilla.kernel.org/attachment.cgi?id=23347


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14375
Subject : Intel(R) I/OAT DMA Engine init failed
Submitter : Alexander Beregalov <[email protected]>
Date : 2009-10-02 9:46 (25 days old)
References : http://marc.info/?l=linux-kernel&m=125447680016160&w=4
Handled-By : Dan Williams <[email protected]>
Patch : http://patchwork.kernel.org/patch/51808/


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14302
Subject : Kernel panic on i386 machine when booting with profile=2
Submitter : Shi, Alex <[email protected]>
Date : 2009-10-01 3:23 (26 days old)
References : http://marc.info/?l=linux-kernel&m=125436749607199&w=4
Handled-By : Alex Shi <[email protected]>
Patch : http://patchwork.kernel.org/patch/50813/


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.31,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=14230

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael



2009-10-28 19:16:19

by John W. Linville

[permalink] [raw]
Subject: Re: 2.6.32-rc5-git3: Reported regressions from 2.6.31

Christian,

Will you be able to test this patch for us?

Thanks,

John

On Mon, Oct 26, 2009 at 09:38:28PM +0100, Michael Buesch wrote:
> On Monday 26 October 2009 20:37:33 Michael Buesch wrote:
> > Ok, it just turns out this actually is a driver bug.
> > Thanks to Johannes Berg for tracking it down.
> >
> > I think it's caused by the DMA bouncebuffer stuff that does not copy the skb->cb
> > and does not adjust the "tx-info" pointer.
> > I wonder why this didn't blow up easlier, because this bug is there since mac80211
> > switched to using the CB.
> >
> > Here's a completely untested patch.
>
> Here's a new version of the patch that also fixes queue mapping bugs:
>
> ---
> drivers/net/wireless/b43/dma.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> --- wireless-testing.orig/drivers/net/wireless/b43/dma.c
> +++ wireless-testing/drivers/net/wireless/b43/dma.c
> @@ -1157,8 +1157,9 @@ struct b43_dmaring *parse_cookie(struct
> }
>
> static int dma_tx_fragment(struct b43_dmaring *ring,
> - struct sk_buff *skb)
> + struct sk_buff **in_skb)
> {
> + struct sk_buff *skb = *in_skb;
> const struct b43_dma_ops *ops = ring->ops;
> struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
> u8 *header;
> @@ -1224,8 +1225,14 @@ static int dma_tx_fragment(struct b43_dm
> }
>
> memcpy(skb_put(bounce_skb, skb->len), skb->data, skb->len);
> + memcpy(bounce_skb->cb, skb->cb, sizeof(skb->cb));
> + bounce_skb->dev = skb->dev;
> + skb_set_queue_mapping(bounce_skb, skb_get_queue_mapping(skb));
> + info = IEEE80211_SKB_CB(bounce_skb);
> +
> dev_kfree_skb_any(skb);
> skb = bounce_skb;
> + *in_skb = bounce_skb;
> meta->skb = skb;
> meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
> if (b43_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
> @@ -1355,7 +1362,11 @@ int b43_dma_tx(struct b43_wldev *dev, st
> * static, so we don't need to store it per frame. */
> ring->queue_prio = skb_get_queue_mapping(skb);
>
> - err = dma_tx_fragment(ring, skb);
> + /* dma_tx_fragment might reallocate the skb, so invalidate pointers pointing
> + * into the skb data or cb now. */
> + hdr = NULL;
> + info = NULL;
> + err = dma_tx_fragment(ring, &skb);
> if (unlikely(err == -ENOKEY)) {
> /* Drop this packet, as we don't have the encryption key
> * anymore and must not transmit it unencrypted. */
>
>
>
> --
> Greetings, Michael.
>

--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2009-10-26 19:11:26

by Michael Büsch

[permalink] [raw]
Subject: Re: 2.6.32-rc5-git3: Reported regressions from 2.6.31

On Monday 26 October 2009 19:59:02 John W. Linville wrote:
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14277
> > Subject : Caught 8-bit read from freed memory in b43 driver at association
> > Submitter : Christian Casteyde <[email protected]>
> > Date : 2009-09-30 18:06 (27 days old)

Does this still trigger with a recent kernel (and thus recent memory debugging).
I'm still not convinced that this is a wireless bug.

--
Greetings, Michael.

2009-10-26 19:01:14

by John W. Linville

[permalink] [raw]
Subject: Re: 2.6.32-rc5-git3: Reported regressions from 2.6.31

Wireless ones...

On Mon, Oct 26, 2009 at 07:45:48PM +0100, Rafael J. Wysocki wrote:

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14381
> Subject : iwlagn lost connection after s2ram (with warnings)
> Submitter : Carlos R. Mafra <[email protected]>
> Date : 2009-10-07 14:20 (20 days old)
> References : http://marc.info/?l=linux-kernel&m=125492569119947&w=4

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14372
> Subject : ath5k wireless not working after suspend-resume - eeepc
> Submitter : Fabio Comolli <[email protected]>
> Date : 2009-10-03 15:36 (24 days old)
> References : http://lkml.org/lkml/2009/10/3/91

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14352
> Subject : WARNING: at net/mac80211/scan.c:267
> Submitter : Maciej Rutecki <[email protected]>
> Date : 2009-10-08 00:30 (19 days old)
> References : http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2089#c7

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14299
> Subject : oops in wireless, iwl3945 related?
> Submitter : Pavel Machek <[email protected]>
> Date : 2009-09-29 17:12 (28 days old)
> References : http://marc.info/?l=linux-kernel&m=125424439725743&w=4

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14277
> Subject : Caught 8-bit read from freed memory in b43 driver at association
> Submitter : Christian Casteyde <[email protected]>
> Date : 2009-09-30 18:06 (27 days old)

--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2009-10-28 20:57:43

by Michael Büsch

[permalink] [raw]
Subject: Re: 2.6.32-rc5-git3: Reported regressions from 2.6.31

On Wednesday 28 October 2009 21:38:37 Christian Casteyde wrote:
> I've just tested the patch posted in bugzilla: it works.
> That is, I do not manage to get the warning anymore in 3 boots in a row.

Thanks a lot for testing. I'll resend with proper signoff.

> CC
> Le mercredi 28 octobre 2009 20:05:47, John W. Linville a ?crit :
> > Christian,
> >
> > Will you be able to test this patch for us?
> >
> > Thanks,
> >
> > John
> >
> > On Mon, Oct 26, 2009 at 09:38:28PM +0100, Michael Buesch wrote:
> > > On Monday 26 October 2009 20:37:33 Michael Buesch wrote:
> > > > Ok, it just turns out this actually is a driver bug.
> > > > Thanks to Johannes Berg for tracking it down.
> > > >
> > > > I think it's caused by the DMA bouncebuffer stuff that does not copy
> > > > the skb->cb and does not adjust the "tx-info" pointer.
> > > > I wonder why this didn't blow up easlier, because this bug is there
> > > > since mac80211 switched to using the CB.
> > > >
> > > > Here's a completely untested patch.
> > >
> > > Here's a new version of the patch that also fixes queue mapping bugs:
> > >
> > > ---
> > > drivers/net/wireless/b43/dma.c | 15 +++++++++++++--
> > > 1 file changed, 13 insertions(+), 2 deletions(-)
> > >
> > > --- wireless-testing.orig/drivers/net/wireless/b43/dma.c
> > > +++ wireless-testing/drivers/net/wireless/b43/dma.c
> > > @@ -1157,8 +1157,9 @@ struct b43_dmaring *parse_cookie(struct
> > > }
> > >
> > > static int dma_tx_fragment(struct b43_dmaring *ring,
> > > - struct sk_buff *skb)
> > > + struct sk_buff **in_skb)
> > > {
> > > + struct sk_buff *skb = *in_skb;
> > > const struct b43_dma_ops *ops = ring->ops;
> > > struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
> > > u8 *header;
> > > @@ -1224,8 +1225,14 @@ static int dma_tx_fragment(struct b43_dm
> > > }
> > >
> > > memcpy(skb_put(bounce_skb, skb->len), skb->data, skb->len);
> > > + memcpy(bounce_skb->cb, skb->cb, sizeof(skb->cb));
> > > + bounce_skb->dev = skb->dev;
> > > + skb_set_queue_mapping(bounce_skb, skb_get_queue_mapping(skb));
> > > + info = IEEE80211_SKB_CB(bounce_skb);
> > > +
> > > dev_kfree_skb_any(skb);
> > > skb = bounce_skb;
> > > + *in_skb = bounce_skb;
> > > meta->skb = skb;
> > > meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
> > > if (b43_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
> > > @@ -1355,7 +1362,11 @@ int b43_dma_tx(struct b43_wldev *dev, st
> > > * static, so we don't need to store it per frame. */
> > > ring->queue_prio = skb_get_queue_mapping(skb);
> > >
> > > - err = dma_tx_fragment(ring, skb);
> > > + /* dma_tx_fragment might reallocate the skb, so invalidate pointers
> > > pointing + * into the skb data or cb now. */
> > > + hdr = NULL;
> > > + info = NULL;
> > > + err = dma_tx_fragment(ring, &skb);
> > > if (unlikely(err == -ENOKEY)) {
> > > /* Drop this packet, as we don't have the encryption key
> > > * anymore and must not transmit it unencrypted. */
> >
>
>



--
Greetings, Michael.

2009-10-26 20:39:53

by Michael Büsch

[permalink] [raw]
Subject: Re: 2.6.32-rc5-git3: Reported regressions from 2.6.31

On Monday 26 October 2009 20:37:33 Michael Buesch wrote:
> Ok, it just turns out this actually is a driver bug.
> Thanks to Johannes Berg for tracking it down.
>
> I think it's caused by the DMA bouncebuffer stuff that does not copy the skb->cb
> and does not adjust the "tx-info" pointer.
> I wonder why this didn't blow up easlier, because this bug is there since mac80211
> switched to using the CB.
>
> Here's a completely untested patch.

Here's a new version of the patch that also fixes queue mapping bugs:

---
drivers/net/wireless/b43/dma.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)

--- wireless-testing.orig/drivers/net/wireless/b43/dma.c
+++ wireless-testing/drivers/net/wireless/b43/dma.c
@@ -1157,8 +1157,9 @@ struct b43_dmaring *parse_cookie(struct
}

static int dma_tx_fragment(struct b43_dmaring *ring,
- struct sk_buff *skb)
+ struct sk_buff **in_skb)
{
+ struct sk_buff *skb = *in_skb;
const struct b43_dma_ops *ops = ring->ops;
struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
u8 *header;
@@ -1224,8 +1225,14 @@ static int dma_tx_fragment(struct b43_dm
}

memcpy(skb_put(bounce_skb, skb->len), skb->data, skb->len);
+ memcpy(bounce_skb->cb, skb->cb, sizeof(skb->cb));
+ bounce_skb->dev = skb->dev;
+ skb_set_queue_mapping(bounce_skb, skb_get_queue_mapping(skb));
+ info = IEEE80211_SKB_CB(bounce_skb);
+
dev_kfree_skb_any(skb);
skb = bounce_skb;
+ *in_skb = bounce_skb;
meta->skb = skb;
meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
if (b43_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
@@ -1355,7 +1362,11 @@ int b43_dma_tx(struct b43_wldev *dev, st
* static, so we don't need to store it per frame. */
ring->queue_prio = skb_get_queue_mapping(skb);

- err = dma_tx_fragment(ring, skb);
+ /* dma_tx_fragment might reallocate the skb, so invalidate pointers pointing
+ * into the skb data or cb now. */
+ hdr = NULL;
+ info = NULL;
+ err = dma_tx_fragment(ring, &skb);
if (unlikely(err == -ENOKEY)) {
/* Drop this packet, as we don't have the encryption key
* anymore and must not transmit it unencrypted. */



--
Greetings, Michael.

2009-10-26 19:38:08

by Michael Büsch

[permalink] [raw]
Subject: Re: 2.6.32-rc5-git3: Reported regressions from 2.6.31

On Monday 26 October 2009 20:11:20 Michael Buesch wrote:
> On Monday 26 October 2009 19:59:02 John W. Linville wrote:
> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14277
> > > Subject : Caught 8-bit read from freed memory in b43 driver at association
> > > Submitter : Christian Casteyde <[email protected]>
> > > Date : 2009-09-30 18:06 (27 days old)
>
> Does this still trigger with a recent kernel (and thus recent memory debugging).
> I'm still not convinced that this is a wireless bug.
>

Ok, it just turns out this actually is a driver bug.
Thanks to Johannes Berg for tracking it down.

I think it's caused by the DMA bouncebuffer stuff that does not copy the skb->cb
and does not adjust the "tx-info" pointer.
I wonder why this didn't blow up easlier, because this bug is there since mac80211
switched to using the CB.

Here's a completely untested patch.

---
drivers/net/wireless/b43/dma.c | 2 ++
1 file changed, 2 insertions(+)

--- wireless-testing.orig/drivers/net/wireless/b43/dma.c
+++ wireless-testing/drivers/net/wireless/b43/dma.c
@@ -1224,6 +1224,8 @@ static int dma_tx_fragment(struct b43_dm
}

memcpy(skb_put(bounce_skb, skb->len), skb->data, skb->len);
+ memcpy(bounce_skb->cb, skb->cb, sizeof(skb->cb));
+ info = IEEE80211_SKB_CB(bounce_skb);
dev_kfree_skb_any(skb);
skb = bounce_skb;
meta->skb = skb;


--
Greetings, Michael.

2009-10-28 20:38:07

by Christian Casteyde

[permalink] [raw]
Subject: Re: 2.6.32-rc5-git3: Reported regressions from 2.6.31

I've just tested the patch posted in bugzilla: it works.
That is, I do not manage to get the warning anymore in 3 boots in a row.

CC
Le mercredi 28 octobre 2009 20:05:47, John W. Linville a ?crit :
> Christian,
>
> Will you be able to test this patch for us?
>
> Thanks,
>
> John
>
> On Mon, Oct 26, 2009 at 09:38:28PM +0100, Michael Buesch wrote:
> > On Monday 26 October 2009 20:37:33 Michael Buesch wrote:
> > > Ok, it just turns out this actually is a driver bug.
> > > Thanks to Johannes Berg for tracking it down.
> > >
> > > I think it's caused by the DMA bouncebuffer stuff that does not copy
> > > the skb->cb and does not adjust the "tx-info" pointer.
> > > I wonder why this didn't blow up easlier, because this bug is there
> > > since mac80211 switched to using the CB.
> > >
> > > Here's a completely untested patch.
> >
> > Here's a new version of the patch that also fixes queue mapping bugs:
> >
> > ---
> > drivers/net/wireless/b43/dma.c | 15 +++++++++++++--
> > 1 file changed, 13 insertions(+), 2 deletions(-)
> >
> > --- wireless-testing.orig/drivers/net/wireless/b43/dma.c
> > +++ wireless-testing/drivers/net/wireless/b43/dma.c
> > @@ -1157,8 +1157,9 @@ struct b43_dmaring *parse_cookie(struct
> > }
> >
> > static int dma_tx_fragment(struct b43_dmaring *ring,
> > - struct sk_buff *skb)
> > + struct sk_buff **in_skb)
> > {
> > + struct sk_buff *skb = *in_skb;
> > const struct b43_dma_ops *ops = ring->ops;
> > struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
> > u8 *header;
> > @@ -1224,8 +1225,14 @@ static int dma_tx_fragment(struct b43_dm
> > }
> >
> > memcpy(skb_put(bounce_skb, skb->len), skb->data, skb->len);
> > + memcpy(bounce_skb->cb, skb->cb, sizeof(skb->cb));
> > + bounce_skb->dev = skb->dev;
> > + skb_set_queue_mapping(bounce_skb, skb_get_queue_mapping(skb));
> > + info = IEEE80211_SKB_CB(bounce_skb);
> > +
> > dev_kfree_skb_any(skb);
> > skb = bounce_skb;
> > + *in_skb = bounce_skb;
> > meta->skb = skb;
> > meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
> > if (b43_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
> > @@ -1355,7 +1362,11 @@ int b43_dma_tx(struct b43_wldev *dev, st
> > * static, so we don't need to store it per frame. */
> > ring->queue_prio = skb_get_queue_mapping(skb);
> >
> > - err = dma_tx_fragment(ring, skb);
> > + /* dma_tx_fragment might reallocate the skb, so invalidate pointers
> > pointing + * into the skb data or cb now. */
> > + hdr = NULL;
> > + info = NULL;
> > + err = dma_tx_fragment(ring, &skb);
> > if (unlikely(err == -ENOKEY)) {
> > /* Drop this packet, as we don't have the encryption key
> > * anymore and must not transmit it unencrypted. */
>

2009-11-02 23:42:50

by Christian Casteyde

[permalink] [raw]
Subject: Re: 2.6.32-rc5-git3: Reported regressions from 2.6.31

Nothing to mention: it seems to work, at least on my hardware.
I got the Allocated bounce buffer log, and managed to boot without any error,
associate and access the web/ssh another computer (and didn't get any
kmemcheck error of course).

CC

Le dimanche 01 novembre 2009 16:28:34, Michael Buesch a ?crit :
> On Wednesday 28 October 2009 21:38:37 Christian Casteyde wrote:
> > I've just tested the patch posted in bugzilla: it works.
> > That is, I do not manage to get the warning anymore in 3 boots in a row.
>
> Can you try this patch (on top of the previous one), please?
> It should fix the issue correctly by removing the skb copying.
> While testing make sure the debugging message
> "Allocated bounce buffer"
> shows up in the kernel log.
>
>
>
> Index: wireless-testing/drivers/net/wireless/b43/dma.c
> ===================================================================
> --- wireless-testing.orig/drivers/net/wireless/b43/dma.c 2009-11-01
> 15:10:48.000000000 +0100 +++
> wireless-testing/drivers/net/wireless/b43/dma.c 2009-11-01
> 16:26:00.000000000 +0100 @@ -1157,18 +1157,17 @@ struct b43_dmaring
> *parse_cookie(struct
> }
>
> static int dma_tx_fragment(struct b43_dmaring *ring,
> - struct sk_buff **in_skb)
> + struct sk_buff *skb)
> {
> - struct sk_buff *skb = *in_skb;
> const struct b43_dma_ops *ops = ring->ops;
> struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
> + struct b43_private_tx_info *priv_info = b43_get_priv_tx_info(info);
> u8 *header;
> int slot, old_top_slot, old_used_slots;
> int err;
> struct b43_dmadesc_generic *desc;
> struct b43_dmadesc_meta *meta;
> struct b43_dmadesc_meta *meta_hdr;
> - struct sk_buff *bounce_skb;
> u16 cookie;
> size_t hdrsize = b43_txhdr_size(ring->dev);
>
> @@ -1212,34 +1211,34 @@ static int dma_tx_fragment(struct b43_dm
>
> meta->skb = skb;
> meta->is_last_fragment = 1;
> + priv_info->bouncebuffer = NULL;
>
> meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
> /* create a bounce buffer in zone_dma on mapping failure. */
> if (b43_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
> - bounce_skb = __dev_alloc_skb(skb->len, GFP_ATOMIC | GFP_DMA);
> - if (!bounce_skb) {
> +
> +{
> +static unsigned int count;
> +if (count++ < 10)
> + printk(KERN_DEBUG "Allocated bounce buffer\n");
> +}
> + priv_info->bouncebuffer = kmalloc(skb->len, GFP_ATOMIC | GFP_DMA);
> + if (!priv_info->bouncebuffer) {
> ring->current_slot = old_top_slot;
> ring->used_slots = old_used_slots;
> err = -ENOMEM;
> goto out_unmap_hdr;
> }
> + memcpy(priv_info->bouncebuffer, skb->data, skb->len);
>
> - memcpy(skb_put(bounce_skb, skb->len), skb->data, skb->len);
> - memcpy(bounce_skb->cb, skb->cb, sizeof(skb->cb));
> - bounce_skb->dev = skb->dev;
> - skb_set_queue_mapping(bounce_skb, skb_get_queue_mapping(skb));
> - info = IEEE80211_SKB_CB(bounce_skb);
> -
> - dev_kfree_skb_any(skb);
> - skb = bounce_skb;
> - *in_skb = bounce_skb;
> - meta->skb = skb;
> - meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
> + meta->dmaaddr = map_descbuffer(ring, priv_info->bouncebuffer, skb->len,
> 1); if (b43_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
> + kfree(priv_info->bouncebuffer);
> + priv_info->bouncebuffer = NULL;
> ring->current_slot = old_top_slot;
> ring->used_slots = old_used_slots;
> err = -EIO;
> - goto out_free_bounce;
> + goto out_unmap_hdr;
> }
> }
>
> @@ -1256,8 +1255,6 @@ static int dma_tx_fragment(struct b43_dm
> ops->poke_tx(ring, next_slot(ring, slot));
> return 0;
>
> -out_free_bounce:
> - dev_kfree_skb_any(skb);
> out_unmap_hdr:
> unmap_descbuffer(ring, meta_hdr->dmaaddr,
> hdrsize, 1);
> @@ -1362,11 +1359,7 @@ int b43_dma_tx(struct b43_wldev *dev, st
> * static, so we don't need to store it per frame. */
> ring->queue_prio = skb_get_queue_mapping(skb);
>
> - /* dma_tx_fragment might reallocate the skb, so invalidate pointers
> pointing - * into the skb data or cb now. */
> - hdr = NULL;
> - info = NULL;
> - err = dma_tx_fragment(ring, &skb);
> + err = dma_tx_fragment(ring, skb);
> if (unlikely(err == -ENOKEY)) {
> /* Drop this packet, as we don't have the encryption key
> * anymore and must not transmit it unencrypted. */
> @@ -1413,12 +1406,17 @@ void b43_dma_handle_txstatus(struct b43_
> B43_WARN_ON(!(slot >= 0 && slot < ring->nr_slots));
> desc = ops->idx2desc(ring, slot, &meta);
>
> - if (meta->skb)
> - unmap_descbuffer(ring, meta->dmaaddr, meta->skb->len,
> - 1);
> - else
> + if (meta->skb) {
> + struct b43_private_tx_info *priv_info =
> + b43_get_priv_tx_info(IEEE80211_SKB_CB(meta->skb));
> +
> + unmap_descbuffer(ring, meta->dmaaddr, meta->skb->len, 1);
> + kfree(priv_info->bouncebuffer);
> + priv_info->bouncebuffer = NULL;
> + } else {
> unmap_descbuffer(ring, meta->dmaaddr,
> b43_txhdr_size(dev), 1);
> + }
>
> if (meta->is_last_fragment) {
> struct ieee80211_tx_info *info;
> Index: wireless-testing/drivers/net/wireless/b43/xmit.h
> ===================================================================
> --- wireless-testing.orig/drivers/net/wireless/b43/xmit.h 2009-10-09
> 19:50:15.000000000 +0200 +++
> wireless-testing/drivers/net/wireless/b43/xmit.h 2009-11-01
> 16:05:46.000000000 +0100 @@ -2,6 +2,8 @@
> #define B43_XMIT_H_
>
> #include "main.h"
> +#include <net/mac80211.h>
> +
>
> #define _b43_declare_plcp_hdr(size) \
> struct b43_plcp_hdr##size { \
> @@ -332,4 +334,21 @@ static inline u8 b43_kidx_to_raw(struct
> return raw_kidx;
> }
>
> +/* struct b43_private_tx_info - TX info private to b43.
> + * The structure is placed in (struct ieee80211_tx_info
> *)->rate_driver_data + *
> + * @bouncebuffer: DMA Bouncebuffer (if used)
> + */
> +struct b43_private_tx_info {
> + void *bouncebuffer;
> +};
> +
> +static inline struct b43_private_tx_info *
> +b43_get_priv_tx_info(struct ieee80211_tx_info *info)
> +{
> + BUILD_BUG_ON(sizeof(struct b43_private_tx_info) >
> + sizeof(info->rate_driver_data));
> + return (struct b43_private_tx_info *)info->rate_driver_data;
> +}
> +
> #endif /* B43_XMIT_H_ */
>

2009-11-03 14:10:33

by Michael Büsch

[permalink] [raw]
Subject: Re: 2.6.32-rc5-git3: Reported regressions from 2.6.31

On Tuesday 03 November 2009 00:43:26 Christian Casteyde wrote:
> Nothing to mention: it seems to work, at least on my hardware.
> I got the Allocated bounce buffer log, and managed to boot without any error,
> associate and access the web/ssh another computer (and didn't get any
> kmemcheck error of course).

Ok, cool. Thanks a lot for testing. I'll resubmit this patch for inclusion later.

--
Greetings, Michael.

2009-11-01 15:29:59

by Michael Büsch

[permalink] [raw]
Subject: Re: 2.6.32-rc5-git3: Reported regressions from 2.6.31

On Wednesday 28 October 2009 21:38:37 Christian Casteyde wrote:
> I've just tested the patch posted in bugzilla: it works.
> That is, I do not manage to get the warning anymore in 3 boots in a row.

Can you try this patch (on top of the previous one), please?
It should fix the issue correctly by removing the skb copying.
While testing make sure the debugging message
"Allocated bounce buffer"
shows up in the kernel log.



Index: wireless-testing/drivers/net/wireless/b43/dma.c
===================================================================
--- wireless-testing.orig/drivers/net/wireless/b43/dma.c 2009-11-01 15:10:48.000000000 +0100
+++ wireless-testing/drivers/net/wireless/b43/dma.c 2009-11-01 16:26:00.000000000 +0100
@@ -1157,18 +1157,17 @@ struct b43_dmaring *parse_cookie(struct
}

static int dma_tx_fragment(struct b43_dmaring *ring,
- struct sk_buff **in_skb)
+ struct sk_buff *skb)
{
- struct sk_buff *skb = *in_skb;
const struct b43_dma_ops *ops = ring->ops;
struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
+ struct b43_private_tx_info *priv_info = b43_get_priv_tx_info(info);
u8 *header;
int slot, old_top_slot, old_used_slots;
int err;
struct b43_dmadesc_generic *desc;
struct b43_dmadesc_meta *meta;
struct b43_dmadesc_meta *meta_hdr;
- struct sk_buff *bounce_skb;
u16 cookie;
size_t hdrsize = b43_txhdr_size(ring->dev);

@@ -1212,34 +1211,34 @@ static int dma_tx_fragment(struct b43_dm

meta->skb = skb;
meta->is_last_fragment = 1;
+ priv_info->bouncebuffer = NULL;

meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
/* create a bounce buffer in zone_dma on mapping failure. */
if (b43_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
- bounce_skb = __dev_alloc_skb(skb->len, GFP_ATOMIC | GFP_DMA);
- if (!bounce_skb) {
+
+{
+static unsigned int count;
+if (count++ < 10)
+ printk(KERN_DEBUG "Allocated bounce buffer\n");
+}
+ priv_info->bouncebuffer = kmalloc(skb->len, GFP_ATOMIC | GFP_DMA);
+ if (!priv_info->bouncebuffer) {
ring->current_slot = old_top_slot;
ring->used_slots = old_used_slots;
err = -ENOMEM;
goto out_unmap_hdr;
}
+ memcpy(priv_info->bouncebuffer, skb->data, skb->len);

- memcpy(skb_put(bounce_skb, skb->len), skb->data, skb->len);
- memcpy(bounce_skb->cb, skb->cb, sizeof(skb->cb));
- bounce_skb->dev = skb->dev;
- skb_set_queue_mapping(bounce_skb, skb_get_queue_mapping(skb));
- info = IEEE80211_SKB_CB(bounce_skb);
-
- dev_kfree_skb_any(skb);
- skb = bounce_skb;
- *in_skb = bounce_skb;
- meta->skb = skb;
- meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
+ meta->dmaaddr = map_descbuffer(ring, priv_info->bouncebuffer, skb->len, 1);
if (b43_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
+ kfree(priv_info->bouncebuffer);
+ priv_info->bouncebuffer = NULL;
ring->current_slot = old_top_slot;
ring->used_slots = old_used_slots;
err = -EIO;
- goto out_free_bounce;
+ goto out_unmap_hdr;
}
}

@@ -1256,8 +1255,6 @@ static int dma_tx_fragment(struct b43_dm
ops->poke_tx(ring, next_slot(ring, slot));
return 0;

-out_free_bounce:
- dev_kfree_skb_any(skb);
out_unmap_hdr:
unmap_descbuffer(ring, meta_hdr->dmaaddr,
hdrsize, 1);
@@ -1362,11 +1359,7 @@ int b43_dma_tx(struct b43_wldev *dev, st
* static, so we don't need to store it per frame. */
ring->queue_prio = skb_get_queue_mapping(skb);

- /* dma_tx_fragment might reallocate the skb, so invalidate pointers pointing
- * into the skb data or cb now. */
- hdr = NULL;
- info = NULL;
- err = dma_tx_fragment(ring, &skb);
+ err = dma_tx_fragment(ring, skb);
if (unlikely(err == -ENOKEY)) {
/* Drop this packet, as we don't have the encryption key
* anymore and must not transmit it unencrypted. */
@@ -1413,12 +1406,17 @@ void b43_dma_handle_txstatus(struct b43_
B43_WARN_ON(!(slot >= 0 && slot < ring->nr_slots));
desc = ops->idx2desc(ring, slot, &meta);

- if (meta->skb)
- unmap_descbuffer(ring, meta->dmaaddr, meta->skb->len,
- 1);
- else
+ if (meta->skb) {
+ struct b43_private_tx_info *priv_info =
+ b43_get_priv_tx_info(IEEE80211_SKB_CB(meta->skb));
+
+ unmap_descbuffer(ring, meta->dmaaddr, meta->skb->len, 1);
+ kfree(priv_info->bouncebuffer);
+ priv_info->bouncebuffer = NULL;
+ } else {
unmap_descbuffer(ring, meta->dmaaddr,
b43_txhdr_size(dev), 1);
+ }

if (meta->is_last_fragment) {
struct ieee80211_tx_info *info;
Index: wireless-testing/drivers/net/wireless/b43/xmit.h
===================================================================
--- wireless-testing.orig/drivers/net/wireless/b43/xmit.h 2009-10-09 19:50:15.000000000 +0200
+++ wireless-testing/drivers/net/wireless/b43/xmit.h 2009-11-01 16:05:46.000000000 +0100
@@ -2,6 +2,8 @@
#define B43_XMIT_H_

#include "main.h"
+#include <net/mac80211.h>
+

#define _b43_declare_plcp_hdr(size) \
struct b43_plcp_hdr##size { \
@@ -332,4 +334,21 @@ static inline u8 b43_kidx_to_raw(struct
return raw_kidx;
}

+/* struct b43_private_tx_info - TX info private to b43.
+ * The structure is placed in (struct ieee80211_tx_info *)->rate_driver_data
+ *
+ * @bouncebuffer: DMA Bouncebuffer (if used)
+ */
+struct b43_private_tx_info {
+ void *bouncebuffer;
+};
+
+static inline struct b43_private_tx_info *
+b43_get_priv_tx_info(struct ieee80211_tx_info *info)
+{
+ BUILD_BUG_ON(sizeof(struct b43_private_tx_info) >
+ sizeof(info->rate_driver_data));
+ return (struct b43_private_tx_info *)info->rate_driver_data;
+}
+
#endif /* B43_XMIT_H_ */


--
Greetings, Michael.