2010-08-01 13:52:59

by Rafael J. Wysocki

[permalink] [raw]
Subject: 2.6.35-rc6-git6: Reported regressions from 2.6.34

This message contains a list of some regressions from 2.6.34,
for which there are no fixes in the mainline known to the tracking team.
If any of them have been fixed already, please let us know.

If you know of any other unresolved regressions from 2.6.34, please let us
know either and we'll add them to the list. Also, please let us know
if any of the entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply
to this message with CCs to the people involved in reporting and handling
the issue.


Listed regressions statistics:

Date Total Pending Unresolved
----------------------------------------
2010-08-01 100 27 23
2010-07-23 94 33 25
2010-07-09 79 45 37
2010-06-21 46 37 26
2010-06-09 15 13 10


Unresolved regressions
----------------------

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16462
Subject : unable to connect to AP on legal channels 12/13
Submitter : Daniel J Blueman <[email protected]>
Date : 2010-07-25 17:06 (8 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16458
Subject : Bluetooth disabled after resume
Submitter : AttilaN <[email protected]>
Date : 2010-07-25 09:33 (8 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16450
Subject : MTD drivers cannot be unloaded
Submitter : Ben Hutchings <[email protected]>
Date : 2010-07-24 00:17 (9 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16448
Subject : 2.6.35-rc5 panic at __br_deliver+0x64/0xe0 with kvm bridge networking
Submitter : [email protected]
Date : 2010-07-23 3:25 (10 days old)
Message-ID : <198123598.1050221279855515402.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
References : http://marc.info/?l=linux-kernel&m=127985553916052&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16423
Subject : netfilter/iptables stopped logging 2.6.35-rc
Submitter : [email protected]
Date : 2010-07-17 10:20 (16 days old)
Message-ID : <[email protected]>
References : http://lkml.indiana.edu/hypermail/linux/kernel/1007.2/00440.html


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16406
Subject : Badness with the kernel version 2.6.35-rc1-git1 running on P6 box
Submitter : divya <[email protected]>
Date : 2010-07-16 8:50 (17 days old)
Message-ID : <[email protected]>
References : http://marc.info/?l=linux-kernel&m=127927024906085&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16400
Subject : 2.6.35-rc5 inconsistent lock state
Submitter : Martin Pirker <[email protected]>
Date : 2010-07-14 20:33 (19 days old)
Message-ID : <[email protected]>
References : http://marc.info/?l=linux-kernel&m=127913961025267&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16399
Subject : perf failed with kernel 2.6.35-rc
Submitter : Zhang, Yanmin <[email protected]>
Date : 2010-07-13 8:14 (20 days old)
First-Bad-Commit: http://git.kernel.org/linus/1ac62cfff252fb668405ef3398a1fa7f4a0d6d15
Message-ID : <[email protected]>
References : http://marc.info/?l=linux-kernel&m=127900880212470&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16396
Subject : [bisected] resume from suspend freezes system
Submitter : tomas m <[email protected]>
Date : 2010-07-15 02:32 (18 days old)
Handled-By : Rafael J. Wysocki <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16393
Subject : kernel BUG at fs/block_dev.c:765!
Submitter : Markus Trippelsdorf <[email protected]>
Date : 2010-07-14 13:52 (19 days old)
Message-ID : <[email protected]>
References : http://marc.info/?l=linux-kernel&m=127911564213748&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16383
Subject : Regression with e1000e from 2.6.34.1 to 2.6.35-rc5
Submitter : Stefan Behte <[email protected]>
Date : 2010-07-14 00:44 (19 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16380
Subject : Loop devices act strangely in 2.6.35
Submitter : Artem S. Tashkinov <[email protected]>
Date : 2010-07-13 23:21 (20 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16369
Subject : Yet another 2.6.35 regression (AGP)?
Submitter : Woody Suwalski <[email protected]>
Date : 2010-07-09 14:21 (24 days old)
Message-ID : <[email protected]>
References : http://marc.info/?l=linux-kernel&m=127868797119254&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16365
Subject : kernel BUG at fs/btrfs/extent-tree.c:1353
Submitter : Johannes Hirte <[email protected]>
Date : 2010-07-08 14:27 (25 days old)
Message-ID : <[email protected]>
References : http://marc.info/?l=linux-kernel&m=127859960725931&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16337
Subject : general protection fault: 0000 [#1] SMP
Submitter : Justin P. Mattock <[email protected]>
Date : 2010-07-03 22:59 (30 days old)
Message-ID : <[email protected]>
References : http://marc.info/?l=linux-kernel&m=127819798215589&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16322
Subject : WARNING: at /arch/x86/include/asm/processor.h:1005 read_measured_perf_ctrs+0x5a/0x70()
Submitter : boris64 <[email protected]>
Date : 2010-07-01 13:54 (32 days old)
Handled-By : H. Peter Anvin <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16307
Subject : i915 in kernel 2.6.35-rc3, high number of wakeups
Submitter : Enrico Bandiello <[email protected]>
Date : 2010-06-26 16:57 (37 days old)
Message-ID : <<[email protected]>>
References : http://marc.info/?l=linux-kernel&m=127757403404259&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16265
Subject : Why is kslowd accumulating so much CPU time?
Submitter : Theodore Ts'o <[email protected]>
Date : 2010-06-09 18:36 (54 days old)
First-Bad-Commit: http://git.kernel.org/linus/fbf81762e385d3d45acad057b654d56972acf58c
Message-ID : <[email protected]>
References : http://marc.info/?l=linux-kernel&m=127610857819033&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16228
Subject : BUG/boot failure on Dell Precision T3500 (pci/ahci_stop_engine)
Submitter : Brian Bloniarz <[email protected]>
Date : 2010-06-16 17:57 (47 days old)
Handled-By : Bjorn Helgaas <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16221
Subject : 2.6.35-rc2-git5 -- [drm:drm_mode_getfb] *ERROR* invalid framebuffer id
Submitter : Miles Lane <[email protected]>
Date : 2010-06-11 20:31 (52 days old)
Message-ID : <[email protected]>
References : http://marc.info/?l=linux-kernel&m=127628828119623&w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16215
Subject : sysfs: cannot create duplicate filename '/class/net/bnep0'
Submitter : Janusz Krzysztofik <[email protected]>
Date : 2010-06-15 14:55 (48 days old)
Handled-By : Eric W. Biederman <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16184
Subject : Container, X86-64, i386, iptables rule
Submitter : Jean-Marc Pigeon <[email protected]>
Date : 2010-06-12 04:17 (51 days old)
Handled-By : Patrick McHardy <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16173
Subject : After uncompressing the kernel, at boot time, the server hangs.
Submitter : David Hill <[email protected]>
Date : 2010-06-09 23:25 (54 days old)
First-Bad-Commit: http://git.kernel.org/linus/cf7500c0ea133d66f8449d86392d83f840102632
Handled-By : Eric W. Biederman <[email protected]>


Regressions with patches
------------------------

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16405
Subject : Brightness Adjustment on Toshiba nb305 Netbooks is non-functional.
Submitter : John Mesmon <[email protected]>
Date : 2010-07-15 23:40 (18 days old)
First-Bad-Commit: http://git.kernel.org/linus/74a365b3f354fafc537efa5867deb7a9fadbfe27
Handled-By : Matthew Garrett <[email protected]>
Patch : https://bugzilla.kernel.org/attachment.cgi?id=27236


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16312
Subject : WARNING: at fs/fs-writeback.c:1127 __mark_inode_dirty
Submitter : Zdenek Kabelac <[email protected]>
Date : 2010-06-28 9:40 (35 days old)
Message-ID : <[email protected]>
References : http://marc.info/?l=linux-kernel&m=127771804806465&w=2
Handled-By : Jan Kara <[email protected]>
Patch : https://bugzilla.kernel.org/attachment.cgi?id=27272


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16310
Subject : arm omap invalid module format
Submitter : Robert Nelson <[email protected]>
Date : 2010-06-28 17:30 (35 days old)
First-Bad-Commit: http://git.kernel.org/linus/d0679c730395d0bde9a46939e7ba255b4ba7dd7c
Handled-By : Michal Marek <[email protected]>
Patch : https://bugzilla.kernel.org/attachment.cgi?id=26999


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16278
Subject : lvm snapshot causes deadlock in 2.6.35
Submitter : Phillip Susi <[email protected]>
Date : 2010-06-23 16:55 (40 days old)
Handled-By : Eric Sandeen <[email protected]>
Patch : https://bugzilla.kernel.org/attachment.cgi?id=26933


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.34,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=16055

Please let the tracking team know if there are any Bugzilla entries that
should be added to the list in there.

Thanks!



2010-08-04 15:59:22

by Tejun Heo

[permalink] [raw]
Subject: [PATCH RESEND block#for-2.6.36] block_dev: always serialize exclusive open attempts

bd_prepare_to_claim() incorrectly allowed multiple attempts for
exclusive open to progress in parallel if the attempting holders are
identical. This triggered BUG_ON() as reported in the following bug.

https://bugzilla.kernel.org/show_bug.cgi?id=16393

__bd_abort_claiming() is used to finish claiming blocks and doesn't
work if multiple openers are inside a claiming block. Allowing
multiple parallel open attempts to continue doesn't gain anything as
those are serialized down in the call chain anyway. Fix it by always
allowing only single open attempt in a claiming block.

This problem can easily be reproduced by adding a delay after
bd_prepare_to_claim() and attempting to mount two partitions of a
disk.

stable: only applicable to v2.6.35

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Markus Trippelsdorf <[email protected]>
Cc: [email protected]
---
Oops, had the wrong reported-by credit. Updated.

Thanks.

fs/block_dev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 99d6af8..b3171fb 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -681,8 +681,8 @@ retry:
if (!bd_may_claim(bdev, whole, holder))
return -EBUSY;

- /* if someone else is claiming, wait for it to finish */
- if (whole->bd_claiming && whole->bd_claiming != holder) {
+ /* if claiming is already in progress, wait for it to finish */
+ if (whole->bd_claiming) {
wait_queue_head_t *wq = bit_waitqueue(&whole->bd_claiming, 0);
DEFINE_WAIT(wait);


2010-08-05 09:23:47

by Markus Trippelsdorf

[permalink] [raw]
Subject: Re: [PATCH RESEND block#for-2.6.36] block_dev: always serialize exclusive open attempts

On Thu, Aug 05, 2010 at 11:02:43AM +0200, Jens Axboe wrote:
> On 2010-08-04 17:59, Tejun Heo wrote:
> > bd_prepare_to_claim() incorrectly allowed multiple attempts for
> > exclusive open to progress in parallel if the attempting holders are
> > identical. This triggered BUG_ON() as reported in the following bug.
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=16393
> >
> > __bd_abort_claiming() is used to finish claiming blocks and doesn't
> > work if multiple openers are inside a claiming block. Allowing
> > multiple parallel open attempts to continue doesn't gain anything as
> > those are serialized down in the call chain anyway. Fix it by always
> > allowing only single open attempt in a claiming block.
> >
> > This problem can easily be reproduced by adding a delay after
> > bd_prepare_to_claim() and attempting to mount two partitions of a
> > disk.
> >
> > stable: only applicable to v2.6.35
> >
> > Signed-off-by: Tejun Heo <[email protected]>
> > Reported-by: Markus Trippelsdorf <[email protected]>
> > Cc: [email protected]
>
> Thanks Tejun, applied.

It's already in mainline:
e75aa85892b2ee78c79edac720868cbef16e62eb

--
?A man who doesn't know he is in prison can never escape.?
William S. Burroughs

2010-08-01 21:38:56

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.35-rc6-git6: Reported regressions from 2.6.34

On Sunday, August 01, 2010, Linus Torvalds wrote:
> On Sun, Aug 1, 2010 at 6:46 AM, Rafael J. Wysocki <[email protected]> wrote:
...
>
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16369
> > Subject : Yet another 2.6.35 regression (AGP)?
> > Submitter : Woody Suwalski <[email protected]>
> > Date : 2010-07-09 14:21 (24 days old)
> > Message-ID : <[email protected]>
> > References : http://marc.info/?l=linux-kernel&m=127868797119254&w=2
>
> Should hopefully be fixed by commit e7b96f28c58c ("agp/intel: Use the
> correct mask to detect i830 aperture size.")

Closed.

> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16365
> > Subject : kernel BUG at fs/btrfs/extent-tree.c:1353
> > Submitter : Johannes Hirte <[email protected]>
> > Date : 2010-07-08 14:27 (25 days old)
> > Message-ID : <[email protected]>
> > References : http://marc.info/?l=linux-kernel&m=127859960725931&w=2
>
> This one is reportedly fixed by commit 83ba7b071f30 ("writeback:
> simplify the write back thread queue")

Closed.

> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16215
> > Subject : sysfs: cannot create duplicate filename '/class/net/bnep0'
> > Submitter : Janusz Krzysztofik <[email protected]>
> > Date : 2010-06-15 14:55 (48 days old)
> > Handled-By : Eric W. Biederman <[email protected]>
>
> Fixed by commit 24b1442d01ae155ea716dfb94ed21605541c317d.

Closed.

Thanks,
Rafael

2010-08-01 19:38:43

by Larry Finger

[permalink] [raw]
Subject: Re: 2.6.35-rc6-git6: Reported regressions from 2.6.34

On 08/01/2010 08:46 AM, Rafael J. Wysocki wrote:
> This message contains a list of some regressions from 2.6.34,
> for which there are no fixes in the mainline known to the tracking team.
> If any of them have been fixed already, please let us know.
>
> If you know of any other unresolved regressions from 2.6.34, please let us
> know either and we'll add them to the list. Also, please let us know
> if any of the entries below are invalid.
>
> Each entry from the list will be sent additionally in an automatic reply
> to this message with CCs to the people involved in reporting and handling
> the issue.

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16312
> Subject : WARNING: at fs/fs-writeback.c:1127 __mark_inode_dirty
> Submitter : Zdenek Kabelac <[email protected]>
> Date : 2010-06-28 9:40 (35 days old)
> Message-ID : <[email protected]>
> References : http://marc.info/?l=linux-kernel&m=127771804806465&w=2
> Handled-By : Jan Kara <[email protected]>
> Patch : https://bugzilla.kernel.org/attachment.cgi?id=27272

I am beginning to think that Bug 16312 is not the same as Bug 16122. Even with
the patches from 16312, I still get warnings as below:

[ 11.728776] ------------[ cut here ]------------
[ 11.728787] WARNING: at fs/fs-writeback.c:964 __mark_inode_dirty+0x10f/0x1a0()
[ 11.728790] Hardware name: HP Pavilion dv2700 Notebook PC
[ 11.728792] Modules linked in: loop(+) dm_mod ide_cd_mod cdrom
snd_hda_codec_conexant ide_pci_generic arc4 ecb b43 rng_core mac80211
snd_hda_intel r8712u(C) cfg80211 snd_hda_codec amd74xx snd_pcm sg ide_core
rfkill led_class snd_timer ssb mmc_core pcmcia snd joydev k8temp hwmon
i2c_nforce2 pcmcia_core forcedeth serio_raw snd_page_alloc i2c_core battery ac
button ext4 mbcache jbd2 crc16 ohci_hcd sd_mod ehci_hcd usbcore fan processor
ahci libahci libata scsi_mod thermal
[ 11.728854] Pid: 2449, comm: udisks-part-id Tainted: G C
2.6.35-rc6-realtek+ #15
[ 11.728857] Call Trace:
[ 11.728865] [<ffffffff8104608a>] warn_slowpath_common+0x7a/0xb0
[ 11.728869] [<ffffffff810460d5>] warn_slowpath_null+0x15/0x20
[ 11.728874] [<ffffffff81129d5f>] __mark_inode_dirty+0x10f/0x1a0
[ 11.728879] [<ffffffff8111e07d>] touch_atime+0x12d/0x170
[ 11.728885] [<ffffffff810cab91>] generic_file_aio_read+0x5c1/0x720
[ 11.728890] [<ffffffff81107ca2>] do_sync_read+0xd2/0x110
[ 11.728896] [<ffffffff81077e7d>] ? trace_hardirqs_on+0xd/0x10
[ 11.728900] [<ffffffff811083c3>] vfs_read+0xb3/0x170
[ 11.728906] [<ffffffff81002d1c>] ? sysret_check+0x27/0x62
[ 11.728909] [<ffffffff811084cc>] sys_read+0x4c/0x80
[ 11.728914] [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
[ 11.728917] ---[ end trace 32e16cacad33229f ]---
[ 11.728919] bdi-block not registered

The warnings do not occur with every boot and appear to be some kind of race
condition.

Larry

2010-08-02 16:33:29

by Tejun Heo

[permalink] [raw]
Subject: Re: 2.6.35-rc6-git6: Reported regressions from 2.6.34

Hello, Linus.

On 08/01/2010 08:01 PM, Linus Torvalds wrote:
> This has a proposed patch. I don't know what the status of it is, though. Jens?
>
> http://marc.info/?l=linux-kernel&m=127950018204029&w=2
>
>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16393
>> Subject : kernel BUG at fs/block_dev.c:765!
>> Submitter : Markus Trippelsdorf <[email protected]>
>> Date : 2010-07-14 13:52 (19 days old)
>> Message-ID : <[email protected]>
>> References : http://marc.info/?l=linux-kernel&m=127911564213748&w=2
>
> This one is interesting. And I think I perhaps see where it's coming from.
>
> bd_start_claiming() (through bd_prepare_to_claim()) has two separate
> success cases: either there was no holder (bd_claiming is NULL) or the
> new holder was already claiming it (bd_claiming == holder).
>
> Note in particular the case of the holder _already_ holding it. What happens is:
>
> - bd_start_claiming() succeeds because we had _already_ claimed it
> with the same holder
>
> - then some error happens, and we call bd_abort_claiming(), which
> does whole->bd_claiming = NULL;
>
> - the original holder thinks it still holds the bd, but it has been released!
>
> - a new claimer comes in, and succeeds because bd_claiming is now NULL.
>
> - we now have two "owners" of the bd, but bd_claiming only points to
> the second one.
>
> I think bd_start_claiming() needs to do some kind of refcount for the
> nested holder case, and bd_abort_claiming() needs to decrement the
> refcount and only clear the bd_claiming field when it goes down to
> zero.
>
> I dunno. Maybe there's something else going on, but it does look
> suspicious, and the above would explain the BUG_ON().

Yeah, that definitely sounds plausible. I think the condition check
in bd_prepare_to_claim() should have been "if (whole->bd_claiming)"
instead of "if (whole->bd_claiming && whole->bd_claiming != holder)".
It doesn't make much sense to allow multiple parallel claiming
operations anyway and the comment above already says - "This function
fails if @bdev is already claimed by another holder and waits if
another claiming is in progress."

I'll try to build a test case and verify it.

Thank you.

--
tejun

2010-08-05 09:21:04

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH RESEND block#for-2.6.36] block_dev: always serialize exclusive open attempts

On 2010-08-05 11:17, Markus Trippelsdorf wrote:
> On Thu, Aug 05, 2010 at 11:02:43AM +0200, Jens Axboe wrote:
>> On 2010-08-04 17:59, Tejun Heo wrote:
>>> bd_prepare_to_claim() incorrectly allowed multiple attempts for
>>> exclusive open to progress in parallel if the attempting holders are
>>> identical. This triggered BUG_ON() as reported in the following bug.
>>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=16393
>>>
>>> __bd_abort_claiming() is used to finish claiming blocks and doesn't
>>> work if multiple openers are inside a claiming block. Allowing
>>> multiple parallel open attempts to continue doesn't gain anything as
>>> those are serialized down in the call chain anyway. Fix it by always
>>> allowing only single open attempt in a claiming block.
>>>
>>> This problem can easily be reproduced by adding a delay after
>>> bd_prepare_to_claim() and attempting to mount two partitions of a
>>> disk.
>>>
>>> stable: only applicable to v2.6.35
>>>
>>> Signed-off-by: Tejun Heo <[email protected]>
>>> Reported-by: Markus Trippelsdorf <[email protected]>
>>> Cc: [email protected]
>>
>> Thanks Tejun, applied.
>
> It's already in mainline:
> e75aa85892b2ee78c79edac720868cbef16e62eb

Irk, had not noticed yet, my for-2.6.36 branch isn't fully merged
up yet. Thanks for the heads-up.

--
Jens Axboe


Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited.

2010-08-01 18:01:39

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.35-rc6-git6: Reported regressions from 2.6.34

On Sun, Aug 1, 2010 at 6:46 AM, Rafael J. Wysocki <[email protected]> wrote:
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16400
> Subject : 2.6.35-rc5 inconsistent lock state
> Submitter : Martin Pirker <[email protected]>
> Date : 2010-07-14 20:33 (19 days old)
> Message-ID : <[email protected]>
> References : http://marc.info/?l=linux-kernel&m=127913961025267&w=2

This has a proposed patch. I don't know what the status of it is, though. Jens?

http://marc.info/?l=linux-kernel&m=127950018204029&w=2

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16393
> Subject : kernel BUG at fs/block_dev.c:765!
> Submitter : Markus Trippelsdorf <[email protected]>
> Date : 2010-07-14 13:52 (19 days old)
> Message-ID : <[email protected]>
> References : http://marc.info/?l=linux-kernel&m=127911564213748&w=2

This one is interesting. And I think I perhaps see where it's coming from.

bd_start_claiming() (through bd_prepare_to_claim()) has two separate
success cases: either there was no holder (bd_claiming is NULL) or the
new holder was already claiming it (bd_claiming == holder).

Note in particular the case of the holder _already_ holding it. What happens is:

- bd_start_claiming() succeeds because we had _already_ claimed it
with the same holder

- then some error happens, and we call bd_abort_claiming(), which
does whole->bd_claiming = NULL;

- the original holder thinks it still holds the bd, but it has been released!

- a new claimer comes in, and succeeds because bd_claiming is now NULL.

- we now have two "owners" of the bd, but bd_claiming only points to
the second one.

I think bd_start_claiming() needs to do some kind of refcount for the
nested holder case, and bd_abort_claiming() needs to decrement the
refcount and only clear the bd_claiming field when it goes down to
zero.

I dunno. Maybe there's something else going on, but it does look
suspicious, and the above would explain the BUG_ON().

Tejun, Jens?

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16369
> Subject : Yet another 2.6.35 regression (AGP)?
> Submitter : Woody Suwalski <[email protected]>
> Date : 2010-07-09 14:21 (24 days old)
> Message-ID : <[email protected]>
> References : http://marc.info/?l=linux-kernel&m=127868797119254&w=2

Should hopefully be fixed by commit e7b96f28c58c ("agp/intel: Use the
correct mask to detect i830 aperture size.")

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16365
> Subject : kernel BUG at fs/btrfs/extent-tree.c:1353
> Submitter : Johannes Hirte <[email protected]>
> Date : 2010-07-08 14:27 (25 days old)
> Message-ID : <[email protected]>
> References : http://marc.info/?l=linux-kernel&m=127859960725931&w=2

This one is reportedly fixed by commit 83ba7b071f30 ("writeback:
simplify the write back thread queue")

> Bug-Entry ? ? ? : http://bugzilla.kernel.org/show_bug.cgi?id=16215
> Subject ? ? ? ? : sysfs: cannot create duplicate filename '/class/net/bnep0'
> Submitter ? ? ? : Janusz Krzysztofik <[email protected]>
> Date ? ? ? ? ? ?: 2010-06-15 14:55 (48 days old)
> Handled-By ? ? ?: Eric W. Biederman <[email protected]>

Fixed by commit 24b1442d01ae155ea716dfb94ed21605541c317d.

Linus

2010-08-05 09:16:26

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH RESEND block#for-2.6.36] block_dev: always serialize exclusive open attempts

On 2010-08-04 17:59, Tejun Heo wrote:
> bd_prepare_to_claim() incorrectly allowed multiple attempts for
> exclusive open to progress in parallel if the attempting holders are
> identical. This triggered BUG_ON() as reported in the following bug.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=16393
>
> __bd_abort_claiming() is used to finish claiming blocks and doesn't
> work if multiple openers are inside a claiming block. Allowing
> multiple parallel open attempts to continue doesn't gain anything as
> those are serialized down in the call chain anyway. Fix it by always
> allowing only single open attempt in a claiming block.
>
> This problem can easily be reproduced by adding a delay after
> bd_prepare_to_claim() and attempting to mount two partitions of a
> disk.
>
> stable: only applicable to v2.6.35
>
> Signed-off-by: Tejun Heo <[email protected]>
> Reported-by: Markus Trippelsdorf <[email protected]>
> Cc: [email protected]

Thanks Tejun, applied.

--
Jens Axboe


Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited.

2010-08-04 15:40:43

by Tejun Heo

[permalink] [raw]
Subject: [PATCH block#for-2.6.36] block_dev: always serialize exclusive open attempts

bd_prepare_to_claim() incorrectly allowed multiple attempts for
exclusive open to progress in parallel if the attempting holders are
identical. This triggered BUG_ON() as reported in the following bug.

https://bugzilla.kernel.org/show_bug.cgi?id=16393

__bd_abort_claiming() is used to finish claiming blocks and doesn't
work if multiple openers are inside a claiming block. Allowing
multiple parallel open attempts to continue doesn't gain anything as
those are serialized down in the call chain anyway. Fix it by always
allowing only single open attempt in a claiming block.

This problem can easily be reproduced by adding a delay after
bd_prepare_to_claim() and attempting to mount two partitions of a
disk.

stable: only applicable to v2.6.35

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Maciej Rutecki <[email protected]>
Cc: [email protected]
---
fs/block_dev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 99d6af8..b3171fb 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -681,8 +681,8 @@ retry:
if (!bd_may_claim(bdev, whole, holder))
return -EBUSY;

- /* if someone else is claiming, wait for it to finish */
- if (whole->bd_claiming && whole->bd_claiming != holder) {
+ /* if claiming is already in progress, wait for it to finish */
+ if (whole->bd_claiming) {
wait_queue_head_t *wq = bit_waitqueue(&whole->bd_claiming, 0);
DEFINE_WAIT(wait);