2009-04-01 21:23:32

by Alexander Beregalov

[permalink] [raw]
Subject: 2.6.29-git: cannot mount ext4/loop

Hi Theodore, Jens

kernel is 2.6.29-07099-g8b53ef3

Mount failed:

EXT4-fs: barriers enabled
kjournald2 starting: pid 1867, dev loop0:8, commit interval 5 seconds
EXT4-fs error (device loop0): ext4_iget: block reference 2703228928 >=
max (524288) in inode #2, offset=0
EXT4-fs: get root inode failed
EXT4-fs (device loop0): mount failed

=====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
mount/1865 is trying to release lock (&lo->lo_ctl_mutex) at:
[<00000000006fa1c0>] mutex_unlock+0x10/0x20
but there are no more locks to release!

other info that might help us debug this:
1 lock held by mount/1865:
#0: (&bdev->bd_mutex){+.+.+.}, at: [<00000000004dd570>]
__blkdev_put+0x1c/0x14c

stack backtrace:
Call Trace:
[0000000000477538] print_unlock_inbalance_bug+0xe8/0xf8
[0000000000477624] lock_release_non_nested+0xdc/0x290
[0000000000477984] lock_release+0x1ac/0x1d8
[00000000006fa120] __mutex_unlock_slowpath+0xe8/0x178
[00000000006fa1c0] mutex_unlock+0x10/0x20
[0000000000621e38] lo_release+0x70/0x80
[00000000004dd5fc] __blkdev_put+0xa8/0x14c
[00000000004dd6b8] blkdev_put+0x18/0x2c
[00000000004dd704] blkdev_close+0x38/0x4c
[00000000004b6f0c] __fput+0xfc/0x1e4
[00000000004b701c] fput+0x28/0x38
[00000000004b4198] filp_close+0x74/0x88
[00000000004529e0] put_files_struct+0x9c/0xfc
[0000000000452a70] exit_files+0x30/0x40
[000000000045406c] do_exit+0x160/0x734
[00000000004546cc] do_group_exit+0x8c/0xc4


2009-04-01 22:54:02

by Alexander Beregalov

[permalink] [raw]
Subject: Re: 2.6.29-git: cannot mount ext4/loop

On Thu, Apr 02, 2009 at 01:23:28AM +0400, Alexander Beregalov wrote:
> Hi Theodore, Jens
>
> kernel is 2.6.29-07099-g8b53ef3
>
> Mount failed:
>
> EXT4-fs: barriers enabled
> kjournald2 starting: pid 1867, dev loop0:8, commit interval 5 seconds
> EXT4-fs error (device loop0): ext4_iget: block reference 2703228928 >=
> max (524288) in inode #2, offset=0
> EXT4-fs: get root inode failed
> EXT4-fs (device loop0): mount failed
>
> =====================================
> [ BUG: bad unlock balance detected! ]
> -------------------------------------
> mount/1865 is trying to release lock (&lo->lo_ctl_mutex) at:
> [<00000000006fa1c0>] mutex_unlock+0x10/0x20
> but there are no more locks to release!
>
> other info that might help us debug this:
> 1 lock held by mount/1865:
> #0: (&bdev->bd_mutex){+.+.+.}, at: [<00000000004dd570>]
> __blkdev_put+0x1c/0x14c
>
> stack backtrace:
> Call Trace:
> [0000000000477538] print_unlock_inbalance_bug+0xe8/0xf8
> [0000000000477624] lock_release_non_nested+0xdc/0x290
> [0000000000477984] lock_release+0x1ac/0x1d8
> [00000000006fa120] __mutex_unlock_slowpath+0xe8/0x178
> [00000000006fa1c0] mutex_unlock+0x10/0x20
> [0000000000621e38] lo_release+0x70/0x80
> [00000000004dd5fc] __blkdev_put+0xa8/0x14c
> [00000000004dd6b8] blkdev_put+0x18/0x2c
> [00000000004dd704] blkdev_close+0x38/0x4c
> [00000000004b6f0c] __fput+0xfc/0x1e4
> [00000000004b701c] fput+0x28/0x38
> [00000000004b4198] filp_close+0x74/0x88
> [00000000004529e0] put_files_struct+0x9c/0xfc
> [0000000000452a70] exit_files+0x30/0x40
> [000000000045406c] do_exit+0x160/0x734
> [00000000004546cc] do_group_exit+0x8c/0xc4

From: Alexander Beregalov <[email protected]>
Subject: [PATCH] loop: mutex already unlocked in loop_clr_fd()

mount/1865 is trying to release lock (&lo->lo_ctl_mutex) at:
but there are no more locks to release!

mutex is already unlocked in loop_clr_fd(), we should not
try to unlock it in lo_release() again.

Signed-off-by: Alexander Beregalov <[email protected]>
---

drivers/block/loop.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 40b17d3..ddae808 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1431,6 +1431,7 @@ static int lo_open(struct block_device *bdev, fmode_t mode)
static int lo_release(struct gendisk *disk, fmode_t mode)
{
struct loop_device *lo = disk->private_data;
+ int err;

mutex_lock(&lo->lo_ctl_mutex);

@@ -1442,7 +1443,9 @@ static int lo_release(struct gendisk *disk, fmode_t mode)
* In autoclear mode, stop the loop thread
* and remove configuration after last close.
*/
- loop_clr_fd(lo, NULL);
+ err = loop_clr_fd(lo, NULL);
+ if (!err)
+ goto out_unlocked;
} else {
/*
* Otherwise keep thread (if running) and config,
@@ -1453,7 +1456,7 @@ static int lo_release(struct gendisk *disk, fmode_t mode)

out:
mutex_unlock(&lo->lo_ctl_mutex);

2009-04-02 05:54:05

by Theodore Ts'o

[permalink] [raw]
Subject: Re: 2.6.29-git: cannot mount ext4/loop

On Thu, Apr 02, 2009 at 02:53:51AM +0400, Alexander Beregalov wrote:
> On Thu, Apr 02, 2009 at 01:23:28AM +0400, Alexander Beregalov wrote:
> > Hi Theodore, Jens
> >
> > kernel is 2.6.29-07099-g8b53ef3
> >
> > Mount failed:
> >
> > EXT4-fs: barriers enabled
> > kjournald2 starting: pid 1867, dev loop0:8, commit interval 5 seconds
> > EXT4-fs error (device loop0): ext4_iget: block reference 2703228928 >=
> > max (524288) in inode #2, offset=0
> > EXT4-fs: get root inode failed
> > EXT4-fs (device loop0): mount failed

I don't get this failure in a Linux 2.6 kernel based off of 5d80f8e5
with the )now upstream) ext4 patches merged in. I'm guessing sometime
between v2.6.29-3652-g5d80f8e and 2.6.29-07099-g8b53ef3 a regression
was introduced into the loop driver?

- Ted

2009-04-02 07:42:20

by Alexander Beregalov

[permalink] [raw]
Subject: Re: 2.6.29-git: cannot mount ext4/loop

2009/4/2 Theodore Tso <[email protected]>:
> On Thu, Apr 02, 2009 at 02:53:51AM +0400, Alexander Beregalov wrote:
>> On Thu, Apr 02, 2009 at 01:23:28AM +0400, Alexander Beregalov wrote:
>> > Hi Theodore, Jens
>> >
>> > kernel is 2.6.29-07099-g8b53ef3
>> >
>> > Mount failed:
>> >
>> > EXT4-fs: barriers enabled
>> > kjournald2 starting: pid 1867, dev loop0:8, commit interval 5 seconds
>> > EXT4-fs error (device loop0): ext4_iget: block reference 2703228928 >=
>> > max (524288) in inode #2, offset=0
>> > EXT4-fs: get root inode failed
>> > EXT4-fs (device loop0): mount failed
>
> I don't get this failure in a Linux 2.6 kernel based off of 5d80f8e5
> with the )now upstream) ext4 patches merged in.  I'm guessing sometime
> between v2.6.29-3652-g5d80f8e and 2.6.29-07099-g8b53ef3 a regression
> was introduced into the loop driver?

It is sparc host again.
2.6.29-06608-g15f7176 worked fine.
There was one commit to loop.c:
53d66608 (loop: add ioctl to resize a loop device)
and ext4 git pull.

I wil try to revert it and probably bisect.

2009-04-02 11:39:28

by Alexander Beregalov

[permalink] [raw]
Subject: Re: 2.6.29-git: cannot mount ext4/loop

2009/4/2 Alexander Beregalov <[email protected]>:
> 2009/4/2 Theodore Tso <[email protected]>:
>> On Thu, Apr 02, 2009 at 02:53:51AM +0400, Alexander Beregalov wrote:
>>> On Thu, Apr 02, 2009 at 01:23:28AM +0400, Alexander Beregalov wrote:
>>> > Hi Theodore, Jens
>>> >
>>> > kernel is 2.6.29-07099-g8b53ef3
>>> >
>>> > Mount failed:
>>> >
>>> > EXT4-fs: barriers enabled
>>> > kjournald2 starting: pid 1867, dev loop0:8, commit interval 5 seconds
>>> > EXT4-fs error (device loop0): ext4_iget: block reference 2703228928 >=
>>> > max (524288) in inode #2, offset=0
>>> > EXT4-fs: get root inode failed
>>> > EXT4-fs (device loop0): mount failed
>>
>> I don't get this failure in a Linux 2.6 kernel based off of 5d80f8e5
>> with the )now upstream) ext4 patches merged in.  I'm guessing sometime
>> between v2.6.29-3652-g5d80f8e and 2.6.29-07099-g8b53ef3 a regression
>> was introduced into the loop driver?
>
> It is sparc host again.
> 2.6.29-06608-g15f7176 worked fine.
> There was one commit to loop.c:
> 53d66608 (loop: add ioctl to resize a loop device)
> and ext4 git pull.
>
> I wil try to revert it and probably bisect.

Revert 53d66608 does not help.
Bisecting...

2009-04-02 12:47:08

by Alexander Beregalov

[permalink] [raw]
Subject: Re: 2.6.29-git: cannot mount ext4/loop

2009/4/2 Alexander Beregalov <[email protected]>:
> 2009/4/2 Alexander Beregalov <[email protected]>:
>> 2009/4/2 Theodore Tso <[email protected]>:
>>> On Thu, Apr 02, 2009 at 02:53:51AM +0400, Alexander Beregalov wrote:
>>>> On Thu, Apr 02, 2009 at 01:23:28AM +0400, Alexander Beregalov wrote:
>>>> > Hi Theodore, Jens
>>>> >
>>>> > kernel is 2.6.29-07099-g8b53ef3
>>>> >
>>>> > Mount failed:
>>>> >
>>>> > EXT4-fs: barriers enabled
>>>> > kjournald2 starting: pid 1867, dev loop0:8, commit interval 5 seconds
>>>> > EXT4-fs error (device loop0): ext4_iget: block reference 2703228928 >=
>>>> > max (524288) in inode #2, offset=0
>>>> > EXT4-fs: get root inode failed
>>>> > EXT4-fs (device loop0): mount failed
>>>
>>> I don't get this failure in a Linux 2.6 kernel based off of 5d80f8e5
>>> with the )now upstream) ext4 patches merged in.  I'm guessing sometime
>>> between v2.6.29-3652-g5d80f8e and 2.6.29-07099-g8b53ef3 a regression
>>> was introduced into the loop driver?
>>
>> It is sparc host again.
>> 2.6.29-06608-g15f7176 worked fine.
>> There was one commit to loop.c:
>> 53d66608 (loop: add ioctl to resize a loop device)
>> and ext4 git pull.
>>
>> I wil try to revert it and probably bisect.
>
> Revert 53d66608 does not help.
> Bisecting...

Done.
fe2c8191faa29d7a09f4962198f6dfab973ceec4 is first bad commit
commit fe2c8191faa29d7a09f4962198f6dfab973ceec4
Author: Thiemo Nagel <[email protected]>
Date: Tue Mar 31 08:36:10 2009 -0400

ext4: add checks of block references for non-extent inodes

2009-04-02 13:30:35

by Thiemo Nagel

[permalink] [raw]
Subject: Re: 2.6.29-git: cannot mount ext4/loop

Alexander Beregalov wrote:
> 2009/4/2 Alexander Beregalov <[email protected]>:
>> 2009/4/2 Alexander Beregalov <[email protected]>:
>>> 2009/4/2 Theodore Tso <[email protected]>:
>>>> On Thu, Apr 02, 2009 at 02:53:51AM +0400, Alexander Beregalov wrote:
>>>>> On Thu, Apr 02, 2009 at 01:23:28AM +0400, Alexander Beregalov wrote:
>>>>>> Hi Theodore, Jens
>>>>>>
>>>>>> kernel is 2.6.29-07099-g8b53ef3
>>>>>>
>>>>>> Mount failed:
>>>>>>
>>>>>> EXT4-fs: barriers enabled
>>>>>> kjournald2 starting: pid 1867, dev loop0:8, commit interval 5 seconds
>>>>>> EXT4-fs error (device loop0): ext4_iget: block reference 2703228928 >=
>>>>>> max (524288) in inode #2, offset=0
>>>>>> EXT4-fs: get root inode failed
>>>>>> EXT4-fs (device loop0): mount failed

This message indicates that the inode contains a reference to a block
outside the filesystem at inode->i_data[0].

When I added the block range checks, initially I was assuming that
when EXTENTS_FL is not set, the inode->i_data *always* contains
references to further blocks. Ted showed me wrong and added the condition

ISREG() || ISDIR() || ( ISLNK() && !is_fast_symlink() )

before that assumption can be made. But maybe we need some further
restraints?

Kind regards,

Thiemo

2009-04-02 14:54:45

by Theodore Ts'o

[permalink] [raw]
Subject: Re: 2.6.29-git: cannot mount ext4/loop

On Thu, Apr 02, 2009 at 03:30:26PM +0200, Thiemo Nagel wrote:
> When I added the block range checks, initially I was assuming that
> when EXTENTS_FL is not set, the inode->i_data *always* contains
> references to further blocks. Ted showed me wrong and added the condition
>
> ISREG() || ISDIR() || ( ISLNK() && !is_fast_symlink() )
>
> before that assumption can be made. But maybe we need some further
> restraints?

It's a endian-problem; we're missing le32_to_cpu() in that patch.
Sparc is big-endian.

Alexander, thanks for pointing this out. We'll have to get this fixed
and pushed to Linus ASAP.

- Ted

2009-04-02 15:18:43

by Thiemo Nagel

[permalink] [raw]
Subject: Re: 2.6.29-git: cannot mount ext4/loop

Theodore Tso wrote:
> On Thu, Apr 02, 2009 at 03:30:26PM +0200, Thiemo Nagel wrote:
>> When I added the block range checks, initially I was assuming that
>> when EXTENTS_FL is not set, the inode->i_data *always* contains
>> references to further blocks. Ted showed me wrong and added the condition
>>
>> ISREG() || ISDIR() || ( ISLNK() && !is_fast_symlink() )
>>
>> before that assumption can be made. But maybe we need some further
>> restraints?
>
> It's a endian-problem; we're missing le32_to_cpu() in that patch.
> Sparc is big-endian.

Sorry for that.

Thiemo

---
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 98e289a..ec3555d 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -377,11 +377,11 @@ static int __ext4_check_blockref(const char
*function, struct inode *inode,
unsigned int maxblocks =
ext4_blocks_count(EXT4_SB(inode->i_sb)->s_es);
unsigned int *bref = p;
while (bref < p+max) {
- if (unlikely(*bref >= maxblocks)) {
+ if (unlikely(le32_to_cpu(*bref) >= maxblocks)) {
ext4_error(inode->i_sb, function,
"block reference %u >= max (%u) "
"in inode #%lu, offset=%d",
- *bref, maxblocks,
+ le32_to_cpu(*bref), maxblocks,
inode->i_ino, (int)(bref-p));
return -EIO;
}

2009-04-02 15:41:23

by Theodore Ts'o

[permalink] [raw]
Subject: Re: 2.6.29-git: cannot mount ext4/loop

On Thu, Apr 02, 2009 at 05:18:39PM +0200, Thiemo Nagel wrote:
> Theodore Tso wrote:
>> On Thu, Apr 02, 2009 at 03:30:26PM +0200, Thiemo Nagel wrote:
>>> When I added the block range checks, initially I was assuming that
>>> when EXTENTS_FL is not set, the inode->i_data *always* contains
>>> references to further blocks. Ted showed me wrong and added the condition
>>>
>>> ISREG() || ISDIR() || ( ISLNK() && !is_fast_symlink() )
>>>
>>> before that assumption can be made. But maybe we need some further
>>> restraints?
>>
>> It's a endian-problem; we're missing le32_to_cpu() in that patch.
>> Sparc is big-endian.
>
> Sorry for that.

Could you also fix the types? bref should have a type of __le32, not
unsigned int, and when you pass in the reference to
__ext4_check_blockref(), there was an inappropriate cast to unsigned
int which hid kernel's natural type checking to catch these sorts of
problems.

I haven't had time yet to check your other patches; could you also
take a quick scan to make sure we have all of the byte-swapping calls
needed for proper big-endian checking, that we're using the correct
__le32 types and not doing any casts? I should have caught this when
I did my review of your patches, so this is also partially my fault.

Many thanks,

- Ted

2009-04-02 17:05:12

by Thiemo Nagel

[permalink] [raw]
Subject: Re: 2.6.29-git: cannot mount ext4/loop

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 98e289a..849e099 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -372,16 +372,16 @@ static int ext4_block_to_path(struct inode *inode,
}

static int __ext4_check_blockref(const char *function, struct inode *inode,
- unsigned int *p, unsigned int max) {
+ __le32 *p, unsigned int max) {

unsigned int maxblocks = ext4_blocks_count(EXT4_SB(inode->i_sb)->s_es);
- unsigned int *bref = p;
+ __le32 *bref = p;
while (bref < p+max) {
- if (unlikely(*bref >= maxblocks)) {
+ if (unlikely(le32_to_cpu(*bref) >= maxblocks)) {
ext4_error(inode->i_sb, function,
"block reference %u >= max (%u) "
"in inode #%lu, offset=%d",
- *bref, maxblocks,
+ le32_to_cpu(*bref), maxblocks,
inode->i_ino, (int)(bref-p));
return -EIO;
}
@@ -392,7 +392,7 @@ static int __ext4_check_blockref(const char *function, struct inode *inode,


#define ext4_check_indirect_blockref(inode, bh) \
- __ext4_check_blockref(__func__, inode, (__le32 *)(bh)->b_data, \
+ __ext4_check_blockref(__func__, inode, (bh)->b_data, \
EXT4_ADDR_PER_BLOCK((inode)->i_sb))

#define ext4_check_inode_blockref(inode) \


Attachments:
le32_fix (1.25 kB)

2009-04-02 18:29:09

by Alexander Beregalov

[permalink] [raw]
Subject: Re: 2.6.29-git: cannot mount ext4/loop

2009/4/2 Thiemo Nagel <[email protected]>:
> Theodore Tso wrote:
>>
>> On Thu, Apr 02, 2009 at 05:18:39PM +0200, Thiemo Nagel wrote:
>>>
>>> Theodore Tso wrote:
>>>>
>>>> On Thu, Apr 02, 2009 at 03:30:26PM +0200, Thiemo Nagel wrote:
>>>>>
>>>>> When I added the block range checks, initially I was assuming that
>>>>> when EXTENTS_FL is not set, the inode->i_data *always* contains
>>>>> references to further blocks.  Ted showed me wrong and added the
>>>>> condition
>>>>>
>>>>>        ISREG() || ISDIR() || ( ISLNK() && !is_fast_symlink() )
>>>>>
>>>>> before that assumption can be made.  But maybe we need some further
>>>>> restraints?
>>>>
>>>> It's a endian-problem; we're missing le32_to_cpu() in that patch.
>>>> Sparc is big-endian.
>>>
>>> Sorry for that.
>>
>> Could you also fix the types?  bref should have a type of __le32, not
>> unsigned int, and when you pass in the reference to
>> __ext4_check_blockref(), there was an inappropriate cast to unsigned
>> int which hid kernel's natural type checking to catch these sorts of
>> problems.
>
> So I was really asking for things to go wrong...  :-(
> I hope the attached patch handles conversion and types in the right way.
>  It's compile-tested only, the current ext4 tree crashes my machine.

It works fine, thanks.

Reported-by: Alexander Beregalov <[email protected]>
Tested-by: Alexander Beregalov <[email protected]>