2008-02-13 13:54:22

by Zdenek Kabelac

[permalink] [raw]
Subject: losetup INFO: possible circular locking dependency detected

Hi

I'm getting this INFO message from losetup I'm using in my scripts:
if more info is needed from my side just ask.

(haven't seen this with 2.6.24..., using T61, 64bit kernel)

Bye

Zdenek

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.25-rc1 #29
-------------------------------------------------------
losetup/26595 is trying to acquire lock:
(&bdev->bd_mutex){--..}, at: [<ffffffff810e6258>] __blkdev_put+0x38/0x1e0

but task is already holding lock:
(&lo->lo_ctl_mutex){--..}, at: [<ffffffff883d097e>] lo_ioctl+0x4e/0xaf0 [loop]

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&lo->lo_ctl_mutex){--..}:
[<ffffffff81066f80>] __lock_acquire+0xf70/0x1000
[<ffffffff883cf384>] lo_open+0x34/0x50 [loop]
[<ffffffff810670b8>] lock_acquire+0xa8/0xf0
[<ffffffff883cf384>] lo_open+0x34/0x50 [loop]
[<ffffffff812f17ac>] mutex_lock_nested+0xcc/0x370
[<ffffffff883cf384>] lo_open+0x34/0x50 [loop]
[<ffffffff883cf384>] lo_open+0x34/0x50 [loop]
[<ffffffff810e67c9>] do_open+0xa9/0x350
[<ffffffff812f3b80>] _spin_unlock+0x30/0x60
[<ffffffff810e6aad>] blkdev_open+0x3d/0x90
[<ffffffff810b61b0>] __dentry_open+0x110/0x280
[<ffffffff810e6a70>] blkdev_open+0x0/0x90
[<ffffffff810b63e4>] nameidata_to_filp+0x44/0x60
[<ffffffff810b644a>] do_filp_open+0x4a/0x50
[<ffffffff810b603d>] get_unused_fd_flags+0xed/0x140
[<ffffffff810b64c6>] do_sys_open+0x76/0x100
[<ffffffff810b657b>] sys_open+0x1b/0x20
[<ffffffff8100c8ae>] tracesys+0xdc/0xe1
[<ffffffffffffffff>] 0xffffffffffffffff

-> #0 (&bdev->bd_mutex){--..}:
[<ffffffff81066e8d>] __lock_acquire+0xe7d/0x1000
[<ffffffff810658d9>] trace_hardirqs_on+0x139/0x1a0
[<ffffffff810e6258>] __blkdev_put+0x38/0x1e0
[<ffffffff810670b8>] lock_acquire+0xa8/0xf0
[<ffffffff810e6258>] __blkdev_put+0x38/0x1e0
[<ffffffff812f17ac>] mutex_lock_nested+0xcc/0x370
[<ffffffff810e6258>] __blkdev_put+0x38/0x1e0
[<ffffffff81014908>] native_sched_clock+0x98/0xa0
[<ffffffff810e6258>] __blkdev_put+0x38/0x1e0
[<ffffffff810e640b>] blkdev_put+0xb/0x10
[<ffffffff810e6438>] blkdev_close+0x28/0x40
[<ffffffff810b9530>] __fput+0xd0/0x1b0
[<ffffffff810b962d>] fput+0x1d/0x30
[<ffffffff883d091d>] loop_clr_fd+0x1ad/0x1c0 [loop]
[<ffffffff883d0a9f>] lo_ioctl+0x16f/0xaf0 [loop]
[<ffffffff810cccdf>] __d_lookup+0xef/0x180
[<ffffffff81014908>] native_sched_clock+0x98/0xa0
[<ffffffff81014908>] native_sched_clock+0x98/0xa0
[<ffffffff8106634d>] __lock_acquire+0x33d/0x1000
[<ffffffff8106634d>] __lock_acquire+0x33d/0x1000
[<ffffffff810639a6>] get_lock_stats+0x16/0x60
[<ffffffff81014908>] native_sched_clock+0x98/0xa0
[<ffffffff810639a6>] get_lock_stats+0x16/0x60
[<ffffffff810639fe>] put_lock_stats+0xe/0x30
[<ffffffff812f2049>] mutex_unlock+0x9/0x10
[<ffffffff8116731d>] blkdev_driver_ioctl+0x8d/0xa0
[<ffffffff811675ad>] blkdev_ioctl+0x27d/0x7e0
[<ffffffff81014908>] native_sched_clock+0x98/0xa0
[<ffffffff8106634d>] __lock_acquire+0x33d/0x1000
[<ffffffff81014908>] native_sched_clock+0x98/0xa0
[<ffffffff810639a6>] get_lock_stats+0x16/0x60
[<ffffffff810e5a7b>] block_ioctl+0x1b/0x20
[<ffffffff810c67b1>] vfs_ioctl+0x31/0xa0
[<ffffffff810c6aa3>] do_vfs_ioctl+0x283/0x2f0
[<ffffffff810c6ba9>] sys_ioctl+0x99/0xa0
[<ffffffff8100c8ae>] tracesys+0xdc/0xe1
[<ffffffffffffffff>] 0xffffffffffffffff

other info that might help us debug this:

1 lock held by losetup/26595:
#0: (&lo->lo_ctl_mutex){--..}, at: [<ffffffff883d097e>]
lo_ioctl+0x4e/0xaf0 [loop]

stack backtrace:
Pid: 26595, comm: losetup Not tainted 2.6.25-rc1 #29

Call Trace:
[<ffffffff81065d54>] print_circular_bug_tail+0x84/0x90
[<ffffffff81066e8d>] __lock_acquire+0xe7d/0x1000
[<ffffffff810658d9>] ? trace_hardirqs_on+0x139/0x1a0
[<ffffffff810e6258>] ? __blkdev_put+0x38/0x1e0
[<ffffffff810670b8>] lock_acquire+0xa8/0xf0
[<ffffffff810e6258>] ? __blkdev_put+0x38/0x1e0
[<ffffffff812f17ac>] mutex_lock_nested+0xcc/0x370
[<ffffffff810e6258>] ? __blkdev_put+0x38/0x1e0
[<ffffffff81014908>] ? native_sched_clock+0x98/0xa0
[<ffffffff810e6258>] __blkdev_put+0x38/0x1e0
[<ffffffff810e640b>] blkdev_put+0xb/0x10
[<ffffffff810e6438>] blkdev_close+0x28/0x40
[<ffffffff810b9530>] __fput+0xd0/0x1b0
[<ffffffff810b962d>] fput+0x1d/0x30
[<ffffffff883d091d>] :loop:loop_clr_fd+0x1ad/0x1c0
[<ffffffff883d0a9f>] :loop:lo_ioctl+0x16f/0xaf0
[<ffffffff810cccdf>] ? __d_lookup+0xef/0x180
[<ffffffff81014908>] ? native_sched_clock+0x98/0xa0
[<ffffffff81014908>] ? native_sched_clock+0x98/0xa0
[<ffffffff8106634d>] ? __lock_acquire+0x33d/0x1000
[<ffffffff8106634d>] ? __lock_acquire+0x33d/0x1000
[<ffffffff810639a6>] ? get_lock_stats+0x16/0x60
[<ffffffff81014908>] ? native_sched_clock+0x98/0xa0
[<ffffffff810639a6>] ? get_lock_stats+0x16/0x60
[<ffffffff810639fe>] ? put_lock_stats+0xe/0x30
[<ffffffff812f2049>] ? mutex_unlock+0x9/0x10
[<ffffffff8116731d>] blkdev_driver_ioctl+0x8d/0xa0
[<ffffffff811675ad>] blkdev_ioctl+0x27d/0x7e0
[<ffffffff81014908>] ? native_sched_clock+0x98/0xa0
[<ffffffff8106634d>] ? __lock_acquire+0x33d/0x1000
[<ffffffff81014908>] ? native_sched_clock+0x98/0xa0
[<ffffffff810639a6>] ? get_lock_stats+0x16/0x60
[<ffffffff810e5a7b>] block_ioctl+0x1b/0x20
[<ffffffff810c67b1>] vfs_ioctl+0x31/0xa0
[<ffffffff810c6aa3>] do_vfs_ioctl+0x283/0x2f0
[<ffffffff810c6ba9>] sys_ioctl+0x99/0xa0
[<ffffffff8100c8ae>] tracesys+0xdc/0xe1


2008-02-13 17:47:27

by Jiri Kosina

[permalink] [raw]
Subject: Re: losetup INFO: possible circular locking dependency detected

On Wed, 13 Feb 2008, Zdenek Kabelac wrote:

> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.25-rc1 #29
> -------------------------------------------------------
> losetup/26595 is trying to acquire lock:
> (&bdev->bd_mutex){--..}, at: [<ffffffff810e6258>] __blkdev_put+0x38/0x1e0
> but task is already holding lock:
> (&lo->lo_ctl_mutex){--..}, at: [<ffffffff883d097e>] lo_ioctl+0x4e/0xaf0 [loop]
> which lock already depends on the new lock.

Seems to me that this is valid and there indeed is a AB-BA deadlock in
loop code.

(BTW when looking at this, there seem to be other locking problems in
loop.c, the code for example doesn't seem to be consistent whether
accessing lo->lo_state needs to be protected by lo->lo_lock or not).

What about this ugly fix?



From: Jiri Kosina <[email protected]>

loop - fix deadlock against block

A: B:
bdev_open() for ino X
(locks bd_mutex for bdev Y)
lo_ioctl(LOOP_CLR_FD) for ino X
(locks lo_ctl_mutex)
fput()
__fput()
blkdev_close()
(hangs on bd_mutex for bdev Y)
lo_open()
(hangs on lo_ctl_mutex)

Fix this by letting releasing the lock inside loop_clr_fd() after the
loopback structure has been completely deinitialized, but before calling
final fput().

Signed-off-by: Jiri Kosina <[email protected]>

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 91ebb00..eb9e091 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -881,6 +881,7 @@ static int loop_clr_fd(struct loop_device *lo, struct block_device *bdev)
struct file *filp = lo->lo_backing_file;
gfp_t gfp = lo->old_gfp_mask;

+ mutex_lock(&lo->lo_ctl_mutex);
if (lo->lo_state != Lo_bound)
return -ENXIO;

@@ -916,6 +917,7 @@ static int loop_clr_fd(struct loop_device *lo, struct block_device *bdev)
bd_set_size(bdev, 0);
mapping_set_gfp_mask(filp->f_mapping, gfp);
lo->lo_state = Lo_unbound;
+ mutex_unlock(&lo->lo_ctl_mutex);
fput(filp);
/* This is safe: open() is still holding a reference. */
module_put(THIS_MODULE);
@@ -1143,8 +1145,11 @@ static int lo_ioctl(struct inode * inode, struct file * file,
err = loop_change_fd(lo, file, inode->i_bdev, arg);
break;
case LOOP_CLR_FD:
+ /* loop_clr_fd must do the locking itself, so that it
+ * doesn't deadlock with bdev */
+ mutex_unlock(&lo->lo_ctl_mutex);
err = loop_clr_fd(lo, inode->i_bdev);
- break;
+ goto out_unlocked;
case LOOP_SET_STATUS:
err = loop_set_status_old(lo, (struct loop_info __user *) arg);
break;
@@ -1160,6 +1165,7 @@ static int lo_ioctl(struct inode * inode, struct file * file,
default:
err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL;
}
+out_unlocked:
mutex_unlock(&lo->lo_ctl_mutex);
return err;
}

2008-02-13 20:22:40

by Jiri Kosina

[permalink] [raw]
Subject: Re: losetup INFO: possible circular locking dependency detected

On Wed, 13 Feb 2008, Jiri Kosina wrote:

> From: Jiri Kosina <[email protected]>
>
> loop - fix deadlock against block
>
> A: B:
> bdev_open() for ino X
> (locks bd_mutex for bdev Y)
> lo_ioctl(LOOP_CLR_FD) for ino X
> (locks lo_ctl_mutex)
> fput()
> __fput()
> blkdev_close()
> (hangs on bd_mutex for bdev Y)
> lo_open()
> (hangs on lo_ctl_mutex)
>
> Fix this by letting releasing the lock inside loop_clr_fd() after the
> loopback structure has been completely deinitialized, but before calling
> final fput().
>
> Signed-off-by: Jiri Kosina <[email protected]>

OK, the previous one was buggy, could you please try this one instead?

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 91ebb00..bfb2e90 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -881,6 +881,7 @@ static int loop_clr_fd(struct loop_device *lo, struct block_device *bdev)
struct file *filp = lo->lo_backing_file;
gfp_t gfp = lo->old_gfp_mask;

+ mutex_lock(&lo->lo_ctl_mutex);
if (lo->lo_state != Lo_bound)
return -ENXIO;

@@ -916,6 +917,7 @@ static int loop_clr_fd(struct loop_device *lo, struct block_device *bdev)
bd_set_size(bdev, 0);
mapping_set_gfp_mask(filp->f_mapping, gfp);
lo->lo_state = Lo_unbound;
+ mutex_unlock(&lo->lo_ctl_mutex);
fput(filp);
/* This is safe: open() is still holding a reference. */
module_put(THIS_MODULE);
@@ -1143,8 +1145,11 @@ static int lo_ioctl(struct inode * inode, struct file * file,
err = loop_change_fd(lo, file, inode->i_bdev, arg);
break;
case LOOP_CLR_FD:
+ /* loop_clr_fd must do the locking itself, so that it
+ * doesn't deadlock with bdev */
+ mutex_unlock(&lo->lo_ctl_mutex);
err = loop_clr_fd(lo, inode->i_bdev);
- break;
+ goto out_unlocked;
case LOOP_SET_STATUS:
err = loop_set_status_old(lo, (struct loop_info __user *) arg);
break;
@@ -1161,6 +1166,7 @@ static int lo_ioctl(struct inode * inode, struct file * file,
err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL;
}
mutex_unlock(&lo->lo_ctl_mutex);
+out_unlocked:
return err;
}