2008-01-31 22:16:32

by Ingo Molnar

[permalink] [raw]
Subject: [bug] as_merged_requests(): possible recursive locking detected



Jens,

AS still has some locking issues - see the lockdep warning below that
the x86 test-rig just triggered. Config attached. Never saw this one
before. Can send more info if needed.

Ingo

---------->
udev: renamed network interface eth0_rename to eth1

=============================================
[ INFO: possible recursive locking detected ]
2.6.24 #183
---------------------------------------------
vol_id/1769 is trying to acquire lock:
(&ret->lock#2){.+..}, at: [<ffffffff8047afee>] as_merged_requests+0xa7/0x110

but task is already holding lock:
(&ret->lock#2){.+..}, at: [<ffffffff8047afe6>] as_merged_requests+0x9f/0x110

other info that might help us debug this:
2 locks held by vol_id/1769:
#0: (&q->__queue_lock){.+..}, at: [<ffffffff80473ffb>] __make_request+0x5f/0x3fd
#1: (&ret->lock#2){.+..}, at: [<ffffffff8047afe6>] as_merged_requests+0x9f/0x110

stack backtrace:
Pid: 1769, comm: vol_id Not tainted 2.6.24 #183

Call Trace:
[<ffffffff8024f2b9>] print_deadlock_bug+0xcb/0xd6
[<ffffffff8024f314>] check_deadlock+0x50/0x60
[<ffffffff80250c41>] validate_chain+0x1ed/0x289
[<ffffffff80251224>] __lock_acquire+0x547/0x608
[<ffffffff8047afee>] ? as_merged_requests+0xa7/0x110
[<ffffffff8025137e>] lock_acquire+0x99/0xc6
[<ffffffff8047afee>] ? as_merged_requests+0xa7/0x110
[<ffffffff8090004e>] _spin_lock+0x34/0x41
[<ffffffff8047afee>] as_merged_requests+0xa7/0x110
[<ffffffff80471184>] elv_merge_requests+0x28/0x51
[<ffffffff80476c1b>] attempt_merge+0xf5/0x14b
[<ffffffff80476cc4>] attempt_back_merge+0x27/0x30
[<ffffffff8047411c>] __make_request+0x180/0x3fd
[<ffffffff80472fb0>] generic_make_request+0x355/0x390
[<ffffffff802ac5ed>] ? create_empty_buffers+0xa0/0xa9
[<ffffffff804744ff>] submit_bio+0xfe/0x107
[<ffffffff802abfc4>] submit_bh+0xe7/0x10b
[<ffffffff802aefcd>] block_read_full_page+0x289/0x2a5
[<ffffffff802b1d7f>] ? blkdev_get_block+0x0/0x4c
[<ffffffff80268422>] ? add_to_page_cache+0xa1/0xd3
[<ffffffff802b0ef7>] blkdev_readpage+0x13/0x15
[<ffffffff8026f3fb>] read_pages+0x81/0xa1
[<ffffffff8026f5b0>] __do_page_cache_readahead+0x195/0x1b8
[<ffffffff80267fda>] ? find_get_page+0x58/0x64
[<ffffffff8026f7f4>] ondemand_readahead+0xa1/0x155
[<ffffffff8026f93b>] page_cache_sync_readahead+0x17/0x19
[<ffffffff80268c3f>] do_generic_mapping_read+0xa8/0x372
[<ffffffff80267d32>] ? file_read_actor+0x0/0x1ac
[<ffffffff80269f94>] generic_file_aio_read+0x125/0x164
[<ffffffff8028b9cc>] do_sync_read+0xeb/0x132
[<ffffffff80250416>] ? mark_held_locks+0x59/0x75
[<ffffffff8024549f>] ? autoremove_wake_function+0x0/0x38
[<ffffffff802515f9>] ? __lock_release+0x5b/0x64
[<ffffffff808fee67>] ? mutex_unlock+0x9/0xb
[<ffffffff808fee33>] ? __mutex_unlock_slowpath+0x10e/0x139
[<ffffffff802505d7>] ? trace_hardirqs_on+0xfe/0x128
[<ffffffff8028c0c4>] vfs_read+0xa4/0xe3
[<ffffffff8028c440>] sys_read+0x47/0x6f
[<ffffffff8020c10a>] system_call_after_swapgs+0x8a/0x8f

eth0: link down
ADDRCONF(NETDEV_UP): eth0: link is not ready
Adding 3911816k swap on /dev/sda2. Priority:-1 extents:1 across:3911816k
rc.sysinit used greatest stack depth: 3512 bytes left
device: 'vcs1': device_add


Attachments:
(No filename) (3.07 kB)
config (52.41 kB)
Download all attachments

2008-01-31 22:19:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: [bug] as_merged_requests(): possible recursive locking detected


* Ingo Molnar <[email protected]> wrote:

> Jens,
>
> AS still has some locking issues - see the lockdep warning below that
> the x86 test-rig just triggered. Config attached. Never saw this one
> before. Can send more info if needed.

btw., it didnt reproduce after a reboot, so it seems like a sporadic
condition.

Ingo

2008-02-01 08:41:37

by Jens Axboe

[permalink] [raw]
Subject: Re: [bug] as_merged_requests(): possible recursive locking detected

On Thu, Jan 31 2008, Ingo Molnar wrote:
>
>
> Jens,
>
> AS still has some locking issues - see the lockdep warning below that
> the x86 test-rig just triggered. Config attached. Never saw this one
> before. Can send more info if needed.
>
> Ingo
>
> ---------->
> udev: renamed network interface eth0_rename to eth1
>
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.24 #183
> ---------------------------------------------
> vol_id/1769 is trying to acquire lock:
> (&ret->lock#2){.+..}, at: [<ffffffff8047afee>] as_merged_requests+0xa7/0x110
>
> but task is already holding lock:
> (&ret->lock#2){.+..}, at: [<ffffffff8047afe6>] as_merged_requests+0x9f/0x110
>
> other info that might help us debug this:
> 2 locks held by vol_id/1769:
> #0: (&q->__queue_lock){.+..}, at: [<ffffffff80473ffb>] __make_request+0x5f/0x3fd
> #1: (&ret->lock#2){.+..}, at: [<ffffffff8047afe6>] as_merged_requests+0x9f/0x110
>
> stack backtrace:
> Pid: 1769, comm: vol_id Not tainted 2.6.24 #183
>
> Call Trace:
> [<ffffffff8024f2b9>] print_deadlock_bug+0xcb/0xd6
> [<ffffffff8024f314>] check_deadlock+0x50/0x60
> [<ffffffff80250c41>] validate_chain+0x1ed/0x289
> [<ffffffff80251224>] __lock_acquire+0x547/0x608
> [<ffffffff8047afee>] ? as_merged_requests+0xa7/0x110
> [<ffffffff8025137e>] lock_acquire+0x99/0xc6
> [<ffffffff8047afee>] ? as_merged_requests+0xa7/0x110
> [<ffffffff8090004e>] _spin_lock+0x34/0x41
> [<ffffffff8047afee>] as_merged_requests+0xa7/0x110
> [<ffffffff80471184>] elv_merge_requests+0x28/0x51
> [<ffffffff80476c1b>] attempt_merge+0xf5/0x14b
> [<ffffffff80476cc4>] attempt_back_merge+0x27/0x30
> [<ffffffff8047411c>] __make_request+0x180/0x3fd
> [<ffffffff80472fb0>] generic_make_request+0x355/0x390
> [<ffffffff802ac5ed>] ? create_empty_buffers+0xa0/0xa9
> [<ffffffff804744ff>] submit_bio+0xfe/0x107
> [<ffffffff802abfc4>] submit_bh+0xe7/0x10b
> [<ffffffff802aefcd>] block_read_full_page+0x289/0x2a5
> [<ffffffff802b1d7f>] ? blkdev_get_block+0x0/0x4c
> [<ffffffff80268422>] ? add_to_page_cache+0xa1/0xd3
> [<ffffffff802b0ef7>] blkdev_readpage+0x13/0x15
> [<ffffffff8026f3fb>] read_pages+0x81/0xa1
> [<ffffffff8026f5b0>] __do_page_cache_readahead+0x195/0x1b8
> [<ffffffff80267fda>] ? find_get_page+0x58/0x64
> [<ffffffff8026f7f4>] ondemand_readahead+0xa1/0x155
> [<ffffffff8026f93b>] page_cache_sync_readahead+0x17/0x19
> [<ffffffff80268c3f>] do_generic_mapping_read+0xa8/0x372
> [<ffffffff80267d32>] ? file_read_actor+0x0/0x1ac
> [<ffffffff80269f94>] generic_file_aio_read+0x125/0x164
> [<ffffffff8028b9cc>] do_sync_read+0xeb/0x132
> [<ffffffff80250416>] ? mark_held_locks+0x59/0x75
> [<ffffffff8024549f>] ? autoremove_wake_function+0x0/0x38
> [<ffffffff802515f9>] ? __lock_release+0x5b/0x64
> [<ffffffff808fee67>] ? mutex_unlock+0x9/0xb
> [<ffffffff808fee33>] ? __mutex_unlock_slowpath+0x10e/0x139
> [<ffffffff802505d7>] ? trace_hardirqs_on+0xfe/0x128
> [<ffffffff8028c0c4>] vfs_read+0xa4/0xe3
> [<ffffffff8028c440>] sys_read+0x47/0x6f
> [<ffffffff8020c10a>] system_call_after_swapgs+0x8a/0x8f

Are you sure this triggered with the as fixup in place? It looks like
the same bug.

--
Jens Axboe

2008-02-01 09:59:46

by Nikanth Karthikesan

[permalink] [raw]
Subject: Re: [bug] as_merged_requests(): possible recursive locking detected

On Thu, 2008-01-31 at 23:14 +0100, Ingo Molnar wrote:
>
> Jens,
>
> AS still has some locking issues - see the lockdep warning below that
> the x86 test-rig just triggered. Config attached. Never saw this one
> before. Can send more info if needed.
>

The io_contexts are swapped. And while swapping, the locks were also
getting swapped, which will change the order of locking after that. This
may be the cause of these warning. I am not sure whether not swapping
the locks is the right way to fix this. Using a field of spinlock_t
itself to order locking might be better, instead of the address of the
container.

Now while adding a new member to io_context, one should not forget to
add it here. Also copying whole io_context and then restoring the locks
might have a window where this warning could be triggered.

Thanks
Nikanth Karthikesan

Do not swap locks while swapping io_contexts

Signed-off-by: Nikanth Karthikesan <[email protected]>

---
diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index 6d16755..b9c6e39 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -179,9 +179,32 @@ EXPORT_SYMBOL(copy_io_context);
void swap_io_context(struct io_context **ioc1, struct io_context **ioc2)
{
struct io_context *temp;
+
+ /*
+ * Do not swap the locks to preserve locking order
+ */
+
temp = *ioc1;
- *ioc1 = *ioc2;
- *ioc2 = temp;
+
+ (*ioc1)->refcount = (*ioc2)->refcount;
+ (*ioc1)->nr_tasks = (*ioc2)->nr_tasks;
+ (*ioc1)->ioprio = (*ioc2)->ioprio;
+ (*ioc1)->ioprio_changed = (*ioc2)->ioprio_changed;
+ (*ioc1)->last_waited = (*ioc2)->last_waited;
+ (*ioc1)->nr_batch_requests = (*ioc2)->nr_batch_requests;
+ (*ioc1)->aic = (*ioc2)->aic;
+ (*ioc1)->radix_root = (*ioc2)->radix_root;
+ (*ioc1)->ioc_data = (*ioc2)->ioc_data;
+
+ (*ioc2)->refcount = (temp)->refcount;
+ (*ioc2)->nr_tasks = (temp)->nr_tasks;
+ (*ioc2)->ioprio = (temp)->ioprio;
+ (*ioc2)->ioprio_changed = (temp)->ioprio_changed;
+ (*ioc2)->last_waited = (temp)->last_waited;
+ (*ioc2)->nr_batch_requests = (temp)->nr_batch_requests;
+ (*ioc2)->aic = (temp)->aic;
+ (*ioc2)->radix_root = (temp)->radix_root;
+ (*ioc2)->ioc_data = (temp)->ioc_data;
}
EXPORT_SYMBOL(swap_io_context);


2008-02-01 10:12:23

by Jens Axboe

[permalink] [raw]
Subject: Re: [bug] as_merged_requests(): possible recursive locking detected

On Fri, Feb 01 2008, Nikanth Karthikesan wrote:
> On Thu, 2008-01-31 at 23:14 +0100, Ingo Molnar wrote:
> >
> > Jens,
> >
> > AS still has some locking issues - see the lockdep warning below that
> > the x86 test-rig just triggered. Config attached. Never saw this one
> > before. Can send more info if needed.
> >
>
> The io_contexts are swapped. And while swapping, the locks were also
> getting swapped, which will change the order of locking after that. This
> may be the cause of these warning. I am not sure whether not swapping
> the locks is the right way to fix this. Using a field of spinlock_t
> itself to order locking might be better, instead of the address of the
> container.
>
> Now while adding a new member to io_context, one should not forget to
> add it here. Also copying whole io_context and then restoring the locks
> might have a window where this warning could be triggered.

Oops, the locks should definitely be left alone. It's not just the
locking order, but also it would confuse lockdep.

> diff --git a/block/blk-ioc.c b/block/blk-ioc.c
> index 6d16755..b9c6e39 100644
> --- a/block/blk-ioc.c
> +++ b/block/blk-ioc.c
> @@ -179,9 +179,32 @@ EXPORT_SYMBOL(copy_io_context);
> void swap_io_context(struct io_context **ioc1, struct io_context **ioc2)
> {
> struct io_context *temp;
> +
> + /*
> + * Do not swap the locks to preserve locking order
> + */
> +
> temp = *ioc1;
> - *ioc1 = *ioc2;
> - *ioc2 = temp;
> +
> + (*ioc1)->refcount = (*ioc2)->refcount;
> + (*ioc1)->nr_tasks = (*ioc2)->nr_tasks;
> + (*ioc1)->ioprio = (*ioc2)->ioprio;
> + (*ioc1)->ioprio_changed = (*ioc2)->ioprio_changed;
> + (*ioc1)->last_waited = (*ioc2)->last_waited;
> + (*ioc1)->nr_batch_requests = (*ioc2)->nr_batch_requests;
> + (*ioc1)->aic = (*ioc2)->aic;
> + (*ioc1)->radix_root = (*ioc2)->radix_root;
> + (*ioc1)->ioc_data = (*ioc2)->ioc_data;
> +
> + (*ioc2)->refcount = (temp)->refcount;
> + (*ioc2)->nr_tasks = (temp)->nr_tasks;
> + (*ioc2)->ioprio = (temp)->ioprio;
> + (*ioc2)->ioprio_changed = (temp)->ioprio_changed;
> + (*ioc2)->last_waited = (temp)->last_waited;
> + (*ioc2)->nr_batch_requests = (temp)->nr_batch_requests;
> + (*ioc2)->aic = (temp)->aic;
> + (*ioc2)->radix_root = (temp)->radix_root;
> + (*ioc2)->ioc_data = (temp)->ioc_data;
> }
> EXPORT_SYMBOL(swap_io_context);

Ugh, that's pretty horrible. How about moving the lock first in the
struct and just doing memcpy()? Still ugly, but better.

I think the right solution is to remove swap_io_context() and fix the io
context referencing in as-iosched.c instead.

--
Jens Axboe

2008-02-01 10:32:09

by Jens Axboe

[permalink] [raw]
Subject: Re: [bug] as_merged_requests(): possible recursive locking detected

On Fri, Feb 01 2008, Jens Axboe wrote:
> On Fri, Feb 01 2008, Nikanth Karthikesan wrote:
> > On Thu, 2008-01-31 at 23:14 +0100, Ingo Molnar wrote:
> > >
> > > Jens,
> > >
> > > AS still has some locking issues - see the lockdep warning below that
> > > the x86 test-rig just triggered. Config attached. Never saw this one
> > > before. Can send more info if needed.
> > >
> >
> > The io_contexts are swapped. And while swapping, the locks were also
> > getting swapped, which will change the order of locking after that. This
> > may be the cause of these warning. I am not sure whether not swapping
> > the locks is the right way to fix this. Using a field of spinlock_t
> > itself to order locking might be better, instead of the address of the
> > container.
> >
> > Now while adding a new member to io_context, one should not forget to
> > add it here. Also copying whole io_context and then restoring the locks
> > might have a window where this warning could be triggered.
>
> Oops, the locks should definitely be left alone. It's not just the
> locking order, but also it would confuse lockdep.
>
> > diff --git a/block/blk-ioc.c b/block/blk-ioc.c
> > index 6d16755..b9c6e39 100644
> > --- a/block/blk-ioc.c
> > +++ b/block/blk-ioc.c
> > @@ -179,9 +179,32 @@ EXPORT_SYMBOL(copy_io_context);
> > void swap_io_context(struct io_context **ioc1, struct io_context **ioc2)
> > {
> > struct io_context *temp;
> > +
> > + /*
> > + * Do not swap the locks to preserve locking order
> > + */
> > +
> > temp = *ioc1;
> > - *ioc1 = *ioc2;
> > - *ioc2 = temp;
> > +
> > + (*ioc1)->refcount = (*ioc2)->refcount;
> > + (*ioc1)->nr_tasks = (*ioc2)->nr_tasks;
> > + (*ioc1)->ioprio = (*ioc2)->ioprio;
> > + (*ioc1)->ioprio_changed = (*ioc2)->ioprio_changed;
> > + (*ioc1)->last_waited = (*ioc2)->last_waited;
> > + (*ioc1)->nr_batch_requests = (*ioc2)->nr_batch_requests;
> > + (*ioc1)->aic = (*ioc2)->aic;
> > + (*ioc1)->radix_root = (*ioc2)->radix_root;
> > + (*ioc1)->ioc_data = (*ioc2)->ioc_data;
> > +
> > + (*ioc2)->refcount = (temp)->refcount;
> > + (*ioc2)->nr_tasks = (temp)->nr_tasks;
> > + (*ioc2)->ioprio = (temp)->ioprio;
> > + (*ioc2)->ioprio_changed = (temp)->ioprio_changed;
> > + (*ioc2)->last_waited = (temp)->last_waited;
> > + (*ioc2)->nr_batch_requests = (temp)->nr_batch_requests;
> > + (*ioc2)->aic = (temp)->aic;
> > + (*ioc2)->radix_root = (temp)->radix_root;
> > + (*ioc2)->ioc_data = (temp)->ioc_data;
> > }
> > EXPORT_SYMBOL(swap_io_context);
>
> Ugh, that's pretty horrible. How about moving the lock first in the
> struct and just doing memcpy()? Still ugly, but better.
>
> I think the right solution is to remove swap_io_context() and fix the io
> context referencing in as-iosched.c instead.

IOW, the below. I don't know why Nick originally wanted to swap io
contexts for a rq <-> rq merge, there seems little (if any) benefit to
doing so.

diff --git a/block/as-iosched.c b/block/as-iosched.c
index 9603684..852803e 100644
--- a/block/as-iosched.c
+++ b/block/as-iosched.c
@@ -1266,22 +1266,8 @@ static void as_merged_requests(struct request_queue *q, struct request *req,
*/
if (!list_empty(&req->queuelist) && !list_empty(&next->queuelist)) {
if (time_before(rq_fifo_time(next), rq_fifo_time(req))) {
- struct io_context *rioc = RQ_IOC(req);
- struct io_context *nioc = RQ_IOC(next);
-
list_move(&req->queuelist, &next->queuelist);
rq_set_fifo_time(req, rq_fifo_time(next));
- /*
- * Don't copy here but swap, because when anext is
- * removed below, it must contain the unused context
- */
- if (rioc != nioc) {
- double_spin_lock(&rioc->lock, &nioc->lock,
- rioc < nioc);
- swap_io_context(&rioc, &nioc);
- double_spin_unlock(&rioc->lock, &nioc->lock,
- rioc < nioc);
- }
}
}

diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index 6d16755..80245dc 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -176,15 +176,6 @@ void copy_io_context(struct io_context **pdst, struct io_context **psrc)
}
EXPORT_SYMBOL(copy_io_context);

-void swap_io_context(struct io_context **ioc1, struct io_context **ioc2)
-{
- struct io_context *temp;
- temp = *ioc1;
- *ioc1 = *ioc2;
- *ioc2 = temp;
-}
-EXPORT_SYMBOL(swap_io_context);
-
int __init blk_ioc_init(void)
{
iocontext_cachep = kmem_cache_create("blkdev_ioc",
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index baba233..bbe3cf4 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -39,7 +39,6 @@ void exit_io_context(void);
struct io_context *get_io_context(gfp_t gfp_flags, int node);
struct io_context *alloc_io_context(gfp_t gfp_flags, int node);
void copy_io_context(struct io_context **pdst, struct io_context **psrc);
-void swap_io_context(struct io_context **ioc1, struct io_context **ioc2);

struct request;
typedef void (rq_end_io_fn)(struct request *, int);

--
Jens Axboe

2008-02-01 10:45:56

by Ingo Molnar

[permalink] [raw]
Subject: Re: [bug] as_merged_requests(): possible recursive locking detected


* Jens Axboe <[email protected]> wrote:

> Are you sure this triggered with the as fixup in place? It looks like
> the same bug.

most definitely a separate bug.

Ingo

2008-02-01 10:47:59

by Jens Axboe

[permalink] [raw]
Subject: Re: [bug] as_merged_requests(): possible recursive locking detected

On Fri, Feb 01 2008, Ingo Molnar wrote:
>
> * Jens Axboe <[email protected]> wrote:
>
> > Are you sure this triggered with the as fixup in place? It looks like
> > the same bug.
>
> most definitely a separate bug.

yeah, I didn't read it carefully enough. Nikanth found the reason.

--
Jens Axboe

2008-02-01 10:54:51

by Nikanth Karthikesan

[permalink] [raw]
Subject: Re: [bug] as_merged_requests(): possible recursive locking detected

On Fri, 2008-02-01 at 11:12 +0100, Jens Axboe wrote:
> On Fri, Feb 01 2008, Nikanth Karthikesan wrote:
> > On Thu, 2008-01-31 at 23:14 +0100, Ingo Molnar wrote:
> > >
> > > Jens,
> > >
> > > AS still has some locking issues - see the lockdep warning below that
> > > the x86 test-rig just triggered. Config attached. Never saw this one
> > > before. Can send more info if needed.
> > >
> >
> > The io_contexts are swapped. And while swapping, the locks were also
> > getting swapped, which will change the order of locking after that. This
> > may be the cause of these warning. I am not sure whether not swapping
> > the locks is the right way to fix this. Using a field of spinlock_t
> > itself to order locking might be better, instead of the address of the
> > container.
> >
> > Now while adding a new member to io_context, one should not forget to
> > add it here. Also copying whole io_context and then restoring the locks
> > might have a window where this warning could be triggered.
>
> Oops, the locks should definitely be left alone. It's not just the
> locking order, but also it would confuse lockdep.

Oops.. It may not need any locking at all! It seems that we just swap
the pointers and do not copy the struct at all! Sorry for confusing with
my patch. I am also confused, as the swap itself seems to do nothing!

Thanks
Nikanth Karthikesan

2008-02-01 11:01:56

by Ingo Molnar

[permalink] [raw]
Subject: Re: [bug] as_merged_requests(): possible recursive locking detected


* Jens Axboe <[email protected]> wrote:

> On Fri, Feb 01 2008, Ingo Molnar wrote:
> >
> > * Jens Axboe <[email protected]> wrote:
> >
> > > Are you sure this triggered with the as fixup in place? It looks like
> > > the same bug.
> >
> > most definitely a separate bug.
>
> yeah, I didn't read it carefully enough. Nikanth found the reason.

/me processes his mbox some more and sees the mails

am i right that lockdep complained about real lockup potential here?
(i.e. it caught a real bug) So there's no need to change anything on the
lockdep side, right?

Ingo

2008-02-01 11:22:49

by Jens Axboe

[permalink] [raw]
Subject: Re: [bug] as_merged_requests(): possible recursive locking detected

On Fri, Feb 01 2008, Ingo Molnar wrote:
>
> * Jens Axboe <[email protected]> wrote:
>
> > On Fri, Feb 01 2008, Ingo Molnar wrote:
> > >
> > > * Jens Axboe <[email protected]> wrote:
> > >
> > > > Are you sure this triggered with the as fixup in place? It looks like
> > > > the same bug.
> > >
> > > most definitely a separate bug.
> >
> > yeah, I didn't read it carefully enough. Nikanth found the reason.
>
> /me processes his mbox some more and sees the mails
>
> am i right that lockdep complained about real lockup potential here?
> (i.e. it caught a real bug) So there's no need to change anything on the
> lockdep side, right?

Right, no bug in lockdep, the locking code and swap_io_context() are
just screwed up.

--
Jens Axboe

2008-02-02 00:55:29

by Nick Piggin

[permalink] [raw]
Subject: Re: [bug] as_merged_requests(): possible recursive locking detected

On Friday 01 February 2008 21:31, Jens Axboe wrote:
> On Fri, Feb 01 2008, Jens Axboe wrote:

> > I think the right solution is to remove swap_io_context() and fix the io
> > context referencing in as-iosched.c instead.
>
> IOW, the below. I don't know why Nick originally wanted to swap io
> contexts for a rq <-> rq merge, there seems little (if any) benefit to
> doing so.

Yeah, I guess this patch is fine. Simpler is better.

>
> diff --git a/block/as-iosched.c b/block/as-iosched.c
> index 9603684..852803e 100644
> --- a/block/as-iosched.c
> +++ b/block/as-iosched.c
> @@ -1266,22 +1266,8 @@ static void as_merged_requests(struct request_queue
> *q, struct request *req, */
> if (!list_empty(&req->queuelist) && !list_empty(&next->queuelist)) {
> if (time_before(rq_fifo_time(next), rq_fifo_time(req))) {
> - struct io_context *rioc = RQ_IOC(req);
> - struct io_context *nioc = RQ_IOC(next);
> -
> list_move(&req->queuelist, &next->queuelist);
> rq_set_fifo_time(req, rq_fifo_time(next));
> - /*
> - * Don't copy here but swap, because when anext is
> - * removed below, it must contain the unused context
> - */
> - if (rioc != nioc) {
> - double_spin_lock(&rioc->lock, &nioc->lock,
> - rioc < nioc);
> - swap_io_context(&rioc, &nioc);
> - double_spin_unlock(&rioc->lock, &nioc->lock,
> - rioc < nioc);
> - }
> }
> }
>
> diff --git a/block/blk-ioc.c b/block/blk-ioc.c
> index 6d16755..80245dc 100644
> --- a/block/blk-ioc.c
> +++ b/block/blk-ioc.c
> @@ -176,15 +176,6 @@ void copy_io_context(struct io_context **pdst, struct
> io_context **psrc) }
> EXPORT_SYMBOL(copy_io_context);
>
> -void swap_io_context(struct io_context **ioc1, struct io_context **ioc2)
> -{
> - struct io_context *temp;
> - temp = *ioc1;
> - *ioc1 = *ioc2;
> - *ioc2 = temp;
> -}
> -EXPORT_SYMBOL(swap_io_context);
> -
> int __init blk_ioc_init(void)
> {
> iocontext_cachep = kmem_cache_create("blkdev_ioc",
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index baba233..bbe3cf4 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -39,7 +39,6 @@ void exit_io_context(void);
> struct io_context *get_io_context(gfp_t gfp_flags, int node);
> struct io_context *alloc_io_context(gfp_t gfp_flags, int node);
> void copy_io_context(struct io_context **pdst, struct io_context **psrc);
> -void swap_io_context(struct io_context **ioc1, struct io_context **ioc2);
>
> struct request;
> typedef void (rq_end_io_fn)(struct request *, int);