Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752060Ab2E2Bic (ORCPT ); Mon, 28 May 2012 21:38:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60337 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751760Ab2E2Bia (ORCPT ); Mon, 28 May 2012 21:38:30 -0400 From: Asias He To: Jens Axboe , Tejun Heo Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Asias He Subject: [PATCH V3] block: Mitigate lock unbalance caused by lock switching Date: Tue, 29 May 2012 09:39:01 +0800 Message-Id: <1338255542-22247-1-git-send-email-asias@redhat.com> In-Reply-To: <20120528102214.GB15202@dhcp-172-17-108-109.mtv.corp.google.com> References: <20120528102214.GB15202@dhcp-172-17-108-109.mtv.corp.google.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4490 Lines: 110 Commit 777eb1bf15b8532c396821774bf6451e563438f5 disconnects externally supplied queue_lock before blk_drain_queue(). Switching the lock would introduce lock unbalance because theads which have taken the external lock might unlock the internal lock in the during the queue drain. This patch mitigate this by disconnecting the lock after the queue draining since queue draining makes a lot of request_queue users go away. However, please note, this patch only makes the problem less likely to happen. Anyone who still holds a ref might try to issue a new request on a dead queue after the blk_cleanup_queue() finishes draining, the lock unbalance might still happen in this case. ===================================== [ BUG: bad unlock balance detected! ] 3.4.0+ #288 Not tainted ------------------------------------- fio/17706 is trying to release lock (&(&q->__queue_lock)->rlock) at: [] blk_queue_bio+0x2a2/0x380 but there are no more locks to release! other info that might help us debug this: 1 lock held by fio/17706: #0: (&(&vblk->lock)->rlock){......}, at: [] get_request_wait+0x19a/0x250 stack backtrace: Pid: 17706, comm: fio Not tainted 3.4.0+ #288 Call Trace: [] ? blk_queue_bio+0x2a2/0x380 [] print_unlock_inbalance_bug+0xf9/0x100 [] lock_release_non_nested+0x1df/0x330 [] ? dio_bio_end_aio+0x34/0xc0 [] ? bio_check_pages_dirty+0x85/0xe0 [] ? dio_bio_end_aio+0xb1/0xc0 [] ? blk_queue_bio+0x2a2/0x380 [] ? blk_queue_bio+0x2a2/0x380 [] lock_release+0xd9/0x250 [] _raw_spin_unlock_irq+0x23/0x40 [] blk_queue_bio+0x2a2/0x380 [] generic_make_request+0xca/0x100 [] submit_bio+0x76/0xf0 [] ? set_page_dirty_lock+0x3c/0x60 [] ? bio_set_pages_dirty+0x51/0x70 [] do_blockdev_direct_IO+0xbf8/0xee0 [] ? blkdev_get_block+0x80/0x80 [] __blockdev_direct_IO+0x55/0x60 [] ? blkdev_get_block+0x80/0x80 [] blkdev_direct_IO+0x57/0x60 [] ? blkdev_get_block+0x80/0x80 [] generic_file_aio_read+0x70e/0x760 [] ? __lock_acquire+0x215/0x5a0 [] ? aio_run_iocb+0x54/0x1a0 [] ? grab_cache_page_nowait+0xc0/0xc0 [] aio_rw_vect_retry+0x7c/0x1e0 [] ? aio_fsync+0x30/0x30 [] aio_run_iocb+0x66/0x1a0 [] do_io_submit+0x6f0/0xb80 [] ? trace_hardirqs_on_thunk+0x3a/0x3f [] sys_io_submit+0x10/0x20 [] system_call_fastpath+0x16/0x1b Changes since v2: Update commit log to explain how the code is still broken even if we delay the lock switching after the drain. Changes since v1: Update commit log as Tejun suggested. Signed-off-by: Asias He --- block/blk-core.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 1f61b74..7b4f6fe 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -416,15 +416,10 @@ void blk_cleanup_queue(struct request_queue *q) /* mark @q DEAD, no new request or merges will be allowed afterwards */ mutex_lock(&q->sysfs_lock); queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q); - spin_lock_irq(lock); queue_flag_set(QUEUE_FLAG_NOMERGES, q); queue_flag_set(QUEUE_FLAG_NOXMERGES, q); queue_flag_set(QUEUE_FLAG_DEAD, q); - - if (q->queue_lock != &q->__queue_lock) - q->queue_lock = &q->__queue_lock; - spin_unlock_irq(lock); mutex_unlock(&q->sysfs_lock); @@ -440,6 +435,11 @@ void blk_cleanup_queue(struct request_queue *q) del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer); blk_sync_queue(q); + spin_lock_irq(lock); + if (q->queue_lock != &q->__queue_lock) + q->queue_lock = &q->__queue_lock; + spin_unlock_irq(lock); + /* @q is and will stay empty, shutdown and put */ blk_put_queue(q); } -- 1.7.10.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/