Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755195AbbFCMoZ (ORCPT ); Wed, 3 Jun 2015 08:44:25 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:58146 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755981AbbFCMTk (ORCPT ); Wed, 3 Jun 2015 08:19:40 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, NeilBrown , Shaohua Li , Jeff Moyer , Jens Axboe , poma Subject: [PATCH 4.0 102/148] sched: always use blk_schedule_flush_plug in io_schedule_out Date: Wed, 3 Jun 2015 21:09:32 +0900 Message-Id: <20150603114209.828672241@linuxfoundation.org> X-Mailer: git-send-email 2.4.2 In-Reply-To: <20150603114205.337615117@linuxfoundation.org> References: <20150603114205.337615117@linuxfoundation.org> User-Agent: quilt/0.64 MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5805 Lines: 114 4.0-stable review patch. If anyone has any objections, please let me know. ------------------ From: Shaohua Li commit 10d784eae2b41e25d8fc6a88096cd27286093c84 upstream. block plug callback could sleep, so we introduce a parameter 'from_schedule' and corresponding drivers can use it to destinguish a schedule plug flush or a plug finish. Unfortunately io_schedule_out still uses blk_flush_plug(). This causes below output (Note, I added a might_sleep() in raid1_unplug to make it trigger faster, but the whole thing doesn't matter if I add might_sleep). In raid1/10, this can cause deadlock. This patch makes io_schedule_out always uses blk_schedule_flush_plug. This should only impact drivers (as far as I know, raid 1/10) which are sensitive to the 'from_schedule' parameter. [ 370.817949] ------------[ cut here ]------------ [ 370.817960] WARNING: CPU: 7 PID: 145 at ../kernel/sched/core.c:7306 __might_sleep+0x7f/0x90() [ 370.817969] do not call blocking ops when !TASK_RUNNING; state=2 set at [] prepare_to_wait+0x2f/0x90 [ 370.817971] Modules linked in: raid1 [ 370.817976] CPU: 7 PID: 145 Comm: kworker/u16:9 Tainted: G W 4.0.0+ #361 [ 370.817977] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014 [ 370.817983] Workqueue: writeback bdi_writeback_workfn (flush-9:1) [ 370.817985] ffffffff81cd83be ffff8800ba8cb298 ffffffff819dd7af 0000000000000001 [ 370.817988] ffff8800ba8cb2e8 ffff8800ba8cb2d8 ffffffff81051afc ffff8800ba8cb2c8 [ 370.817990] ffffffffa00061a8 000000000000041e 0000000000000000 ffff8800ba8cba28 [ 370.817993] Call Trace: [ 370.817999] [] dump_stack+0x4f/0x7b [ 370.818002] [] warn_slowpath_common+0x8c/0xd0 [ 370.818004] [] warn_slowpath_fmt+0x46/0x50 [ 370.818006] [] ? prepare_to_wait+0x2f/0x90 [ 370.818008] [] ? prepare_to_wait+0x2f/0x90 [ 370.818010] [] __might_sleep+0x7f/0x90 [ 370.818014] [] raid1_unplug+0xd3/0x170 [raid1] [ 370.818024] [] blk_flush_plug_list+0x8a/0x1e0 [ 370.818028] [] ? bit_wait+0x50/0x50 [ 370.818031] [] io_schedule_timeout+0x130/0x140 [ 370.818033] [] bit_wait_io+0x36/0x50 [ 370.818034] [] __wait_on_bit+0x65/0x90 [ 370.818041] [] ? ext4_read_block_bitmap_nowait+0xbc/0x630 [ 370.818043] [] ? bit_wait+0x50/0x50 [ 370.818045] [] out_of_line_wait_on_bit+0x72/0x80 [ 370.818047] [] ? autoremove_wake_function+0x40/0x40 [ 370.818050] [] __wait_on_buffer+0x44/0x50 [ 370.818053] [] ext4_wait_block_bitmap+0xe0/0xf0 [ 370.818058] [] ext4_mb_init_cache+0x206/0x790 [ 370.818062] [] ? lru_cache_add+0x1c/0x50 [ 370.818064] [] ext4_mb_init_group+0x11e/0x200 [ 370.818066] [] ext4_mb_load_buddy+0x341/0x360 [ 370.818068] [] ext4_mb_find_by_goal+0x93/0x2f0 [ 370.818070] [] ? ext4_mb_normalize_request+0x1e4/0x5b0 [ 370.818072] [] ext4_mb_regular_allocator+0x67/0x460 [ 370.818074] [] ? ext4_mb_normalize_request+0x1e4/0x5b0 [ 370.818076] [] ext4_mb_new_blocks+0x4cb/0x620 [ 370.818079] [] ext4_ext_map_blocks+0x4c6/0x14d0 [ 370.818081] [] ? ext4_es_lookup_extent+0x4e/0x290 [ 370.818085] [] ext4_map_blocks+0x14d/0x4f0 [ 370.818088] [] ext4_writepages+0x76d/0xe50 [ 370.818094] [] do_writepages+0x21/0x50 [ 370.818097] [] __writeback_single_inode+0x60/0x490 [ 370.818099] [] writeback_sb_inodes+0x2da/0x590 [ 370.818103] [] ? trylock_super+0x1b/0x50 [ 370.818105] [] ? trylock_super+0x1b/0x50 [ 370.818107] [] __writeback_inodes_wb+0x9f/0xd0 [ 370.818109] [] wb_writeback+0x34b/0x3c0 [ 370.818111] [] bdi_writeback_workfn+0x23f/0x550 [ 370.818116] [] process_one_work+0x1c8/0x570 [ 370.818117] [] ? process_one_work+0x14b/0x570 [ 370.818119] [] worker_thread+0x11b/0x470 [ 370.818121] [] ? process_one_work+0x570/0x570 [ 370.818124] [] kthread+0xf8/0x110 [ 370.818126] [] ? kthread_create_on_node+0x210/0x210 [ 370.818129] [] ret_from_fork+0x42/0x70 [ 370.818131] [] ? kthread_create_on_node+0x210/0x210 [ 370.818132] ---[ end trace 7b4deb71e68b6605 ]--- V2: don't change ->in_iowait Cc: NeilBrown Signed-off-by: Shaohua Li Reviewed-by: Jeff Moyer Signed-off-by: Jens Axboe Cc: poma Signed-off-by: Greg Kroah-Hartman --- kernel/sched/core.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4384,10 +4384,7 @@ long __sched io_schedule_timeout(long ti long ret; current->in_iowait = 1; - if (old_iowait) - blk_schedule_flush_plug(current); - else - blk_flush_plug(current); + blk_schedule_flush_plug(current); delayacct_blkio_start(); rq = raw_rq(); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/