Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751668AbbEARmN (ORCPT ); Fri, 1 May 2015 13:42:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46975 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751246AbbEARmK (ORCPT ); Fri, 1 May 2015 13:42:10 -0400 From: Jeff Moyer To: Shaohua Li Cc: , , , Subject: Re: [PATCH 2/5] sched: always use blk_schedule_flush_plug in io_schedule_out References: <9eedafbfb33bf3e8d273860e75f9f07ca9b0d5ec.1430414610.git.shli@fb.com> X-PGP-KeyID: 1F78E1B4 X-PGP-CertKey: F6FE 280D 8293 F72C 65FD 5A58 1FF8 A7CA 1F78 E1B4 X-PCLoadLetter: What the f**k does that mean? Date: Fri, 01 May 2015 13:42:02 -0400 In-Reply-To: <9eedafbfb33bf3e8d273860e75f9f07ca9b0d5ec.1430414610.git.shli@fb.com> (Shaohua Li's message of "Thu, 30 Apr 2015 10:45:15 -0700") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6102 Lines: 119 Shaohua Li writes: > block plug callback could sleep, so we introduce a parameter > 'from_schedule' and corresponding drivers can use it to destinguish a > schedule plug flush or a plug finish. Unfortunately io_schedule_out > still uses blk_flush_plug(). This causes below output (Note, I added a > might_sleep() in raid1_unplug to make it trigger faster, but the whole > thing doesn't matter if I add might_sleep). In raid1/10, this can cause > deadlock. > > This patch makes io_schedule_out always uses blk_schedule_flush_plug. > This should only impact drivers (as far as I know, raid 1/10) which are > sensitive to the 'from_schedule' parameter. Why wouldn't you change io_schedule_timeout to use blk_flush_plug_list instead? Cheers, Jeff > > [ 370.817949] ------------[ cut here ]------------ > [ 370.817960] WARNING: CPU: 7 PID: 145 at ../kernel/sched/core.c:7306 __might_sleep+0x7f/0x90() > [ 370.817969] do not call blocking ops when !TASK_RUNNING; state=2 set at [] prepare_to_wait+0x2f/0x90 > [ 370.817971] Modules linked in: raid1 > [ 370.817976] CPU: 7 PID: 145 Comm: kworker/u16:9 Tainted: G W 4.0.0+ #361 > [ 370.817977] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014 > [ 370.817983] Workqueue: writeback bdi_writeback_workfn (flush-9:1) > [ 370.817985] ffffffff81cd83be ffff8800ba8cb298 ffffffff819dd7af 0000000000000001 > [ 370.817988] ffff8800ba8cb2e8 ffff8800ba8cb2d8 ffffffff81051afc ffff8800ba8cb2c8 > [ 370.817990] ffffffffa00061a8 000000000000041e 0000000000000000 ffff8800ba8cba28 > [ 370.817993] Call Trace: > [ 370.817999] [] dump_stack+0x4f/0x7b > [ 370.818002] [] warn_slowpath_common+0x8c/0xd0 > [ 370.818004] [] warn_slowpath_fmt+0x46/0x50 > [ 370.818006] [] ? prepare_to_wait+0x2f/0x90 > [ 370.818008] [] ? prepare_to_wait+0x2f/0x90 > [ 370.818010] [] __might_sleep+0x7f/0x90 > [ 370.818014] [] raid1_unplug+0xd3/0x170 [raid1] > [ 370.818024] [] blk_flush_plug_list+0x8a/0x1e0 > [ 370.818028] [] ? bit_wait+0x50/0x50 > [ 370.818031] [] io_schedule_timeout+0x130/0x140 > [ 370.818033] [] bit_wait_io+0x36/0x50 > [ 370.818034] [] __wait_on_bit+0x65/0x90 > [ 370.818041] [] ? ext4_read_block_bitmap_nowait+0xbc/0x630 > [ 370.818043] [] ? bit_wait+0x50/0x50 > [ 370.818045] [] out_of_line_wait_on_bit+0x72/0x80 > [ 370.818047] [] ? autoremove_wake_function+0x40/0x40 > [ 370.818050] [] __wait_on_buffer+0x44/0x50 > [ 370.818053] [] ext4_wait_block_bitmap+0xe0/0xf0 > [ 370.818058] [] ext4_mb_init_cache+0x206/0x790 > [ 370.818062] [] ? lru_cache_add+0x1c/0x50 > [ 370.818064] [] ext4_mb_init_group+0x11e/0x200 > [ 370.818066] [] ext4_mb_load_buddy+0x341/0x360 > [ 370.818068] [] ext4_mb_find_by_goal+0x93/0x2f0 > [ 370.818070] [] ? ext4_mb_normalize_request+0x1e4/0x5b0 > [ 370.818072] [] ext4_mb_regular_allocator+0x67/0x460 > [ 370.818074] [] ? ext4_mb_normalize_request+0x1e4/0x5b0 > [ 370.818076] [] ext4_mb_new_blocks+0x4cb/0x620 > [ 370.818079] [] ext4_ext_map_blocks+0x4c6/0x14d0 > [ 370.818081] [] ? ext4_es_lookup_extent+0x4e/0x290 > [ 370.818085] [] ext4_map_blocks+0x14d/0x4f0 > [ 370.818088] [] ext4_writepages+0x76d/0xe50 > [ 370.818094] [] do_writepages+0x21/0x50 > [ 370.818097] [] __writeback_single_inode+0x60/0x490 > [ 370.818099] [] writeback_sb_inodes+0x2da/0x590 > [ 370.818103] [] ? trylock_super+0x1b/0x50 > [ 370.818105] [] ? trylock_super+0x1b/0x50 > [ 370.818107] [] __writeback_inodes_wb+0x9f/0xd0 > [ 370.818109] [] wb_writeback+0x34b/0x3c0 > [ 370.818111] [] bdi_writeback_workfn+0x23f/0x550 > [ 370.818116] [] process_one_work+0x1c8/0x570 > [ 370.818117] [] ? process_one_work+0x14b/0x570 > [ 370.818119] [] worker_thread+0x11b/0x470 > [ 370.818121] [] ? process_one_work+0x570/0x570 > [ 370.818124] [] kthread+0xf8/0x110 > [ 370.818126] [] ? kthread_create_on_node+0x210/0x210 > [ 370.818129] [] ret_from_fork+0x42/0x70 > [ 370.818131] [] ? kthread_create_on_node+0x210/0x210 > [ 370.818132] ---[ end trace 7b4deb71e68b6605 ]--- > > Cc: NeilBrown > Signed-off-by: Shaohua Li > --- > kernel/sched/core.c | 8 ++------ > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index f9123a8..fef7eb2 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -4397,21 +4397,17 @@ EXPORT_SYMBOL_GPL(yield_to); > */ > long __sched io_schedule_timeout(long timeout) > { > - int old_iowait = current->in_iowait; > struct rq *rq; > long ret; > > current->in_iowait = 1; > - if (old_iowait) > - blk_schedule_flush_plug(current); > - else > - blk_flush_plug(current); > + blk_schedule_flush_plug(current); > > delayacct_blkio_start(); > rq = raw_rq(); > atomic_inc(&rq->nr_iowait); > ret = schedule_timeout(timeout); > - current->in_iowait = old_iowait; > + current->in_iowait = 0; > atomic_dec(&rq->nr_iowait); > delayacct_blkio_end(); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/