Received: by 10.223.176.46 with SMTP id f43csp3936186wra; Tue, 23 Jan 2018 01:22:58 -0800 (PST) X-Google-Smtp-Source: AH8x226PMwW2korCGWiG2abdbGG8fXaVDNkUEbH5KvTh32y+mb3uANXeqHJ3oGxvBlusD8xJtAzN X-Received: by 2002:a17:902:6d09:: with SMTP id s9-v6mr5143850plk.176.1516699378817; Tue, 23 Jan 2018 01:22:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516699378; cv=none; d=google.com; s=arc-20160816; b=thBrLTKVAG9vSTPwAocQrC4GQe1fprTulxj+km8Z853Hj7cT8I+HYF8xh+LKOk8GG4 U6d1DAlpIe9qx1EK45tqLNJS5z3LBJAcB4Qvi7UaATxwANlOMk2lQhUsnIon1UkvGBH/ MUwwA1cT5F44ii5tY0OSMZYlralayKe+NbIJmW5rOtV0sqoDeuijNtuR5HnKAbq8LhG0 AyBFuKL7zHu161xErfbbDTTxwgsVzfDcEx6LHQyTlf5aEw7I3OoBjBnLMpG3xgcF3Nuk Uo6CodDX4IE+x6la3ZvT6zsW3LFhfT8NxjaEwHH+Yqx/CfwJ1I0jW3IjZ/VlWQEHHQ/k aEAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=/aZwDy82yjqVBo3frJCmaQ3aY3LKOlQUqweI3Tb4qoY=; b=g43PGGcX4t+C/hqwOzohpiAM7PeZ0mJI+v2zP+GGXTCJ9BeftrcvThg5uaQbq2DKM5 n3fOfO2lXfDgHGcm8BGjxw4TrROLTUiYlebyFtR1zd6WXDoYgq43FPk8j8LUt5uEa9+B Z42zgf6CD6ZirsIKbKlHcrU/1Go+rAHDS41g6yxscYuBU8UXhRsiaE1h+0K5pBjZo3zU 4hhEp5drEUyjd1Hz+ZBUWA4R7q5phVjqFW3XE9Qi+8ZADxNlsHYg+9nqZjzX629e87b6 I3OKu2WqJH6Wd0RiS1hXgIs6+V5OoMhv2EVfxrZ9oaOiplnPo68pPKwqTm5eSplVBObn a/GA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y15si14692216pgc.366.2018.01.23.01.22.44; Tue, 23 Jan 2018 01:22:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751211AbeAWJWS (ORCPT + 99 others); Tue, 23 Jan 2018 04:22:18 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38734 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751156AbeAWJWQ (ORCPT ); Tue, 23 Jan 2018 04:22:16 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DA68BC053FD5; Tue, 23 Jan 2018 09:22:15 +0000 (UTC) Received: from localhost (unknown [10.34.251.121]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1B0EED1F3; Tue, 23 Jan 2018 09:22:05 +0000 (UTC) Date: Tue, 23 Jan 2018 10:22:04 +0100 From: Mike Snitzer To: Bart Van Assche , axboe@kernel.dk, ming.lei@redhat.com Cc: "dm-devel@redhat.com" , "linux-kernel@vger.kernel.org" , "hch@infradead.org" , "linux-block@vger.kernel.org" , "osandov@fb.com" Subject: [PATCH] block: neutralize blk_insert_cloned_request IO stall regression (was: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle) Message-ID: <20180123092204.GA39002@redhat.com> References: <1516296056.2676.23.camel@wdc.com> <20180118183039.GA20121@redhat.com> <1516301278.2676.35.camel@wdc.com> <20180118204856.GA31679@redhat.com> <1516309128.2676.38.camel@wdc.com> <20180118212327.GB31679@redhat.com> <1516311554.2676.50.camel@wdc.com> <20180118220132.GA20860@redhat.com> <1516314012.2676.76.camel@wdc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1516314012.2676.76.camel@wdc.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 23 Jan 2018 09:22:16 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 18 2018 at 5:20pm -0500, Bart Van Assche wrote: > On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote: > > And yet Laurence cannot reproduce any such lockups with your test... > > Hmm ... maybe I misunderstood Laurence but I don't think that Laurence has > already succeeded at running an unmodified version of my tests. In one of the > e-mails Laurence sent me this morning I read that he modified these scripts > to get past a kernel module unload failure that was reported while starting > these tests. So the next step is to check which changes were made to the test > scripts and also whether the test results are still valid. > > > Are you absolutely certain this patch doesn't help you? > > https://patchwork.kernel.org/patch/10174037/ > > > > If it doesn't then that is actually very useful to know. > > The first I tried this morning is to run the srp-test software against a merge > of Jens' for-next branch and your dm-4.16 branch. Since I noticed that the dm > queue locked up I reinserted a blk_mq_delay_run_hw_queue() call in the dm code. > Since even that was not sufficient I tried to kick the queues via debugfs (for > s in /sys/kernel/debug/block/*/state; do echo kick >$s; done). Since that was > not sufficient to resolve the queue stall I reverted the following tree patches > that are in Jens' tree: > * "blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback" > * "blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request" > * "blk-mq: don't dispatch request in blk_mq_request_direct_issue if queue is busy" > > Only after I had done this the srp-test software ran again without triggering > dm queue lockups. Given that Ming's notifier-based patchset needs more development time I think we're unfortunately past the point where we can comfortably wait for that to be ready. So we need to explore alternatives to fixing this IO stall regression. Rather than attempt the above block reverts (which is an incomplete listing given newer changes): might we develop a more targeted code change to neutralize commit 396eaf21ee ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")? -- which, given Bart's findings above, seems to be the most problematic block commit. To that end, assuming I drop this commit from dm-4.16: https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16&id=316a795ad388e0c3ca613454851a28079d917a92 Here is my proposal for putting this regression behind us for 4.16 (Ming's line of development would continue and hopefully be included in 4.17): From: Mike Snitzer Date: Tue, 23 Jan 2018 09:40:22 +0100 Subject: [PATCH] block: neutralize blk_insert_cloned_request IO stall regression The series of blk-mq changes intended to improve sequential IO performace (through improved merging with dm-mapth blk-mq stacked on underlying blk-mq device). Unfortunately these changes have caused dm-mpath blk-mq IO stalls when blk_mq_request_issue_directly()'s call to q->mq_ops->queue_rq() fails (due to device-specific resource unavailability). Fix this by reverting back to how blk_insert_cloned_request() functioned prior to commit 396eaf21ee -- by using blk_mq_request_bypass_insert() instead of blk_mq_request_issue_directly(). In the future, this commit should be reverted as the first change in a followup series of changes that implements a comprehensive solution to allowing an underlying blk-mq queue's resource limitation to trigger the upper blk-mq queue to run once that underlying limited resource is replenished. Fixes: 396eaf21ee ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback") Signed-off-by: Mike Snitzer --- block/blk-core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/block/blk-core.c b/block/blk-core.c index cdae69be68e9..a224f282b4a6 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2520,7 +2520,8 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq); + blk_mq_request_bypass_insert(rq, true); + return BLK_STS_OK; } spin_lock_irqsave(q->queue_lock, flags); -- 2.15.0