Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752732AbdLEFQ4 (ORCPT ); Tue, 5 Dec 2017 00:16:56 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40748 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752143AbdLEFQr (ORCPT ); Tue, 5 Dec 2017 00:16:47 -0500 Date: Tue, 5 Dec 2017 13:16:30 +0800 From: Ming Lei To: Holger =?iso-8859-1?Q?Hoffst=E4tte?= Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] SCSI: delay run queue if device is blocked in scsi_dev_queue_ready() Message-ID: <20171205051624.GB9989@ming.t460p> References: <20171202163150.1273-1-ming.lei@redhat.com> <1512400159.23838.1.camel@wdc.com> <20171204224507.GB6888@ming.t460p> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 05 Dec 2017 05:16:47 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1673 Lines: 39 On Mon, Dec 04, 2017 at 11:48:07PM +0000, Holger Hoffst?tte wrote: > On Tue, 05 Dec 2017 06:45:08 +0800, Ming Lei wrote: > > > On Mon, Dec 04, 2017 at 03:09:20PM +0000, Bart Van Assche wrote: > >> On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote: > >> > Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq") > >> > >> It might be safer to revert commit 0df21c86bdbf instead of trying to fix all > >> issues introduced by that commit for kernel version v4.15 ... > > > > What are all issues in v4.15-rc? Up to now, it is the only issue reported, > > and can be fixed by this simple patch, which one can be thought as cleanup > > too. > > Even with this patch I've encountered at least one hang that > seemed related. I'm using most of block/scsi-4.15 on top of 4.14 and > the hang in question was on a rotating disk. It could be solved by activating > a different scheduler on the hanging device; all hanging sync/df processes got > unstuck and all was fine again, which leads me to believe that there is at least > one more rare condition where delaying requests (as done in the budget patch) > leads to a hang. > > This happened with mq-deadline which I was testing specifically to avoid > any BFQ-related side effects. OK, this looks a new report. Without any log, we can't make any progress, and even we can't guess what the issue is related with. Could you post your dmesg log(include the hang process stack trace)? And dump the debugfs log by the following script when this hang happens? http://people.redhat.com/minlei/tests/tools/dump-blk-info BTW, you just need to pass the disk name to the script, such as: /dev/sda. -- Ming