Received: by 10.223.148.5 with SMTP id 5csp7734604wrq; Thu, 18 Jan 2018 08:52:05 -0800 (PST) X-Google-Smtp-Source: ACJfBoudWUNkim1YyKH57Vv7yP9nyua0RR7CuWCv+Sl81PRRcPkh3DNBC71YeaPTwNrJN/3pjnO+ X-Received: by 10.99.180.76 with SMTP id n12mr28535621pgu.8.1516294325509; Thu, 18 Jan 2018 08:52:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516294325; cv=none; d=google.com; s=arc-20160816; b=YFjw1k1ogyiXsiuwH5g/Hv3kdIYq8+bzvD7fsCHAVQ9rk+QU2zPdPc9Ih5qZX5b0Iv eeivzc4QvsfcohbOkFn5DkFls4l3p3JTvbI5w7QP52A5ZdKIM0JR0U0P4PvStovM4mRq BGWe72l5id0giBPA3EI+HUMp8eJlI6SpXmF+BjEAMU3kumV5LoUxb6rK4Pbmv/U8jFwH kmOf96bVPV9/GtWvJh6TxsjS1bAnXiqRug9YoKdwIN6+7LOG6WpzdNEs1m7fnxExLjfd zEsWholTEnzdVFV7xsj8pWM8DWcFCct5R6z/UBQ5jAe3zxAzWGojNKJtwbSdTvEkn+5/ Ntag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=Ga1dGeTgXo7Zil6sTcle2cYazUrzcEaNUVXZfwH5tkE=; b=cwiQlf13L41exZkUoSREC3QElMABVhgEfat13/t2u4sKnDqux9sWBegV8+DawMIuFo YqpN9VTwok+tm7lUFftsGfrvwTuaiTOZ62KsjPfGmgrbk8adN6AgZP+XSdjpMZu+KmSE NS7iHwBf6kWvx1jYnOOyIJVQGuq5RVLUlAe8J09OHLKmSTKKLa5QlLR9vU/DPuE4NeLv U9SVbZWSMtftW0XLqNHQnaY3zEKchqi26NX881l4j0okUxpxNNSezTC17vDa4m1RyN3O DfJesvyU1b7J3p6hjXbWK2eM44ofV78oH1Rfh2M18wWdhtDkGgrKMC7DzLhBKJmJEy4O wTTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@wdc.com header.s=dkim.wdc.com header.b=LMgrEREN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j4-v6si30060plt.543.2018.01.18.08.51.51; Thu, 18 Jan 2018 08:52:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@wdc.com header.s=dkim.wdc.com header.b=LMgrEREN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932809AbeARQuq (ORCPT + 99 others); Thu, 18 Jan 2018 11:50:46 -0500 Received: from esa4.hgst.iphmx.com ([216.71.154.42]:1655 "EHLO esa4.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932066AbeARQun (ORCPT ); Thu, 18 Jan 2018 11:50:43 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1516294244; x=1547830244; h=subject:to:cc:references:from:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=FFoVbKfjo/UqjEJpL6FAbdnlxEZXgjnh9fYCabRrfbk=; b=LMgrERENut4UEPA8t7dVD8v0z5HDVELdh0S/Ad0q3Q02XqWdoLlAClIu qEdzDIdVkztsIKOcy1jhxsOWfaGe2kvzLfFp3goV00TmjWLYoaFBYSsa9 KuzosCaMoy8CE3tQrs59EQ2fHMxNQuTrVxPN6Eo1ULAyAwrNkWtivPAFc y8A4e/qDrzgCpusdqcK9vQEBnESpZDvrrAGO+87LV/3/08VVPECWpKd35 7tBg5XIi7EK8QJw3bQenfjF+lkvAKSHK9r+drhDobK/0Z4a4nqyhA7MS1 AM/2zBkuQmXKEjXyyBxT+Rd65hYRwTDfYdZeDkx1S4lcbBeG2rkj5Xvk/ w==; X-IronPort-AV: E=Sophos;i="5.46,378,1511798400"; d="scan'208";a="69214723" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 19 Jan 2018 00:50:44 +0800 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP; 18 Jan 2018 08:45:55 -0800 Received: from thinkpad-bart.sdcorp.global.sandisk.com (HELO [10.11.171.236]) ([10.11.171.236]) by uls-op-cesaip02.wdc.com with ESMTP; 18 Jan 2018 08:50:44 -0800 Subject: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle To: Ming Lei , Jens Axboe , linux-block@vger.kernel.org, Mike Snitzer , dm-devel@redhat.com Cc: Christoph Hellwig , Bart Van Assche , linux-kernel@vger.kernel.org, Omar Sandoval References: <20180118024124.8079-1-ming.lei@redhat.com> From: Bart Van Assche Message-ID: Date: Thu, 18 Jan 2018 08:50:43 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <20180118024124.8079-1-ming.lei@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/17/18 18:41, Ming Lei wrote: > BLK_STS_RESOURCE can be returned from driver when any resource > is running out of. And the resource may not be related with tags, > such as kmalloc(GFP_ATOMIC), when queue is idle under this kind of > BLK_STS_RESOURCE, restart can't work any more, then IO hang may > be caused. > > Most of drivers may call kmalloc(GFP_ATOMIC) in IO path, and almost > all returns BLK_STS_RESOURCE under this situation. But for dm-mpath, > it may be triggered a bit easier since the request pool of underlying > queue may be consumed up much easier. But in reality, it is still not > easy to trigger it. I run all kinds of test on dm-mpath/scsi-debug > with all kinds of scsi_debug parameters, can't trigger this issue > at all. But finally it is triggered in Bart's SRP test, which seems > made by genius, :-) > > [ ... ] > > static void blk_mq_timeout_work(struct work_struct *work) > { > struct request_queue *q = > @@ -966,8 +1045,10 @@ static void blk_mq_timeout_work(struct work_struct *work) > */ > queue_for_each_hw_ctx(q, hctx, i) { > /* the hctx may be unmapped, so check it here */ > - if (blk_mq_hw_queue_mapped(hctx)) > + if (blk_mq_hw_queue_mapped(hctx)) { > blk_mq_tag_idle(hctx); > + blk_mq_fixup_restart(hctx); > + } > } > } > blk_queue_exit(q); Hello Ming, My comments about the above are as follows: - It can take up to q->rq_timeout jiffies after a .queue_rq() implementation returned BLK_STS_RESOURCE before blk_mq_timeout_work() gets called. However, it can happen that only a few milliseconds after .queue_rq() returned BLK_STS_RESOURCE that the condition that caused it to return BLK_STS_RESOURCE gets cleared. So the above approach can result in long delays during which it will seem like the queue got stuck. Additionally, I think that the block driver should decide how long it takes before a queue is rerun and not the block layer core. - The lockup that I reported only occurs with the dm driver but not any other block driver. So why to modify the block layer core since this can be fixed by modifying the dm driver? - A much simpler fix and a fix that is known to work exists, namely inserting a blk_mq_delay_run_hw_queue() call in the dm driver. Bart.