Received: by 10.223.176.5 with SMTP id f5csp3523034wra; Mon, 29 Jan 2018 14:37:44 -0800 (PST) X-Google-Smtp-Source: AH8x2243IDg6YFF3q/WFf/q4vvVsPa11CnVXu/4lT3mk3zNQ+F8mL7KIAUfK60Hcx3ImJDqnZTjm X-Received: by 10.101.97.209 with SMTP id j17mr19810775pgv.266.1517265464479; Mon, 29 Jan 2018 14:37:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517265464; cv=none; d=google.com; s=arc-20160816; b=AqIQ85c3M4yMGWAJ+Sp1ZcSiLBAKiIhpny+H6DyChozt3o48PNewKBcf5+pkPXn563 kQp17qdGRJ3/dr9YiioOz9fQid4Ecx0R/k2uAwnqrZd5ET4S5EDzAnQZvDRGDuDdo2iS nx2UoxoEgOZoQw0pSZ6vIGHDuzXVZrrBrjI3mI4NKfCzC+C2WQNUeYdA++si7+5y8rvy 09U3chKplxipc7/mJN1rOsGvzSevj570HhqZijNHgCZuyFVXReIT31/UaYCb3xeePWFy O24C1vuRsAqHEaM3PAPLorpz1FDA43DJNsfWpvdMISRZY13/FTXG/ApgPnwLGP0mSzM+ IoKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=qZOd1/yROiW4qxh6ghPD1CdCSfY7LZLf2yQGCayt7vk=; b=gB+zaPsBwXc7ZTNa+i9tfiDu452cni1Yzvy2V+kz7hMoEexgRoVDI6MvmwSCAJu4Ml j8gSsELoswgnZ+qBWGVvjHxcWohFiGHSpGrs74PRqTfWtmioLqvL5B1q8qsttj2nyUnW gTrZ7jMdikND6mJY9wZpijUHgoV6bm1rdVTGDShLrfZ07CYYptaJ2kt/n/egRXBsMAZx icmvabfz9hocb5lQ9druiUuUireqEJjEkJqKMQ+UPWTUE3lgwj9VxRwG/mdUP/I99Xwx qw0iq8EEZIBxw5FMlCaE8iizAKQTJTTMCSGkvNmiAvSf1mCSkuykJSA1LW11CJE22aXx b+kw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@wdc.com header.s=dkim.wdc.com header.b=k/WlOWwE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y128si8176778pgb.55.2018.01.29.14.37.29; Mon, 29 Jan 2018 14:37:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@wdc.com header.s=dkim.wdc.com header.b=k/WlOWwE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751837AbeA2WhC (ORCPT + 99 others); Mon, 29 Jan 2018 17:37:02 -0500 Received: from esa4.hgst.iphmx.com ([216.71.154.42]:30872 "EHLO esa4.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751457AbeA2WhB (ORCPT ); Mon, 29 Jan 2018 17:37:01 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1517265421; x=1548801421; h=subject:to:cc:references:from:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=TdSYT1mw3D1HH9IAPvhLhQQqiDyTRtJnJTj09tCIJRQ=; b=k/WlOWwEz5SjhqZOkCJWnYbYfn0cSoZn6OCE9xB3gln1jA5sMiSp0pgJ COcA74Q3MS5TqZthzOgEkdGh4okVYIShDt4D0T5dBaqN5izY3qUY51aPZ hKhjeRdNt3fDa05Qdx4e/79wE4gxJuLAe6kn+G9j/1FsJ5cjvU/ObuECz hwHOLXWXeqpIUpRb5WtNyil262f26iqFL3CoXQbrMgi99VgKqiqLx3jJx Imn4zJ66XAyrL0nhj/0zhEFyuqdTge/ytVeCFhaHRFMx6ErMDV2bhB3DN +G6IspWBFJ4d0zlu1FPpQsl9aqAxjFCDB+iVCDjCel6Vb4ufuaU+BuEoU Q==; X-IronPort-AV: E=Sophos;i="5.46,432,1511798400"; d="scan'208";a="70023620" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Jan 2018 06:37:01 +0800 Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP; 29 Jan 2018 14:31:40 -0800 Received: from thinkpad-bart.sdcorp.global.sandisk.com (HELO [10.11.171.236]) ([10.11.171.236]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Jan 2018 14:37:00 -0800 Subject: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle To: Jens Axboe , Ming Lei Cc: "snitzer@redhat.com" , "dm-devel@redhat.com" , "hch@infradead.org" , "linux-kernel@vger.kernel.org" , "linux-block@vger.kernel.org" , "osandov@fb.com" References: <20180118024124.8079-1-ming.lei@redhat.com> <20180118170353.GB19734@redhat.com> <1516296056.2676.23.camel@wdc.com> <20180118183039.GA20121@redhat.com> <1516301278.2676.35.camel@wdc.com> <20180119023212.GA25413@ming.t460p> <20180119072623.GB25369@ming.t460p> <047f68ec-f51b-190f-2f89-f413325c2540@kernel.dk> From: Bart Van Assche Message-ID: Date: Mon, 29 Jan 2018 14:37:00 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <047f68ec-f51b-190f-2f89-f413325c2540@kernel.dk> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/19/18 07:24, Jens Axboe wrote: > That's what I thought. So for a low queue depth underlying queue, it's > quite possible that this situation can happen. Two potential solutions > I see: > > 1) As described earlier in this thread, having a mechanism for being > notified when the scarce resource becomes available. It would not > be hard to tap into the existing sbitmap wait queue for that. > > 2) Have dm set BLK_MQ_F_BLOCKING and just sleep on the resource > allocation. I haven't read the dm code to know if this is a > possibility or not. > > I'd probably prefer #1. It's a classic case of trying to get the > request, and if it fails, add ourselves to the sbitmap tag wait > queue head, retry, and bail if that also fails. Connecting the > scarce resource and the consumer is the only way to really fix > this, without bogus arbitrary delays. (replying to an e-mail from ten days ago) Implementing a notification mechanism for all cases in which blk_insert_cloned_request() returns BLK_STS_RESOURCE today would require a lot of work. If e.g. a SCSI LLD returns one of the SCSI_MLQUEUE_*_BUSY return codes from its .queuecommand() implementation then the SCSI core will translate that return code into BLK_STS_RESOURCE. From scsi_queue_rq(): reason = scsi_dispatch_cmd(cmd); if (reason) { scsi_set_blocked(cmd, reason); ret = BLK_STS_RESOURCE; goto out_dec_host_busy; } In other words, implementing a notification mechanism for all cases in which blk_insert_cloned_request() can return BLK_STS_RESOURCE would require to modify all SCSI LLDs. Bart.