Received: by 10.223.176.46 with SMTP id f43csp1169118wra; Fri, 19 Jan 2018 07:50:52 -0800 (PST) X-Google-Smtp-Source: ACJfBoscWrly37h5kj8lpNGMqCKMJ5SS3nK2Md/KT8xLNl1k56tf/A15ZMzINkGwmj/Yn7woYaEk X-Received: by 10.101.67.130 with SMTP id m2mr10479514pgp.301.1516377052198; Fri, 19 Jan 2018 07:50:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516377052; cv=none; d=google.com; s=arc-20160816; b=saEWOaZSJ5VY22Ili4X4H8d73ufTbfs38ZFOz3ARLHodujlwm/oxXDFGNpbLeGjuah 1D9Nkthf9fxE6wez9qrTc2qSuzTYKyqU6IcaJE2d3dsSHVqrqnUEgTfIjLbHlJlBnu+o C3TeRZOrFgW7tkPhUwJJfDhuv8OWd3shkWiwAQOAySUiu9uYcpTEFzM4E4Xhm9o3cb7p ad5Ybp5/TP/SkuxKzG8FN6Jyfjt2Qtke7XoCY+wszBr0FkyeG1oOupTnggjuvbof2mMp NdXnAQdTr07Dd/+OGNcMsM5xVX2D3ColVgmERyM+TblFVtP8paTK6hnVVENHsfNOtUg4 fyBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=C1CWjsFVyupZbGwI8m11CeglwmSRS8h8un7ikV6eCJA=; b=RmnE4pQKVY6LoODms2ywlkNnguqs5hcUda3KkuQRDOh/Q9+CTGW2Tk7VOAWnzWpz/g ZTan2lh+g0zqo1+ZeSdIU7yDauNG9Z1EahKJnHHE1jZF6qk8Tdba+NNeCUDhfxpjTW6a Ma3FPcC8iGJu9ZA93wWirtJ+gmFJ4Fn3PATTHqrrqEjvMC1ywMH/yeoEOpgKwkMSP9MI TiZ6BRO5PnETWsu3eUIn4r4ABOgVwXE3VhrU5wDFHJS7SRUhvPhtyL15LrHZymoKKAmi pCUWK+JsFkniGWrSdlmfSHL7n0GssaYusHSu4I6RxTweLIHU5imBmSVw5xO9rcWOIAzS BdNQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=Y60LS5r4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 5-v6si916646plt.284.2018.01.19.07.50.38; Fri, 19 Jan 2018 07:50:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=Y60LS5r4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755740AbeASPtF (ORCPT + 99 others); Fri, 19 Jan 2018 10:49:05 -0500 Received: from mail-it0-f44.google.com ([209.85.214.44]:39051 "EHLO mail-it0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755285AbeASPs6 (ORCPT ); Fri, 19 Jan 2018 10:48:58 -0500 Received: by mail-it0-f44.google.com with SMTP id 68so2592791ite.4 for ; Fri, 19 Jan 2018 07:48:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=C1CWjsFVyupZbGwI8m11CeglwmSRS8h8un7ikV6eCJA=; b=Y60LS5r4QagP8xoYjurez0DU9RY4zqo5iBK9D6sn7JXngAiTVzMBqlsHjtcFSh+2KS L/pSRCOY7NeEqJ0RaIlVhjKzVOBZQ+8A/ZLSnk3+1J5g9oallXsHZfPPO+qV5Ap49Voz XTknrTahG/E5Amt1bILIctpQgw4L5PEo500EJVqvsC+iTRk1KzVHncWVT7BPcGUdtDIm V12l8RYyGqMXoCO3H6F0PNJjB05WdO1wyspu9q8rR4B8IGr/+bPowo8wDXVd+0oMJWrz UhBtqdWxEOGaUTARvVrhQrQ8gyJb5kwQj10vlmE9aTsPImd2dXK8lirhFYPeapbZ0Ol/ sONg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=C1CWjsFVyupZbGwI8m11CeglwmSRS8h8un7ikV6eCJA=; b=UdOmNJjXXDCHWzQOgan7cjmzMzC+xTR4/hGkSKonyKIdb49ljLcPVHEKxMd6OWpEK9 jKJsiI1L67TUL1oox807YwNbePK5u9kwSKf5e+oZsZ9TRiuXg2jDt/orloqt96VdNuAl is95U6LD0PjK8ruA8Vk4Fo1BNh126f72nBbGm8oWXaAtU7n3I6g2MWsFlqOdFMReZNgc LbQ/FmcobEpgiXTMKFbZPHGFiG3YT4o0TRWwVQOXnLyHGRyIEtyiKp/bLm8g5CJJsJbq yeWe23CiHCR1gJ9t9jzfRlDS92/70PChLx8N40ZuhkYXhQMqmBc9l7l186rkDOc+cbjO 1xtg== X-Gm-Message-State: AKwxytdvRLUAkN9fM+EY9zmRKK0s28KZtF+9PtqpQlc9HASn501SxJZE iIAV13CXpNkFNJ7OknHHyDjacg== X-Received: by 10.36.219.195 with SMTP id c186mr27325itg.61.1516376937696; Fri, 19 Jan 2018 07:48:57 -0800 (PST) Received: from [192.168.1.160] ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id m34sm943069iti.24.2018.01.19.07.48.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 19 Jan 2018 07:48:56 -0800 (PST) Subject: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle To: Ming Lei Cc: Bart Van Assche , "snitzer@redhat.com" , "dm-devel@redhat.com" , "hch@infradead.org" , "linux-kernel@vger.kernel.org" , "linux-block@vger.kernel.org" , "osandov@fb.com" References: <20180118170353.GB19734@redhat.com> <1516296056.2676.23.camel@wdc.com> <20180118183039.GA20121@redhat.com> <1516301278.2676.35.camel@wdc.com> <20180119023212.GA25413@ming.t460p> <20180119072623.GB25369@ming.t460p> <047f68ec-f51b-190f-2f89-f413325c2540@kernel.dk> <20180119154047.GB14827@ming.t460p> From: Jens Axboe Message-ID: <540e1239-c415-766b-d4ff-bb0b7f3517a7@kernel.dk> Date: Fri, 19 Jan 2018 08:48:55 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:58.0) Gecko/20100101 Thunderbird/58.0 MIME-Version: 1.0 In-Reply-To: <20180119154047.GB14827@ming.t460p> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/19/18 8:40 AM, Ming Lei wrote: >>>> Where does the dm STS_RESOURCE error usually come from - what's exact >>>> resource are we running out of? >>> >>> It is from blk_get_request(underlying queue), see >>> multipath_clone_and_map(). >> >> That's what I thought. So for a low queue depth underlying queue, it's >> quite possible that this situation can happen. Two potential solutions >> I see: >> >> 1) As described earlier in this thread, having a mechanism for being >> notified when the scarce resource becomes available. It would not >> be hard to tap into the existing sbitmap wait queue for that. >> >> 2) Have dm set BLK_MQ_F_BLOCKING and just sleep on the resource >> allocation. I haven't read the dm code to know if this is a >> possibility or not. >> >> I'd probably prefer #1. It's a classic case of trying to get the >> request, and if it fails, add ourselves to the sbitmap tag wait >> queue head, retry, and bail if that also fails. Connecting the >> scarce resource and the consumer is the only way to really fix >> this, without bogus arbitrary delays. > > Right, as I have replied to Bart, using mod_delayed_work_on() with > returning BLK_STS_NO_DEV_RESOURCE(or sort of name) for the scarce > resource should fix this issue. It'll fix the forever stall, but it won't really fix it, as we'll slow down the dm device by some random amount. A simple test case would be to have a null_blk device with a queue depth of one, and dm on top of that. Start a fio job that runs two jobs: one that does IO to the underlying device, and one that does IO to the dm device. If the job on the dm device runs substantially slower than the one to the underlying device, then the problem isn't really fixed. That said, I'm fine with ensuring that we make forward progress always first, and then we can come up with a proper solution to the issue. The forward progress guarantee will be needed for the more rare failure cases, like allocation failures. nvme needs that too, for instance, for the discard range struct allocation. -- Jens Axboe