Received: by 10.223.176.46 with SMTP id f43csp1210913wra; Fri, 19 Jan 2018 08:21:41 -0800 (PST) X-Google-Smtp-Source: ACJfBot1Gfyr3w9WR3ncTsfOklnLDwB46yKdkvfC60cApQh/OlzFqTHK5zNNuY/KenQzBRYwtIRP X-Received: by 10.98.225.7 with SMTP id q7mr11079548pfh.22.1516378901254; Fri, 19 Jan 2018 08:21:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516378901; cv=none; d=google.com; s=arc-20160816; b=ABFuTSbp4Y55GLZWyLj/UBbWTXf5xiQ6WvVtZbyfVS5DSPy3apEVT1T/yH8Gug+BE6 VBp+DXHcYD42nCMqcIo+CXNwwM5XISdlTYrG04xSOvZXzMVrPXz53uv7PJKI9znvLNn9 WAmfqqjyU7nNlqSfbV0A0wuZmOM8q8dmGkqjTFZkkHGGoiT7j4E466rIZegIHxSoz2BG sIwTTIPYXYeSbSqr/epQrOJuhYH4I+TgPRASQLjlACVOuCOWGROw6U8L6cuN96fDj3QE 6MufWhgvAFDzLAjMmhj6dn1WCu5jQCG4BIqbzs3vyM4nR52UpvHNSOWDckzm/adoyKZU pfqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=QBAX2czOwFmNDI9pbU5EEwZNF8SDKAJ4PkK9Sb0VLU4=; b=mSPWTEPYNQ7x4PnoOTI32o8dQg6vGSdQKY8ij2sECYqh/bx5znRKY1U/DIgpJqVZYd b6YH8UQm9tbTh5+ZnlHKDT4s3DQKZmbuQoVV1BlH6JLa2Vo6S+4ExJvn0sCLG1hgG5pJ bZTJMupw2h2zraM1UoTbj37Z+9CEEmk8n44tQ3su6N9/oBLLugNYQsYX7BZUtUaoQLix ibnT/M1oZqNcuGmn76LFYRpUB1X9H/gLWaY78kRWhss6nRa4DC8APjlqfOb1wg7epFFk 2Jqv3hMd/FBSEtJsij3Mq2sSkT3zMzfHuOgpryN1k/MHadvnU/d6x+oSxodCtB83ddEf Y0zw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=debPCoh9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r13si5422967pgf.626.2018.01.19.08.21.27; Fri, 19 Jan 2018 08:21:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=debPCoh9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755870AbeASQTg (ORCPT + 99 others); Fri, 19 Jan 2018 11:19:36 -0500 Received: from mail-it0-f49.google.com ([209.85.214.49]:39359 "EHLO mail-it0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755734AbeASQT3 (ORCPT ); Fri, 19 Jan 2018 11:19:29 -0500 Received: by mail-it0-f49.google.com with SMTP id 68so2710439ite.4 for ; Fri, 19 Jan 2018 08:19:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=QBAX2czOwFmNDI9pbU5EEwZNF8SDKAJ4PkK9Sb0VLU4=; b=debPCoh97ef7rQIiizLgJOfsHr+zKomW7ex4ZmfZHoeQLYikrUAt64fhJcn9qG4LlW 6vX7C94CG1h9aF/3D3dp0nmm0Rh8oe3VOiNM/qTs+EtzYqk/r5fqQIfbkbN1U+B1mKoY DUMMhAC4ehhXofeRfc0lAi7BBZEfW5smbYluOhqtjNNPjhojaXnImabKRblttihd9BDC FEvtrpJRAjtGi2j+JJhbrCZm/JZrde8FKlKnhcKQjW+Avun5aH1qJG7jlY6N8MfW6vTz ZIsUb14KW9LR6nsUU3dXC+Ayn+n1qy1Q88EdDp3VRNK72CLkO3ZpGE0OqPqVG1ZmEYqK RMGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=QBAX2czOwFmNDI9pbU5EEwZNF8SDKAJ4PkK9Sb0VLU4=; b=t0wVqPJWOSO+N7hYcxBMhSPA7qWEkvgocZ2e5RmKtuZh5NKpLsX2Y6THofUO/b8GFO cjMfN1vQLFNFx+nWKedthrdnCz7HVT9vTfN9djqHCZLA/egYxWHNjPEyrpdoyMrWoFmM KyHi3mGWkk8XMfya8+3eZlZOy9Ubt4p4gC1QS2/s2vuCL9lDnDXj7PpHTOxFJ8BZbr5+ kZd6Yy7AEJoXW5VF5fhosdJMfKJO0X1KJH9+TykXMfZjFV4QN1m1R0BaEUOlbQotWIz7 m+57uGTtvko1vmIpazpsAEtzOLFqGSlxwCIebwZ8gBPxacKHtxmj9ajksn3Niux1bYGQ FFug== X-Gm-Message-State: AKwxytfvqS6qq7o5vtFeMrk4wmYI2NG8EjGsRcCHwsB2azvreNI4clME yKAGDfSNFr74ejC+IcCj9wkHgA== X-Received: by 10.36.41.198 with SMTP id p189mr34283689itp.40.1516378768031; Fri, 19 Jan 2018 08:19:28 -0800 (PST) Received: from [192.168.1.160] ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id c140sm2238959itc.1.2018.01.19.08.19.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 19 Jan 2018 08:19:26 -0800 (PST) Subject: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle To: Ming Lei Cc: Bart Van Assche , "snitzer@redhat.com" , "dm-devel@redhat.com" , "hch@infradead.org" , "linux-kernel@vger.kernel.org" , "linux-block@vger.kernel.org" , "osandov@fb.com" References: <1516296056.2676.23.camel@wdc.com> <20180118183039.GA20121@redhat.com> <1516301278.2676.35.camel@wdc.com> <20180119023212.GA25413@ming.t460p> <20180119072623.GB25369@ming.t460p> <047f68ec-f51b-190f-2f89-f413325c2540@kernel.dk> <20180119154047.GB14827@ming.t460p> <540e1239-c415-766b-d4ff-bb0b7f3517a7@kernel.dk> <20180119160518.GC14827@ming.t460p> From: Jens Axboe Message-ID: <4a5c049f-0fab-bbaf-bfe2-eb5bca73f2c8@kernel.dk> Date: Fri, 19 Jan 2018 09:19:24 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:58.0) Gecko/20100101 Thunderbird/58.0 MIME-Version: 1.0 In-Reply-To: <20180119160518.GC14827@ming.t460p> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/19/18 9:05 AM, Ming Lei wrote: > On Fri, Jan 19, 2018 at 08:48:55AM -0700, Jens Axboe wrote: >> On 1/19/18 8:40 AM, Ming Lei wrote: >>>>>> Where does the dm STS_RESOURCE error usually come from - what's exact >>>>>> resource are we running out of? >>>>> >>>>> It is from blk_get_request(underlying queue), see >>>>> multipath_clone_and_map(). >>>> >>>> That's what I thought. So for a low queue depth underlying queue, it's >>>> quite possible that this situation can happen. Two potential solutions >>>> I see: >>>> >>>> 1) As described earlier in this thread, having a mechanism for being >>>> notified when the scarce resource becomes available. It would not >>>> be hard to tap into the existing sbitmap wait queue for that. >>>> >>>> 2) Have dm set BLK_MQ_F_BLOCKING and just sleep on the resource >>>> allocation. I haven't read the dm code to know if this is a >>>> possibility or not. >>>> >>>> I'd probably prefer #1. It's a classic case of trying to get the >>>> request, and if it fails, add ourselves to the sbitmap tag wait >>>> queue head, retry, and bail if that also fails. Connecting the >>>> scarce resource and the consumer is the only way to really fix >>>> this, without bogus arbitrary delays. >>> >>> Right, as I have replied to Bart, using mod_delayed_work_on() with >>> returning BLK_STS_NO_DEV_RESOURCE(or sort of name) for the scarce >>> resource should fix this issue. >> >> It'll fix the forever stall, but it won't really fix it, as we'll slow >> down the dm device by some random amount. >> >> A simple test case would be to have a null_blk device with a queue depth >> of one, and dm on top of that. Start a fio job that runs two jobs: one >> that does IO to the underlying device, and one that does IO to the dm >> device. If the job on the dm device runs substantially slower than the >> one to the underlying device, then the problem isn't really fixed. > > I remembered that I tried this test on scsi-debug & dm-mpath over scsi-debug, > seems not observed this issue, could you explain a bit why IO over dm-mpath > may be slower? Because both two IO contexts call same get_request(), and > in theory dm-mpath should be a bit quicker since it uses direct issue for > underlying queue, without io scheduler involved. Because if you lose the race for getting the request, you'll have some arbitrary delay before trying again, potentially. Compared to the direct user of the underlying device, who will simply sleep on the resource and get woken the instant it's available. -- Jens Axboe