Received: by 10.223.176.5 with SMTP id f5csp1338094wra; Wed, 7 Feb 2018 17:42:09 -0800 (PST) X-Google-Smtp-Source: AH8x226XOpAFyD73tGJPt3h/Naxip7Q0HE51q20hwV0ulGUZ6BGV4v7vDENE4b+6oeESWXAzZqQ2 X-Received: by 10.99.165.28 with SMTP id n28mr6623819pgf.103.1518054129640; Wed, 07 Feb 2018 17:42:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518054129; cv=none; d=google.com; s=arc-20160816; b=APV40ye3yBUK1VnUQxRZPwGPl6O7UvR6x+2FOo2LYQlXTUiHzI44YoXMTmkBMX9tlm ZLvDWkjd1MbTSrWoiFd6Ewd0eRdO9KaobMgw56OcVfrn0tTCeExTjGdaOqVwyN+gX1vj BlsKN50pqCgIPsqwvzDpqCcTkL9Wkq3IlCl9FOgtPjnAen00e23AoppPNF/SbQYidbYT PiQsXh4QAzlC3m0IHAKJytDWnMyJH7DonweNnW2DIclZqaLEluvix+f3v1TKvqSpaVRY WvVUsIWHYGj0FJUycQKk3p4ihLaQGUkMrNmOTcGkFgIUc3pZKTX0/ss/sxGyucB9lzro jFzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=F4MpFcuWsJ/Ng0iqoYQy9ParRUZiT63KM+u1SQshXg4=; b=BC7glZCKVpoAa3+3Lma97+BMiqGG2r8Eoeyn8UxAFJ+uDZOxb2plHKq7GnYd99ofuM fh78NgjbhVBNY/emjojznnEmhct8me99j0arwM4nbE/1ZXMajLkd5eG2E9vdzygJ9ttz 2np1OOHwiYY69ihDIByK8x5buLM3j04ia64Hi6uYjXBl5HnB08Fp55gUPZxAhLyraLob bK4+NsOhS7UC3PgHVcJvzIj9+Ye+IWx2otl7V2In0NzmZ8v4ISb5F9/xiSUKrVNwr4SZ rsqBAM8FclGjBfrr5h6jDfxhS54OTKZw/j4AM7e1nANp+XlPMIDv24PAfawRRDdMFzNF dp2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=klU8lB4n; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bd8-v6si1886588plb.3.2018.02.07.17.41.55; Wed, 07 Feb 2018 17:42:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=klU8lB4n; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751671AbeBHBlO (ORCPT + 99 others); Wed, 7 Feb 2018 20:41:14 -0500 Received: from aserp2130.oracle.com ([141.146.126.79]:45872 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750815AbeBHBlM (ORCPT ); Wed, 7 Feb 2018 20:41:12 -0500 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w181alPv090150; Thu, 8 Feb 2018 01:39:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=F4MpFcuWsJ/Ng0iqoYQy9ParRUZiT63KM+u1SQshXg4=; b=klU8lB4nHNxx9qXTqkuOaiBXFAwWAJ8ap+Pf6ecFsvBBN0YBRXjANPmKFUjqSGsJdRcB t2/Fw7pEMyvJuaPUKh4J293l0pA/c+3jhnsSMq09PHnHcdoFcbbRk9PNMl1o1T2HoGom oJR78NZPJ5lN6/BRZDlhFnyb7v7gD34yzetVbviKmVkLe6pdJ28x0oo3X8lJuwCDodyw yoTrBsKqm0gqcOO3qWwVXPLpfE8XFv4WwmJwD99BPWu6USNjV3rIDQezBzghqVB+te7p ODvJ1kVybda1U9n1h5G1z6WeRLg0QfkjyETyQkHXL7NfyzbZ5p2zwzg9K0wVr0UaWtty sQ== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp2130.oracle.com with ESMTP id 2g0cecr4ma-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 08 Feb 2018 01:39:50 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w181dmoA008137 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 8 Feb 2018 01:39:49 GMT Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w181dm6t026686; Thu, 8 Feb 2018 01:39:48 GMT Received: from [10.191.1.71] (/10.191.1.71) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 07 Feb 2018 17:39:48 -0800 Subject: Re: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case To: Keith Busch Cc: axboe@fb.com, linux-kernel@vger.kernel.org, hch@lst.de, linux-nvme@lists.infradead.org, sagi@grimberg.me References: <1517554849-7802-1-git-send-email-jianchao.w.wang@oracle.com> <1517554849-7802-3-git-send-email-jianchao.w.wang@oracle.com> <20180202182413.GH24417@localhost.localdomain> <20180205151314.GP24417@localhost.localdomain> <20180206151335.GE31110@localhost.localdomain> <20180207161345.GB1337@localhost.localdomain> From: "jianchao.wang" Message-ID: <1826ebc1-d419-23da-12d4-dd7b1b3fe598@oracle.com> Date: Thu, 8 Feb 2018 09:40:06 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180207161345.GB1337@localhost.localdomain> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8798 signatures=668663 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1802080013 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Keith Really thanks for your your precious time and kindly directive. That's really appreciated. :) On 02/08/2018 12:13 AM, Keith Busch wrote: > On Wed, Feb 07, 2018 at 10:13:51AM +0800, jianchao.wang wrote: >> What's the difference ? Can you please point out. >> I have shared my understanding below. >> But actually, I don't get the point what's the difference you said. > > It sounds like you have all the pieces. Just keep this in mind: we don't > want to fail IO if we can prevent it. > Yes, absolutely. > A request is allocated from an hctx pool of tags. Once the request is > allocated, it is permently tied to that hctx because that's where its > tag came from. If that hctx becomes invalid, the request has to be ended > with an error, and we can't do anything about that[*]. > > Prior to a reset, we currently halt new requests from being allocated by > freezing the request queues. We unfreeze the queues after the new state > of the hctx's is established. This way all IO requests that were gating > on the unfreeze are guaranteed to enter into a valid context. > > You are proposing to skip freeze on a reset. New requests will then be > allocated before we've established the hctx map. Any request allocated > will have to be terminated in failure if the hctx is no longer valid > once the reset completes. Yes, if any previous hctx doesn't come back, the requests on that hctx will be drained with BLK_STS_IOERR. __blk_mq_update_nr_hw_queues -> blk_mq_freeze_queue -> blk_freeze_queue -> blk_mq_freeze_queue_wait But the nvmeq's cq_vector is -1. > Yes, it's entirely possible today a request allocated prior to the reset > may need to be terminated after the reset. There's nothing we can do > about those except end them in failure, but we can prevent new ones from > sharing the same fate. You are removing that prevention, and that's what > I am complaining about. Thanks again for your precious time to detail this. So I got what you concern about is that this patch doesn't freeze the queue for reset case any more. And there maybe new requests enter, which will be failed when the associated hctx doesn't come back during reset procedure. And this should be avoided. I will change this in next V3 version. > * Future consideration: we recently obtained a way to "steal" bios that > looks like it may be used to back out certain types of requests and let > the bio create a new one. > Yeah, that will be a great idea to reduce the loss when hctx is gone. Sincerely Jianchao