Received: by 10.223.176.46 with SMTP id f43csp696419wra; Fri, 19 Jan 2018 00:21:26 -0800 (PST) X-Google-Smtp-Source: ACJfBotwpu2N2QbDhopV48ploJ+MDjj618Vc9bFvoSBpe48+qyirN10xCkCSCRPn/YgSv8EaoIrr X-Received: by 2002:a17:902:6005:: with SMTP id r5-v6mr1207271plj.307.1516350086270; Fri, 19 Jan 2018 00:21:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516350086; cv=none; d=google.com; s=arc-20160816; b=H+akjev9kQNi8Dqp4eyLswCNgf3R3pXiXmPeEIyz7nrc2OUZbRvcbxfjNMpcLdp81H bLdKqnUelP6sHCYYEmtmGicE2DU1Q01WyNBg0dNAu3NDMyYi5y4vVtkuIpKuyML0EFKt oNmVDzqAURkA0t7Y+NugF8Ks8pe/hJOPg87IBwWiNTS5tXjHD8W+7YKx+GoqTSvTbQDg jAvoqjQ5LJrIzTmCNgwbYFWZ1dz5BsadHbsHwrrKHfSX18OtLvvdM9BEodCQ18pEMGqH 4KNH6eaEEBrTNh7pgxIGBi8kj5qNvhg+KBws6TIswnRFIyv2SLOXiRYoMZjIPT97diuS 1PdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=yb6cVIDTKNgPXcoOrytEW39/U0rlplQ5pSy1sHGqcL4=; b=yBf1GQwVfZDTMNwqOmgBl14Y8DJNqpAa7eW+Xa2ZehowOepKrsUtBcq3YxKSB/gfra LdXNoy/g9svJUb7gnd27O48n6CndLCfDyFt0p8Y9szntm+OAoynvw1pHSHhMfkG9PiJ3 FsONuzTkcHNONw+PaVDN5Zq5FKHsqLVYH8FN45h0d0YqtKek+M74mDhWua1N5DfXSbK4 XjJ5cS6Y5YANIT7SwrEvGICPUNZXrWgV7DI/AZsFFP3xZQphXXabzz/Tb0JWRFqZAVnc xxGYNMTm1X96WvVw/i4GAimkS+MJmFDxCFiz1snO5+YTrNnVs+JWGDmKf0FcPSrYDM/6 nNrw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=Ujl/Cub5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m3-v6si646755plb.330.2018.01.19.00.21.12; Fri, 19 Jan 2018 00:21:26 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=Ujl/Cub5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754927AbeASITx (ORCPT + 99 others); Fri, 19 Jan 2018 03:19:53 -0500 Received: from userp2130.oracle.com ([156.151.31.86]:52708 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751548AbeASITq (ORCPT ); Fri, 19 Jan 2018 03:19:46 -0500 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w0J8GbdV185592; Fri, 19 Jan 2018 08:19:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=yb6cVIDTKNgPXcoOrytEW39/U0rlplQ5pSy1sHGqcL4=; b=Ujl/Cub5kC7LiqUnfEclNo1yT6PUOvNXTmGIclcHUSCkxGMA8mnh6BXk8iqUt5wHm51k tRrIMghZPap4tBM19y9Cv4kM36kqcUyoZ2LyVRTLnsgUlQltXpyQTT6Hkn/Iskajzvnc RGmSCFxoKDXUMACP0tPhLWMIpN7G1zqQmD3TZ4f30sOprmLnxC6j4lDnZcetoL2Ejq0Q 3XjT9Y5q6iIdNZyeZT17lWeGouCmSvES4UUaSHnHe2Zo1L4xNpVcAPkRYnOHvtlqbTcZ 0bQ45izbfDGfrun2YqGBMKgSklFuOplqyICxK926X6RhYtF8LrQhAy6JBe+IalJExrOF vw== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp2130.oracle.com with ESMTP id 2fkcuug2gf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 19 Jan 2018 08:19:10 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w0J8E9lG014058 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Fri, 19 Jan 2018 08:14:09 GMT Received: from abhmp0003.oracle.com (abhmp0003.oracle.com [141.146.116.9]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w0J8E8C8005894; Fri, 19 Jan 2018 08:14:09 GMT Received: from [10.182.70.180] (/10.182.70.180) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 19 Jan 2018 00:14:08 -0800 Subject: Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing To: Keith Busch Cc: axboe@fb.com, hch@lst.de, sagi@grimberg.me, maxg@mellanox.com, james.smart@broadcom.com, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org References: <1516270202-8051-1-git-send-email-jianchao.w.wang@oracle.com> <20180119080130.GE12043@localhost.localdomain> From: "jianchao.wang" Message-ID: <0639aa2f-d153-5aac-ce08-df0d4b45f9a0@oracle.com> Date: Fri, 19 Jan 2018 16:14:02 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <20180119080130.GE12043@localhost.localdomain> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8778 signatures=668654 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1801190106 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Keith Thanks for your time to look into this. On 01/19/2018 04:01 PM, Keith Busch wrote: > On Thu, Jan 18, 2018 at 06:10:00PM +0800, Jianchao Wang wrote: >> Hello >> >> Please consider the following scenario. >> nvme_reset_ctrl >> -> set state to RESETTING >> -> queue reset_work >> (scheduling) >> nvme_reset_work >> -> nvme_dev_disable >> -> quiesce queues >> -> nvme_cancel_request >> on outstanding requests >> -------------------------------_boundary_ >> -> nvme initializing (issue request on adminq) >> >> Before the _boundary_, not only quiesce the queues, but only cancel >> all the outstanding requests. >> >> A request could expire when the ctrl state is RESETTING. >> - If the timeout occur before the _boundary_, the expired requests >> are from the previous work. >> - Otherwise, the expired requests are from the controller initializing >> procedure, such as sending cq/sq create commands to adminq to setup >> io queues. >> In current implementation, nvme_timeout cannot identify the _boundary_ >> so only handles second case above. > > Bare with me a moment, as I'm only just now getting a real chance to look > at this, and I'm not quite sure I follow what problem this is solving. > > The nvme_dev_disable routine makes forward progress without depending on > timeout handling to complete expired commands. Once controller disabling > completes, there can't possibly be any started requests that can expire. > So we don't need nvme_timeout to do anything for requests above the > boundary. > Yes, once controller disabling completes, any started requests will be handled and cannot expire. But before the _boundary_, there could be a nvme_timeout context runs with nvme_dev_disable in parallel. If a timeout path grabs a request, then nvme_dev_disable cannot get and cancel it. So even though the nvme_dev_disable completes, there still could be a request in nvme_timeout context. The worst case is : nvme_timeout nvme_reset_work if (ctrl->state == RESETTING ) nvme_dev_disable nvme_dev_disable initializing procedure the nvme_dev_disable run with reinit procedure in nvme_reset_work in parallel. Thanks Jianchao