Received: by 10.223.176.46 with SMTP id f43csp748844wra; Fri, 19 Jan 2018 01:05:29 -0800 (PST) X-Google-Smtp-Source: ACJfBosD7i3wZGF/CZAg6ATH8DvarIXTPsWDCx64GSJx/vnSOMxmTrqUeTvnd/xBxN2nSDNhVwYt X-Received: by 10.98.93.65 with SMTP id r62mr36970954pfb.55.1516352729286; Fri, 19 Jan 2018 01:05:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516352729; cv=none; d=google.com; s=arc-20160816; b=twXh74iy5U54LWTXT1IVLFp0JFMXUrpgxTR2mjVmwPbCnpoqmMJqRnlkjuHdLySKIY owCgV1mESv1jZftncIYEVY9yyL691MfQQihzUaJ6HbBOiw7u1hgWauLMB2PVJ3HSbfsv Wshy2xrA8fR0Lm1f7IZ7C1ewSixfBeSY/kzIXLgOH7XYWKJ1ei4CUqMnckEv6cqGoLN6 mZzXlvkC+3wmbbrjJw4gy4xUjpTknuehcRUR6R/mNNvgPy2yCCbotMQqaAtNJIUWg/cC uatIJM5NrJ29hYC7DibKbOXAVR7pUj81gb+s1H1X33up9cipwytEzzBRhUc6FZwt50mG /M3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=WfPeQEZPxrEOPI9reqHVdiLC3/rO8wAs5nnbT//P8cs=; b=JnGNPr3/qamTa3MQ3MzkWiP0gjRk4dV3I6Oq+j6TIVl5H271voYgzLQ5snV+cHFR0z o+qSZpTg5naKUeGxI5ToUN7Dv4vb4x6A5y3kT+aeo7aIHBQSsf4I49FVKb+xJGdO7KuD oGJz0FUJUaP6fhPL5gbXw8B3it2j3lrOz8pWqPEblt4JoD/z3B4F5AIPvs9YVL0aGqRL DaJy3AM+2+Ky2VWntpR6xHtULNwYoQlp0U0Is/WXKiNIqbai9w1bEoQV+sVvGEw5q1UI /uASmn3QrXrudyvMhGAA3vGFGJQWo2jJnKZzfazoN2xsgUFIE/DPSGx5xoZk416VxYRg yFRA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=MyPMdLmK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 64-v6si675683pla.149.2018.01.19.01.05.15; Fri, 19 Jan 2018 01:05:29 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=MyPMdLmK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755382AbeASJDY (ORCPT + 99 others); Fri, 19 Jan 2018 04:03:24 -0500 Received: from aserp2130.oracle.com ([141.146.126.79]:53300 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754070AbeASJDN (ORCPT ); Fri, 19 Jan 2018 04:03:13 -0500 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w0J91elQ153182; Fri, 19 Jan 2018 09:02:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=WfPeQEZPxrEOPI9reqHVdiLC3/rO8wAs5nnbT//P8cs=; b=MyPMdLmKwiETJvIiHH8C9+z+MRDsN0f7yjKeEl4yQaoCHqrudE1N602tcO77OkHlkGxF qdo1G1egnPm+xhqFR/KKaUolkPH8m6MmgMnXWkb0JCzwFiEZOYOwwQfOZ9hQZX9nnueN QFfiHbxU1kheC2ps6pkD4syHkpaamaUasRQ4iuSKxRqjZs5DeoYIkHzXwHw1TDAK2B2x C2L4ipKBgV5WYlldb+gX1MBfEYWiATxkli2PNolfL1Q6GuwEZuXed5Bb8QaRmeaoadHm z/dLTbg2Rec20ZPdnbAgVzDrOpWLoSkdK43ndFLZxbK78CSsPGiHLDPa4QMFTKYFpSTA lA== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2130.oracle.com with ESMTP id 2fkcxc05hd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 19 Jan 2018 09:02:14 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w0J92Eu8026854 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Fri, 19 Jan 2018 09:02:14 GMT Received: from abhmp0018.oracle.com (abhmp0018.oracle.com [141.146.116.24]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w0J92C4Z029265; Fri, 19 Jan 2018 09:02:13 GMT Received: from [10.182.70.180] (/10.182.70.180) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 19 Jan 2018 01:02:12 -0800 Subject: Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing To: Keith Busch Cc: axboe@fb.com, hch@lst.de, sagi@grimberg.me, maxg@mellanox.com, james.smart@broadcom.com, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org References: <1516270202-8051-1-git-send-email-jianchao.w.wang@oracle.com> <20180119080130.GE12043@localhost.localdomain> <0639aa2f-d153-5aac-ce08-df0d4b45f9a0@oracle.com> <20180119084218.GF12043@localhost.localdomain> From: "jianchao.wang" Message-ID: <84b4e3bc-fe23-607e-9d5e-bb5644eedb54@oracle.com> Date: Fri, 19 Jan 2018 17:02:06 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <20180119084218.GF12043@localhost.localdomain> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8778 signatures=668654 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1801190117 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Keith Thanks for your kindly and detailed response and patch. On 01/19/2018 04:42 PM, Keith Busch wrote: > On Fri, Jan 19, 2018 at 04:14:02PM +0800, jianchao.wang wrote: >> On 01/19/2018 04:01 PM, Keith Busch wrote: >>> The nvme_dev_disable routine makes forward progress without depending on >>> timeout handling to complete expired commands. Once controller disabling >>> completes, there can't possibly be any started requests that can expire. >>> So we don't need nvme_timeout to do anything for requests above the >>> boundary. >>> >> Yes, once controller disabling completes, any started requests will be handled and cannot expire. >> But before the _boundary_, there could be a nvme_timeout context runs with nvme_dev_disable in parallel. >> If a timeout path grabs a request, then nvme_dev_disable cannot get and cancel it. >> So even though the nvme_dev_disable completes, there still could be a request in nvme_timeout context. >> >> The worst case is : >> nvme_timeout nvme_reset_work >> if (ctrl->state == RESETTING ) nvme_dev_disable >> nvme_dev_disable initializing procedure >> >> the nvme_dev_disable run with reinit procedure in nvme_reset_work in parallel. > > Okay, I see what you're saying. That's a pretty obscure case, as normally > we enter nvme_reset_work with the queues already disabled, but there > are a few cases where we need nvme_reset_work to handle that. > > But if we are in that case, I think we really just want to sync the > queues. What do you think of this? > > --- > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > index fde6fd2e7eef..516383193416 100644 > --- a/drivers/nvme/host/core.c > +++ b/drivers/nvme/host/core.c > @@ -3520,6 +3520,17 @@ void nvme_stop_queues(struct nvme_ctrl *ctrl) > } > EXPORT_SYMBOL_GPL(nvme_stop_queues); > > +void nvme_sync_queues(struct nvme_ctrl *ctrl) > +{ > + struct nvme_ns *ns; > + > + mutex_lock(&ctrl->namespaces_mutex); > + list_for_each_entry(ns, &ctrl->namespaces, list) > + blk_sync_queue(ns->queue); > + mutex_unlock(&ctrl->namespaces_mutex); > +} > +EXPORT_SYMBOL_GPL(nvme_sync_queues); > + > void nvme_start_queues(struct nvme_ctrl *ctrl) > { > struct nvme_ns *ns; > diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h > index 8e7fc1b041b7..45b1b8ceddb6 100644 > --- a/drivers/nvme/host/nvme.h > +++ b/drivers/nvme/host/nvme.h > @@ -374,6 +374,7 @@ void nvme_complete_async_event(struct nvme_ctrl *ctrl, __le16 status, > > void nvme_stop_queues(struct nvme_ctrl *ctrl); > void nvme_start_queues(struct nvme_ctrl *ctrl); > +void nvme_sync_queues(struct nvme_ctrl *ctrl) > void nvme_kill_queues(struct nvme_ctrl *ctrl); > void nvme_unfreeze(struct nvme_ctrl *ctrl); > void nvme_wait_freeze(struct nvme_ctrl *ctrl); > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index a2ffb557b616..1fe00be22ad1 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -2289,8 +2289,10 @@ static void nvme_reset_work(struct work_struct *work) > * If we're called to reset a live controller first shut it down before > * moving on. > */ > - if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) > + if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) { > nvme_dev_disable(dev, false); > + nvme_sync_queues(&dev->ctrl); > + } > > result = nvme_pci_enable(dev); > if (result) > -- > We should not use blk_sync_queue here, the requeue_work and run_work will be canceled. Just flush_work(&q->timeout_work) should be ok. In addition, we could check NVME_CC_ENABLE in nvme_dev_disable to avoid redundant invoking. :) Thanks Jianchao