Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755350AbeAON3A (ORCPT + 1 other); Mon, 15 Jan 2018 08:29:00 -0500 Received: from mail-ve1eur01on0041.outbound.protection.outlook.com ([104.47.1.41]:61856 "EHLO EUR01-VE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755315AbeAON2z (ORCPT ); Mon, 15 Jan 2018 08:28:55 -0500 Authentication-Results: spf=pass (sender IP is 193.47.165.134) smtp.mailfrom=mellanox.com; lists.infradead.org; dkim=none (message not signed) header.d=none;lists.infradead.org; dmarc=pass action=none header.from=mellanox.com; Subject: Re: [PATCH V3 1/2] nvme: split resetting state into reset_prepate and resetting To: Sagi Grimberg , Jianchao Wang , , , CC: , References: <1515647268-1717-1-git-send-email-jianchao.w.wang@oracle.com> <1515647268-1717-2-git-send-email-jianchao.w.wang@oracle.com> From: Max Gurtovoy Message-ID: Date: Mon, 15 Jan 2018 15:28:47 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1255"; format=flowed Content-Language: he Content-Transfer-Encoding: 8bit X-Originating-IP: [10.223.3.143] X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:193.47.165.134;IPV:NLI;CTRY:IL;EFV:NLI;SFV:NSPM;SFS:(10009020)(376002)(39860400002)(396003)(39380400002)(346002)(2980300002)(438002)(24454002)(189003)(199004)(110136005)(58126008)(77096006)(31696002)(65956001)(316002)(2906002)(65806001)(64126003)(67846002)(229853002)(106466001)(50466002)(8936002)(83506002)(2201001)(16526018)(36756003)(76176011)(59450400001)(6306002)(86362001)(8676002)(81166006)(106002)(81156014)(16576012)(2950100002)(53546011)(47776003)(54906003)(4326008)(356003)(5660300001)(6246003)(478600001)(7736002)(305945005)(6116002)(2870700001)(1720100001)(65826007)(3846002)(31686004)(966005)(3940600001);DIR:OUT;SFP:1101;SCL:1;SRVR:VI1PR0501MB2271;H:mtlcas13.mtl.com;FPR:;SPF:Pass;PTR:mail13.mellanox.com;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;AM5EUR03FT064;1:uQgXXa5UjVUZZZKWxClPo1lq7hrKvKjxkinIHkLkzfjjPlUklGCIJGKd+/Wo+rdgT7ldxynRFcghXsV5lSU9JYfVZyABvoc7hNQ47yItVfkftYIquiHaQ1+rWYhGxlAv X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 526368c2-df55-4bf2-2fed-08d55c1beaa1 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(5600026)(4604075)(4608076)(2017052603307)(7153060)(7193020);SRVR:VI1PR0501MB2271; X-Microsoft-Exchange-Diagnostics: 1;VI1PR0501MB2271;3:yB2fmdCm/1DHTamW9xdJr3bNeLqJfATJR+Y68KlzOeqeU6q8KTZauJtXhl2G9UAQho3yR4duFikxhF3lLulkdztwkWcAJKhffySBWEcu17meb/kC8i9hT2ZOwWK5irppzaSOUeSCfDIy28/PDduhpXdPjG1+GyfUNZKdii4eHHaqxEjA/Jy6gZd68i7frKMc0i++DVFgP1pAJRZqXtq2vrIdCEXJ/DrT3IuIpWIfqcVk0RQTtOt1LR0qw1Hw//rrq2ZfCZbPSHpXZXY1tckeoweLY4jVIP2ZoHau217yeFhs8yAyztaw1alpH5pvv+q4cH/bXUF+ZqgvKUMU8r05z282Jcfl/SNIaBX+QwP33Qg=;25:JkhFklDG2KYer+WFbn5n2eJF49MQY4zoqOuhGItN3wHXkISUGyRJhr/8t5FjRax70ZHLmPyS1f1aPhHj/uR6jEU3dV/9E6yhN4CGdxpV2DtdCX7qB1t0t4drhUGyUbzv/zILKm0BieyUb5EmjXUuRqJwoIvBSai2UDl41ekXsNhL4Vd9+NefzM5hJXct4WxwvFsqQNZHwfqEyYaWTLK/YGOvFIuFT66CODtiua92aeZ4GARv+EbphsqWkPmFO4/27dJZMITwSnDh5pnpl7PCzinlMGXYKOqyfzB27TK/uGqyBq+AwU52jmO+Ej6Lzazsz1R9cI/qhvXTHmPWOwvvYA== X-MS-TrafficTypeDiagnostic: VI1PR0501MB2271: X-Microsoft-Exchange-Diagnostics: 1;VI1PR0501MB2271;31:TB9GGmDg085tjRTYnAD1/xVGHQElhZyGpUlPY6idegzJ0kxdWt70Snp0hCFi7dFFpqVJSAX8/d4GjFYgxyc6veLCTX7BlJupLsM1dmnC4AYYRSV6EyK+0btvUz/QGku2YxZ7Fr50wkE1qLwKtnpFyHuQfEBh42gmoj7W0RQE4yr1ecxkp3TdliKob1Y0f2BDLFA+MWUYt6/BghhjwIisBAK5HuyMcMcmKJ7yO0ATz0g=;20:3zlEiPJLNTKqd5XS05wW3RlV1NZx6qgvmrZHUgBJUaOZBhrKdT7ju3YOdo8wmN7nDKEcH1FzcHuh6mvoLL2yiDrbNWzvF4blfX7Y8XfKVm89mZDxM5mSA173/7E+7dsCtqZrxByHxp1HAffMDxdWhrT9WKFyYn2pDPOTE558Lpmf5cwYvy/n78UXAjyBF462pDWSbBA/WbxuIxEi/vL62fwaGV0vMOnMJuAMUeLg/IC4BZF63e0jpZ2S7RQDraRJ/jcL4wIUvXEMAVMkcMyOsg89YAHPy2Sw1EXafr2gK//cvm1kyPz24bmilYi/Sq5bGbIyYlwmBAi8Zu6Oj+EfzVi+D2e7nTTz2wMHtzhUzG55bWyi3WklRRnKXhG6lwVQlqO4p54iV5QHFNUEEq4s6D9NzLd7NwoIKNKEX4ZLGcQU1ejyn11jqMLy/R9NZF296rncFNeXB6g++fpHWlSCyBJZwLUtz/QC5Ies7clcmmjmZyLsVRtVvl1ebxkDOZlf X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(258649278758335)(146099531331640); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040470)(2401047)(8121501046)(5005006)(3231023)(944501161)(3002001)(10201501046)(93006095)(93004095)(6055026)(6041268)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(20161123562045)(20161123558120)(20161123560045)(6072148)(201708071742011);SRVR:VI1PR0501MB2271;BCL:0;PCL:0;RULEID:(100000803101)(100110400095);SRVR:VI1PR0501MB2271; X-Microsoft-Exchange-Diagnostics: 1;VI1PR0501MB2271;4:kl8kDjV8WS8d1NTZNaC/3vcrPNUG6xcDZDWqomWGVaYcXbu4tDtYRZpqQZUiLXSucoWMaXJLZNlGIp5mcNPV8xvkHBD9ha+Qiwhx0ljp3PNHMtX+hELSZEmoZyr8Y9TDMzGpgyP3ndJ/zq3rUwXrmh2x4fVz9Mg4alcyaBzWTPlTeHdmclUVkWpCOEeRG5zOnrYn4/zJbvKoCb3b6LEJh9LK9DpsPXCYPLCFOvOkS5VGIqZz+OwD6JGyzq8lBcGJpxj/YSVVy1R01ONWwq/Az43mHLs2ROPtSw+/FgVfQ+zT4nL2kmvpO5EJuRMrxmDBOZHsYGVf2qzqsXwcPcIMFfx7MlNdL/rcdN+Z/wGgfc8= X-Forefront-PRVS: 0553CBB77A X-Microsoft-Exchange-Diagnostics: =?windows-1255?Q?1;VI1PR0501MB2271;23:9vSLaIcuwTrWN4cpiZ6PiFqQbX4vwT/9KGk?= =?windows-1255?Q?+4armbtut9db2SoEzJJ/YNqDHrevUF6OoakBehoE97CVAa3dSiqw6z7v?= =?windows-1255?Q?P9UTr5FtX+pLv2vOS8bk7LVh+KyGpQzwVDdPOe8TeZuGvL8cuOR6hLcE?= =?windows-1255?Q?C4xf9vB3sDnVP0X6SphoScYCPmm9Zm6el0LwUwA6ovOr9Ndeesz9mk2k?= =?windows-1255?Q?lP8bsgfrwm/sb02WeFOebfeyD9ueSO0Vq4wSkFWmV/TFmcYy38XLpo/7?= =?windows-1255?Q?dPDFIwSWlJqMqoFc0CWHyDZiFbWleu82zqnspP7JFCnJ54vum7ilr2D3?= =?windows-1255?Q?TpO7Ud0gcI+REvIMe8WwzCvMKrbId4LxgTAvDQ4hbe9deOPJokCXP8Ow?= =?windows-1255?Q?sIuOCDAYcvOMT2/EHRq6dMmoecgYZTKeHh3fyxr2dIx3AhHMndJD07dF?= =?windows-1255?Q?2q4l+vBrgVUBpUD3YOtV26FyT1TPJ9MprKpnW9pmeW3Cjlx2Rn8Nlrtf?= =?windows-1255?Q?u9fSob94uqYOSeFemPxbVNe01Js2pwgDkpsorPMBLH7RLbmsCZFwgJE9?= =?windows-1255?Q?J4gHFq3PFG2YGYfWING0Q9wyE9KJ5Q2UTe3G9sV+EzMLB2/dGsdznNRb?= =?windows-1255?Q?i81Bw75scH6HwGUaXrdSmVq1tKqzxYQI57ig6srZwQwrD+J7J2pnEtwI?= =?windows-1255?Q?BOawE6b1vXPXyhMmiQMqnzjDpHGDNogIlRQaKTFfUq7mz4Qm7l6/1V6P?= =?windows-1255?Q?GPDqf/53IXOaIpr05buJ46WxoAxyXIsN9IQ6xud7uy3hE9BrWM0vbxvf?= =?windows-1255?Q?KuiLh8zjZYjqsUXM6eCAv0RfUH0qkgM/4ve09+bEaWzwvB73Q2L2mBEI?= =?windows-1255?Q?XAf5wQY1oKonde/MyVDIqhFEYs4Mm8FJlLMDq39R7VrVT5oAhaFvbDM9?= =?windows-1255?Q?YpHhrVhin5/Z+hi8kKEdH/G6Rlo0gna0BPDgyZoRRP4zNYaGdPOhs2tb?= =?windows-1255?Q?h539bRDZhacvxG63lav38FMgj6dW/AmtGNuS1rVp/6YFgWV4chJTsjaJ?= =?windows-1255?Q?ipQY0uQtRpARUYIeGAzP5U/Nee7BhVBlEd+kGingzYtbhXoWMiMvNjIo?= =?windows-1255?Q?Aklm8TYHsHQkwBUz9p6VcRX7Em3tQBHN66j6NvOsxzQdWdvZlr/nfDac?= =?windows-1255?Q?qIDKATLNfitFqSclP8p5lth/is4jTR12yz1nSYfxxh8TyDAe3swH/5nr?= =?windows-1255?Q?CYS9EVDK92QRI4Fzp/wXUlp5ND1FuRI9a4ygibA6N42/uHwtpo7d6rh+?= =?windows-1255?Q?NEivN1B82mlkHV6k6o9a8nmA6Lyi9ckc+RKXT7Et9jJlPGxRGKSOzOQP?= =?windows-1255?Q?eUV0J3vFoc/j5ZmpYiPVE9KwZCnCI16OQkw=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;VI1PR0501MB2271;6:QOL5bl2dDXn0c9Eo4KEZMtw4qA169FQWBgg91TLJp/XFNoHlDzQiRKEp5uwn4WqkOatPLIktyv9x30SARufD1qZ1ncYY6HvKWVMg9UF7wMerYayzs02xhORg18ozmQHSR0zkoaYrhFzZ6sovBSpLCoYVu5nM+2oh6fEGKi90P29JqImYpR0XY4t/K+Ad5oDjd6L/5GLHBT9GazF6vCo0efmQlQPskXGdYvm/qf1xwGn1cgTNAtQZvyfRQ6kdJVpnMC44hWZ/YjnevzPgVEn2dGZArL5p5q+lhjoq9x+HDbNPkscD0Jso1bA2HF29Uwz6isUpHusavx3zwL0n70OQp7cB12X8KDRbeS8TqiuxNVA=;5:5G9d1RQY1hediYgTmjDRCrknNvWDpzIvCFvxXQ2Pj5ZHO3BZvC7IlqzA++3DKt6QFtuXJf5NwWHsrm1Y9JpX9umE5ocha90IonN2x8GACwG0huiENO7dHjXiqq70HC2VzG/2muDDbVdUvXLXUBIo4rRYf0Q0nSYR//KWZfblDKo=;24:SJ5h4aMnG7/3OTQoxdN1XeCIvhFPZqo8ZX/FIFazImrsta6bW6YoP1N9A96RH8aLdN3/RAvEISiDVNTumUssFkYrFHJO3+UdeK92J1Uz1Fk=;7:zuFkdZdH8OfNQ+C/6REv7mVS/VIe5kTwJTfCUBseDZXZ3gkva8LxvCibLujMnPjhM6mPmf5d452bilajUrurIvq9pewiJQNZ0/aTljattFvuziMUOX+iZPwbUvNVogOk+vHgBQPUjzS29Sk1pqP4UvqRTXNR+E3ySznyBt2NV++ILC4yPtb3RAb1f+0pWOZ02AGngZONAZM/OOPN2Gc78uCARuJo8DfdLnpjyPQVsEHf/jaRKKNfjRgg3usuW0Ix SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jan 2018 13:28:50.9503 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 526368c2-df55-4bf2-2fed-08d55c1beaa1 X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=a652971c-7d2e-4d9b-a6a4-d149256f461b;Ip=[193.47.165.134];Helo=[mtlcas13.mtl.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0501MB2271 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 1/14/2018 11:48 AM, Sagi Grimberg wrote: > >> Currently, the ctrl->state will be changed to NVME_CTRL_RESETTING >> before queue the reset work. This is not so strict. There could be >> a big gap before the reset_work callback is invoked. In addition, >> there is some disable work in the reset_work callback, strictly >> speaking, not part of reset work, and could lead to some confusion. >> >> This patch splits the NVME_CTRL_RESETTING into NVME_CTRL_RESET_PREPARE >> and NVME_CTRL_RESETTING. Before queue the reset work, changes state >> to NVME_CTRL_RESET_PREPARE, after disable work completes, changes >> state to NVME_CTRL_RESETTING. > > What guarantees that nvme_timeout will do the right thing? If it > is relying on the fact that nvme-pci sets the state to RESETTING > only _after_ quiescing the requests queues it needs to be documented > as its a real delicate dependency. > > > >> >> Suggested-by: Christoph Hellwig >> Signed-off-by: Jianchao Wang >> --- >> ? drivers/nvme/host/core.c?? | 17 +++++++++++++++-- >> ? drivers/nvme/host/fc.c???? |? 2 ++ >> ? drivers/nvme/host/nvme.h?? |? 1 + >> ? drivers/nvme/host/pci.c??? | 28 ++++++++++++++++++++++------ >> ? drivers/nvme/host/rdma.c?? |? 8 ++++++++ >> ? drivers/nvme/target/loop.c |? 5 +++++ >> ? 6 files changed, 53 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c >> index 1e46e60..106a437 100644 >> --- a/drivers/nvme/host/core.c >> +++ b/drivers/nvme/host/core.c >> @@ -87,7 +87,7 @@ static __le32 nvme_get_log_dw10(u8 lid, size_t size) >> ? int nvme_reset_ctrl(struct nvme_ctrl *ctrl) >> ? { >> -??? if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)) >> +??? if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESET_PREPARE)) >> ????????? return -EBUSY; >> ????? if (!queue_work(nvme_wq, &ctrl->reset_work)) >> ????????? return -EBUSY; >> @@ -243,7 +243,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl, >> ????????????? break; >> ????????? } >> ????????? break; >> -??? case NVME_CTRL_RESETTING: >> +??? case NVME_CTRL_RESET_PREPARE: >> ????????? switch (old_state) { >> ????????? case NVME_CTRL_NEW: >> ????????? case NVME_CTRL_LIVE: >> @@ -253,10 +253,21 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl, >> ????????????? break; >> ????????? } >> ????????? break; >> + >> +??? case NVME_CTRL_RESETTING: >> +??????? switch (old_state) { >> +??????? case NVME_CTRL_RESET_PREPARE: >> +??????????? changed = true; >> +??????????? /* FALLTHRU */ >> +??????? default: >> +??????????? break; >> +??????? } >> +??????? break; >> ????? case NVME_CTRL_RECONNECTING: >> ????????? switch (old_state) { >> ????????? case NVME_CTRL_LIVE: >> ????????? case NVME_CTRL_RESETTING: >> +??????? case NVME_CTRL_RESET_PREPARE: >> ????????????? changed = true; >> ????????????? /* FALLTHRU */ >> ????????? default: >> @@ -267,6 +278,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl, >> ????????? switch (old_state) { >> ????????? case NVME_CTRL_LIVE: >> ????????? case NVME_CTRL_RESETTING: >> +??????? case NVME_CTRL_RESET_PREPARE: >> ????????? case NVME_CTRL_RECONNECTING: >> ????????????? changed = true; >> ????????????? /* FALLTHRU */ >> @@ -2603,6 +2615,7 @@ static ssize_t nvme_sysfs_show_state(struct >> device *dev, >> ????????? [NVME_CTRL_NEW]??????? = "new", >> ????????? [NVME_CTRL_LIVE]??? = "live", >> ????????? [NVME_CTRL_RESETTING]??? = "resetting", >> +??????? [NVME_CTRL_RESET_PREPARE]??? = "reset-prepare", >> ????????? [NVME_CTRL_RECONNECTING]= "reconnecting", >> ????????? [NVME_CTRL_DELETING]??? = "deleting", >> ????????? [NVME_CTRL_DEAD]??? = "dead", >> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c >> index 794e66e..516c1ea 100644 >> --- a/drivers/nvme/host/fc.c >> +++ b/drivers/nvme/host/fc.c >> @@ -547,6 +547,7 @@ nvme_fc_resume_controller(struct nvme_fc_ctrl *ctrl) >> ????????? break; >> ????? case NVME_CTRL_RESETTING: >> +??? case NVME_CTRL_RESET_PREPARE: >> ????????? /* >> ?????????? * Controller is already in the process of terminating the >> ?????????? * association. No need to do anything further. The reconnect >> @@ -790,6 +791,7 @@ nvme_fc_ctrl_connectivity_loss(struct nvme_fc_ctrl >> *ctrl) >> ????????? break; >> ????? case NVME_CTRL_RESETTING: >> +??? case NVME_CTRL_RESET_PREPARE: >> ????????? /* >> ?????????? * Controller is already in the process of terminating the >> ?????????? * association.? No need to do anything further. The reconnect > > James, can you take a look at this? > >> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c >> index f5800c3..e477c35 100644 >> --- a/drivers/nvme/host/pci.c >> +++ b/drivers/nvme/host/pci.c >> @@ -1141,8 +1141,13 @@ static bool nvme_should_reset(struct nvme_dev >> *dev, u32 csts) >> ????? bool nssro = dev->subsystem && (csts & NVME_CSTS_NSSRO); >> ????? /* If there is a reset ongoing, we shouldn't reset again. */ >> -??? if (dev->ctrl.state == NVME_CTRL_RESETTING) >> +??? switch (dev->ctrl.state) { >> +??? case NVME_CTRL_RESETTING: >> +??? case NVME_CTRL_RESET_PREPARE: >> ????????? return false; >> +??? default: >> +??????? break; >> +??? } > > Isn't the indentation off? also in other places... > >> @@ -2292,7 +2303,7 @@ static void nvme_reset_work(struct work_struct >> *work) >> ????? bool was_suspend = !!(dev->ctrl.ctrl_config & NVME_CC_SHN_NORMAL); >> ????? int result = -ENODEV; >> -??? if (WARN_ON(dev->ctrl.state != NVME_CTRL_RESETTING)) >> +??? if (WARN_ON(dev->ctrl.state != NVME_CTRL_RESET_PREPARE)) >> ????????? goto out; >> ????? /* >> @@ -2302,6 +2313,11 @@ static void nvme_reset_work(struct work_struct >> *work) >> ????? if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) >> ????????? nvme_dev_disable(dev, false); >> +??? if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESETTING)) { >> +??????? WARN_ON_ONCE(dev->ctrl.state != NVME_CTRL_DELETING); >> +??????? goto out; >> +??? } >> + > > This needs documentation on _why_ here. > >> -??? nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESETTING); >> +??? nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESET_PREPARE); >> ????? dev_info(dev->ctrl.device, "pci function %s\n", >> dev_name(&pdev->dev)); > > Can you rebase on top of my commit: > 4caff8fc19f1 nvme-pci: don't open-code nvme_reset_ctrl > >> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c >> index 37af565..8ae073e 100644 >> --- a/drivers/nvme/host/rdma.c >> +++ b/drivers/nvme/host/rdma.c >> @@ -1753,6 +1753,14 @@ static void nvme_rdma_reset_ctrl_work(struct >> work_struct *work) >> ????? nvme_stop_ctrl(&ctrl->ctrl); >> ????? nvme_rdma_shutdown_ctrl(ctrl, false); >> +??? changed = nvme_change_ctrl_state(&ctrl->ctrl, >> +??????????? NVME_CTRL_RESET_PREPARE); >> +??? if (!changed) { >> +??????? /* state change failure is ok if we're in DELETING state */ >> +??????? WARN_ON_ONCE(ctrl->ctrl.state != NVME_CTRL_DELETING); >> +??????? return; >> +??? } >> + > > setting RESET_PREPARE here?? > > Also, the error recovery code is mutually excluded from reset_work > by trying to set the same state which is protected by the ctrl state > machine, so a similar change is needed there. Sagi, Do you mean to add this (if so, I agree): diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index d06641b..44ef52a 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -957,6 +957,12 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work) struct nvme_rdma_ctrl *ctrl = container_of(work, struct nvme_rdma_ctrl, err_work); + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETING)) { + /* state change failure should never happen */ + WARN_ON_ONCE(1); + return; + } + nvme_stop_keep_alive(&ctrl->ctrl); if (ctrl->ctrl.queue_count > 1) { @@ -989,7 +995,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work) static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl) { - if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING)) + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESET_PREPARE)) return; queue_work(nvme_wq, &ctrl->err_work); @@ -1760,6 +1766,12 @@ static void nvme_rdma_reset_ctrl_work(struct work_struct *work) int ret; bool changed; + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETING)) { + /* state change failure should never happen */ + WARN_ON_ONCE(1); + return; + } + nvme_stop_ctrl(&ctrl->ctrl); nvme_rdma_shutdown_ctrl(ctrl, false); > > So in order not to break rdma ctrl reset (again) please look at: > bcdacf7fc3d8 nvme-rdma: fix concurrent reset and reconnect > > _______________________________________________ > Linux-nvme mailing list > Linux-nvme@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-nvme