Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755529AbeAOOI0 (ORCPT + 1 other); Mon, 15 Jan 2018 09:08:26 -0500 Received: from userp2130.oracle.com ([156.151.31.86]:33770 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755514AbeAOOIX (ORCPT ); Mon, 15 Jan 2018 09:08:23 -0500 Subject: Re: [PATCH V3 1/2] nvme: split resetting state into reset_prepate and resetting To: Max Gurtovoy , Sagi Grimberg , keith.busch@intel.com, axboe@fb.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org References: <1515647268-1717-1-git-send-email-jianchao.w.wang@oracle.com> <1515647268-1717-2-git-send-email-jianchao.w.wang@oracle.com> From: "jianchao.wang" Message-ID: Date: Mon, 15 Jan 2018 22:07:32 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1255 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8774 signatures=668652 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1801150201 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: Hi max Thanks for your kindly response and comment. On 01/15/2018 09:28 PM, Max Gurtovoy wrote: >>> >> >> setting RESET_PREPARE here?? >> >> Also, the error recovery code is mutually excluded from reset_work >> by trying to set the same state which is protected by the ctrl state >> machine, so a similar change is needed there. > > Sagi, > Do you mean to add this (if so, I agree): > > > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c > index d06641b..44ef52a 100644 > --- a/drivers/nvme/host/rdma.c > +++ b/drivers/nvme/host/rdma.c > @@ -957,6 +957,12 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work) > ??? struct nvme_rdma_ctrl *ctrl = container_of(work, > ??????????? struct nvme_rdma_ctrl, err_work); > > +?? if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETING)) { > +?????? /* state change failure should never happen */ > +?????? WARN_ON_ONCE(1); > +?????? return; > +?? } > + > ??? nvme_stop_keep_alive(&ctrl->ctrl); > > ??? if (ctrl->ctrl.queue_count > 1) { > @@ -989,7 +995,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work) > > ?static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl) > ?{ > -?? if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING)) > +?? if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESET_PREPARE)) > ??????? return; > > ??? queue_work(nvme_wq, &ctrl->err_work); > @@ -1760,6 +1766,12 @@ static void nvme_rdma_reset_ctrl_work(struct work_struct *work) > ??? int ret; > ??? bool changed; > > +?? if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETING)) { > +?????? /* state change failure should never happen */ > +?????? WARN_ON_ONCE(1); > +?????? return; > +?? } > + > ??? nvme_stop_ctrl(&ctrl->ctrl); > ??? nvme_rdma_shutdown_ctrl(ctrl, false); RESET_PREPARE state should include not only the scheduling gap of reset_work, but also the device disable procedure. All the previous state and work are cleared and canceled, then start new one. It is a very obvious boundary there. nvme_stop_ctrl(&ctrl->ctrl); nvme_rdma_shutdown_ctrl(ctrl, false); + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETING)) { + /* state change failure should never happen */ + WARN_ON_ONCE(1); + return; + } What do you think about ? :) Thanks Jianchao