Received: by 10.223.148.5 with SMTP id 5csp6960395wrq; Wed, 17 Jan 2018 22:25:30 -0800 (PST) X-Google-Smtp-Source: ACJfBosR6+k8RwL5bHXaPfh6K0SLTArDe0Sm8xlY7msdPRjd55zFe8iHDIJg3baVbEnQFpVhLaMb X-Received: by 10.98.214.129 with SMTP id a1mr35008343pfl.221.1516256730534; Wed, 17 Jan 2018 22:25:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516256730; cv=none; d=google.com; s=arc-20160816; b=PNrf8KCSRxU6HNF5sxjm3nQ4CaWi3YPnQ1mb9B09d65zZPsRXXNZwQcVOJQM9AF5+2 irOHk1aylQntD3ysb1Hr6DL5y6M+G86NAGI4DU1eJxHEsWS6Ev7kZ8Pfm6Ny/aOrTlpD JA1rY9UKWFsVjWeteWITqAnezmI6ZmyQMkDb+dayXcuB6Ao69ir4ZiqSigQ85JI8v6Qt PYrvCGepZB6rPzer48vxtQ02XsHKs3IDbBVAziDpjL6Lr9jP8U155EcUSSG9grTq305d 5alYIVsyqFW1J6uUIfAa8CMehpYC4Tb+DNw1v1w/K3xeM1T6OSgnxhF6fOYChM15bHXX jSEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=+L1hgSLg0o+a4idvxpxtuk3U6prs1P0cjyPHxFh6bQ8=; b=I1cg2JzO0HUcuI/7vEV0gYWEGahI0e2s0nLZ3oJWY/BY/erxrXyL1PcWA1YV+8iYz3 mAHW+J7Lrgsh9Jj/jqbV5VnXRF9DOWA9xPyAJlubL3YP80+8QPP1tsZw2tjqeAuY6Ip4 9h/gYuzzg4LNODUXFed38p0aBll0x1fKwoJgwubv2q/I8bwkM7MnrWxSTYJWWpyWPL8O FC3+ZRAOAjLkKrMXnkscgjtozj+aBLvbu9L79xVeENoD+IQkOGIbi72RAKAT7AKVuid0 cteakhENkA2odM+sAArhijwr0IHjV3FMKnsg93q27Gl8jvqm7pyI7bZ0Lmuw18Kn4eOd adPg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=CCGL2ZyX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 68si5860715pff.213.2018.01.17.22.25.07; Wed, 17 Jan 2018 22:25:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=CCGL2ZyX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754498AbeARGYb (ORCPT + 99 others); Thu, 18 Jan 2018 01:24:31 -0500 Received: from mail-qt0-f176.google.com ([209.85.216.176]:46014 "EHLO mail-qt0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752306AbeARGYa (ORCPT ); Thu, 18 Jan 2018 01:24:30 -0500 Received: by mail-qt0-f176.google.com with SMTP id x27so19682778qtm.12 for ; Wed, 17 Jan 2018 22:24:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=+L1hgSLg0o+a4idvxpxtuk3U6prs1P0cjyPHxFh6bQ8=; b=CCGL2ZyXvo8LYPbVUe0Ygn0hXu3CigaGgyYxcDN+k6tAKBU2V1Xmc5EkJscpeAoHjE B6DUxefrJDXQWCxltCgq/atXiLiUKr79vUUSBldQmhUjN0Tj8HplTzKiOziHo2Pw695K IANf+mW5mtNp8FydZwGZAYd0q9I6N0GXuiSAc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=+L1hgSLg0o+a4idvxpxtuk3U6prs1P0cjyPHxFh6bQ8=; b=ZYsrDH54HjBt4Ns4O/++CdRsC+culI0dwiUEoQ07O/AGdhQ2My8m7KOmt2E2wVbzQT SJAfxpzIV0VJDg5pS/i3SA3NF8+dO3I9/UfPuimGEmU8Cpxd/+rnVIIS+TmTmRODNgxa Hs7sfj8sTz8Lj5ZZsoexMLrUESvR5/obdNuTZRrnMTROxH0Gce3Qouvm1Wd1HadRgHJQ hQT9iTXd91jKD3z1o8qdpK1bvZrU4qak36kvdu/ZbpEFijFCHp95l92yfl/KcGVBaKLk cjlJg7eQyQ/9VCz8vFEsHrVk28xhu3pHQku7lblF2xNggcrTI4ar4aqqHBgIfKgH/li7 +oLw== X-Gm-Message-State: AKwxytcoknkQ2Xzys0P7AbWLgAfZ7v/WV8PP9lBiqo035qkwF62exN46 GSH+gkzg7CyjAbWDeq9D14kyHg== X-Received: by 10.55.165.204 with SMTP id o195mr24893294qke.262.1516256669639; Wed, 17 Jan 2018 22:24:29 -0800 (PST) Received: from [10.10.115.224] ([192.19.218.250]) by smtp.gmail.com with ESMTPSA id c188sm3536793qkg.92.2018.01.17.22.24.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Jan 2018 22:24:29 -0800 (PST) Subject: Re: [PATCH V4 1/2] nvme: add NVME_CTRL_RESET_PREPARE state To: "jianchao.wang" , Max Gurtovoy , keith.busch@intel.com, axboe@fb.com, hch@lst.de, sagi@grimberg.me Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org References: <1516164877-2170-1-git-send-email-jianchao.w.wang@oracle.com> <1516164877-2170-2-git-send-email-jianchao.w.wang@oracle.com> <11106a93-2e78-c853-e6eb-35c652dab3a9@mellanox.com> <50ca626e-7933-5a60-33f9-52e2ac8d0f25@broadcom.com> <9a42cfac-642a-6838-625a-eeb51bec5c78@oracle.com> From: James Smart Message-ID: Date: Wed, 17 Jan 2018 22:24:26 -0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <9a42cfac-642a-6838-625a-eeb51bec5c78@oracle.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Thanks jianchoa. This helped. On 1/17/2018 7:13 PM, jianchao.wang wrote: > Actually, this patchset is to fix a issue in nvme_timeout. > Please consider the following scenario. > > nvme_reset_ctrl > -> set state to RESETTING > -> queue reset_work > (scheduling) > nvme_reset_work -> nvme_dev_disable > -> quiesce queues > -> nvme_cancel_request > on outstanding requests > --------------------------------------------------------------------_boundary_ > -> nvme initializing (may issue request on adminq) > > Before the boundary, not only quiesce the queues, but only cancel all the outstanding requests. > > A request could expire when the ctrl state is RESETTING. > - If the timeout occur before the _boundary_, the expired requests are from the previous work. > - Otherwise, the expired requests are from the controller initializing procedure, such as sending cq/sq > create commands to adminq to setup io queues. > In current implementation, nvme_timeout cannot identify the _boundary_ so only handles second case above. So what you've described very well is the pci adapter and the fact that it doesn't use a RECONNECTING state when it starts to reinit the controller like rdma/fc does.  Note: we had left it that way as a "grandfathering" as the code already existed and we didn't see an issue just leaving the reinit in the resetting handler. > .... > > So in the patch, RESETTING in nvme-fc/rdma is changed to RESET_PREPARE. Then we get: > nvme-fc/rdma RESET_PREPARE -> RECONNECTING -> LIVE > nvme-pci RESET_PREPARE -> RESETTING -> LIVE Right - so another way to look at this is you renamed RESETTING to RESET_PREPARE and added a new RESETTING state in the nvme-pci when in reinits.  Why not simply have the nvme-pci transport transition to RECONNECTING like the other transports. No new states, semantics are all the same. Here's what the responsibility of the states are: RESETTING: -quiescing the blk-mq queues os errors don't bubble back to callees and new io is suspended -terminating the link-side association: resets the controller via register access/SET_PROPERTY, deletes connections/queues, cleans out active io and lets them queue for resending, etc. -end result is nvme block device is present, but idle, and no active controller on the link side  (link being a transport specific link such as pci, or ethernet/ib for rdma or fc link). RECONNECTING: - "establish an association at the transport" on PCI this is immediate - you can either talk to the pci function or you can't. With rdma/fc we send transport traffic to create an admin q connection. - if association succeeded: the controller is init'd via register accesses/fabric GET/SET_PROPERTIES and admin-queue command, io connections/queues created, blk-mq queues unquiesced, and finally transition to NVME_CTRL_LIVE. - if association failed: check controller timeout., if not exceeded, schedule timer and retry later.  With pci, this could cover the temporary loss of the pci function access (a hot plug event) while keeping the nvme blk device live in the system. It matches what you are describing but using what we already have in place - with the only difference being the addition of your "boundary" by adding the RECONNECTING state to nvme-pci. > >>> I don't believe RESETTING and RECONNECTING are that similar - unless - you are looking at that "reinit" part that we left grandfathered into PCI. > Yes. I have got the point. Thanks for your directive. > > Both nvme-pc and nvme-fc/rdma have "shutdown" part to tear down queues/connects, quiesce queues, cancel outstanding requests... > We could call this part as "shutdowning" as well as the scheduling gap. > Because the difference of initializing between the nvme-pci and nvme-fc/rdma, we could call "reiniting" for nvme-pci and > call "reconnecting" for nvme-fc/rdma I don't think we need different states for the two. The actions taken are really very similar. How they do the actions varies slightly, but not what they are trying to do.   Thus, I'd prefer we keep the existing RECONNECTING state and use it in nvme-pci as well. I'm open to renaming it if there's something about the name that is not agreeable. -- james