2022-01-28 02:57:15

by Daniel Wagner

[permalink] [raw]
Subject: [RFC] nvme: Do not reject dynamic controller cntlid

The NVME spec states in 3.1.1 Controller Model there are two types
controllers: dynamic and static.

The nvme_validate_cntlid() check ensures that the newly discovered
controller does not have the same cntlid as a known controller. For
the dynamic controller model this check doesn't make sense as all
cntlid will be 0xffff. In this case ignore this check.

Signed-off-by: Daniel Wagner <[email protected]>
---

I got a bug report with this:

nvme nvme10: pci function 0000:00:19.0
nvme 0000:00:19.0: enabling device (0000 -> 0002)
nvme nvme10: Duplicate cntlid 65535 with nvme7, rejecting
nvme nvme10: Removing after probe failure status: -22

From reading the spec I got the impression that it should be enough to
ignore the cntlid check if the cntlid is 0xffff. Though I might be
completely wrong here.

drivers/nvme/host/core.c | 2 +-
drivers/nvme/host/nvme.h | 5 +++++
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dd18861f77c0..918abb54771c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2701,7 +2701,7 @@ static bool nvme_validate_cntlid(struct nvme_subsystem *subsys,
if (nvme_state_terminal(tmp))
continue;

- if (tmp->cntlid == ctrl->cntlid) {
+ if (!nvme_ctrl_dynamic(tmp) && tmp->cntlid == ctrl->cntlid) {
dev_err(ctrl->device,
"Duplicate cntlid %u with %s, subsys %s, rejecting\n",
ctrl->cntlid, dev_name(tmp->device),
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index a162f6c6da6e..ed75245263f5 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -539,6 +539,11 @@ static inline struct request *nvme_cid_to_rq(struct blk_mq_tags *tags,
return blk_mq_tag_to_rq(tags, nvme_tag_from_cid(command_id));
}

+static inline bool nvme_ctrl_dynamic(struct nvme_ctrl *ctrl)
+{
+ return ctrl->cntlid == 0xffff;
+}
+
#ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
void nvme_fault_inject_init(struct nvme_fault_inject *fault_inj,
const char *dev_name);
--
2.29.2


2022-01-28 11:23:49

by Keith Busch

[permalink] [raw]
Subject: Re: [RFC] nvme: Do not reject dynamic controller cntlid

On Thu, Jan 27, 2022 at 02:36:48PM +0100, Daniel Wagner wrote:
> index dd18861f77c0..918abb54771c 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2701,7 +2701,7 @@ static bool nvme_validate_cntlid(struct nvme_subsystem *subsys,
> if (nvme_state_terminal(tmp))
> continue;
>
> - if (tmp->cntlid == ctrl->cntlid) {
> + if (!nvme_ctrl_dynamic(tmp) && tmp->cntlid == ctrl->cntlid) {
> dev_err(ctrl->device,
> "Duplicate cntlid %u with %s, subsys %s, rejecting\n",
> ctrl->cntlid, dev_name(tmp->device),
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index a162f6c6da6e..ed75245263f5 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -539,6 +539,11 @@ static inline struct request *nvme_cid_to_rq(struct blk_mq_tags *tags,
> return blk_mq_tag_to_rq(tags, nvme_tag_from_cid(command_id));
> }
>
> +static inline bool nvme_ctrl_dynamic(struct nvme_ctrl *ctrl)
> +{
> + return ctrl->cntlid == 0xffff;
> +}

It's probably safe to assume 0xffff is dynamic, but spec suggests we
check ID_CTRL.FCATT bit 0.

2022-01-30 23:41:20

by Daniel Wagner

[permalink] [raw]
Subject: Re: [RFC] nvme: Do not reject dynamic controller cntlid

On Thu, Jan 27, 2022 at 09:17:58AM -0800, Keith Busch wrote:
> > +static inline bool nvme_ctrl_dynamic(struct nvme_ctrl *ctrl)
> > +{
> > + return ctrl->cntlid == 0xffff;
> > +}
>
> It's probably safe to assume 0xffff is dynamic, but spec suggests we
> check ID_CTRL.FCATT bit 0.

Okay, but this one is only defined for fabrics. I haven't found anything
so far which is equivalent to FCATT bit 0 for memory based transport.

2022-01-31 11:00:07

by Daniel Wagner

[permalink] [raw]
Subject: Re: [RFC] nvme: Do not reject dynamic controller cntlid

On Fri, Jan 28, 2022 at 11:31:37AM +0100, Daniel Wagner wrote:
> On Thu, Jan 27, 2022 at 09:17:58AM -0800, Keith Busch wrote:
> > > +static inline bool nvme_ctrl_dynamic(struct nvme_ctrl *ctrl)
> > > +{
> > > + return ctrl->cntlid == 0xffff;
> > > +}
> >
> > It's probably safe to assume 0xffff is dynamic, but spec suggests we
> > check ID_CTRL.FCATT bit 0.
>
> Okay, but this one is only defined for fabrics. I haven't found anything
> so far which is equivalent to FCATT bit 0 for memory based transport.

Never mind. After discussing it with Hannes it turns out there is no
problem here. I didn't understand the spec correctly (it's difficult to
get an exact answer) on this topic.

2022-01-31 11:22:42

by Hannes Reinecke

[permalink] [raw]
Subject: Re: [RFC] nvme: Do not reject dynamic controller cntlid

On 1/28/22 13:36, Daniel Wagner wrote:
> On Fri, Jan 28, 2022 at 11:31:37AM +0100, Daniel Wagner wrote:
>> On Thu, Jan 27, 2022 at 09:17:58AM -0800, Keith Busch wrote:
>>>> +static inline bool nvme_ctrl_dynamic(struct nvme_ctrl *ctrl)
>>>> +{
>>>> + return ctrl->cntlid == 0xffff;
>>>> +}
>>>
>>> It's probably safe to assume 0xffff is dynamic, but spec suggests we
>>> check ID_CTRL.FCATT bit 0.
>>
>> Okay, but this one is only defined for fabrics. I haven't found anything
>> so far which is equivalent to FCATT bit 0 for memory based transport.
>
> Never mind. After discussing it with Hannes it turns out there is no
> problem here. I didn't understand the spec correctly (it's difficult to
> get an exact answer) on this topic.
>
Weasely wording in the spec again.
It talks at great length on controller IDs for fabrics, which values are
allowed and which not, and how one should do dynamic controller ids _on
fabrics_, but is strangely silent for memory-based (ie PCI) transports.
There, apparently, 0xFFFF _is_ a valid controller id, and the only
restriction is that the controller ID must be unique.
And even that is never stated directly, but must be inferred from the
various command payload descriptions.
Guess it's a topic for the fmds call.

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer