2022-03-14 11:45:02

by Sungup Moon

[permalink] [raw]
Subject: [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns

When the multi-controller, managed by a special admin command, has private
namespace with same nsid, current linux driver raise "Duplicate unshared
namespace" error. But, NVMe Specification defines the NSID usage like this:

If Namespace Management, ANA Reporting, or NVM Sets are supported, the
NSIDs shall be unique within the NVM subsystem. If the Namespace
Management, ANA Reporting, and NVM Sets are not supported, then NSIDs:
a) for shared namespace shall be unique; and
b) for private namespace are not required to be unique.
(reference: 6.1.6 NSID and Namespace Usage; NVM Express 1.4c spec)

So, if a multi-controller, which is not managed by Namespace Management
function, creates some private namespaces without ANA and NVM Sets, the
duplicated NSID should be allowed because that is not a NVMe specification
violation.

But, current nvme driver checks only namespace is shared or not, so I
propose following patch:
1. nvme_ctrl has unique_nsid field to identify that controller should
assign unique nsid.
2. nvme_init_ns_head function creates new nvme_ns_head instance not only
head is null but controller's unique_nsid is false (no flagged
attribute) and namespace is not shared.
3. for creating bdev device file, nvme_mpath_set_disk_name will return
false when unique_nsid is false and namespace is not shared.
4. also, nvme_mpath_alloc_disk alto return 0 with same manner.

This patch has been modified the unique_nsid mechanism from flag based
to the checking function.
- v1: flag based initial patch
- v2: change from unique_nsid flag to nvme_check_unique_nsid

Signed-off-by: Sungup Moon <[email protected]>
---
drivers/nvme/host/core.c | 9 ++++++++-
drivers/nvme/host/multipath.c | 5 +++--
drivers/nvme/host/nvme.h | 16 ++++++++++++++++
include/linux/nvme.h | 1 +
4 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 51c08f206cbf..5eedef9a781c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3816,7 +3816,14 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,

mutex_lock(&ctrl->subsys->lock);
head = nvme_find_ns_head(ctrl->subsys, nsid);
- if (!head) {
+ if (!head || !(nvme_check_unique_nsid(ctrl, head) || is_shared)) {
+ /*
+ * If the found ns head is null or both of ns are not shared
+ * without the unique namespace condition (this means both
+ * namespace are private namespaces and those can share the
+ * same nsid), allocate the new head. Private namespace can
+ * reuse nsid with the others.
+ */
ret = nvme_subsys_check_duplicate_ids(ctrl->subsys, ids);
if (ret) {
dev_err(ctrl->device,
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index ff775235534c..4671dc1b32da 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -88,7 +88,7 @@ void nvme_mpath_start_freeze(struct nvme_subsystem *subsys)
*/
bool nvme_mpath_set_disk_name(struct nvme_ns *ns, char *disk_name, int *flags)
{
- if (!multipath)
+ if (!multipath || !nvme_check_unique_nsid(ns->ctrl, ns->head))
return false;
if (!ns->head->disk) {
sprintf(disk_name, "nvme%dn%d", ns->ctrl->subsys->instance,
@@ -507,7 +507,8 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
* We also do this for private namespaces as the namespace sharing data could
* change after a rescan.
*/
- if (!(ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) || !multipath)
+ if (!(ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) || !multipath ||
+ !nvme_check_unique_nsid(ctrl, head))
return 0;

head->disk = blk_alloc_disk(ctrl->numa_node);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index e7ccdb119ede..50091ed3713b 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -719,6 +719,22 @@ static inline bool nvme_check_ready(struct nvme_ctrl *ctrl, struct request *rq,
return queue_live;
return __nvme_check_ready(ctrl, rq, queue_live);
}
+static inline bool nvme_check_unique_nsid(struct nvme_ctrl *ctrl,
+ struct nvme_ns_head *head)
+{
+ /*
+ * NSID should be unique on the following condition
+ * 1. Namespace Management support; or
+ * 2. ANA Reporing support; or
+ * 3. NVM Set support; or
+ * 4. Namespace is shared
+ * Other case, private namespace are not required to be unique.
+ */
+ return (ctrl->oacs & NVME_CTRL_OACS_NS_MNGT_SUPP) ||
+ (ctrl->subsys->cmic & NVME_CTRL_CMIC_ANA) ||
+ (ctrl->ctratt & NVME_CTRL_CTRATT_NVM_SETS) ||
+ (head->shared);
+}
int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
void *buf, unsigned bufflen);
int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 4f44f83817a9..f626a445d1a8 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -346,6 +346,7 @@ enum {
NVME_CTRL_ONCS_TIMESTAMP = 1 << 6,
NVME_CTRL_VWC_PRESENT = 1 << 0,
NVME_CTRL_OACS_SEC_SUPP = 1 << 0,
+ NVME_CTRL_OACS_NS_MNGT_SUPP = 1 << 3,
NVME_CTRL_OACS_DIRECTIVES = 1 << 5,
NVME_CTRL_OACS_DBBUF_SUPP = 1 << 8,
NVME_CTRL_LPA_CMD_EFFECTS_LOG = 1 << 1,
--
2.25.1


2022-03-15 07:30:38

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns

I looked at this a bit more and found two issues:

- nvme_init_ns_head will now leak the ns_head for the private namespaces
with potentially duplicate IDs case.
- nvme_mpath_set_disk_name still needs to use the subsystem-wide IDA
for the nvme instance name as the subsystem and controller ones
could otherwise clash.

Let me know what you think of this version:

---
From 1b217962cc10fa59eae98fc112adc64bddc462b3 Mon Sep 17 00:00:00 2001
From: Sungup Moon <[email protected]>
Date: Mon, 14 Mar 2022 20:05:45 +0900
Subject: nvme: allow duplicated NSIDs for the private namespaces

A NVMe subsystem with multiple controller can have private namespaces
that use the same NSID under some conditions:

"If Namespace Management, ANA Reporting, or NVM Sets are supported, the
NSIDs shall be unique within the NVM subsystem. If the Namespace
Management, ANA Reporting, and NVM Sets are not supported, then NSIDs:
a) for shared namespace shall be unique; and
b) for private namespace are not required to be unique."

Reference: Section 6.1.6 NSID and Namespace Usage; NVM Express 1.4c spec.

Make sure this specific setup is supported in Linux.

Signed-off-by: Sungup Moon <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
---
drivers/nvme/host/core.c | 7 ++++++-
drivers/nvme/host/multipath.c | 7 ++++---
drivers/nvme/host/nvme.h | 19 +++++++++++++++++++
include/linux/nvme.h | 1 +
4 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f8084ded69e50..c7127d439b3de 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3657,7 +3657,12 @@ static struct nvme_ns_head *nvme_find_ns_head(struct nvme_subsystem *subsys,
lockdep_assert_held(&subsys->lock);

list_for_each_entry(h, &subsys->nsheads, entry) {
- if (h->ns_id != nsid)
+ /*
+ * Private namespaces can share NSIDs under some conditions.
+ * In that case we can't use the same ns_head for namespaces
+ * with the same NSID.
+ */
+ if (h->ns_id != nsid || !nvme_is_uniqueue_nsid(ctrl, head))
continue;
if (!list_empty(&h->list) && nvme_tryget_ns_head(h))
return h;
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index d13b81cd6225c..6b6df1016cb91 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -504,10 +504,11 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)

/*
* Add a multipath node if the subsystems supports multiple controllers.
- * We also do this for private namespaces as the namespace sharing data could
- * change after a rescan.
+ * We also do this for private namespaces as the namespace sharing flag
+ * could change after a rescan.
*/
- if (!(ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) || !multipath)
+ if (!(ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) ||
+ !nvme_is_uniqueue_nsid(ctrl, head) || !multipath)
return 0;

head->disk = blk_alloc_disk(ctrl->numa_node);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 587d92df118b7..9add586434929 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -718,6 +718,25 @@ static inline bool nvme_check_ready(struct nvme_ctrl *ctrl, struct request *rq,
return queue_live;
return __nvme_check_ready(ctrl, rq, queue_live);
}
+
+/*
+ * NSID shall be unique for all shared namespaces, or if at least one of the
+ * following conditions is met:
+ * 1. Namespace Management is supported by the controller
+ * 2. ANA is supported by the controller
+ * 3. NVM Set are supported by the controller
+ *
+ * In other case, private namespace are not required to report a unique NSID.
+ */
+static inline bool nvme_is_uniqueue_nsid(struct nvme_ctrl *ctrl,
+ struct nvme_ns_head *head)
+{
+ return head->shared ||
+ (ctrl->oacs & NVME_CTRL_OACS_NS_MNGT_SUPP) ||
+ (ctrl->subsys->cmic & NVME_CTRL_CMIC_ANA) ||
+ (ctrl->ctratt & NVME_CTRL_CTRATT_NVM_SETS);
+}
+
int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
void *buf, unsigned bufflen);
int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 9dbc3ef4daf7c..2dcee34d467d6 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -345,6 +345,7 @@ enum {
NVME_CTRL_ONCS_TIMESTAMP = 1 << 6,
NVME_CTRL_VWC_PRESENT = 1 << 0,
NVME_CTRL_OACS_SEC_SUPP = 1 << 0,
+ NVME_CTRL_OACS_NS_MNGT_SUPP = 1 << 3,
NVME_CTRL_OACS_DIRECTIVES = 1 << 5,
NVME_CTRL_OACS_DBBUF_SUPP = 1 << 8,
NVME_CTRL_LPA_CMD_EFFECTS_LOG = 1 << 1,
--
2.30.2

2022-03-15 11:27:07

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns

On Tue, Mar 15, 2022 at 08:12:30AM +0100, [email protected] wrote:
> I looked at this a bit more and found two issues:
>
> - nvme_init_ns_head will now leak the ns_head for the private namespaces
> with potentially duplicate IDs case.
> - nvme_mpath_set_disk_name still needs to use the subsystem-wide IDA
> for the nvme instance name as the subsystem and controller ones
> could otherwise clash.
>
> Let me know what you think of this version:

Except that this had the parts to actually make it compile uncommited,
so here is the proper one:

---
From c6deed0b18d66460b090d22ee18f37d631d0fd12 Mon Sep 17 00:00:00 2001
From: Sungup Moon <[email protected]>
Date: Mon, 14 Mar 2022 20:05:45 +0900
Subject: nvme: allow duplicated NSIDs for the private namespaces

A NVMe subsystem with multiple controller can have private namespaces
that use the same NSID under some conditions:

"If Namespace Management, ANA Reporting, or NVM Sets are supported, the
NSIDs shall be unique within the NVM subsystem. If the Namespace
Management, ANA Reporting, and NVM Sets are not supported, then NSIDs:
a) for shared namespace shall be unique; and
b) for private namespace are not required to be unique."

Reference: Section 6.1.6 NSID and Namespace Usage; NVM Express 1.4c spec.

Make sure this specific setup is supported in Linux.

Signed-off-by: Sungup Moon <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
---
drivers/nvme/host/core.c | 12 +++++++++---
drivers/nvme/host/multipath.c | 7 ++++---
drivers/nvme/host/nvme.h | 19 +++++++++++++++++++
include/linux/nvme.h | 1 +
4 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f8084ded69e50..31f7a479fa08d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3649,15 +3649,21 @@ static const struct attribute_group *nvme_dev_attr_groups[] = {
NULL,
};

-static struct nvme_ns_head *nvme_find_ns_head(struct nvme_subsystem *subsys,
+static struct nvme_ns_head *nvme_find_ns_head(struct nvme_ctrl *ctrl,
unsigned nsid)
{
+ struct nvme_subsystem *subsys = ctrl->subsys;
struct nvme_ns_head *h;

lockdep_assert_held(&subsys->lock);

list_for_each_entry(h, &subsys->nsheads, entry) {
- if (h->ns_id != nsid)
+ /*
+ * Private namespaces can share NSIDs under some conditions.
+ * In that case we can't use the same ns_head for namespaces
+ * with the same NSID.
+ */
+ if (h->ns_id != nsid || !nvme_is_uniqueue_nsid(ctrl, h))
continue;
if (!list_empty(&h->list) && nvme_tryget_ns_head(h))
return h;
@@ -3851,7 +3857,7 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
}

mutex_lock(&ctrl->subsys->lock);
- head = nvme_find_ns_head(ctrl->subsys, nsid);
+ head = nvme_find_ns_head(ctrl, nsid);
if (!head) {
ret = nvme_subsys_check_duplicate_ids(ctrl->subsys, ids);
if (ret) {
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index d13b81cd6225c..6b6df1016cb91 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -504,10 +504,11 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)

/*
* Add a multipath node if the subsystems supports multiple controllers.
- * We also do this for private namespaces as the namespace sharing data could
- * change after a rescan.
+ * We also do this for private namespaces as the namespace sharing flag
+ * could change after a rescan.
*/
- if (!(ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) || !multipath)
+ if (!(ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) ||
+ !nvme_is_uniqueue_nsid(ctrl, head) || !multipath)
return 0;

head->disk = blk_alloc_disk(ctrl->numa_node);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 587d92df118b7..9add586434929 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -718,6 +718,25 @@ static inline bool nvme_check_ready(struct nvme_ctrl *ctrl, struct request *rq,
return queue_live;
return __nvme_check_ready(ctrl, rq, queue_live);
}
+
+/*
+ * NSID shall be unique for all shared namespaces, or if at least one of the
+ * following conditions is met:
+ * 1. Namespace Management is supported by the controller
+ * 2. ANA is supported by the controller
+ * 3. NVM Set are supported by the controller
+ *
+ * In other case, private namespace are not required to report a unique NSID.
+ */
+static inline bool nvme_is_uniqueue_nsid(struct nvme_ctrl *ctrl,
+ struct nvme_ns_head *head)
+{
+ return head->shared ||
+ (ctrl->oacs & NVME_CTRL_OACS_NS_MNGT_SUPP) ||
+ (ctrl->subsys->cmic & NVME_CTRL_CMIC_ANA) ||
+ (ctrl->ctratt & NVME_CTRL_CTRATT_NVM_SETS);
+}
+
int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
void *buf, unsigned bufflen);
int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 9dbc3ef4daf7c..2dcee34d467d6 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -345,6 +345,7 @@ enum {
NVME_CTRL_ONCS_TIMESTAMP = 1 << 6,
NVME_CTRL_VWC_PRESENT = 1 << 0,
NVME_CTRL_OACS_SEC_SUPP = 1 << 0,
+ NVME_CTRL_OACS_NS_MNGT_SUPP = 1 << 3,
NVME_CTRL_OACS_DIRECTIVES = 1 << 5,
NVME_CTRL_OACS_DBBUF_SUPP = 1 << 8,
NVME_CTRL_LPA_CMD_EFFECTS_LOG = 1 << 1,
--
2.30.2

2022-03-15 14:32:58

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns

On Tue, Mar 15, 2022 at 10:42:56AM +0200, Sagi Grimberg wrote:
>>>> + * We also do this for private namespaces as the namespace sharing flag
>>>> + * could change after a rescan.
>>>
>>> What happens in this case? we now have non-unique shared namespaces?
>>
>> The non-uniqueue NSIDs can only happen for private namespaces.
>
> But what happens if this changes upon a rescan as you commented?

Well, it can't change to shared as the nsids are non-unique. If we
want to be paranoid we could add a sanity check for that, but then
again there are a bunch of other things where we could be more paranoid.

2022-03-15 15:27:20

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns


> Well, it can't change to shared as the nsids are non-unique. If we
> want to be paranoid we could add a sanity check for that, but then
> again there are a bunch of other things where we could be more paranoid.

*** paranoid person looking over his shoulder meme ***

2022-03-16 02:11:51

by Sungup Moon

[permalink] [raw]
Subject: RE:(2) [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns

I'll answer your opinion.

1. ns_head leak issue

I don't think that is leaked ns_head. Because although all ids and nsid are same
through all namespaces, each namespaces are indenpendent namespace and each of that
should have independent data structure.
Duplicated nsid private namespace is different from the shared namespace even though
same information.

2. nvme_mpath_set_disk_name issue

Yes, I also agree that subsystem-wide IDA is very important data. However, I
implemented without nvme_mpath_set_disk_name modification at the first time, it is
hard to decide which namespace are connected to private controller.

As you know, each nvme controller start initiating at a time. So, each controller
structures are sequentially initiated, but each namespaces structures are initiated
independently because of multi-processing on cpu. So, all namespace can have different
instance number every boot-up time, and it makes hard to track and control the private
namespace on the controller or device failure.

Anyway, the private namespace is same condition with no-multipath situation (because
private namespace cannot shared between controllers) so I think that the private
namespace should follow the naming rule with no-multipath situation.

 
--------- Original Message ---------
Sender : [email protected] <[email protected]>
Date : 2022-03-15 16:12 (GMT+9)
Title : Re: [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns
 
I looked at this a bit more and found two issues:
 
 - nvme_init_ns_head will now leak the ns_head for the private namespaces
   with potentially duplicate IDs case.
 - nvme_mpath_set_disk_name still needs to use the subsystem-wide IDA
   for the nvme instance name as the subsystem and controller ones
   could otherwise clash.
 
Let me know what you think of this version:
 
---
From 1b217962cc10fa59eae98fc112adc64bddc462b3 Mon Sep 17 00:00:00 2001
From: Sungup Moon <[email protected]>
Date: Mon, 14 Mar 2022 20:05:45 +0900
Subject: nvme: allow duplicated NSIDs for the private namespaces
 
A NVMe subsystem with multiple controller can have private namespaces
that use the same NSID under some conditions:
 
 "If Namespace Management, ANA Reporting, or NVM Sets are supported, the
  NSIDs shall be unique within the NVM subsystem. If the Namespace
  Management, ANA Reporting, and NVM Sets are not supported, then NSIDs:
   a) for shared namespace shall be unique; and
   b) for private namespace are not required to be unique."
 
Reference: Section 6.1.6 NSID and Namespace Usage; NVM Express 1.4c spec.
 
Make sure this specific setup is supported in Linux.
 
Signed-off-by: Sungup Moon <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
---
 drivers/nvme/host/core.c      |  7 ++++++-
 drivers/nvme/host/multipath.c |  7 ++++---
 drivers/nvme/host/nvme.h      | 19 +++++++++++++++++++
 include/linux/nvme.h          |  1 +
 4 files changed, 30 insertions(+), 4 deletions(-)
 
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f8084ded69e50..c7127d439b3de 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3657,7 +3657,12 @@ static struct nvme_ns_head *nvme_find_ns_head(struct nvme_subsystem *subsys,
         lockdep_assert_held(&subsys->lock);
 
         list_for_each_entry(h, &subsys->nsheads, entry) {
-                if (h->ns_id != nsid)
+                /*
+                 * Private namespaces can share NSIDs under some conditions.
+                 * In that case we can't use the same ns_head for namespaces
+                 * with the same NSID.
+                 */
+                if (h->ns_id != nsid || !nvme_is_uniqueue_nsid(ctrl, head))
                         continue;
                 if (!list_empty(&h->list) && nvme_tryget_ns_head(h))
                         return h;
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index d13b81cd6225c..6b6df1016cb91 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -504,10 +504,11 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
 
         /*
          * Add a multipath node if the subsystems supports multiple controllers.
-         * We also do this for private namespaces as the namespace sharing data could
-         * change after a rescan.
+         * We also do this for private namespaces as the namespace sharing flag
+         * could change after a rescan.
          */
-        if (!(ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) || !multipath)
+        if (!(ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) ||
+            !nvme_is_uniqueue_nsid(ctrl, head) || !multipath)
                 return 0;
 
         head->disk = blk_alloc_disk(ctrl->numa_node);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 587d92df118b7..9add586434929 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -718,6 +718,25 @@ static inline bool nvme_check_ready(struct nvme_ctrl *ctrl, struct request *rq,
                 return queue_live;
         return __nvme_check_ready(ctrl, rq, queue_live);
 }
+
+/*
+ * NSID shall be unique for all shared namespaces, or if at least one of the
+ * following conditions is met:
+ *   1. Namespace Management is supported by the controller
+ *   2. ANA is supported by the controller
+ *   3. NVM Set are supported by the controller
+ *
+ * In other case, private namespace are not required to report a unique NSID.
+ */
+static inline bool nvme_is_uniqueue_nsid(struct nvme_ctrl *ctrl,
+                struct nvme_ns_head *head)
+{
+        return head->shared ||
+                (ctrl->oacs & NVME_CTRL_OACS_NS_MNGT_SUPP) ||
+                (ctrl->subsys->cmic & NVME_CTRL_CMIC_ANA) ||
+                (ctrl->ctratt & NVME_CTRL_CTRATT_NVM_SETS);
+}
+
 int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
                 void *buf, unsigned bufflen);
 int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 9dbc3ef4daf7c..2dcee34d467d6 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -345,6 +345,7 @@ enum {
         NVME_CTRL_ONCS_TIMESTAMP                = 1 << 6,
         NVME_CTRL_VWC_PRESENT                        = 1 << 0,
         NVME_CTRL_OACS_SEC_SUPP                 = 1 << 0,
+        NVME_CTRL_OACS_NS_MNGT_SUPP                = 1 << 3,
         NVME_CTRL_OACS_DIRECTIVES                = 1 << 5,
         NVME_CTRL_OACS_DBBUF_SUPP                = 1 << 8,
         NVME_CTRL_LPA_CMD_EFFECTS_LOG                = 1 << 1,
-- 
2.30.2
 
 

2022-03-16 21:11:44

by Sungup Moon

[permalink] [raw]
Subject: RE:(2) [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns

First of all private namespace should be created by the vendor specific command,
because namespace management, ANA and NVM Set should be disabled on the controller
level. So Normal namespace managed NVMe deivce should return true using the
namespace management field (ctrl->oacs & NVME_CTRL_OACS_NS_MNGT_SUPP).

If user create using the namespace management admin command, my patch
check the namespace management field and that sub-system should be managed
like multi-path nvme device route.

So, if user create shared namespace on that nvme subsystem, should
distroy all namespace with target nsid, and create the new shared namespace
using the vendor specific admin command.

 
 
--------- Original Message ---------
Sender : [email protected] <[email protected]>
Date : 2022-03-15 17:46 (GMT+9)
Title : Re: [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns
 
On Tue, Mar 15, 2022 at 10:42:56AM +0200, Sagi Grimberg wrote:
>>>> +         * We also do this for private namespaces as the namespace sharing flag
>>>> +         * could change after a rescan.
>>>
>>> What happens in this case? we now have non-unique shared namespaces?
>>
>> The non-uniqueue NSIDs can only happen for private namespaces.
>
> But what happens if this changes upon a rescan as you commented?
 
Well, it can't change to shared as the nsids are non-unique.  If we
want to be paranoid we could add a sanity check for that, but then
again there are a bunch of other things where we could be more paranoid.
 

2022-03-17 03:57:35

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns

On Tue, Mar 15, 2022 at 10:18:30AM +0200, Sagi Grimberg wrote:
>> +static struct nvme_ns_head *nvme_find_ns_head(struct nvme_ctrl *ctrl,
>> unsigned nsid)
>> {
>> + struct nvme_subsystem *subsys = ctrl->subsys;
>> struct nvme_ns_head *h;
>> lockdep_assert_held(&subsys->lock);
>
> IMO it is a bit strange that we now don't pass in the subsystem but
> require that the subsys->lock is taken...

We do things like that in various places, mostly because information
needed that is subsystem-wide hangs of the nvme_ctrl structure, in
this case the various feature bitmaps. We could move them to the
subsystem structure, which would be the right thing to do but a fair
amout of churn for little savings.

>> +++ b/drivers/nvme/host/multipath.c
>> @@ -504,10 +504,11 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
>> /*
>> * Add a multipath node if the subsystems supports multiple controllers.
>> - * We also do this for private namespaces as the namespace sharing data could
>> - * change after a rescan.
>> + * We also do this for private namespaces as the namespace sharing flag
>> + * could change after a rescan.
>
> What happens in this case? we now have non-unique shared namespaces?

The non-uniqueue NSIDs can only happen for private namespaces.

2022-03-17 05:21:09

by Christoph Hellwig

[permalink] [raw]
Subject: Re: (2) [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns

On Tue, Mar 15, 2022 at 06:56:12PM +0900, Sungup Moon wrote:
> I'll answer your opinion.
>
> 1. ns_head leak issue
>
> I don't think that is leaked ns_head. Because although all ids and nsid are same
> through all namespaces, each namespaces are indenpendent namespace and each of that
> should have independent data structure.
> Duplicated nsid private namespace is different from the shared namespace even though
> same information.

In your patch if nvme_find_ns_head returns a head for a private namespace
with a non-uniqueue ID, that returned head already has an additional
reference, which we'll never drop as the head variable is overriden just
below.

> 2. nvme_mpath_set_disk_name issue
>
> Yes, I also agree that subsystem-wide IDA is very important data. However, I
> implemented without nvme_mpath_set_disk_name modification at the first time, it is
> hard to decide which namespace are connected to private controller.

It is not just very important, it can't work without that. The two
separate IDAs can will have conflicts, so you can up with two namespaces
with the same name without that change.

> As you know, each nvme controller start initiating at a time. So, each controller
> structures are sequentially initiated, but each namespaces structures are initiated
> independently because of multi-processing on cpu. So, all namespace can have different
> instance number every boot-up time, and it makes hard to track and control the private
> namespace on the controller or device failure.

Yes, but that is true of all Linux device enumeration. That is why
everyone should use table identifiers like the UUID or GUID to identify
the namespaces.

> Anyway, the private namespace is same condition with no-multipath situation (because
> private namespace cannot shared between controllers) so I think that the private
> namespace should follow the naming rule with no-multipath situation.

We can't use the non-multipath cabale naming as it will cause conflicts
in the naming. If anything in the system supports multipathing we have
to use subsystem based instances for the naming.

2022-03-17 05:51:18

by Sungup Moon

[permalink] [raw]
Subject: RE:(2) (2) [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns

>
> On Tue, Mar 15, 2022 at 06:56:12PM +0900, Sungup Moon wrote:
> > I'll answer your opinion.
> >
> > 1. ns_head leak issue
> >
> > I don't think that is leaked ns_head. Because although all ids and nsid are same
> > through all namespaces, each namespaces are indenpendent namespace and each of that
> > should have independent data structure.
> > Duplicated nsid private namespace is different from the shared namespace even though
> > same information.
>
> In your patch if nvme_find_ns_head returns a head for a private namespace
> with a non-uniqueue ID, that returned head already has an additional
> reference, which we'll never drop as the head variable is overriden just
> below.

Ok, I will review my patch with my team and reply your comment asap.

> > 2. nvme_mpath_set_disk_name issue
> >
> > Yes, I also agree that subsystem-wide IDA is very important data. However, I
> > implemented without nvme_mpath_set_disk_name modification at the first time, it is
> > hard to decide which namespace are connected to private controller.
>
> It is not just very important, it can't work without that. The two
> separate IDAs can will have conflicts, so you can up with two namespaces
> with the same name without that change.
>
> > As you know, each nvme controller start initiating at a time. So, each controller
> > structures are sequentially initiated, but each namespaces structures are initiated
> > independently because of multi-processing on cpu. So, all namespace can have different
> > instance number every boot-up time, and it makes hard to track and control the private
> > namespace on the controller or device failure.
>
> Yes, but that is true of all Linux device enumeration. That is why
> everyone should use table identifiers like the UUID or GUID to identify
> the namespaces.
>
> > Anyway, the private namespace is same condition with no-multipath situation (because
> > private namespace cannot shared between controllers) so I think that the private
> > namespace should follow the naming rule with no-multipath situation.
>
> We can't use the non-multipath cabale naming as it will cause conflicts
> in the naming. If anything in the system supports multipathing we have
> to use subsystem based instances for the naming.
>

Ok, I agree your opinion. My patch about nvme_mpath_set_disk_name will make confusion
of the naming in subsystem, and also find the relation-ship between controller <->
namespace using the sysfs directory structure. I'll remove that patch line and resummit
with first issue (ns_head leak issue) after review that.

2022-03-17 06:12:44

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns



On 3/15/22 09:19, [email protected] wrote:
> On Tue, Mar 15, 2022 at 08:12:30AM +0100, [email protected] wrote:
>> I looked at this a bit more and found two issues:
>>
>> - nvme_init_ns_head will now leak the ns_head for the private namespaces
>> with potentially duplicate IDs case.
>> - nvme_mpath_set_disk_name still needs to use the subsystem-wide IDA
>> for the nvme instance name as the subsystem and controller ones
>> could otherwise clash.
>>
>> Let me know what you think of this version:
>
> Except that this had the parts to actually make it compile uncommited,
> so here is the proper one:
>
> ---
> From c6deed0b18d66460b090d22ee18f37d631d0fd12 Mon Sep 17 00:00:00 2001
> From: Sungup Moon <[email protected]>
> Date: Mon, 14 Mar 2022 20:05:45 +0900
> Subject: nvme: allow duplicated NSIDs for the private namespaces
>
> A NVMe subsystem with multiple controller can have private namespaces
> that use the same NSID under some conditions:
>
> "If Namespace Management, ANA Reporting, or NVM Sets are supported, the
> NSIDs shall be unique within the NVM subsystem. If the Namespace
> Management, ANA Reporting, and NVM Sets are not supported, then NSIDs:
> a) for shared namespace shall be unique; and
> b) for private namespace are not required to be unique."
>
> Reference: Section 6.1.6 NSID and Namespace Usage; NVM Express 1.4c spec.
>
> Make sure this specific setup is supported in Linux.
>
> Signed-off-by: Sungup Moon <[email protected]>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> drivers/nvme/host/core.c | 12 +++++++++---
> drivers/nvme/host/multipath.c | 7 ++++---
> drivers/nvme/host/nvme.h | 19 +++++++++++++++++++
> include/linux/nvme.h | 1 +
> 4 files changed, 33 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index f8084ded69e50..31f7a479fa08d 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3649,15 +3649,21 @@ static const struct attribute_group *nvme_dev_attr_groups[] = {
> NULL,
> };
>
> -static struct nvme_ns_head *nvme_find_ns_head(struct nvme_subsystem *subsys,
> +static struct nvme_ns_head *nvme_find_ns_head(struct nvme_ctrl *ctrl,
> unsigned nsid)
> {
> + struct nvme_subsystem *subsys = ctrl->subsys;
> struct nvme_ns_head *h;
>
> lockdep_assert_held(&subsys->lock);

IMO it is a bit strange that we now don't pass in the subsystem but
require that the subsys->lock is taken...

>
> list_for_each_entry(h, &subsys->nsheads, entry) {
> - if (h->ns_id != nsid)
> + /*
> + * Private namespaces can share NSIDs under some conditions.
> + * In that case we can't use the same ns_head for namespaces
> + * with the same NSID.
> + */
> + if (h->ns_id != nsid || !nvme_is_uniqueue_nsid(ctrl, h))
> continue;
> if (!list_empty(&h->list) && nvme_tryget_ns_head(h))
> return h;
> @@ -3851,7 +3857,7 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
> }
>
> mutex_lock(&ctrl->subsys->lock);
> - head = nvme_find_ns_head(ctrl->subsys, nsid);
> + head = nvme_find_ns_head(ctrl, nsid);
> if (!head) {
> ret = nvme_subsys_check_duplicate_ids(ctrl->subsys, ids);
> if (ret) {
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index d13b81cd6225c..6b6df1016cb91 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -504,10 +504,11 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
>
> /*
> * Add a multipath node if the subsystems supports multiple controllers.
> - * We also do this for private namespaces as the namespace sharing data could
> - * change after a rescan.
> + * We also do this for private namespaces as the namespace sharing flag
> + * could change after a rescan.

What happens in this case? we now have non-unique shared namespaces?

> */
> - if (!(ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) || !multipath)
> + if (!(ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) ||
> + !nvme_is_uniqueue_nsid(ctrl, head) || !multipath)
> return 0;
>
> head->disk = blk_alloc_disk(ctrl->numa_node);
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 587d92df118b7..9add586434929 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -718,6 +718,25 @@ static inline bool nvme_check_ready(struct nvme_ctrl *ctrl, struct request *rq,
> return queue_live;
> return __nvme_check_ready(ctrl, rq, queue_live);
> }
> +
> +/*
> + * NSID shall be unique for all shared namespaces, or if at least one of the
> + * following conditions is met:
> + * 1. Namespace Management is supported by the controller
> + * 2. ANA is supported by the controller
> + * 3. NVM Set are supported by the controller
> + *
> + * In other case, private namespace are not required to report a unique NSID.
> + */
> +static inline bool nvme_is_uniqueue_nsid(struct nvme_ctrl *ctrl,
> + struct nvme_ns_head *head)
> +{
> + return head->shared ||
> + (ctrl->oacs & NVME_CTRL_OACS_NS_MNGT_SUPP) ||
> + (ctrl->subsys->cmic & NVME_CTRL_CMIC_ANA) ||
> + (ctrl->ctratt & NVME_CTRL_CTRATT_NVM_SETS);
> +}
> +
> int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
> void *buf, unsigned bufflen);
> int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
> diff --git a/include/linux/nvme.h b/include/linux/nvme.h
> index 9dbc3ef4daf7c..2dcee34d467d6 100644
> --- a/include/linux/nvme.h
> +++ b/include/linux/nvme.h
> @@ -345,6 +345,7 @@ enum {
> NVME_CTRL_ONCS_TIMESTAMP = 1 << 6,
> NVME_CTRL_VWC_PRESENT = 1 << 0,
> NVME_CTRL_OACS_SEC_SUPP = 1 << 0,
> + NVME_CTRL_OACS_NS_MNGT_SUPP = 1 << 3,
> NVME_CTRL_OACS_DIRECTIVES = 1 << 5,
> NVME_CTRL_OACS_DBBUF_SUPP = 1 << 8,
> NVME_CTRL_LPA_CMD_EFFECTS_LOG = 1 << 1,

2022-03-17 06:48:11

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v2] driver/nvme/host: Support duplicated nsid for the private ns


>>> +static struct nvme_ns_head *nvme_find_ns_head(struct nvme_ctrl *ctrl,
>>> unsigned nsid)
>>> {
>>> + struct nvme_subsystem *subsys = ctrl->subsys;
>>> struct nvme_ns_head *h;
>>> lockdep_assert_held(&subsys->lock);
>>
>> IMO it is a bit strange that we now don't pass in the subsystem but
>> require that the subsys->lock is taken...
>
> We do things like that in various places, mostly because information
> needed that is subsystem-wide hangs of the nvme_ctrl structure, in
> this case the various feature bitmaps. We could move them to the
> subsystem structure, which would be the right thing to do but a fair
> amout of churn for little savings.

Yea I understand its not universally enforced, just popped up.

>>> +++ b/drivers/nvme/host/multipath.c
>>> @@ -504,10 +504,11 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
>>> /*
>>> * Add a multipath node if the subsystems supports multiple controllers.
>>> - * We also do this for private namespaces as the namespace sharing data could
>>> - * change after a rescan.
>>> + * We also do this for private namespaces as the namespace sharing flag
>>> + * could change after a rescan.
>>
>> What happens in this case? we now have non-unique shared namespaces?
>
> The non-uniqueue NSIDs can only happen for private namespaces.

But what happens if this changes upon a rescan as you commented?