2019-08-02 02:29:42

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v7 00/14] nvmet: add target passthru commands support

Hi,

This is v7 of the passthru patchset. This largely addresses the
feedback from v6: cleaning up the nvme_ctrl_get_by_path() and
allowing multipath over fabrics to the passthru target.

Multipath is now supported simply by allowing multiple connections
from the same hostnqn and overriding the appropriate cmic bit. Passing
through to multiport devices as a target is still not supported.

--

Chaitainya has asked us to take on these patches as we have an
interest in getting them into upstream. To that end, we've done
a large amount of testing, bug fixes and cleanup.

Passthru support for nvmet allows users to export an entire
NVMe controller through NVMe-oF. When exported in this way (as opposed
to exporting each namespace as a block device), all the NVMe commands
are passed to the given controller unmodified, including most admin
commands and Vendor Unique Commands (VUCs). A passthru target will
expose all namespaces for a given device to the remote host.

There are three major non-bugfix changes that we've done to the series:

1) Instead of using a seperate special passthru subsystem in
configfs simply add a passthru directory that's analogous to
the existing namespace directories. The directories have
very similar attributes to namespaces but are mutually exclusive.
If a user enables a namespaces, they can't then enable
passthru controller and vice versa. This simplifies the code
required to implement passthru configfs and IMO creates a much
clearer and uniform interface.

2) Instead of taking a bare controller name (ie. "nvme1"), take a
full device path to the controller's char device. This is more
consistent with the regular namespaces which take a path and
also allows users to make use of udev rules and symlinks to
manage their controllers instead of the potentially unstable
device names.

3) Implement block accounting for the passthru devices. This is so
the target OS can still track device usage using /proc/diskstats.

Besides these three changes, we've also found a large number of bugs
and crashes and did a bunch of testing with KASAN, lockdep and kmemleak.
A more complete list of changes is given below.

Additionally, we've written some new blktests to test the passthru
code. A branch is available here[1] and can be submitted once these
patches are upstream.

These patches are based off of v5.3-rc1 and a git branch is available
at [2].

Thanks,

Logan

[1] https://github.com/Eideticom/blktests nvmet_passthru
[2] https://github.com/sbates130272/linux-p2pmem/ nvmet_passthru_v6

--

v7 Changes:
1. Rebased onto v5.3-rc2
2. Rework nvme_ctrl_get_by_path() to use filp_open() instead of
the cdev changes that were in v6. (Per Al)
3. Override the cmic bit to allow multipath and allow
multiple connections from the same hostnqn. (At the same
time I cleaned up the method of rejecting multiple connections.)
See Patch 8)
4. Found a bug when used with the tcp transport (See Patch 10)

v6 Changes:
1. Rebased onto v5.3-rc1
2. Rework configfs interface to simply be a passthru directory
within the existing subsystem. The directory is similar to
and consistent with a namespace directory.
3. Have the configfs take a path instead of a bare controller name
4. Renamed the main passthru file to io-cmd-passthru.c for consistency
with the file and block-dev methods.
5. Cleaned up all the CONFIG_NVME_TARGET_PASSTHRU usage to remove
all the inline #ifdefs
6. Restructured nvmet_passthru_make_request() a bit for clearer code
7. Moved nvme_find_get_ns() call into nvmet_passthru_execute_cmd()
seeing calling it in nvmet_req_init() causes a lockdep warning
due to nvme_find_get_ns() being able to sleep.
8. Added a check in nvmet_passthru_execute_cmd() to ensure we don't
violate queue_max_segments or queue_max_hw_sectors and overrode
mdts to ensure hosts don't intentionally submit commands
that will exceed these limits.
9. Reworked the code which ensures there's only one subsystem per
passthru controller to use an xarray instead of a list as this is
simpler and more easily fixed some bugs triggered by disabling
subsystems that weren't enabled.
10. Removed the overide of the target cntlid with the passthru cntlid;
this seemed like a really bad idea especially in the presence of
mixed systems as you could end up with two ctrlrs with the same
cntlid. For now, commands that depend on cntlid are black listed.
11. Implement block accounting for passthru so the target can track
usage using /proc/diskstats
12. A number of other minor bug fixes and cleanups

v5 Changes (not sent to list, from Chaitanya):
1. Added workqueue for admin commands.
2. Added kconfig option for the pass-thru feature.
3. Restructure the parsing code according to your suggestion,
call nvmet_xxx_parse_cmd() from nvmet_passthru_parse_cmd().
4. Use pass-thru instead of pt.
5. Several cleanups and add comments at the appropriate locations.
6. Minimize the code for checking pass-thru ns across all the subsystems.
7. Removed the delays in the ns related admin commands since I was
not able to reproduce the previous bug.

v4 Changes:
1. Add request polling interface to the block layer.
2. Use request polling interface in the NVMEoF target passthru code
path.
3. Add checks suggested by Sagi for creating one target ctrl per
passthru ctrl.
4. Don't enable the namespace if it belongs to the configured passthru
ctrl.
5. Adjust the code latest kernel.

v3 Changes:
1. Split the addition of passthru command handlers and integration
into two different patches since we add guards to create one target
controller per passthru controller. This way it will be easier to
review the code.
2. Adjust the code for 4.18.

v2 Changes:
1. Update the new nvme core controller find API naming and
changed the string comparison of the ctrl.
2. Get rid of the newly added #defines for target ctrl values.
3. Use the newly added structure members in the same patch where
they are used. Aggregate the passthru command handling support
and integration with nvmet-core into one patch.
4. Introduce global NVMe Target subsystem list for connected and
not connected subsystems on the target side.
5. Add check when configuring the target ns and target
passthru ctrl to allow only one target controller to be created
for one passthru subsystem.
6. Use the passthru ctrl cntlid when creating the
target controller.

--

Chaitanya Kulkarni (5):
nvme-core: export existing ctrl and ns interfaces
nvmet: add return value to nvmet_add_async_event()
nvmet-passthru: update KConfig with config passthru option
nvmet-passthru: add passthru code to process commands
nvmet-core: don't check the data len for pt-ctrl

Logan Gunthorpe (9):
nvme-core: introduce nvme_ctrl_get_by_path()
nvmet: make nvmet_copy_ns_identifier() non-static
nvmet-passthru: add enable/disable helpers
nvmet-core: allow one host per passthru-ctrl
nvmet-tcp: don't check data_len in nvmet_tcp_map_data()
nvmet-configfs: introduce passthru configfs interface
block: don't check blk_rq_is_passthrough() in blk_do_io_stat()
block: call blk_account_io_start() in blk_execute_rq_nowait()
nvmet-passthru: support block accounting

block/blk-exec.c | 2 +
block/blk-mq.c | 2 +-
block/blk.h | 5 +-
drivers/nvme/host/core.c | 41 +-
drivers/nvme/host/nvme.h | 9 +
drivers/nvme/target/Kconfig | 10 +
drivers/nvme/target/Makefile | 1 +
drivers/nvme/target/admin-cmd.c | 4 +-
drivers/nvme/target/configfs.c | 99 ++++
drivers/nvme/target/core.c | 39 +-
drivers/nvme/target/io-cmd-passthru.c | 683 ++++++++++++++++++++++++++
drivers/nvme/target/nvmet.h | 68 ++-
drivers/nvme/target/tcp.c | 2 +-
13 files changed, 947 insertions(+), 18 deletions(-)
create mode 100644 drivers/nvme/target/io-cmd-passthru.c

--
2.20.1


2019-08-02 02:29:47

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v7 05/14] nvmet-passthru: update KConfig with config passthru option

From: Chaitanya Kulkarni <[email protected]>

This patch updates KConfig file for the NVMeOF target where we add new
option so that user can selectively enable/disable passthru code.

Signed-off-by: Chaitanya Kulkarni <[email protected]>
[[email protected]: fixed some of the wording in the help message]
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/nvme/target/Kconfig | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/nvme/target/Kconfig b/drivers/nvme/target/Kconfig
index d7f48c0fb311..2478cb5a932d 100644
--- a/drivers/nvme/target/Kconfig
+++ b/drivers/nvme/target/Kconfig
@@ -15,6 +15,16 @@ config NVME_TARGET
To configure the NVMe target you probably want to use the nvmetcli
tool from http://git.infradead.org/users/hch/nvmetcli.git.

+config NVME_TARGET_PASSTHRU
+ bool "NVMe Target Passthrough support"
+ depends on NVME_CORE
+ depends on NVME_TARGET
+ help
+ This enables target side NVMe passthru controller support for the
+ NVMe Over Fabrics protocol. It allows for hosts to manage and
+ directly access an actual NVMe controller residing on the target
+ side, incuding executing Vendor Unique Commands.
+
config NVME_TARGET_LOOP
tristate "NVMe loopback device support"
depends on NVME_TARGET
--
2.20.1

2019-08-02 02:29:48

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v7 09/14] nvmet-core: don't check the data len for pt-ctrl

From: Chaitanya Kulkarni <[email protected]>

Right now, data_len is calculated before the transfer len after we
parse the command, With passthru interface we allow VUCs (Vendor-Unique
Commands). In order to make the code simple and compact, instead of
assigning the data len or each VUC in the command parse function
just use the transfer len as it is. This may result in error if expected
data_len != transfer_len.

Signed-off-by: Chaitanya Kulkarni <[email protected]>
[[email protected]:
* added definition of VUC to the commit message and comment
* use nvmet_req_passthru_ctrl() helper seeing we can't dereference
subsys->passthru_ctrl if CONFIG_NVME_TARGET_PASSTHRU is not set]
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/nvme/target/core.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index c655f26db3da..1cda94437fbf 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -941,7 +941,16 @@ EXPORT_SYMBOL_GPL(nvmet_req_uninit);

void nvmet_req_execute(struct nvmet_req *req)
{
- if (unlikely(req->data_len != req->transfer_len)) {
+ /*
+ * data_len is calculated before the transfer len after we parse
+ * the command, With passthru interface we allow VUC (Vendor-Unique
+ * Commands)'s. In order to make the code simple and compact,
+ * instead of assinging the dala len for each VUC in the command
+ * parse function just use the transfer len as it is. This may
+ * result in error if expected data_len != transfer_len.
+ */
+ if (!(req->sq->ctrl && nvmet_req_passthru_ctrl(req)) &&
+ unlikely(req->data_len != req->transfer_len)) {
req->error_loc = offsetof(struct nvme_common_command, dptr);
nvmet_req_complete(req, NVME_SC_SGL_INVALID_DATA | NVME_SC_DNR);
} else
--
2.20.1

2019-08-02 02:29:50

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v7 07/14] nvmet-passthru: add enable/disable helpers

This patch adds helper functions which are used in the NVMeOF configfs
when the user is configuring the passthru subsystem. Here we ensure
that only one subsys is assigned to each nvme_ctrl by using an xarray
on the cntlid.

[[email protected]: this patch is very roughly based
on a similar one by Chaitanya]
Signed-off-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/nvme/target/core.c | 8 +++
drivers/nvme/target/io-cmd-passthru.c | 77 +++++++++++++++++++++++++++
drivers/nvme/target/nvmet.h | 10 ++++
3 files changed, 95 insertions(+)

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 50c01b2da568..2e75968af7f4 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -519,6 +519,12 @@ int nvmet_ns_enable(struct nvmet_ns *ns)

mutex_lock(&subsys->lock);
ret = 0;
+
+ if (nvmet_passthru_ctrl(subsys)) {
+ pr_info("cannot enable both passthru and regular namespaces for a single subsystem");
+ goto out_unlock;
+ }
+
if (ns->enabled)
goto out_unlock;

@@ -1439,6 +1445,8 @@ static void nvmet_subsys_free(struct kref *ref)

WARN_ON_ONCE(!list_empty(&subsys->namespaces));

+ nvmet_passthru_subsys_free(subsys);
+
kfree(subsys->subsysnqn);
kfree(subsys);
}
diff --git a/drivers/nvme/target/io-cmd-passthru.c b/drivers/nvme/target/io-cmd-passthru.c
index 46c58fec5608..b199785500ad 100644
--- a/drivers/nvme/target/io-cmd-passthru.c
+++ b/drivers/nvme/target/io-cmd-passthru.c
@@ -11,6 +11,11 @@
#include "../host/nvme.h"
#include "nvmet.h"

+/*
+ * xarray to maintain one passthru subsystem per nvme controller.
+ */
+static DEFINE_XARRAY(passthru_subsystems);
+
static struct workqueue_struct *passthru_wq;

int nvmet_passthru_init(void)
@@ -27,6 +32,78 @@ void nvmet_passthru_destroy(void)
destroy_workqueue(passthru_wq);
}

+int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
+{
+ struct nvme_ctrl *ctrl;
+ int ret = -EINVAL;
+ void *old;
+
+ mutex_lock(&subsys->lock);
+ if (!subsys->passthru_ctrl_path)
+ goto out_unlock;
+ if (subsys->passthru_ctrl)
+ goto out_unlock;
+
+ if (subsys->nr_namespaces) {
+ pr_info("cannot enable both passthru and regular namespaces for a single subsystem");
+ goto out_unlock;
+ }
+
+ ctrl = nvme_ctrl_get_by_path(subsys->passthru_ctrl_path);
+ if (IS_ERR(ctrl)) {
+ ret = PTR_ERR(ctrl);
+ pr_err("failed to open nvme controller %s\n",
+ subsys->passthru_ctrl_path);
+
+ goto out_unlock;
+ }
+
+ old = xa_cmpxchg(&passthru_subsystems, ctrl->cntlid, NULL,
+ subsys, GFP_KERNEL);
+ if (xa_is_err(old)) {
+ ret = xa_err(old);
+ goto out_put_ctrl;
+ }
+
+ if (old)
+ goto out_put_ctrl;
+
+ subsys->passthru_ctrl = ctrl;
+ ret = 0;
+
+ goto out_unlock;
+
+out_put_ctrl:
+ nvme_put_ctrl(ctrl);
+out_unlock:
+ mutex_unlock(&subsys->lock);
+ return ret;
+}
+
+static void __nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys)
+{
+ if (subsys->passthru_ctrl) {
+ xa_erase(&passthru_subsystems, subsys->passthru_ctrl->cntlid);
+ nvme_put_ctrl(subsys->passthru_ctrl);
+ }
+ subsys->passthru_ctrl = NULL;
+}
+
+void nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys)
+{
+ mutex_lock(&subsys->lock);
+ __nvmet_passthru_ctrl_disable(subsys);
+ mutex_unlock(&subsys->lock);
+}
+
+void nvmet_passthru_subsys_free(struct nvmet_subsys *subsys)
+{
+ mutex_lock(&subsys->lock);
+ __nvmet_passthru_ctrl_disable(subsys);
+ kfree(subsys->passthru_ctrl_path);
+ mutex_unlock(&subsys->lock);
+}
+
static void nvmet_passthru_req_complete(struct nvmet_req *req,
struct request *rq, u16 status)
{
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index bd11114ebbb9..aff4db03269d 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -230,6 +230,7 @@ struct nvmet_subsys {

#ifdef CONFIG_NVME_TARGET_PASSTHRU
struct nvme_ctrl *passthru_ctrl;
+ char *passthru_ctrl_path;
#endif /* CONFIG_NVME_TARGET_PASSTHRU */
};

@@ -509,6 +510,9 @@ static inline u32 nvmet_rw_len(struct nvmet_req *req)

int nvmet_passthru_init(void);
void nvmet_passthru_destroy(void);
+void nvmet_passthru_subsys_free(struct nvmet_subsys *subsys);
+int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys);
+void nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys);
u16 nvmet_parse_passthru_cmd(struct nvmet_req *req);

static inline
@@ -526,6 +530,12 @@ static inline int nvmet_passthru_init(void)
static inline void nvmet_passthru_destroy(void)
{
}
+static inline void nvmet_passthru_subsys_free(struct nvmet_subsys *subsys)
+{
+}
+static inline void nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys)
+{
+}
static inline u16 nvmet_parse_passthru_cmd(struct nvmet_req *req)
{
return 0;
--
2.20.1

2019-08-02 02:29:54

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v7 13/14] block: call blk_account_io_start() in blk_execute_rq_nowait()

All existing users of blk_execute_rq[_nowait]() are for passthrough
commands and will thus be rejected by blk_do_io_stat().

This allows passthrough requests to opt-in to IO accounting by setting
RQF_IO_STAT.

Signed-off-by: Logan Gunthorpe <[email protected]>
---
block/blk-exec.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/block/blk-exec.c b/block/blk-exec.c
index 1db44ca0f4a6..e20a852ae432 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -55,6 +55,8 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
rq->rq_disk = bd_disk;
rq->end_io = done;

+ blk_account_io_start(rq, true);
+
/*
* don't check dying flag for MQ because the request won't
* be reused after dying flag is set
--
2.20.1

2019-08-02 02:29:57

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v7 14/14] nvmet-passthru: support block accounting

Support block disk accounting by setting the RQF_IO_STAT flag
and gendisk in the request.

After this change, IO counts will be reflected correctly in
/proc/diskstats for drives being used by passthru.

Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/nvme/target/io-cmd-passthru.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/target/io-cmd-passthru.c b/drivers/nvme/target/io-cmd-passthru.c
index 06a919283cc5..cb193b434545 100644
--- a/drivers/nvme/target/io-cmd-passthru.c
+++ b/drivers/nvme/target/io-cmd-passthru.c
@@ -441,6 +441,9 @@ static struct request *nvmet_passthru_blk_make_request(struct nvmet_req *req,
if (unlikely(IS_ERR(rq)))
return rq;

+ if (blk_queue_io_stat(q) && cmd->common.opcode != nvme_cmd_flush)
+ rq->rq_flags |= RQF_IO_STAT;
+
if (req->sg_cnt) {
ret = nvmet_passthru_map_sg(req, rq);
if (unlikely(ret)) {
@@ -505,7 +508,7 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)

rq->end_io_data = req;
if (req->sq->qid != 0) {
- blk_execute_rq_nowait(rq->q, NULL, rq, 0,
+ blk_execute_rq_nowait(rq->q, ns->disk, rq, 0,
nvmet_passthru_req_done);
} else {
req->p.rq = rq;
--
2.20.1

2019-08-02 02:31:20

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v7 08/14] nvmet-core: allow one host per passthru-ctrl

This patch rejects any new connection to the passthru-ctrl if this
controller is already connected to a different host. At the time of
allocating the controller we check if the subsys associated with
the passthru ctrl is already connected to a host and reject it
if the hostnqn differs.

Connections from the same host (by hostnqn) are supported to allow
for multipath.

[[email protected]: based conceptually on a similar patch but
different implementation]
Signed-off-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/nvme/target/core.c | 4 ++++
drivers/nvme/target/io-cmd-passthru.c | 31 +++++++++++++++++++++++++++
drivers/nvme/target/nvmet.h | 7 ++++++
3 files changed, 42 insertions(+)

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 2e75968af7f4..c655f26db3da 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -1278,6 +1278,10 @@ u16 nvmet_alloc_ctrl(const char *subsysnqn, const char *hostnqn,
if (!ctrl->sqs)
goto out_free_cqs;

+ ret = nvmet_passthru_alloc_ctrl(subsys, hostnqn);
+ if (ret)
+ goto out_free_sqs;
+
ret = ida_simple_get(&cntlid_ida,
NVME_CNTLID_MIN, NVME_CNTLID_MAX,
GFP_KERNEL);
diff --git a/drivers/nvme/target/io-cmd-passthru.c b/drivers/nvme/target/io-cmd-passthru.c
index b199785500ad..06a919283cc5 100644
--- a/drivers/nvme/target/io-cmd-passthru.c
+++ b/drivers/nvme/target/io-cmd-passthru.c
@@ -104,6 +104,37 @@ void nvmet_passthru_subsys_free(struct nvmet_subsys *subsys)
mutex_unlock(&subsys->lock);
}

+int nvmet_passthru_alloc_ctrl(struct nvmet_subsys *subsys,
+ const char *hostnqn)
+{
+ struct nvmet_ctrl *ctrl;
+
+ /*
+ * Check here if this subsystem is already connected to the passthru
+ * ctrl. We allow only one host to connect to a given passthru
+ * subsystem.
+ */
+ int rc = 0;
+
+ mutex_lock(&subsys->lock);
+
+ if (!subsys->passthru_ctrl)
+ goto out;
+
+ if (list_empty(&subsys->ctrls))
+ goto out;
+
+ ctrl = list_first_entry(&subsys->ctrls, struct nvmet_ctrl,
+ subsys_entry);
+
+ if (strcmp(hostnqn, ctrl->hostnqn))
+ rc = -ENODEV;
+
+out:
+ mutex_unlock(&subsys->lock);
+ return rc;
+}
+
static void nvmet_passthru_req_complete(struct nvmet_req *req,
struct request *rq, u16 status)
{
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index aff4db03269d..6436cb990905 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -513,6 +513,8 @@ void nvmet_passthru_destroy(void);
void nvmet_passthru_subsys_free(struct nvmet_subsys *subsys);
int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys);
void nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys);
+int nvmet_passthru_alloc_ctrl(struct nvmet_subsys *subsys,
+ const char *hostnqn);
u16 nvmet_parse_passthru_cmd(struct nvmet_req *req);

static inline
@@ -536,6 +538,11 @@ static inline void nvmet_passthru_subsys_free(struct nvmet_subsys *subsys)
static inline void nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys)
{
}
+static inline int nvmet_passthru_alloc_ctrl(struct nvmet_subsys *subsys,
+ const char *hostnqn)
+{
+ return 0;
+}
static inline u16 nvmet_parse_passthru_cmd(struct nvmet_req *req)
{
return 0;
--
2.20.1

2019-08-02 07:35:15

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v7 12/14] block: don't check blk_rq_is_passthrough() in blk_do_io_stat()

Instead of checking blk_rq_is_passthruough() for every call to
blk_do_io_stat(), don't set RQF_IO_STAT for passthrough requests.
This should be equivalent, and opens the possibility of passthrough
requests specifically requesting statistics tracking.

Signed-off-by: Logan Gunthorpe <[email protected]>
---
block/blk-mq.c | 2 +-
block/blk.h | 5 ++---
2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f78d3287dd82..66cb5bfa2ebf 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -318,7 +318,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
rq->cmd_flags = op;
if (data->flags & BLK_MQ_REQ_PREEMPT)
rq->rq_flags |= RQF_PREEMPT;
- if (blk_queue_io_stat(data->q))
+ if (blk_queue_io_stat(data->q) && !blk_rq_is_passthrough(rq))
rq->rq_flags |= RQF_IO_STAT;
INIT_LIST_HEAD(&rq->queuelist);
INIT_HLIST_NODE(&rq->hash);
diff --git a/block/blk.h b/block/blk.h
index de6b2e146d6e..554efa769bfe 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -234,13 +234,12 @@ int blk_dev_init(void);
*
* a) it's attached to a gendisk, and
* b) the queue had IO stats enabled when this request was started, and
- * c) it's a file system request
+ * c) it's a file system request (RQF_IO_STAT will not be set otherwise)
*/
static inline bool blk_do_io_stat(struct request *rq)
{
return rq->rq_disk &&
- (rq->rq_flags & RQF_IO_STAT) &&
- !blk_rq_is_passthrough(rq);
+ (rq->rq_flags & RQF_IO_STAT);
}

static inline void req_set_nomerge(struct request_queue *q, struct request *req)
--
2.20.1

2019-08-14 14:34:51

by Max Gurtovoy

[permalink] [raw]
Subject: Re: [PATCH v7 05/14] nvmet-passthru: update KConfig with config passthru option


On 8/2/2019 2:45 AM, Logan Gunthorpe wrote:
> From: Chaitanya Kulkarni <[email protected]>
>
> This patch updates KConfig file for the NVMeOF target where we add new
> option so that user can selectively enable/disable passthru code.
>
> Signed-off-by: Chaitanya Kulkarni <[email protected]>
> [[email protected]: fixed some of the wording in the help message]
> Signed-off-by: Logan Gunthorpe <[email protected]>
> ---
> drivers/nvme/target/Kconfig | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/nvme/target/Kconfig b/drivers/nvme/target/Kconfig
> index d7f48c0fb311..2478cb5a932d 100644
> --- a/drivers/nvme/target/Kconfig
> +++ b/drivers/nvme/target/Kconfig
> @@ -15,6 +15,16 @@ config NVME_TARGET
> To configure the NVMe target you probably want to use the nvmetcli
> tool from http://git.infradead.org/users/hch/nvmetcli.git.
>
> +config NVME_TARGET_PASSTHRU
> + bool "NVMe Target Passthrough support"
> + depends on NVME_CORE
> + depends on NVME_TARGET
> + help
> + This enables target side NVMe passthru controller support for the
> + NVMe Over Fabrics protocol. It allows for hosts to manage and
> + directly access an actual NVMe controller residing on the target
> + side, incuding executing Vendor Unique Commands.
> +
> config NVME_TARGET_LOOP
> tristate "NVMe loopback device support"
> depends on NVME_TARGET


Looks good,

Reviewed-by: Max Gurtovoy <[email protected]>


2019-08-15 12:47:47

by Max Gurtovoy

[permalink] [raw]
Subject: Re: [PATCH v7 07/14] nvmet-passthru: add enable/disable helpers


On 8/2/2019 2:45 AM, Logan Gunthorpe wrote:
> This patch adds helper functions which are used in the NVMeOF configfs
> when the user is configuring the passthru subsystem. Here we ensure
> that only one subsys is assigned to each nvme_ctrl by using an xarray
> on the cntlid.
>
> [[email protected]: this patch is very roughly based
> on a similar one by Chaitanya]
> Signed-off-by: Chaitanya Kulkarni <[email protected]>
> Signed-off-by: Logan Gunthorpe <[email protected]>
> ---
> drivers/nvme/target/core.c | 8 +++
> drivers/nvme/target/io-cmd-passthru.c | 77 +++++++++++++++++++++++++++
> drivers/nvme/target/nvmet.h | 10 ++++
> 3 files changed, 95 insertions(+)
>
> diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
> index 50c01b2da568..2e75968af7f4 100644
> --- a/drivers/nvme/target/core.c
> +++ b/drivers/nvme/target/core.c
> @@ -519,6 +519,12 @@ int nvmet_ns_enable(struct nvmet_ns *ns)
>
> mutex_lock(&subsys->lock);
> ret = 0;
> +
> + if (nvmet_passthru_ctrl(subsys)) {
> + pr_info("cannot enable both passthru and regular namespaces for a single subsystem");
> + goto out_unlock;
> + }
> +
> if (ns->enabled)
> goto out_unlock;
>
> @@ -1439,6 +1445,8 @@ static void nvmet_subsys_free(struct kref *ref)
>
> WARN_ON_ONCE(!list_empty(&subsys->namespaces));
>
> + nvmet_passthru_subsys_free(subsys);
> +
> kfree(subsys->subsysnqn);
> kfree(subsys);
> }
> diff --git a/drivers/nvme/target/io-cmd-passthru.c b/drivers/nvme/target/io-cmd-passthru.c
> index 46c58fec5608..b199785500ad 100644
> --- a/drivers/nvme/target/io-cmd-passthru.c
> +++ b/drivers/nvme/target/io-cmd-passthru.c
> @@ -11,6 +11,11 @@
> #include "../host/nvme.h"
> #include "nvmet.h"
>
> +/*
> + * xarray to maintain one passthru subsystem per nvme controller.
> + */
> +static DEFINE_XARRAY(passthru_subsystems);
> +
> static struct workqueue_struct *passthru_wq;
>
> int nvmet_passthru_init(void)
> @@ -27,6 +32,78 @@ void nvmet_passthru_destroy(void)
> destroy_workqueue(passthru_wq);
> }
>
> +int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
> +{
> + struct nvme_ctrl *ctrl;
> + int ret = -EINVAL;
> + void *old;
> +
> + mutex_lock(&subsys->lock);
> + if (!subsys->passthru_ctrl_path)
> + goto out_unlock;
> + if (subsys->passthru_ctrl)
> + goto out_unlock;
> +
> + if (subsys->nr_namespaces) {
> + pr_info("cannot enable both passthru and regular namespaces for a single subsystem");
> + goto out_unlock;
> + }
> +
> + ctrl = nvme_ctrl_get_by_path(subsys->passthru_ctrl_path);
> + if (IS_ERR(ctrl)) {
> + ret = PTR_ERR(ctrl);
> + pr_err("failed to open nvme controller %s\n",
> + subsys->passthru_ctrl_path);
> +
> + goto out_unlock;
> + }
> +
> + old = xa_cmpxchg(&passthru_subsystems, ctrl->cntlid, NULL,
> + subsys, GFP_KERNEL);
> + if (xa_is_err(old)) {
> + ret = xa_err(old);
> + goto out_put_ctrl;
> + }
> +
> + if (old)
> + goto out_put_ctrl;
> +
> + subsys->passthru_ctrl = ctrl;
> + ret = 0;
> +
> + goto out_unlock;

can we re-arrange the code here ?

it's not so common to see goto in a good flow.

maybe have 1 good flow the goto's will go bellow it as we usually do in
this subsystem.


> +
> +out_put_ctrl:
> + nvme_put_ctrl(ctrl);
> +out_unlock:
> + mutex_unlock(&subsys->lock);
> + return ret;
> +}
> +
> +static void __nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys)
> +{
> + if (subsys->passthru_ctrl) {
> + xa_erase(&passthru_subsystems, subsys->passthru_ctrl->cntlid);
> + nvme_put_ctrl(subsys->passthru_ctrl);
> + }
> + subsys->passthru_ctrl = NULL;
> +}
> +
> +void nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys)
> +{
> + mutex_lock(&subsys->lock);
> + __nvmet_passthru_ctrl_disable(subsys);
> + mutex_unlock(&subsys->lock);
> +}
> +
> +void nvmet_passthru_subsys_free(struct nvmet_subsys *subsys)
> +{
> + mutex_lock(&subsys->lock);
> + __nvmet_passthru_ctrl_disable(subsys);
> + kfree(subsys->passthru_ctrl_path);
> + mutex_unlock(&subsys->lock);
> +}
> +
> static void nvmet_passthru_req_complete(struct nvmet_req *req,
> struct request *rq, u16 status)
> {
> diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
> index bd11114ebbb9..aff4db03269d 100644
> --- a/drivers/nvme/target/nvmet.h
> +++ b/drivers/nvme/target/nvmet.h
> @@ -230,6 +230,7 @@ struct nvmet_subsys {
>
> #ifdef CONFIG_NVME_TARGET_PASSTHRU
> struct nvme_ctrl *passthru_ctrl;
> + char *passthru_ctrl_path;
> #endif /* CONFIG_NVME_TARGET_PASSTHRU */
> };
>
> @@ -509,6 +510,9 @@ static inline u32 nvmet_rw_len(struct nvmet_req *req)
>
> int nvmet_passthru_init(void);
> void nvmet_passthru_destroy(void);
> +void nvmet_passthru_subsys_free(struct nvmet_subsys *subsys);
> +int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys);
> +void nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys);
> u16 nvmet_parse_passthru_cmd(struct nvmet_req *req);
>
> static inline
> @@ -526,6 +530,12 @@ static inline int nvmet_passthru_init(void)
> static inline void nvmet_passthru_destroy(void)
> {
> }
> +static inline void nvmet_passthru_subsys_free(struct nvmet_subsys *subsys)
> +{
> +}
> +static inline void nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys)
> +{
> +}
> static inline u16 nvmet_parse_passthru_cmd(struct nvmet_req *req)
> {
> return 0;

2019-08-15 14:51:26

by Max Gurtovoy

[permalink] [raw]
Subject: Re: [PATCH v7 08/14] nvmet-core: allow one host per passthru-ctrl


On 8/2/2019 2:45 AM, Logan Gunthorpe wrote:
> This patch rejects any new connection to the passthru-ctrl if this
> controller is already connected to a different host. At the time of
> allocating the controller we check if the subsys associated with
> the passthru ctrl is already connected to a host and reject it
> if the hostnqn differs.

This is a big limitation.

Are we plan to enable many front-end ctrl's to connect to the single
back-end ctrl in the future ?

I guess it can be incremental to this series.


2019-08-15 16:05:42

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v7 07/14] nvmet-passthru: add enable/disable helpers



On 2019-08-15 6:20 a.m., Max Gurtovoy wrote:
>
> On 8/2/2019 2:45 AM, Logan Gunthorpe wrote:
>> This patch adds helper functions which are used in the NVMeOF configfs
>> when the user is configuring the passthru subsystem. Here we ensure
>> that only one subsys is assigned to each nvme_ctrl by using an xarray
>> on the cntlid.
>>
>> [[email protected]: this patch is very roughly based
>>   on a similar one by Chaitanya]
>> Signed-off-by: Chaitanya Kulkarni <[email protected]>
>> Signed-off-by: Logan Gunthorpe <[email protected]>
>> ---
>>   drivers/nvme/target/core.c            |  8 +++
>>   drivers/nvme/target/io-cmd-passthru.c | 77 +++++++++++++++++++++++++++
>>   drivers/nvme/target/nvmet.h           | 10 ++++
>>   3 files changed, 95 insertions(+)
>>
>> diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
>> index 50c01b2da568..2e75968af7f4 100644
>> --- a/drivers/nvme/target/core.c
>> +++ b/drivers/nvme/target/core.c
>> @@ -519,6 +519,12 @@ int nvmet_ns_enable(struct nvmet_ns *ns)
>>         mutex_lock(&subsys->lock);
>>       ret = 0;
>> +
>> +    if (nvmet_passthru_ctrl(subsys)) {
>> +        pr_info("cannot enable both passthru and regular namespaces
>> for a single subsystem");
>> +        goto out_unlock;
>> +    }
>> +
>>       if (ns->enabled)
>>           goto out_unlock;
>>   @@ -1439,6 +1445,8 @@ static void nvmet_subsys_free(struct kref *ref)
>>         WARN_ON_ONCE(!list_empty(&subsys->namespaces));
>>   +    nvmet_passthru_subsys_free(subsys);
>> +
>>       kfree(subsys->subsysnqn);
>>       kfree(subsys);
>>   }
>> diff --git a/drivers/nvme/target/io-cmd-passthru.c
>> b/drivers/nvme/target/io-cmd-passthru.c
>> index 46c58fec5608..b199785500ad 100644
>> --- a/drivers/nvme/target/io-cmd-passthru.c
>> +++ b/drivers/nvme/target/io-cmd-passthru.c
>> @@ -11,6 +11,11 @@
>>   #include "../host/nvme.h"
>>   #include "nvmet.h"
>>   +/*
>> + * xarray to maintain one passthru subsystem per nvme controller.
>> + */
>> +static DEFINE_XARRAY(passthru_subsystems);
>> +
>>   static struct workqueue_struct *passthru_wq;
>>     int nvmet_passthru_init(void)
>> @@ -27,6 +32,78 @@ void nvmet_passthru_destroy(void)
>>       destroy_workqueue(passthru_wq);
>>   }
>>   +int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
>> +{
>> +    struct nvme_ctrl *ctrl;
>> +    int ret = -EINVAL;
>> +    void *old;
>> +
>> +    mutex_lock(&subsys->lock);
>> +    if (!subsys->passthru_ctrl_path)
>> +        goto out_unlock;
>> +    if (subsys->passthru_ctrl)
>> +        goto out_unlock;
>> +
>> +    if (subsys->nr_namespaces) {
>> +        pr_info("cannot enable both passthru and regular namespaces
>> for a single subsystem");
>> +        goto out_unlock;
>> +    }
>> +
>> +    ctrl = nvme_ctrl_get_by_path(subsys->passthru_ctrl_path);
>> +    if (IS_ERR(ctrl)) {
>> +        ret = PTR_ERR(ctrl);
>> +        pr_err("failed to open nvme controller %s\n",
>> +               subsys->passthru_ctrl_path);
>> +
>> +        goto out_unlock;
>> +    }
>> +
>> +    old = xa_cmpxchg(&passthru_subsystems, ctrl->cntlid, NULL,
>> +             subsys, GFP_KERNEL);
>> +    if (xa_is_err(old)) {
>> +        ret = xa_err(old);
>> +        goto out_put_ctrl;
>> +    }
>> +
>> +    if (old)
>> +        goto out_put_ctrl;
>> +
>> +    subsys->passthru_ctrl = ctrl;
>> +    ret = 0;
>> +
>> +    goto out_unlock;
>
> can we re-arrange the code here ?
>
> it's not so common to see goto in a good flow.
>
> maybe have 1 good flow the goto's will go bellow it as we usually do in
> this subsystem.

OK, queued up for v8.

Thanks,

Logan

2019-08-15 16:08:28

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v7 08/14] nvmet-core: allow one host per passthru-ctrl



On 2019-08-15 6:36 a.m., Max Gurtovoy wrote:
>
> On 8/2/2019 2:45 AM, Logan Gunthorpe wrote:
>> This patch rejects any new connection to the passthru-ctrl if this
>> controller is already connected to a different host. At the time of
>> allocating the controller we check if the subsys associated with
>> the passthru ctrl is already connected to a host and reject it
>> if the hostnqn differs.
>
> This is a big limitation.
>
> Are we plan to enable many front-end ctrl's to connect to the single
> back-end ctrl in the future ?

Honestly, I don't know that it's really necessary, but the limitation
was requested by Sagi the first time this patch-set was submitted[1]
citing unspecified user troubles. If there's consensus to remove the
restriction I certainly can.

Logan

[1] http://lists.infradead.org/pipermail/linux-nvme/2018-April/016588.html

2019-08-18 10:36:33

by Max Gurtovoy

[permalink] [raw]
Subject: Re: [PATCH v7 08/14] nvmet-core: allow one host per passthru-ctrl


On 8/15/2019 7:06 PM, Logan Gunthorpe wrote:
>
> On 2019-08-15 6:36 a.m., Max Gurtovoy wrote:
>> On 8/2/2019 2:45 AM, Logan Gunthorpe wrote:
>>> This patch rejects any new connection to the passthru-ctrl if this
>>> controller is already connected to a different host. At the time of
>>> allocating the controller we check if the subsys associated with
>>> the passthru ctrl is already connected to a host and reject it
>>> if the hostnqn differs.
>> This is a big limitation.
>>
>> Are we plan to enable many front-end ctrl's to connect to the single
>> back-end ctrl in the future ?
> Honestly, I don't know that it's really necessary, but the limitation
> was requested by Sagi the first time this patch-set was submitted[1]
> citing unspecified user troubles. If there's consensus to remove the
> restriction I certainly can.
>
> Logan
>
> [1] http://lists.infradead.org/pipermail/linux-nvme/2018-April/016588.html

I don't understand why we don't limit a regular ctrl to single access
and we do it for the PT ctrl.

I guess the block layer helps to sync between multiple access in
parallel but we can do it as well.

Also, let's say you limit the access to this subsystem to 1 user, the
bdev is still accessibly for local user and also you can create a
different subsystem that will use this device (PT and non-PT ctrl).

Sagi,

can you explain the trouble you meant and how this limitation solve it ?

2019-08-22 00:14:03

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v7 08/14] nvmet-core: allow one host per passthru-ctrl


> I don't understand why we don't limit a regular ctrl to single access
> and we do it for the PT ctrl.
>
> I guess the block layer helps to sync between multiple access in
> parallel but we can do it as well.
>
> Also, let's say you limit the access to this subsystem to 1 user, the
> bdev is still accessibly for local user and also you can create a
> different subsystem that will use this device (PT and non-PT ctrl).
>
> Sagi,
>
> can you explain the trouble you meant and how this limitation solve it ?

Its different to emulate the controller with all its admin
commands vs. passing it through to the nvme device.. (think of format nvm)



2019-08-22 09:58:42

by Max Gurtovoy

[permalink] [raw]
Subject: Re: [PATCH v7 08/14] nvmet-core: allow one host per passthru-ctrl


On 8/22/2019 3:09 AM, Sagi Grimberg wrote:
>
>> I don't understand why we don't limit a regular ctrl to single access
>> and we do it for the PT ctrl.
>>
>> I guess the block layer helps to sync between multiple access in
>> parallel but we can do it as well.
>>
>> Also, let's say you limit the access to this subsystem to 1 user, the
>> bdev is still accessibly for local user and also you can create a
>> different subsystem that will use this device (PT and non-PT ctrl).
>>
>> Sagi,
>>
>> can you explain the trouble you meant and how this limitation solve it ?
>
> Its different to emulate the controller with all its admin
> commands vs. passing it through to the nvme device.. (think of format
> nvm)
>
>
>
we don't need to support format command for PT ctrl as we don't support
other commands such create_sq/cq.

2019-08-23 07:02:07

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v7 08/14] nvmet-core: allow one host per passthru-ctrl


>>> I don't understand why we don't limit a regular ctrl to single access
>>> and we do it for the PT ctrl.
>>>
>>> I guess the block layer helps to sync between multiple access in
>>> parallel but we can do it as well.
>>>
>>> Also, let's say you limit the access to this subsystem to 1 user, the
>>> bdev is still accessibly for local user and also you can create a
>>> different subsystem that will use this device (PT and non-PT ctrl).
>>>
>>> Sagi,
>>>
>>> can you explain the trouble you meant and how this limitation solve it ?
>>
>> Its different to emulate the controller with all its admin
>> commands vs. passing it through to the nvme device.. (think of format
>> nvm)
>>
>>
>>
> we don't need to support format command for PT ctrl as we don't support
> other commands such create_sq/cq.

That is just an example, basically every command that we are not aware
of we simply passthru to the drive without knowing the implications
on a multi-host environment..

2019-08-23 07:35:21

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v7 08/14] nvmet-core: allow one host per passthru-ctrl


>>>> I don't understand why we don't limit a regular ctrl to single
>>>> access and we do it for the PT ctrl.
>>>>
>>>> I guess the block layer helps to sync between multiple access in
>>>> parallel but we can do it as well.
>>>>
>>>> Also, let's say you limit the access to this subsystem to 1 user,
>>>> the bdev is still accessibly for local user and also you can create
>>>> a different subsystem that will use this device (PT and non-PT ctrl).
>>>>
>>>> Sagi,
>>>>
>>>> can you explain the trouble you meant and how this limitation solve
>>>> it ?
>>>
>>> Its different to emulate the controller with all its admin
>>> commands vs. passing it through to the nvme device.. (think of format
>>> nvm)
>>>
>>>
>>>
>> we don't need to support format command for PT ctrl as we don't
>> support other commands such create_sq/cq.
>
> That is just an example, basically every command that we are not aware
> of we simply passthru to the drive without knowing the implications
> on a multi-host environment..

If we were to change the logic of nvmet_parse_passthru_admin_cmd to
have the default case do nvmet_parse_admin_cmd, and only have
the vendor-specific space opcodes do nvmet_passthru_execute_cmd
then I could not see at the moment how we can break a multi-host
export...

2019-08-23 12:53:33

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v7 08/14] nvmet-core: allow one host per passthru-ctrl



On 2019-08-22 1:17 p.m., Sagi Grimberg wrote:
>
>>>>> I don't understand why we don't limit a regular ctrl to single
>>>>> access and we do it for the PT ctrl.
>>>>>
>>>>> I guess the block layer helps to sync between multiple access in
>>>>> parallel but we can do it as well.
>>>>>
>>>>> Also, let's say you limit the access to this subsystem to 1 user,
>>>>> the bdev is still accessibly for local user and also you can create
>>>>> a different subsystem that will use this device (PT and non-PT ctrl).
>>>>>
>>>>> Sagi,
>>>>>
>>>>> can you explain the trouble you meant and how this limitation solve
>>>>> it ?
>>>>
>>>> Its different to emulate the controller with all its admin
>>>> commands vs. passing it through to the nvme device.. (think of
>>>> format nvm)
>>>>
>>>>
>>>>
>>> we don't need to support format command for PT ctrl as we don't
>>> support other commands such create_sq/cq.
>>
>> That is just an example, basically every command that we are not aware
>> of we simply passthru to the drive without knowing the implications
>> on a multi-host environment..
>
> If we were to change the logic of nvmet_parse_passthru_admin_cmd to
> have the default case do nvmet_parse_admin_cmd, and only have
> the vendor-specific space opcodes do nvmet_passthru_execute_cmd
> then I could not see at the moment how we can break a multi-host
> export...

That makes sense. I'll make that change and resend a v8 next week.

Logan