2021-03-26 21:06:08

by Niklas Cassel

[permalink] [raw]
Subject: [RFC PATCH] nvme: allow NVME_IOCTL_IO_CMD on controller char dev even when multiple ns

From: Niklas Cassel <[email protected]>

Currently when doing NVME_IOCTL_IO_CMD on the controller character device,
the command is rejected if there is more than one namespace in the
ctrl->namespaces list.

There is not really any reason for this restriction.
Instead, check the nsid value specified in the passthru command, and try
to find the matching namespace in ctrl->namespaces list.
If found, call nvme_user_cmd() on the namespace.
If not found, reject the command.

While at it, remove the warning that says that NVME_IOCTL_IO_CMD is
deprecated on the controller character device.
There is no comment saying why it is deprecated.
It might be very unsafe to send a passthru command, but if that is
the issue with this IOCTL, then we should add a warning about that
instead.

Signed-off-by: Niklas Cassel <[email protected]>
---
drivers/nvme/host/core.c | 30 +++++++++++++-----------------
1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a50352ea3f7b..b50fdf143b90 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3264,35 +3264,31 @@ static int nvme_dev_release(struct inode *inode, struct file *file)

static int nvme_dev_user_cmd(struct nvme_ctrl *ctrl, void __user *argp)
{
+ struct nvme_passthru_cmd cmd;
struct nvme_ns *ns;
int ret;

down_read(&ctrl->namespaces_rwsem);
if (list_empty(&ctrl->namespaces)) {
- ret = -ENOTTY;
- goto out_unlock;
+ up_read(&ctrl->namespaces_rwsem);
+ return -ENOTTY;
}
+ up_read(&ctrl->namespaces_rwsem);

- ns = list_first_entry(&ctrl->namespaces, struct nvme_ns, list);
- if (ns != list_last_entry(&ctrl->namespaces, struct nvme_ns, list)) {
- dev_warn(ctrl->device,
- "NVME_IOCTL_IO_CMD not supported when multiple namespaces present!\n");
- ret = -EINVAL;
- goto out_unlock;
- }
+ if (copy_from_user(&cmd, argp, sizeof(cmd)))
+ return -EFAULT;

- dev_warn(ctrl->device,
- "using deprecated NVME_IOCTL_IO_CMD ioctl on the char device!\n");
- kref_get(&ns->kref);
- up_read(&ctrl->namespaces_rwsem);
+ ns = nvme_find_get_ns(ctrl, cmd.nsid);
+ if (!ns) {
+ dev_err(ctrl->device,
+ "%s: could not find namespace with nsid %u\n",
+ current->comm, cmd.nsid);
+ return -EINVAL;
+ }

ret = nvme_user_cmd(ctrl, ns, argp);
nvme_put_ns(ns);
return ret;
-
-out_unlock:
- up_read(&ctrl->namespaces_rwsem);
- return ret;
}

static long nvme_dev_ioctl(struct file *file, unsigned int cmd,
--
2.30.2


2021-03-30 18:34:57

by Javier González

[permalink] [raw]
Subject: Re: [RFC PATCH] nvme: allow NVME_IOCTL_IO_CMD on controller char dev even when multiple ns

On 26.03.2021 20:59, Niklas Cassel wrote:
>From: Niklas Cassel <[email protected]>
>
>Currently when doing NVME_IOCTL_IO_CMD on the controller character device,
>the command is rejected if there is more than one namespace in the
>ctrl->namespaces list.
>
>There is not really any reason for this restriction.
>Instead, check the nsid value specified in the passthru command, and try
>to find the matching namespace in ctrl->namespaces list.
>If found, call nvme_user_cmd() on the namespace.
>If not found, reject the command.
>
>While at it, remove the warning that says that NVME_IOCTL_IO_CMD is
>deprecated on the controller character device.
>There is no comment saying why it is deprecated.
>It might be very unsafe to send a passthru command, but if that is
>the issue with this IOCTL, then we should add a warning about that
>instead.
>
>Signed-off-by: Niklas Cassel <[email protected]>

I think the idea is OK, but I have 3 questions:

1. Wouldn't this break user-space when nsid is not specified?
2. What is the use case for this? As I understand it, this char device
is primarily for admin commands sent to the controller. Do you see a
use case for sending commands to the namespace using the controller
char device?
3. Following up on the above, if the use-case is I/O, don't you think
it is cleaner to use the ongoing per-namespace char device effort? We
would very much like to get your input there and eventually send a
series together. When this is merged, we could wire that logic to the
controller char device if there is an use-case for it.

Javier

2021-03-31 14:19:41

by Niklas Cassel

[permalink] [raw]
Subject: Re: [RFC PATCH] nvme: allow NVME_IOCTL_IO_CMD on controller char dev even when multiple ns

On Tue, Mar 30, 2021 at 08:30:22PM +0200, [email protected] wrote:
> On 26.03.2021 20:59, Niklas Cassel wrote:
> > From: Niklas Cassel <[email protected]>
> >
> > Currently when doing NVME_IOCTL_IO_CMD on the controller character device,
> > the command is rejected if there is more than one namespace in the
> > ctrl->namespaces list.
> >
> > There is not really any reason for this restriction.
> > Instead, check the nsid value specified in the passthru command, and try
> > to find the matching namespace in ctrl->namespaces list.
> > If found, call nvme_user_cmd() on the namespace.
> > If not found, reject the command.
> >
> > While at it, remove the warning that says that NVME_IOCTL_IO_CMD is
> > deprecated on the controller character device.
> > There is no comment saying why it is deprecated.
> > It might be very unsafe to send a passthru command, but if that is
> > the issue with this IOCTL, then we should add a warning about that
> > instead.
> >
> > Signed-off-by: Niklas Cassel <[email protected]>
>
> I think the idea is OK, but I have 3 questions:
>
> 1. Wouldn't this break user-space when nsid is not specified?

Since this is an ioctl, the kernel will always read some value
from cmd.nsid, so I assume you mean when specifying cmd.nsid == 0.

I don't think we have anything to worry about because:

a)
Like Keith said in the other thread:
"There are no IO commands accepting a 0 NSID, so rejecting those from the
driver should be okay."

Currently, when sending a NVME_IOCTL_IO_CMD on the ctrl char dev with
cmd.nsid == 0, we will take the first namespace in the list, use the
request_queue of that namespace, and then send the command there.

Since there are no I/O commands that accept a NSID == 0, whatever you
specified in cmd.opcode, you should get an "Invalid Namespace or Format"
error back from the controller.

I don't think that there is any harm in adding a check (which is essentially
what this RFC does), that will reject the command before sending it down to
the controller.

(A potential improvement in the future, on top of this patch, is to allow
nsid.cmd == broadcast address, but that is out of scope for this patch.
And no, since the current behavior on master does reject any cmd when there
is more than one namespace attached to the controller (more than one ns in
ctrl->namespaces list), I wouldn't say that master handles this case.)


b)
If you use nvme-cli then all commands that calls nvme_submit_io_passthru()
already does:

if (!cfg.namespace_id) {
err = cfg.namespace_id = nvme_get_nsid(fd);
if (err < 0) {
perror("get-namespace-id");
goto close_fd;
}
}

So either if you do a:
nvme write-zeroes -s 0 -c 0 --namespace-id=0 /dev/nvme0
or if you completely omit --namespace-id:
nvme write-zeroes -s 0 -c 0 /dev/nvme0

nvme-cli will already reject it (since the controller char device does not
(and can not) implement NVME_IOCTL_ID).

Sure, nvme-cli is just a single user of this ioctl (there might be other
users), but it is probably the most common one.

If nvme-cli already rejects it in user space, and we concluded that the
controller will reject it, I think it should be safe to reject it also
at the kernel side.

Without this patch, we are already rejecting any command, to any nsid,
if the controller has more than one namespace attached, which I think
makes less sense.

> 2. What is the use case for this? As I understand it, this char device
> is primarily for admin commands sent to the controller. Do you see a
> use case for sending commands to the namespace using the controller
> char device?

I don't have any use case for this, more than allowing to specify whatever
nsid you want in the --namespace-id parameter for nvme, when opening
/dev/nvme0, even when the controller has more than one namespace.

Why allow NVME_IOCTL_IO_CMD on /dev/nvme0 when the controller has one
namespace attached, but not when it has more than one namespace attached?

Doesn't make sense to me.

> 3. Following up on the above, if the use-case is I/O, don't you think
> it is cleaner to use the ongoing per-namespace char device effort? We
> would very much like to get your input there and eventually send a
> series together. When this is merged, we could wire that logic to the
> controller char device if there is an use-case for it.

While I'm not against your per-namespace char device effort, I'm not sure
if clean is the word I would use to decribe creating a bunch of extra
character devices, that almost no one will use, in the root of /dev/ even
(which is what the current proposal suggests.)

Perhaps it would be better if you had to do a
echo $nsid > /sys/class/nvme/nvme0/export_unsupported_namespace
to actually create one of these suggested additional per namespace
character devices. But I do realize that things such as udev rules might
be harder to do, since not all devices would be there automatically.

However, I do agree about using the existing per namespace block devices:
e.g.:
nvme write-zeroes -s 0 -c 0 /dev/nvme0n1
is cleaner than
nvme write-zeroes -s 0 -c 0 --namespace-id=1 /dev/nvme0

And I also agree that if we manage to support IOCTLs on rejected
namespaces, then it would be nice if we could do something like:
nvme write-zeroes -s 0 -c 0 --namespace-id=4 /dev/nvme0
if NSID was a namespace that was rejected/unsupported by the kernel.


Kind regards,
Niklas

2021-04-02 17:17:20

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH] nvme: allow NVME_IOCTL_IO_CMD on controller char dev even when multiple ns

Well, there is at least one good reason for not allowing this
retroactively:

Old users of the ioctl could have complete garbage in the field, and
might send the command to a random namespace now instead of the first
one.

So unles we have a very good reason I think we should keep
NVME_IOCTL_IO_CMD on the controller char dev deprecated and maybe
eventually remove it.