2020-03-28 05:11:33

by Nick Bowler

[permalink] [raw]
Subject: [PATCH v2 0/2] nvme: compat ioctl fixes

On review of my earlier patch to correct how 32-bit addresses in the
NVME_IOCTL_ADMIN_CMD compat ioctl (via nvme_user_cmd function) were
handled, similar problems were noted in the nvme_user_cmd64 function.

Additionally, NVME_IOCTL_SUBMIT_IO is busted in the compat case because
it not only has the same 32-bit address problem, but additionally the
corresponding nvme_user_io structure padding differs between 32-bit and
64-bit x86 (and some other arches presumably have the same problem).

Note that since I do not know of any users of the NVME_IOCTL_IO64_CMD
or NVME_IOCTL_ADMIN64_CMD ioctls, I have not tested the changes to the
nvme_user_cmd64 function (but these changes are virtually identical
to those done in the other functions function).

Nick Bowler (2):
nvme: Fix compat NVME_IOCTL_SUBMIT_IO numbering
nvme: Fix compat address handling in several ioctls

drivers/nvme/host/core.c | 47 ++++++++++++++++++++++++---------
include/uapi/linux/nvme_ioctl.h | 25 ++++++++++++++++++
2 files changed, 59 insertions(+), 13 deletions(-)

--
2.24.1


2020-03-28 05:11:33

by Nick Bowler

[permalink] [raw]
Subject: [PATCH v2 1/2] nvme: Fix compat NVME_IOCTL_SUBMIT_IO numbering

When __u64 has 64-bit alignment, the nvme_user_io structure has trailing
padding. This causes problems in the compat case with 32-bit userspace
that has less strict alignment because the size of the structure differs.

Since the NVME_IOCTL_SUBMIT_IO macro encodes the structure size itself,
the result is that this ioctl does not work at all in such a scenario:

# nvme read /dev/nvme0n1 -z 512
submit-io: Inappropriate ioctl for device

But by the same token, this makes it easy to handle both cases and
since the structures differ only in unused trailing padding bytes
we can simply not read those bytes.

Signed-off-by: Nick Bowler <[email protected]>
---
drivers/nvme/host/core.c | 19 +++++++++++++------
include/uapi/linux/nvme_ioctl.h | 25 +++++++++++++++++++++++++
2 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a4d8c90ee7cc..9eccf56494de 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1248,14 +1248,19 @@ static void nvme_enable_aen(struct nvme_ctrl *ctrl)
queue_work(nvme_wq, &ctrl->async_event_work);
}

-static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio)
+static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio,
+ size_t uio_size)
{
struct nvme_user_io io;
struct nvme_command c;
unsigned length, meta_len;
void __user *metadata;

- if (copy_from_user(&io, uio, sizeof(io)))
+ /*
+ * To handle the compat case the amount of data read may be reduced as
+ * the only difference is in unused padding at the end of the structure.
+ */
+ if (copy_from_user(&io, uio, uio_size))
return -EFAULT;
if (io.flags)
return -EINVAL;
@@ -1559,6 +1564,11 @@ static int nvme_ioctl(struct block_device *bdev, fmode_t mode,
if (is_ctrl_ioctl(cmd))
return nvme_handle_ctrl_ioctl(ns, cmd, argp, head, srcu_idx);

+ if (cmd == NVME_IOCTL_SUBMIT_IO || cmd == NVME_IOCTL_SUBMIT_IO_COMPAT) {
+ ret = nvme_submit_io(ns, argp, _IOC_SIZE(cmd));
+ goto out;
+ }
+
switch (cmd) {
case NVME_IOCTL_ID:
force_successful_syscall_return();
@@ -1567,9 +1577,6 @@ static int nvme_ioctl(struct block_device *bdev, fmode_t mode,
case NVME_IOCTL_IO_CMD:
ret = nvme_user_cmd(ns->ctrl, ns, argp);
break;
- case NVME_IOCTL_SUBMIT_IO:
- ret = nvme_submit_io(ns, argp);
- break;
case NVME_IOCTL_IO64_CMD:
ret = nvme_user_cmd64(ns->ctrl, ns, argp);
break;
@@ -1579,7 +1586,7 @@ static int nvme_ioctl(struct block_device *bdev, fmode_t mode,
else
ret = -ENOTTY;
}
-
+out:
nvme_put_ns_from_disk(head, srcu_idx);
return ret;
}
diff --git a/include/uapi/linux/nvme_ioctl.h b/include/uapi/linux/nvme_ioctl.h
index d99b5a772698..60e20f23bec9 100644
--- a/include/uapi/linux/nvme_ioctl.h
+++ b/include/uapi/linux/nvme_ioctl.h
@@ -9,6 +9,31 @@

#include <linux/types.h>

+#ifdef __KERNEL__
+/*
+ * The nvme_user_io structure has excess padding at the end when __u64 has
+ * 64-bit alignment. In the compat case with less strict alignment, there
+ * is no such padding. The nvme_user_io_compat structure has otherwise
+ * identical layout to nvme_user_io as there is no padding between members.
+ */
+struct nvme_user_io_compat {
+ __u8 opcode;
+ __u8 flags;
+ __u16 control;
+ __u16 nblocks;
+ __u16 rsvd;
+ __u64 metadata;
+ __u64 addr;
+ __u64 slba;
+ __u32 dsmgmt;
+ __u32 reftag;
+ __u16 apptag;
+ __u16 appmask;
+} __attribute__((packed));
+
+#define NVME_IOCTL_SUBMIT_IO_COMPAT _IOW('N', 0x42, struct nvme_user_io_compat)
+#endif
+
struct nvme_user_io {
__u8 opcode;
__u8 flags;
--
2.24.1

2020-03-28 05:12:09

by Nick Bowler

[permalink] [raw]
Subject: [PATCH v2 2/2] nvme: Fix compat address handling in several ioctls

On a 32-bit kernel, the upper bits of userspace addresses passed
via various ioctls are silently ignored by the nvme driver.

However on a 64-bit kernel running a compat task, these upper bits are
not ignored and are in fact required to be zero for the ioctls to work.

Unfortunately, this difference matters. 32-bit smartctl submits the
NVME_IOCTL_ADMIN_CMD ioctl with garbage in these upper bits because
it seems the pointer value it puts into the nvme_passthru_cmd structure
is sign extended. This works fine on 32-bit kernels but fails on a
64-bit one because (at least on my setup) the addresses smartctl uses
are consistently above 2G. For example:

# smartctl -x /dev/nvme0n1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.5.11] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, http://www.smartmontools.org

Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Bad address

Since changing 32-bit kernels to actually check all of the submitted
address bits now would break existing userspace, this patch fixes the
compat problem by explicitly zeroing the upper bits in the compat case.
This enables 32-bit smartctl to work on a 64-bit kernel.

Signed-off-by: Nick Bowler <[email protected]>
---
drivers/nvme/host/core.c | 28 +++++++++++++++++++++-------
1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 9eccf56494de..f265ccd69dd7 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -6,6 +6,7 @@

#include <linux/blkdev.h>
#include <linux/blk-mq.h>
+#include <linux/compat.h>
#include <linux/delay.h>
#include <linux/errno.h>
#include <linux/hdreg.h>
@@ -1248,6 +1249,19 @@ static void nvme_enable_aen(struct nvme_ctrl *ctrl)
queue_work(nvme_wq, &ctrl->async_event_work);
}

+/*
+ * Convert integer values from ioctl structures to user pointers, silently
+ * ignoring the upper bits in the compat case to match behaviour of 32-bit
+ * kernels.
+ */
+static void __user *nvme_to_user_ptr(uintptr_t ptrval)
+{
+ if (in_compat_syscall())
+ ptrval = (compat_uptr_t)ptrval;
+
+ return (void __user *)ptrval;
+}
+
static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio,
size_t uio_size)
{
@@ -1276,7 +1290,7 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio,

length = (io.nblocks + 1) << ns->lba_shift;
meta_len = (io.nblocks + 1) * ns->ms;
- metadata = (void __user *)(uintptr_t)io.metadata;
+ metadata = nvme_to_user_ptr(io.metadata);

if (ns->ext) {
length += meta_len;
@@ -1299,7 +1313,7 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio,
c.rw.appmask = cpu_to_le16(io.appmask);

return nvme_submit_user_cmd(ns->queue, &c,
- (void __user *)(uintptr_t)io.addr, length,
+ nvme_to_user_ptr(io.addr), length,
metadata, meta_len, lower_32_bits(io.slba), NULL, 0);
}

@@ -1419,9 +1433,9 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns,

effects = nvme_passthru_start(ctrl, ns, cmd.opcode);
status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c,
- (void __user *)(uintptr_t)cmd.addr, cmd.data_len,
- (void __user *)(uintptr_t)cmd.metadata,
- cmd.metadata_len, 0, &result, timeout);
+ nvme_to_user_ptr(cmd.addr), cmd.data_len,
+ nvme_to_user_ptr(cmd.metadata), cmd.metadata_len,
+ 0, &result, timeout);
nvme_passthru_end(ctrl, effects);

if (status >= 0) {
@@ -1466,8 +1480,8 @@ static int nvme_user_cmd64(struct nvme_ctrl *ctrl, struct nvme_ns *ns,

effects = nvme_passthru_start(ctrl, ns, cmd.opcode);
status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c,
- (void __user *)(uintptr_t)cmd.addr, cmd.data_len,
- (void __user *)(uintptr_t)cmd.metadata, cmd.metadata_len,
+ nvme_to_user_ptr(cmd.addr), cmd.data_len,
+ nvme_to_user_ptr(cmd.metadata), cmd.metadata_len,
0, &cmd.result, timeout);
nvme_passthru_end(ctrl, effects);

--
2.24.1

2020-03-28 08:27:41

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] nvme: Fix compat NVME_IOCTL_SUBMIT_IO numbering

On Sat, Mar 28, 2020 at 01:09:08AM -0400, Nick Bowler wrote:
> When __u64 has 64-bit alignment, the nvme_user_io structure has trailing
> padding. This causes problems in the compat case with 32-bit userspace
> that has less strict alignment because the size of the structure differs.
>
> Since the NVME_IOCTL_SUBMIT_IO macro encodes the structure size itself,
> the result is that this ioctl does not work at all in such a scenario:
>
> # nvme read /dev/nvme0n1 -z 512
> submit-io: Inappropriate ioctl for device
>
> But by the same token, this makes it easy to handle both cases and
> since the structures differ only in unused trailing padding bytes
> we can simply not read those bytes.
>
> Signed-off-by: Nick Bowler <[email protected]>

I think we already have a similar patch titled
"nvme: Add compat_ioctl handler for NVME_IOCTL_SUBMIT_IO" in
linux-next, with the difference of actually implementing the
.compat_ioctl entry point.

2020-03-28 08:27:41

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] nvme: Fix compat address handling in several ioctls

Looks good, and I really like the new nvme_to_user_ptr helper!

Reviewed-by: Christoph Hellwig <[email protected]>

2020-03-28 13:57:16

by Nick Bowler

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] nvme: Fix compat NVME_IOCTL_SUBMIT_IO numbering

On 28/03/2020, Christoph Hellwig <[email protected]> wrote:
> On Sat, Mar 28, 2020 at 01:09:08AM -0400, Nick Bowler wrote:
>> When __u64 has 64-bit alignment, the nvme_user_io structure has trailing
>> padding. This causes problems in the compat case with 32-bit userspace
>> that has less strict alignment because the size of the structure differs.
>>
>> Since the NVME_IOCTL_SUBMIT_IO macro encodes the structure size itself,
>> the result is that this ioctl does not work at all in such a scenario:
>>
>> # nvme read /dev/nvme0n1 -z 512
>> submit-io: Inappropriate ioctl for device
>>
>> But by the same token, this makes it easy to handle both cases and
>> since the structures differ only in unused trailing padding bytes
>> we can simply not read those bytes.
>>
>> Signed-off-by: Nick Bowler <[email protected]>
>
> I think we already have a similar patch titled
> "nvme: Add compat_ioctl handler for NVME_IOCTL_SUBMIT_IO" in
> linux-next, with the difference of actually implementing the
> .compat_ioctl entry point.

OK, I found that one and it looks to solve the same problem.

I'm not sure about copying the nonexistent trailing padding from 32-bit
userspace but that may not be a problem in practice.

So feel free to drop this patch.

Thanks,
Nick