2023-04-21 11:49:53

by Breno Leitao

[permalink] [raw]
Subject: [PATCH v2 0/3] io_uring: Pass the whole sqe to commands

These three patches prepare for the sock support in the io_uring cmd, as
described in the following RFC:

https://lore.kernel.org/lkml/[email protected]/

Since the support linked above depends on other refactors, such as the sock
ioctl() sock refactor[1], I would like to start integrating patches that have
consensus and can bring value right now. This will also reduce the patchset
size later.

Regarding to these three patches, they are simple changes that turn
io_uring cmd subsystem more flexible (by passing the whole SQE to the
command), and cleaning up an unnecessary compile check.

These patches were tested by creating a file system and mounting an NVME disk
using ubdsrv/ublkb0.

[1] [email protected]/">https://lore.kernel.org/lkml/[email protected]/

V1 -> V2 :
* Create a helper to return the size of the SQE

Breno Leitao (3):
io_uring: Create a helper to return the SQE size
io_uring: Pass whole sqe to commands
io_uring: Remove unnecessary BUILD_BUG_ON

drivers/block/ublk_drv.c | 24 ++++++++++++------------
drivers/nvme/host/ioctl.c | 2 +-
include/linux/io_uring.h | 2 +-
io_uring/io_uring.h | 3 +++
io_uring/opdef.c | 2 +-
io_uring/uring_cmd.c | 13 ++++---------
io_uring/uring_cmd.h | 8 --------
7 files changed, 22 insertions(+), 32 deletions(-)

--
2.34.1


2023-04-21 11:49:53

by Breno Leitao

[permalink] [raw]
Subject: [PATCH v2 1/3] io_uring: Create a helper to return the SQE size

Create a simple helper that returns the size of the SQE. The SQE could
have two size, depending of the flags.

If IO_URING_SETUP_SQE128 flag is set, then return a double SQE,
otherwise returns the sizeof of io_uring_sqe (64 bytes).

Signed-off-by: Breno Leitao <[email protected]>
---
io_uring/io_uring.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index 25515d69d205..25597a771929 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -394,4 +394,7 @@ static inline void io_req_queue_tw_complete(struct io_kiocb *req, s32 res)
io_req_task_work_add(req);
}

+#define uring_sqe_size(ctx) \
+ ((1 + !!(ctx->flags & IORING_SETUP_SQE128)) * sizeof(struct io_uring_sqe))
+
#endif
--
2.34.1

2023-04-21 11:50:26

by Breno Leitao

[permalink] [raw]
Subject: [PATCH v2 3/3] io_uring: Remove unnecessary BUILD_BUG_ON

In the io_uring_cmd_prep_async() there is a unnecessary compilation time
check to check if cmd is correctly placed at field 48 of the SQE.

This is uncessary, since this check is already in place at
io_uring_init():

BUILD_BUG_SQE_ELEM(48, __u64, addr3);

Remove it and the uring_cmd_pdu_size() function, which is not used
anymore.

Keith started a discussion about this topic in the following thread:
https://lore.kernel.org/lkml/[email protected]/

Signed-off-by: Breno Leitao <[email protected]>
---
io_uring/uring_cmd.c | 3 ---
io_uring/uring_cmd.h | 8 --------
2 files changed, 11 deletions(-)

diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index a1be746cd009..743d1496431b 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -71,9 +71,6 @@ int io_uring_cmd_prep_async(struct io_kiocb *req)
struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
size_t size = uring_sqe_size(req->ctx);

- BUILD_BUG_ON(uring_cmd_pdu_size(0) != 16);
- BUILD_BUG_ON(uring_cmd_pdu_size(1) != 80);
-
memcpy(req->async_data, ioucmd->sqe, size);
ioucmd->sqe = req->async_data;
return 0;
diff --git a/io_uring/uring_cmd.h b/io_uring/uring_cmd.h
index 7c6697d13cb2..8117684ec3ca 100644
--- a/io_uring/uring_cmd.h
+++ b/io_uring/uring_cmd.h
@@ -3,11 +3,3 @@
int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags);
int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe);
int io_uring_cmd_prep_async(struct io_kiocb *req);
-
-/*
- * The URING_CMD payload starts at 'cmd' in the first sqe, and continues into
- * the following sqe if SQE128 is used.
- */
-#define uring_cmd_pdu_size(is_sqe128) \
- ((1 + !!(is_sqe128)) * sizeof(struct io_uring_sqe) - \
- offsetof(struct io_uring_sqe, cmd))
--
2.34.1

2023-04-21 11:50:30

by Breno Leitao

[permalink] [raw]
Subject: [PATCH v2 2/3] io_uring: Pass whole sqe to commands

Currently uring CMD operation relies on having large SQEs, but future
operations might want to use normal SQE.

The io_uring_cmd currently only saves the payload (cmd) part of the SQE,
but, for commands that use normal SQE size, it might be necessary to
access the initial SQE fields outside of the payload/cmd block. So,
saves the whole SQE other than just the pdu.

This changes slightly how the io_uring_cmd works, since the cmd
structures and callbacks are not opaque to io_uring anymore. I.e, the
callbacks can look at the SQE entries, not only, in the cmd structure.

The main advantage is that we don't need to create custom structures for
simple commands.

Suggested-by: Pavel Begunkov <[email protected]>
Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
---
drivers/block/ublk_drv.c | 24 ++++++++++++------------
drivers/nvme/host/ioctl.c | 3 ++-
include/linux/io_uring.h | 2 +-
io_uring/opdef.c | 2 +-
io_uring/uring_cmd.c | 10 ++++------
5 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index c73cc57ec547..ec23a3c9fac8 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -1263,7 +1263,7 @@ static void ublk_handle_need_get_data(struct ublk_device *ub, int q_id,

static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags)
{
- struct ublksrv_io_cmd *ub_cmd = (struct ublksrv_io_cmd *)cmd->cmd;
+ struct ublksrv_io_cmd *ub_cmd = (struct ublksrv_io_cmd *)cmd->sqe->cmd;
struct ublk_device *ub = cmd->file->private_data;
struct ublk_queue *ubq;
struct ublk_io *io;
@@ -1567,7 +1567,7 @@ static struct ublk_device *ublk_get_device_from_id(int idx)

static int ublk_ctrl_start_dev(struct ublk_device *ub, struct io_uring_cmd *cmd)
{
- struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd;
int ublksrv_pid = (int)header->data[0];
struct gendisk *disk;
int ret = -EINVAL;
@@ -1630,7 +1630,7 @@ static int ublk_ctrl_start_dev(struct ublk_device *ub, struct io_uring_cmd *cmd)
static int ublk_ctrl_get_queue_affinity(struct ublk_device *ub,
struct io_uring_cmd *cmd)
{
- struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd;
void __user *argp = (void __user *)(unsigned long)header->addr;
cpumask_var_t cpumask;
unsigned long queue;
@@ -1681,7 +1681,7 @@ static inline void ublk_dump_dev_info(struct ublksrv_ctrl_dev_info *info)

static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd)
{
- struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd;
void __user *argp = (void __user *)(unsigned long)header->addr;
struct ublksrv_ctrl_dev_info info;
struct ublk_device *ub;
@@ -1844,7 +1844,7 @@ static int ublk_ctrl_del_dev(struct ublk_device **p_ub)

static inline void ublk_ctrl_cmd_dump(struct io_uring_cmd *cmd)
{
- struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd;

pr_devel("%s: cmd_op %x, dev id %d qid %d data %llx buf %llx len %u\n",
__func__, cmd->cmd_op, header->dev_id, header->queue_id,
@@ -1863,7 +1863,7 @@ static int ublk_ctrl_stop_dev(struct ublk_device *ub)
static int ublk_ctrl_get_dev_info(struct ublk_device *ub,
struct io_uring_cmd *cmd)
{
- struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd;
void __user *argp = (void __user *)(unsigned long)header->addr;

if (header->len < sizeof(struct ublksrv_ctrl_dev_info) || !header->addr)
@@ -1894,7 +1894,7 @@ static void ublk_ctrl_fill_params_devt(struct ublk_device *ub)
static int ublk_ctrl_get_params(struct ublk_device *ub,
struct io_uring_cmd *cmd)
{
- struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd;
void __user *argp = (void __user *)(unsigned long)header->addr;
struct ublk_params_header ph;
int ret;
@@ -1925,7 +1925,7 @@ static int ublk_ctrl_get_params(struct ublk_device *ub,
static int ublk_ctrl_set_params(struct ublk_device *ub,
struct io_uring_cmd *cmd)
{
- struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd;
void __user *argp = (void __user *)(unsigned long)header->addr;
struct ublk_params_header ph;
int ret = -EFAULT;
@@ -1983,7 +1983,7 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
static int ublk_ctrl_start_recovery(struct ublk_device *ub,
struct io_uring_cmd *cmd)
{
- struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd;
int ret = -EINVAL;
int i;

@@ -2025,7 +2025,7 @@ static int ublk_ctrl_start_recovery(struct ublk_device *ub,
static int ublk_ctrl_end_recovery(struct ublk_device *ub,
struct io_uring_cmd *cmd)
{
- struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd;
int ublksrv_pid = (int)header->data[0];
int ret = -EINVAL;

@@ -2092,7 +2092,7 @@ static int ublk_char_dev_permission(struct ublk_device *ub,
static int ublk_ctrl_uring_cmd_permission(struct ublk_device *ub,
struct io_uring_cmd *cmd)
{
- struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd;
bool unprivileged = ub->dev_info.flags & UBLK_F_UNPRIVILEGED_DEV;
void __user *argp = (void __user *)(unsigned long)header->addr;
char *dev_path = NULL;
@@ -2171,7 +2171,7 @@ static int ublk_ctrl_uring_cmd_permission(struct ublk_device *ub,
static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd,
unsigned int issue_flags)
{
- struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd;
struct ublk_device *ub = NULL;
int ret = -EINVAL;

diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index d24ea2e05156..b69604485670 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -552,7 +552,8 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
struct io_uring_cmd *ioucmd, unsigned int issue_flags, bool vec)
{
struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd);
- const struct nvme_uring_cmd *cmd = ioucmd->cmd;
+ const struct nvme_uring_cmd *cmd =
+ (struct nvme_uring_cmd *)ioucmd->sqe->cmd;
struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
struct nvme_uring_data d;
struct nvme_command c;
diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index 35b9328ca335..2dfc81dd6d1a 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -24,7 +24,7 @@ enum io_uring_cmd_flags {

struct io_uring_cmd {
struct file *file;
- const void *cmd;
+ const struct io_uring_sqe *sqe;
union {
/* callback to defer completions to task context */
void (*task_work_cb)(struct io_uring_cmd *cmd, unsigned);
diff --git a/io_uring/opdef.c b/io_uring/opdef.c
index cca7c5b55208..3b9c6489b8b6 100644
--- a/io_uring/opdef.c
+++ b/io_uring/opdef.c
@@ -627,7 +627,7 @@ const struct io_cold_def io_cold_defs[] = {
},
[IORING_OP_URING_CMD] = {
.name = "URING_CMD",
- .async_size = uring_cmd_pdu_size(1),
+ .async_size = 2 * sizeof(struct io_uring_sqe),
.prep_async = io_uring_cmd_prep_async,
},
[IORING_OP_SEND_ZC] = {
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 5113c9a48583..a1be746cd009 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -69,15 +69,13 @@ EXPORT_SYMBOL_GPL(io_uring_cmd_done);
int io_uring_cmd_prep_async(struct io_kiocb *req)
{
struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
- size_t cmd_size;
+ size_t size = uring_sqe_size(req->ctx);

BUILD_BUG_ON(uring_cmd_pdu_size(0) != 16);
BUILD_BUG_ON(uring_cmd_pdu_size(1) != 80);

- cmd_size = uring_cmd_pdu_size(req->ctx->flags & IORING_SETUP_SQE128);
-
- memcpy(req->async_data, ioucmd->cmd, cmd_size);
- ioucmd->cmd = req->async_data;
+ memcpy(req->async_data, ioucmd->sqe, size);
+ ioucmd->sqe = req->async_data;
return 0;
}

@@ -103,7 +101,7 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
req->imu = ctx->user_bufs[index];
io_req_set_rsrc_node(req, ctx, 0);
}
- ioucmd->cmd = sqe->cmd;
+ ioucmd->sqe = sqe;
ioucmd->cmd_op = READ_ONCE(sqe->cmd_op);
return 0;
}
--
2.34.1

2023-04-24 07:22:22

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] io_uring: Create a helper to return the SQE size

On Fri, Apr 21, 2023 at 04:44:38AM -0700, Breno Leitao wrote:
> +#define uring_sqe_size(ctx) \
> + ((1 + !!(ctx->flags & IORING_SETUP_SQE128)) * sizeof(struct io_uring_sqe))

Please turn this into an actually readable inline function:

/*
* IORING_SETUP_SQE128 contexts allocate twice the normal SQE size for each
* slot.
*/
static inline size_t uring_sqe_size(struct io_ring_ctx *ctx)
{
if (ctx->flags & IORING_SETUP_SQE128)
return 2 * sizeof(struct io_uring_sqe);
return sizeof(struct io_uring_sqe);
}

2023-04-24 07:31:52

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] io_uring: Pass whole sqe to commands

On Fri, Apr 21, 2023 at 04:44:39AM -0700, Breno Leitao wrote:
> - struct ublksrv_io_cmd *ub_cmd = (struct ublksrv_io_cmd *)cmd->cmd;
> + struct ublksrv_io_cmd *ub_cmd = (struct ublksrv_io_cmd *)cmd->sqe->cmd;

As mentioned in my (late) reply to the previous series, please
add a helper like:

static inline const void *io_uring_sqe_cmd(struct io_uring_sqe *sqe)
{
return sqe->cmd;
}

and then avoid all these casts.

> int io_uring_cmd_prep_async(struct io_kiocb *req)
> {
> struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
> + size_t size = uring_sqe_size(req->ctx);
>
> BUILD_BUG_ON(uring_cmd_pdu_size(0) != 16);
> BUILD_BUG_ON(uring_cmd_pdu_size(1) != 80);
>
> + memcpy(req->async_data, ioucmd->sqe, size);
> + ioucmd->sqe = req->async_data;

This can skip the size local variable now.

2023-04-24 07:31:56

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] io_uring: Remove unnecessary BUILD_BUG_ON

Looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2023-04-28 17:28:01

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] io_uring: Remove unnecessary BUILD_BUG_ON

On 4/21/23 5:44?AM, Breno Leitao wrote:
> In the io_uring_cmd_prep_async() there is a unnecessary compilation time
> check to check if cmd is correctly placed at field 48 of the SQE.
>
> This is uncessary, since this check is already in place at
> io_uring_init():
>
> BUILD_BUG_SQE_ELEM(48, __u64, addr3);
>
> Remove it and the uring_cmd_pdu_size() function, which is not used
> anymore.
>
> Keith started a discussion about this topic in the following thread:
> https://lore.kernel.org/lkml/[email protected]/

Just turn that into a:

Link: https://lore.kernel.org/lkml/[email protected]/

instead.

--
Jens Axboe

2023-04-28 17:31:13

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] io_uring: Pass the whole sqe to commands

On 4/21/23 5:44?AM, Breno Leitao wrote:
> These three patches prepare for the sock support in the io_uring cmd, as
> described in the following RFC:
>
> https://lore.kernel.org/lkml/[email protected]/
>
> Since the support linked above depends on other refactors, such as the sock
> ioctl() sock refactor[1], I would like to start integrating patches that have
> consensus and can bring value right now. This will also reduce the patchset
> size later.
>
> Regarding to these three patches, they are simple changes that turn
> io_uring cmd subsystem more flexible (by passing the whole SQE to the
> command), and cleaning up an unnecessary compile check.
>
> These patches were tested by creating a file system and mounting an NVME disk
> using ubdsrv/ublkb0.

Looks mostly good to me, do agree with Christoph's comments on the two
patches. Can you spin a v3? Would be annoying to miss 6.4 with this, as
other things will be built on top of it.

--
Jens Axboe

2023-04-30 15:04:11

by Breno Leitao

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] io_uring: Pass the whole sqe to commands

Hello Jens,

On Fri, Apr 28, 2023 at 11:28:14AM -0600, Jens Axboe wrote:
> On 4/21/23 5:44?AM, Breno Leitao wrote:
> > These three patches prepare for the sock support in the io_uring cmd, as
> > described in the following RFC:
> >
> > https://lore.kernel.org/lkml/[email protected]/
> >
> > Since the support linked above depends on other refactors, such as the sock
> > ioctl() sock refactor[1], I would like to start integrating patches that have
> > consensus and can bring value right now. This will also reduce the patchset
> > size later.
> >
> > Regarding to these three patches, they are simple changes that turn
> > io_uring cmd subsystem more flexible (by passing the whole SQE to the
> > command), and cleaning up an unnecessary compile check.
> >
> > These patches were tested by creating a file system and mounting an NVME disk
> > using ubdsrv/ublkb0.
>
> Looks mostly good to me, do agree with Christoph's comments on the two
> patches. Can you spin a v3? Would be annoying to miss 6.4 with this, as
> other things will be built on top of it.

Sure. I've just sent V3 with all the fixes discussed in this email
thread.

Here is the link: https://lkml.org/lkml/2023/4/30/91