2023-08-19 10:48:27

by Breno Leitao

[permalink] [raw]
Subject: [PATCH v3 0/9] io_uring: Initial support for {s,g}etsockopt commands

This patchset adds support for getsockopt (SOCKET_URING_OP_GETSOCKOPT)
and setsockopt (SOCKET_URING_OP_SETSOCKOPT) in io_uring commands.
SOCKET_URING_OP_SETSOCKOPT implements generic case, covering all levels
and optnames. SOCKET_URING_OP_GETSOCKOPT is limited, for now, to SOL_SOCKET
level, which seems to be the most common level parameter for get/setsockopt(2).

In order to keep the implementation (and tests) simple, some refactors were done
prior to the changes, as follows:

Patches 1-2: Modify the BPF hooks to support sockptr_t, so, these functions
become flexible enough to accept user or kernel pointers for optval/optlen.

Patch 3: Extract the core setsockopt() core function from __sys_setsockopt, so,
the code code could be reused by other callers, such as io_uring.

Patch 4: Pass compat mode to the file/socket callbacks.

Patch 5: Move io_uring helpers from io_uring_zerocopy_tx to a generic io_uring
headers. This simplify the testcase (last patch)

PS1: For getsockopt command, the optlen field is not a userspace
pointers, but an absolute value, so this is slightly different from
getsockopt(2) behaviour. The new optlen value is returned in cqe->res.

PS2: The userspace pointers need to be alive until the operation is
completed.

These changes were tested with a new test[1] in liburing, as also with
bpf/progs/sockopt test case, which is now adapted to run using both system
calls and io_uring commands.

[1] Link: https://github.com/leitao/liburing/blob/getsock/test/socket-getsetsock-cmd.c

RFC -> V1:
* Copy user memory at io_uring subsystem, and call proto_ops
callbacks using kernel memory
* Implement all the cases for SOCKET_URING_OP_SETSOCKOPT

V1 -> V2
* Implemented the BPF part
* Using user pointers from optval to avoid kmalloc in io_uring part.

V2 -> V3:
* Break down __sys_setsockopt and reuse the core code, avoiding
duplicated code. This removed the requirement to export
sock_use_custom_sol_socket() as done in v2.
* Added io_uring test to selftests/bpf/sockopt.
* Fixed "compat" argument, by passing it to the issue_flags.

Breno Leitao (9):
bpf: Leverage sockptr_t in BPF getsockopt hook
bpf: Leverage sockptr_t in BPF setsockopt hook
net/socket: Break down __sys_setsockopt
io_uring/cmd: Pass compat mode in issue_flags
selftests/net: Extract uring helpers to be reusable
io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
io_uring/cmd: Introduce SOCKET_URING_OP_SETSOCKOPT
io_uring/cmd: BPF hook for getsockopt cmd
selftests/bpf/sockopt: Add io_uring support


2023-08-19 16:58:29

by Breno Leitao

[permalink] [raw]
Subject: [PATCH v3 7/9] io_uring/cmd: Introduce SOCKET_URING_OP_SETSOCKOPT

Add initial support for SOCKET_URING_OP_SETSOCKOPT. This new command is
similar to setsockopt. This implementation leverages the function
do_sock_setsockopt(), which is shared with the setsockopt() system call
path.

Important to say that userspace needs to keep the pointer's memory alive
until the operation is completed. I.e, the memory could not be
deallocated before the CQE is returned to userspace.

Signed-off-by: Breno Leitao <[email protected]>
---
include/uapi/linux/io_uring.h | 1 +
io_uring/uring_cmd.c | 16 ++++++++++++++++
2 files changed, 17 insertions(+)

diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 8152151080db..3fe82df06abf 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -736,6 +736,7 @@ enum {
SOCKET_URING_OP_SIOCINQ = 0,
SOCKET_URING_OP_SIOCOUTQ,
SOCKET_URING_OP_GETSOCKOPT,
+ SOCKET_URING_OP_SETSOCKOPT,
};

#ifdef __cplusplus
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 7b768f1bf4df..a567dd32df00 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -197,6 +197,20 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock,
return -EOPNOTSUPP;
}

+static inline int io_uring_cmd_setsockopt(struct socket *sock,
+ struct io_uring_cmd *cmd,
+ unsigned int issue_flags)
+{
+ void __user *optval = u64_to_user_ptr(READ_ONCE(cmd->sqe->optval));
+ bool compat = !!(issue_flags & IO_URING_F_COMPAT);
+ int optname = READ_ONCE(cmd->sqe->optname);
+ sockptr_t optval_s = USER_SOCKPTR(optval);
+ int optlen = READ_ONCE(cmd->sqe->optlen);
+ int level = READ_ONCE(cmd->sqe->level);
+
+ return do_sock_setsockopt(sock, compat, level, optname, optval_s, optlen);
+}
+
#if defined(CONFIG_NET)
int io_uring_cmd_sock(struct io_uring_cmd *cmd, unsigned int issue_flags)
{
@@ -221,6 +235,8 @@ int io_uring_cmd_sock(struct io_uring_cmd *cmd, unsigned int issue_flags)
return arg;
case SOCKET_URING_OP_GETSOCKOPT:
return io_uring_cmd_getsockopt(sock, cmd, issue_flags);
+ case SOCKET_URING_OP_SETSOCKOPT:
+ return io_uring_cmd_setsockopt(sock, cmd, issue_flags);
default:
return -EOPNOTSUPP;
}
--
2.34.1


2023-08-20 16:23:59

by Breno Leitao

[permalink] [raw]
Subject: [PATCH v3 3/9] net/socket: Break down __sys_setsockopt

Split __sys_setsockopt() into two functions by removing the core
logic into a sub-function (do_sock_setsockopt()). This will avoid
code duplication when doing the same operation in other callers, for
instance.

do_sock_setsockopt() will be called by io_uring setsockopt() command
operation in the following patch.

Signed-off-by: Breno Leitao <[email protected]>
---
include/net/sock.h | 2 ++
net/socket.c | 39 +++++++++++++++++++++++++--------------
2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 2eb916d1ff64..2a0324275347 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1853,6 +1853,8 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
sockptr_t optval, unsigned int optlen);
int sock_setsockopt(struct socket *sock, int level, int op,
sockptr_t optval, unsigned int optlen);
+int do_sock_setsockopt(struct socket *sock, bool compat, int level,
+ int optname, sockptr_t optval, int optlen);

int sk_getsockopt(struct sock *sk, int level, int optname,
sockptr_t optval, sockptr_t optlen);
diff --git a/net/socket.c b/net/socket.c
index 2c3ef8862b4f..b5e4398a6b4d 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2221,30 +2221,20 @@ static bool sock_use_custom_sol_socket(const struct socket *sock)
return test_bit(SOCK_CUSTOM_SOCKOPT, &sock->flags);
}

-/*
- * Set a socket option. Because we don't know the option lengths we have
- * to pass the user mode parameter for the protocols to sort out.
- */
-int __sys_setsockopt(int fd, int level, int optname, char __user *user_optval,
- int optlen)
+int do_sock_setsockopt(struct socket *sock, bool compat, int level,
+ int optname, sockptr_t optval, int optlen)
{
- sockptr_t optval = USER_SOCKPTR(user_optval);
char *kernel_optval = NULL;
- int err, fput_needed;
- struct socket *sock;
+ int err;

if (optlen < 0)
return -EINVAL;

- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (!sock)
- return err;
-
err = security_socket_setsockopt(sock, level, optname);
if (err)
goto out_put;

- if (!in_compat_syscall())
+ if (!compat)
err = BPF_CGROUP_RUN_PROG_SETSOCKOPT(sock->sk, &level, &optname,
optval, &optlen,
&kernel_optval);
@@ -2266,6 +2256,27 @@ int __sys_setsockopt(int fd, int level, int optname, char __user *user_optval,
optlen);
kfree(kernel_optval);
out_put:
+ return err;
+}
+EXPORT_SYMBOL(do_sock_setsockopt);
+
+/* Set a socket option. Because we don't know the option lengths we have
+ * to pass the user mode parameter for the protocols to sort out.
+ */
+int __sys_setsockopt(int fd, int level, int optname, char __user *user_optval,
+ int optlen)
+{
+ sockptr_t optval = USER_SOCKPTR(user_optval);
+ bool compat = in_compat_syscall();
+ int err, fput_needed;
+ struct socket *sock;
+
+ sock = sockfd_lookup_light(fd, &err, &fput_needed);
+ if (!sock)
+ return err;
+
+ err = do_sock_setsockopt(sock, compat, level, optname, optval, optlen);
+
fput_light(sock->file, fput_needed);
return err;
}
--
2.34.1