2022-05-09 07:16:28

by Hao Xu

[permalink] [raw]
Subject: [PATCH v3 0/4] fast poll multishot mode

Let multishot support multishot mode, currently only add accept as its
first comsumer.
theoretical analysis:
1) when connections come in fast
- singleshot:
add accept sqe(userpsace) --> accept inline
^ |
|-----------------|
- multishot:
add accept sqe(userspace) --> accept inline
^ |
|--*--|

we do accept repeatedly in * place until get EAGAIN

2) when connections come in at a low pressure
similar thing like 1), we reduce a lot of userspace-kernel context
switch and useless vfs_poll()


tests:
Did some tests, which goes in this way:

server client(multiple)
accept connect
read write
write read
close close

Basically, raise up a number of clients(on same machine with server) to
connect to the server, and then write some data to it, the server will
write those data back to the client after it receives them, and then
close the connection after write return. Then the client will read the
data and then close the connection. Here I test 10000 clients connect
one server, data size 128 bytes. And each client has a go routine for
it, so they come to the server in short time.
test 20 times before/after this patchset, time spent:(unit cycle, which
is the return value of clock())
before:
1930136+1940725+1907981+1947601+1923812+1928226+1911087+1905897+1941075
+1934374+1906614+1912504+1949110+1908790+1909951+1941672+1969525+1934984
+1934226+1914385)/20.0 = 1927633.75
after:
1858905+1917104+1895455+1963963+1892706+1889208+1874175+1904753+1874112
+1874985+1882706+1884642+1864694+1906508+1916150+1924250+1869060+1889506
+1871324+1940803)/20.0 = 1894750.45

(1927633.75 - 1894750.45) / 1927633.75 = 1.65%


A liburing test is here:
https://github.com/HowHsu/liburing/blob/multishot_accept/test/accept.c

v1->v2:
- re-implement it against the reworked poll code

v2->v3:
- fold in code tweak and clean from Jens
- use io_issue_sqe rather than io_queue_sqe, since the former one
return the internal error back which makes more sense
- remove io_poll_clean() and its friends since they are not needed


Hao Xu (4):
io_uring: add IORING_ACCEPT_MULTISHOT for accept
io_uring: add REQ_F_APOLL_MULTISHOT for requests
io_uring: let fast poll support multishot
io_uring: implement multishot mode for accept

fs/io_uring.c | 94 +++++++++++++++++++++++++++--------
include/uapi/linux/io_uring.h | 5 ++
2 files changed, 79 insertions(+), 20 deletions(-)


base-commit: 0a194603ba7ee67b4e39ec0ee5cda70a356ea618
--
2.36.0



2022-05-09 08:02:03

by Hao Xu

[permalink] [raw]
Subject: [PATCH 2/4] io_uring: add REQ_F_APOLL_MULTISHOT for requests

From: Hao Xu <[email protected]>

Add a flag to indicate multishot mode for fast poll. currently only
accept use it, but there may be more operations leveraging it in the
future. Also add a mask IO_APOLL_MULTI_POLLED which stands for
REQ_F_APOLL_MULTI | REQ_F_POLLED, to make the code short and cleaner.

Signed-off-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index b6d491c9a25f..c2ee184ac693 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -116,6 +116,8 @@
#define IO_REQ_CLEAN_SLOW_FLAGS (REQ_F_REFCOUNT | REQ_F_LINK | REQ_F_HARDLINK |\
IO_REQ_CLEAN_FLAGS)

+#define IO_APOLL_MULTI_POLLED (REQ_F_APOLL_MULTISHOT | REQ_F_POLLED)
+
#define IO_TCTX_REFS_CACHE_NR (1U << 10)

struct io_uring {
@@ -810,6 +812,7 @@ enum {
REQ_F_SINGLE_POLL_BIT,
REQ_F_DOUBLE_POLL_BIT,
REQ_F_PARTIAL_IO_BIT,
+ REQ_F_APOLL_MULTISHOT_BIT,
/* keep async read/write and isreg together and in order */
REQ_F_SUPPORT_NOWAIT_BIT,
REQ_F_ISREG_BIT,
@@ -874,6 +877,8 @@ enum {
REQ_F_DOUBLE_POLL = BIT(REQ_F_DOUBLE_POLL_BIT),
/* request has already done partial IO */
REQ_F_PARTIAL_IO = BIT(REQ_F_PARTIAL_IO_BIT),
+ /* fast poll multishot mode */
+ REQ_F_APOLL_MULTISHOT = BIT(REQ_F_APOLL_MULTISHOT_BIT),
};

struct async_poll {
--
2.36.0


2022-05-09 11:15:39

by Hao Xu

[permalink] [raw]
Subject: [PATCH 1/4] io_uring: add IORING_ACCEPT_MULTISHOT for accept

From: Hao Xu <[email protected]>

add an accept_flag IORING_ACCEPT_MULTISHOT for accept, which is to
support multishot.

Signed-off-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
include/uapi/linux/io_uring.h | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 06621a278cb6..f4d9ca62a5a6 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -223,6 +223,11 @@ enum {
*/
#define IORING_RECVSEND_POLL_FIRST (1U << 0)

+/*
+ * accept flags stored in accept_flags
+ */
+#define IORING_ACCEPT_MULTISHOT (1U << 15)
+
/*
* IO completion data structure (Completion Queue Entry)
*/
--
2.36.0