2023-07-26 11:02:48

by Hao Xu

[permalink] [raw]
Subject: [RFC 0/7] io_uring lseek

From: Hao Xu <[email protected]>

This series adds lseek for io_uring, the motivation to import this
syscall is in previous io_uring getdents patchset, we lack a way to
rewind the file cursor when it goes to the end of file. Another reason
is lseek is a common syscall, it's good for coding consistency when
users use io_uring as their main loop.

Patch 1 is code clean for iomap
Patch 2 adds IOMAP_NOWAIT logic for iomap lseek
Patch 3 adds a nowait parameter to for IOMAP_NOWAIT control
Patch 4 adds llseek_nowait() for file_operations so that specific
filesystem can implement it for nowait lseek
Patch 5 adds llseek_nowait() implementation for xfs
Patch 6 adds a new vfs wrapper for io_uring use
Patch 7 is the main io_uring lseek implementation

Note, this series depends on the previous io_uring getdents series.

This is marked RFC since there is (at least) an issue to be discussed:
The work in this series is mainly to reslove a problem that the current
llseek() in struct file_operations doesn't have a place to deliver
nowait info, and adding an argument to it results in update for llseek
implementation of all filesystems (35 functions), so here I introduce
a new llseek_nowait() as a workaround.

For performance, it has about 20%~30% improvement on iops.
The test program is just like the one for io_uring getdents, here is the
link to it: https://github.com/HowHsu/liburing/blob/llseek/test/lseek.c
- Each test runs about 30000 async requests/sync syscalls
- Each test runs 100 times and get the average value.
- offset is randomly generated value
- the file is a 1M all zero file

[howeyxu@~]$ python3 run_lseek.py
test args: seek mode:SEEK_SET, offset: 334772
Average of sync : 0.012300650000000002
Average of iouring : 0.008528009999999999
30.67%

[howeyxu@~]$ python3 run_lseek.py
test args: seek mode:SEEK_CUR, offset: 389292
Average of sync : 0.012736129999999995
Average of iouring : 0.00928725
27.08%

[howeyxu@~]$ python3 run_lseek.py
test args: seek mode:SEEK_END, offset: 281141
Average of sync : 0.01221595
Average of iouring : 0.008442890000000003
30.89%

[howeyxu@~]$ python3 run_lseek.py
test args: seek mode:SEEK_DATA, offset: 931103
Average of sync : 0.015496230000000005
Average of iouring : 0.012341509999999998
20.36%

[howeyxu@~]$ python3 run_lseek.py
test args: seek mode:SEEK_HOLE, offset: 430194
Average of sync : 0.01555663000000001
Average of iouring : 0.012064940000000003
22.45%


Hao Xu (7):
iomap: merge iomap_seek_hole() and iomap_seek_data()
xfs: add nowait support for xfs_seek_iomap_begin()
add nowait parameter for iomap_seek()
add llseek_nowait() for struct file_operations
add llseek_nowait support for xfs
add vfs_lseek_nowait()
add lseek for io_uring

fs/ext4/file.c | 9 ++---
fs/gfs2/inode.c | 4 +--
fs/iomap/seek.c | 42 ++++++-----------------
fs/read_write.c | 18 ++++++++++
fs/xfs/xfs_file.c | 34 ++++++++++++++++---
fs/xfs/xfs_iomap.c | 4 ++-
include/linux/fs.h | 4 +++
include/linux/iomap.h | 6 ++--
include/uapi/linux/io_uring.h | 1 +
io_uring/fs.c | 63 +++++++++++++++++++++++++++++++++++
io_uring/fs.h | 3 ++
io_uring/opdef.c | 8 +++++
12 files changed, 145 insertions(+), 51 deletions(-)


base-commit: 4a4b046082eca8ae90b654d772fccc30e9f23f4d
--
2.25.1



2023-07-26 11:16:00

by Hao Xu

[permalink] [raw]
Subject: [PATCH 4/7] add llseek_nowait() for struct file_operations

From: Hao Xu <[email protected]>

Add a new function member llseek_nowait() in struct file_operations for
nowait llseek. It act just like llseek() but has an extra boolean
parameter called nowait to indicate if it's a nowait try, avoid IO and
locks if so.

Signed-off-by: Hao Xu <[email protected]>
---
include/linux/fs.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index f3e315e8efdd..d37290da2d7e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1823,6 +1823,7 @@ struct file_operations {
int (*uring_cmd)(struct io_uring_cmd *ioucmd, unsigned int issue_flags);
int (*uring_cmd_iopoll)(struct io_uring_cmd *, struct io_comp_batch *,
unsigned int poll_flags);
+ loff_t (*llseek_nowait)(struct file *, loff_t, int, bool);
} __randomize_layout;

struct inode_operations {
--
2.25.1


2023-07-26 12:17:56

by Hao Xu

[permalink] [raw]
Subject: [PATCH 6/7] add vfs_lseek_nowait()

From: Hao Xu <[email protected]>

Add a new vfs wrapper for io_uring lseek usage. The reason is the
current vfs_lseek() calls llseek() but what we need is llseek_nowait().

Signed-off-by: Hao Xu <[email protected]>
---
fs/read_write.c | 18 ++++++++++++++++++
include/linux/fs.h | 3 +++
2 files changed, 21 insertions(+)

diff --git a/fs/read_write.c b/fs/read_write.c
index b07de77ef126..b4c3bcf706e2 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -290,6 +290,24 @@ loff_t vfs_llseek(struct file *file, loff_t offset, int whence)
}
EXPORT_SYMBOL(vfs_llseek);

+loff_t vfs_lseek_nowait(struct file *file, off_t offset,
+ int whence, bool nowait)
+{
+ if (!(file->f_mode & FMODE_LSEEK))
+ return -ESPIPE;
+ /*
+ * This function is only used by io_uring, thus
+ * returning -ENOTSUPP is not proper since doing
+ * nonblock lseek as the first try is asked internally
+ * by io_uring not by users. Return -ENOTSUPP to users
+ * is not sane.
+ */
+ if (!file->f_op->llseek_nowait)
+ return -EAGAIN;
+ return file->f_op->llseek_nowait(file, offset, whence, nowait);
+}
+EXPORT_SYMBOL(vfs_lseek_nowait);
+
static off_t ksys_lseek(unsigned int fd, off_t offset, unsigned int whence)
{
off_t retval;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d37290da2d7e..cb804d1f1650 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2654,6 +2654,9 @@ extern loff_t default_llseek(struct file *file, loff_t offset, int whence);

extern loff_t vfs_llseek(struct file *file, loff_t offset, int whence);

+extern loff_t vfs_lseek_nowait(struct file *file, off_t offset,
+ int whence, bool nowait);
+
extern int inode_init_always(struct super_block *, struct inode *);
extern void inode_init_once(struct inode *);
extern void address_space_init_once(struct address_space *mapping);
--
2.25.1


2023-07-26 13:43:29

by Christian Brauner

[permalink] [raw]
Subject: Re: [RFC 0/7] io_uring lseek

On Wed, Jul 26, 2023 at 06:25:56PM +0800, Hao Xu wrote:
> From: Hao Xu <[email protected]>
>
> This series adds lseek for io_uring, the motivation to import this
> syscall is in previous io_uring getdents patchset, we lack a way to
> rewind the file cursor when it goes to the end of file. Another reason
> is lseek is a common syscall, it's good for coding consistency when
> users use io_uring as their main loop.

While I understand this it is a time consuming review to make sure
things work correctly. So before we get this thing going we better get
getdents correct first.

>
> Patch 1 is code clean for iomap
> Patch 2 adds IOMAP_NOWAIT logic for iomap lseek
> Patch 3 adds a nowait parameter to for IOMAP_NOWAIT control
> Patch 4 adds llseek_nowait() for file_operations so that specific
> filesystem can implement it for nowait lseek
> Patch 5 adds llseek_nowait() implementation for xfs
> Patch 6 adds a new vfs wrapper for io_uring use
> Patch 7 is the main io_uring lseek implementation
>
> Note, this series depends on the previous io_uring getdents series.
>
> This is marked RFC since there is (at least) an issue to be discussed:
> The work in this series is mainly to reslove a problem that the current
> llseek() in struct file_operations doesn't have a place to deliver
> nowait info, and adding an argument to it results in update for llseek
> implementation of all filesystems (35 functions), so here I introduce
> a new llseek_nowait() as a workaround.

My intuition would be to update all filesystems. Adding new inode
operations always starts as a temporary thing and then we live with two
different methods for the next years or possibly forever.

But it'd be good to hear what others think.

2023-07-27 13:28:23

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 4/7] add llseek_nowait() for struct file_operations

On Wed, Jul 26, 2023 at 06:26:00PM +0800, Hao Xu wrote:
> + loff_t (*llseek_nowait)(struct file *, loff_t, int, bool);

You don't have to name the struct file, but an unnamed int and an
unnamed bool is just not acceptable.

2023-07-27 13:37:06

by Hao Xu

[permalink] [raw]
Subject: Re: [RFC 0/7] io_uring lseek

On 7/26/23 18:25, Hao Xu wrote:
> From: Hao Xu <[email protected]>
>
> This series adds lseek for io_uring, the motivation to import this
> syscall is in previous io_uring getdents patchset, we lack a way to
> rewind the file cursor when it goes to the end of file. Another reason
> is lseek is a common syscall, it's good for coding consistency when
> users use io_uring as their main loop.
>
> Patch 1 is code clean for iomap
> Patch 2 adds IOMAP_NOWAIT logic for iomap lseek
> Patch 3 adds a nowait parameter to for IOMAP_NOWAIT control
> Patch 4 adds llseek_nowait() for file_operations so that specific
> filesystem can implement it for nowait lseek
> Patch 5 adds llseek_nowait() implementation for xfs
> Patch 6 adds a new vfs wrapper for io_uring use
> Patch 7 is the main io_uring lseek implementation
>
> Note, this series depends on the previous io_uring getdents series.
>
> This is marked RFC since there is (at least) an issue to be discussed:
> The work in this series is mainly to reslove a problem that the current
> llseek() in struct file_operations doesn't have a place to deliver
> nowait info, and adding an argument to it results in update for llseek
> implementation of all filesystems (35 functions), so here I introduce
> a new llseek_nowait() as a workaround.
>
> For performance, it has about 20%~30% improvement on iops.
> The test program is just like the one for io_uring getdents, here is the
> link to it: https://github.com/HowHsu/liburing/blob/llseek/test/lseek.c
> - Each test runs about 30000 async requests/sync syscalls
> - Each test runs 100 times and get the average value.
> - offset is randomly generated value
> - the file is a 1M all zero file
>
> [howeyxu@~]$ python3 run_lseek.py
> test args: seek mode:SEEK_SET, offset: 334772
> Average of sync : 0.012300650000000002
> Average of iouring : 0.008528009999999999
> 30.67%
>
> [howeyxu@~]$ python3 run_lseek.py
> test args: seek mode:SEEK_CUR, offset: 389292
> Average of sync : 0.012736129999999995
> Average of iouring : 0.00928725
> 27.08%
>
> [howeyxu@~]$ python3 run_lseek.py
> test args: seek mode:SEEK_END, offset: 281141
> Average of sync : 0.01221595
> Average of iouring : 0.008442890000000003
> 30.89%
>
> [howeyxu@~]$ python3 run_lseek.py
> test args: seek mode:SEEK_DATA, offset: 931103
> Average of sync : 0.015496230000000005
> Average of iouring : 0.012341509999999998
> 20.36%
>
> [howeyxu@~]$ python3 run_lseek.py
> test args: seek mode:SEEK_HOLE, offset: 430194
> Average of sync : 0.01555663000000001
> Average of iouring : 0.012064940000000003
> 22.45%
>
>
> Hao Xu (7):
> iomap: merge iomap_seek_hole() and iomap_seek_data()
> xfs: add nowait support for xfs_seek_iomap_begin()
> add nowait parameter for iomap_seek()
> add llseek_nowait() for struct file_operations
> add llseek_nowait support for xfs
> add vfs_lseek_nowait()
> add lseek for io_uring
>
> fs/ext4/file.c | 9 ++---
> fs/gfs2/inode.c | 4 +--
> fs/iomap/seek.c | 42 ++++++-----------------
> fs/read_write.c | 18 ++++++++++
> fs/xfs/xfs_file.c | 34 ++++++++++++++++---
> fs/xfs/xfs_iomap.c | 4 ++-
> include/linux/fs.h | 4 +++
> include/linux/iomap.h | 6 ++--
> include/uapi/linux/io_uring.h | 1 +
> io_uring/fs.c | 63 +++++++++++++++++++++++++++++++++++
> io_uring/fs.h | 3 ++
> io_uring/opdef.c | 8 +++++
> 12 files changed, 145 insertions(+), 51 deletions(-)
>
>
> base-commit: 4a4b046082eca8ae90b654d772fccc30e9f23f4d

Hi folks,
I'll leave this patchset here before the io_uring getdents is merged,
then come back and update this one.

Thanks,
Hao