2020-10-12 13:56:13

by Jens Axboe

[permalink] [raw]
Subject: [GIT PULL] io_uring updates for 5.10-rc1

Hi Linus,

Here are the io_uring updates for 5.10. This pull request contains:

- Add blkcg accounting for io-wq offload (Dennis)

- A use-after-free fix for io-wq (Hillf)

- Cancelation fixes and improvements

- Use proper files_struct references for offload

- Cleanup of io_uring_get_socket() since that can now go into our own
header

- SQPOLL fixes and cleanups, and support for sharing the thread

- Improvement to how page accounting is done for registered buffers and
huge pages, accounting the real pinned state

- Series cleaning up the xarray code (Willy)

- Various cleanups, refactoring, and improvements (Pavel)

- Use raw spinlock for io-wq (Sebastian)

- Add support for ring restrictions (Stefano)

Please pull!


The following changes since commit c8d317aa1887b40b188ec3aaa6e9e524333caed1:

io_uring: fix async buffered reads when readahead is disabled (2020-09-29 07:54:00 -0600)

are available in the Git repository at:

git://git.kernel.dk/linux-block.git tags/io_uring-5.10-2020-10-12

for you to fetch changes up to b2e9685283127f30e7f2b466af0046ff9bd27a86:

io_uring: keep a pointer ref_node in file_data (2020-10-10 12:49:25 -0600)

----------------------------------------------------------------
io_uring-5.10-2020-10-12

----------------------------------------------------------------
Dennis Zhou (1):
io_uring: add blkcg accounting to offloaded operations

Hillf Danton (1):
io-wq: fix use-after-free in io_wq_worker_running

Jens Axboe (29):
Merge branch 'io_uring-5.9' into for-5.10/io_uring
io_uring: allow timeout/poll/files killing to take task into account
io_uring: move dropping of files into separate helper
io_uring: stash ctx task reference for SQPOLL
io_uring: unconditionally grab req->task
io_uring: return cancelation status from poll/timeout/files handlers
io_uring: enable task/files specific overflow flushing
io_uring: don't rely on weak ->files references
io_uring: reference ->nsproxy for file table commands
io_uring: move io_uring_get_socket() into io_uring.h
io_uring: io_sq_thread() doesn't need to flush signals
fs: align IOCB_* flags with RWF_* flags
io_uring: use private ctx wait queue entries for SQPOLL
io_uring: move SQPOLL post-wakeup ring need wakeup flag into wake handler
io_uring: split work handling part of SQPOLL into helper
io_uring: split SQPOLL data into separate structure
io_uring: base SQPOLL handling off io_sq_data
io_uring: enable IORING_SETUP_ATTACH_WQ to attach to SQPOLL thread too
io_uring: mark io_uring_fops/io_op_defs as __read_mostly
io_uring: provide IORING_ENTER_SQ_WAIT for SQPOLL SQ ring waits
io_uring: get rid of req->io/io_async_ctx union
io_uring: cap SQ submit size for SQPOLL with multiple rings
io_uring: improve registered buffer accounting for huge pages
io_uring: process task work in io_uring_register()
io-wq: kill unused IO_WORKER_F_EXITING
io_uring: kill callback_head argument for io_req_task_work_add()
io_uring: batch account ->req_issue and task struct references
io_uring: no need to call xa_destroy() on empty xarray
io_uring: fix break condition for __io_uring_register() waiting

Joseph Qi (1):
io_uring: show sqthread pid and cpu in fdinfo

Matthew Wilcox (Oracle) (3):
io_uring: Fix use of XArray in __io_uring_files_cancel
io_uring: Fix XArray usage in io_uring_add_task_file
io_uring: Convert advanced XArray uses to the normal API

Pavel Begunkov (23):
io_uring: simplify io_rw_prep_async()
io_uring: refactor io_req_map_rw()
io_uring: fix overlapped memcpy in io_req_map_rw()
io_uring: kill extra user_bufs check
io_uring: simplify io_alloc_req()
io_uring: io_kiocb_ppos() style change
io_uring: remove F_NEED_CLEANUP check in *prep()
io_uring: set/clear IOCB_NOWAIT into io_read/write
io_uring: remove nonblock arg from io_{rw}_prep()
io_uring: decouple issuing and req preparation
io_uring: move req preps out of io_issue_sqe()
io_uring: don't io_prep_async_work() linked reqs
io_uring: clean up ->files grabbing
io_uring: kill extra check in fixed io_file_get()
io_uring: simplify io_file_get()
io_uring: improve submit_state.ios_left accounting
io_uring: use a separate struct for timeout_remove
io_uring: remove timeout.list after hrtimer cancel
io_uring: clean leftovers after splitting issue
io_uring: don't delay io_init_req() error check
io_uring: clean file_data access in files_register
io_uring: refactor *files_register()'s error paths
io_uring: keep a pointer ref_node in file_data

Sebastian Andrzej Siewior (1):
io_wq: Make io_wqe::lock a raw_spinlock_t

Stefano Garzarella (3):
io_uring: use an enumeration for io_uring_register(2) opcodes
io_uring: add IOURING_REGISTER_RESTRICTIONS opcode
io_uring: allow disabling rings during the creation

Zheng Bin (1):
io_uring: remove unneeded semicolon

fs/exec.c | 6 +
fs/file.c | 2 +
fs/io-wq.c | 200 +++---
fs/io-wq.h | 4 +
fs/io_uring.c | 2181 ++++++++++++++++++++++++++++++++++++--------------------
include/linux/fs.h | 46 +-
include/linux/io_uring.h | 58 ++
include/linux/sched.h | 5 +
include/uapi/linux/io_uring.h | 61 +-
init/init_task.c | 3 +
kernel/fork.c | 6 +
net/unix/scm.c | 1 +
12 files changed, 1662 insertions(+), 911 deletions(-)
create mode 100644 include/linux/io_uring.h

--
Jens Axboe


2020-10-13 19:48:46

by pr-tracker-bot

[permalink] [raw]
Subject: Re: [GIT PULL] io_uring updates for 5.10-rc1

The pull request you sent on Mon, 12 Oct 2020 07:46:45 -0600:

> git://git.kernel.dk/linux-block.git tags/io_uring-5.10-2020-10-12

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/6ad4bf6ea1609fb539a62f10fca87ddbd53a0315

Thank you!

--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

2020-10-13 19:52:33

by Jens Axboe

[permalink] [raw]
Subject: Re: [GIT PULL] io_uring updates for 5.10-rc1

On 10/13/20 1:46 PM, Linus Torvalds wrote:
> On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe <[email protected]> wrote:
>>
>> Here are the io_uring updates for 5.10.
>
> Very strange. My clang build gives a warning I've never seen before:
>
> /tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section
> attributes for .data..read_mostly
>
> and looking at what clang generates for the *.s file, it seems to be
> the "section" line in:
>
> .type io_op_defs,@object # @io_op_defs
> .section .data..read_mostly,"a",@progbits
> .p2align 4
>
> I think it's the combination of "const" and "__read_mostly".
>
> I think the warning is sensible: how can a piece of data be both
> "const" and "__read_mostly"? If it's "const", then it's not "mostly"
> read - it had better be _always_ read.
>
> I'm letting it go, and I've pulled this (gcc doesn't complain), but
> please have a look.

Huh weird, I'll take a look. FWIW, the construct isn't unique across
the kernel.

What clang are you using?

--
Jens Axboe

2020-10-13 19:54:31

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] io_uring updates for 5.10-rc1

On Tue, Oct 13, 2020 at 12:49 PM Jens Axboe <[email protected]> wrote:
>
> What clang are you using?

I have a self-built clang version from their development tree, since
I've been using it for the "asm goto with outputs" testing.

But I bet this happens with just regular reasonably up-to-date clang too.

Linus

2020-10-13 20:51:12

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [GIT PULL] io_uring updates for 5.10-rc1

On 13/10/2020 21.49, Jens Axboe wrote:
> On 10/13/20 1:46 PM, Linus Torvalds wrote:
>> On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe <[email protected]> wrote:
>>>
>>> Here are the io_uring updates for 5.10.
>>
>> Very strange. My clang build gives a warning I've never seen before:
>>
>> /tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section
>> attributes for .data..read_mostly
>>
>> and looking at what clang generates for the *.s file, it seems to be
>> the "section" line in:
>>
>> .type io_op_defs,@object # @io_op_defs
>> .section .data..read_mostly,"a",@progbits
>> .p2align 4
>>
>> I think it's the combination of "const" and "__read_mostly".
>>
>> I think the warning is sensible: how can a piece of data be both
>> "const" and "__read_mostly"? If it's "const", then it's not "mostly"
>> read - it had better be _always_ read.
>>
>> I'm letting it go, and I've pulled this (gcc doesn't complain), but
>> please have a look.
>
> Huh weird, I'll take a look. FWIW, the construct isn't unique across
> the kernel.

Citation needed. There's lots of "pointer to const foo" stuff declared
as __read_mostly, but I can't find any objects that are themselves both
const and __read_mostly. Other than that io_op_defs and io_uring_fops now.

But... there's something a little weird:

$ grep read_most -- fs/io_uring.s
.section .data..read_mostly,"a",@progbits
$ readelf --wide -S fs/io_uring.o | grep read_most
[32] .data..read_mostly PROGBITS 0000000000000000 01b4e0 000188
00 WA 0 0 32

(this is with gcc/gas). So despite that .section directive not saying
"aw", the section got the W flag anyway. There are lots of

.section "__tracepoints_ptrs", "a"
.pushsection .smp_locks,"a"

in the .s file, and those sections do end up with just the A bit in the
.o file. Does gas maybe somehow special-case a section name starting
with .data?

Rasmus

2020-10-14 06:23:30

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] io_uring updates for 5.10-rc1

On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe <[email protected]> wrote:
>
> Here are the io_uring updates for 5.10.

Very strange. My clang build gives a warning I've never seen before:

/tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section
attributes for .data..read_mostly

and looking at what clang generates for the *.s file, it seems to be
the "section" line in:

.type io_op_defs,@object # @io_op_defs
.section .data..read_mostly,"a",@progbits
.p2align 4

I think it's the combination of "const" and "__read_mostly".

I think the warning is sensible: how can a piece of data be both
"const" and "__read_mostly"? If it's "const", then it's not "mostly"
read - it had better be _always_ read.

I'm letting it go, and I've pulled this (gcc doesn't complain), but
please have a look.

Linus

2020-10-14 06:39:34

by Arvind Sankar

[permalink] [raw]
Subject: Re: [GIT PULL] io_uring updates for 5.10-rc1

On Tue, Oct 13, 2020 at 01:49:01PM -0600, Jens Axboe wrote:
> On 10/13/20 1:46 PM, Linus Torvalds wrote:
> > On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe <[email protected]> wrote:
> >>
> >> Here are the io_uring updates for 5.10.
> >
> > Very strange. My clang build gives a warning I've never seen before:
> >
> > /tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section
> > attributes for .data..read_mostly
> >
> > and looking at what clang generates for the *.s file, it seems to be
> > the "section" line in:
> >
> > .type io_op_defs,@object # @io_op_defs
> > .section .data..read_mostly,"a",@progbits
> > .p2align 4
> >
> > I think it's the combination of "const" and "__read_mostly".
> >
> > I think the warning is sensible: how can a piece of data be both
> > "const" and "__read_mostly"? If it's "const", then it's not "mostly"
> > read - it had better be _always_ read.
> >
> > I'm letting it go, and I've pulled this (gcc doesn't complain), but
> > please have a look.
>
> Huh weird, I'll take a look. FWIW, the construct isn't unique across
> the kernel.
>
> What clang are you using?
>
> --
> Jens Axboe
>

If const and non-const __read_mostly appeared in the same file, both gcc
and clang would give errors.

2020-10-14 10:43:28

by Jens Axboe

[permalink] [raw]
Subject: Re: [GIT PULL] io_uring updates for 5.10-rc1

On 10/13/20 2:49 PM, Rasmus Villemoes wrote:
> On 13/10/2020 21.49, Jens Axboe wrote:
>> On 10/13/20 1:46 PM, Linus Torvalds wrote:
>>> On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe <[email protected]> wrote:
>>>>
>>>> Here are the io_uring updates for 5.10.
>>>
>>> Very strange. My clang build gives a warning I've never seen before:
>>>
>>> /tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section
>>> attributes for .data..read_mostly
>>>
>>> and looking at what clang generates for the *.s file, it seems to be
>>> the "section" line in:
>>>
>>> .type io_op_defs,@object # @io_op_defs
>>> .section .data..read_mostly,"a",@progbits
>>> .p2align 4
>>>
>>> I think it's the combination of "const" and "__read_mostly".
>>>
>>> I think the warning is sensible: how can a piece of data be both
>>> "const" and "__read_mostly"? If it's "const", then it's not "mostly"
>>> read - it had better be _always_ read.
>>>
>>> I'm letting it go, and I've pulled this (gcc doesn't complain), but
>>> please have a look.
>>
>> Huh weird, I'll take a look. FWIW, the construct isn't unique across
>> the kernel.
>
> Citation needed. There's lots of "pointer to const foo" stuff declared
> as __read_mostly, but I can't find any objects that are themselves both
> const and __read_mostly. Other than that io_op_defs and io_uring_fops now.

You are right, they are all pointers, so not the same. I'll just revert
the patch.

--
Jens Axboe

2020-10-14 17:48:13

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [GIT PULL] io_uring updates for 5.10-rc1

Sorry for not reporting it sooner. It looks to me like a GNU `as` bug:
https://github.com/ClangBuiltLinux/linux/issues/1153#issuecomment-692265433
When I'm done with the three build breakages that popped up overnight I'll try
to report it to GNU binutils folks.

(We run an issue tracker out of
https://github.com/ClangBuiltLinux/linux/issues, if your interested to see what
the outstanding known issues are, or recently solved ones. We try to
aggressively track when and where patches land for the inevitable backports.
We have 118 people in our github group!)