2023-07-28 18:25:47

by Jens Axboe

[permalink] [raw]
Subject: [PATCHSET v4] Add io_uring futex/futexv support

s patchset adds support for first futex wake and wait, and then
futexv.

For both wait/wake/waitv, we support the bitset variant, as the
"normal" variants can be easily implemented on top of that.

PI and requeue are not supported through io_uring, just the above
mentioned parts. This may change in the future, but in the spirit
of keeping this small (and based on what people have been asking for),
this is what we currently have.

When I did these patches, I forgot that Pavel had previously posted a
futex variant for io_uring. The major thing that had been holding me
back from people asking about futexes and io_uring, is that I wanted
to do this what I consider the right way - no usage of io-wq or thread
offload, an actually async implementation that is efficient to use
and don't rely on a blocking thread for futex wait/waitv. This is what
this patchset attempts to do, while being minimally invasive on the
futex side. I believe the diffstat reflects that.

As far as I can recall, the first request for futex support with
io_uring came from Andres Freund, working on postgres. His aio rework
of postgres was one of the early adopters of io_uring, and futex
support was a natural extension for that. This is relevant from both
a usability point of view, as well as for effiency and performance.
In Andres's words, for the former:

"Futex wait support in io_uring makes it a lot easier to avoid deadlocks
in concurrent programs that have their own buffer pool: Obviously pages in
the application buffer pool have to be locked during IO. If the initiator
of IO A needs to wait for a held lock B, the holder of lock B might wait
for the IO A to complete. The ability to wait for a lock and IO
completions at the same time provides an efficient way to avoid such
deadlocks."

and in terms of effiency, even without unlocking the full potential yet,
Andres says:

"Futex wake support in io_uring is useful because it allows for more
efficient directed wakeups. For some "locks" postgres has queues
implemented in userspace, with wakeup logic that cannot easily be
implemented with FUTEX_WAKE_BITSET on a single "futex word" (imagine
waiting for journal flushes to have completed up to a certain point). Thus
a "lock release" sometimes need to wake up many processes in a row. A
quick-and-dirty conversion to doing these wakeups via io_uring lead to a
3% throughput increase, with 12% fewer context switches, albeit in a
fairly extreme workload."

Some basic io_uring futex support and test cases are available in the
liburing 'futex' branch:

https://git.kernel.dk/cgit/liburing/log/?h=futex

testing all of the variants. I originally wrote this code about a
month ago and Andres has been using it with postgres, and I'm not
aware of any bugs in it. That's not to say it's perfect, obviously,
and I welcome some feedback so we can move this forward and hash out
any potential issues.

include/linux/io_uring_types.h | 3 +
include/uapi/linux/futex.h | 17 +-
include/uapi/linux/io_uring.h | 4 +
io_uring/Makefile | 4 +-
io_uring/cancel.c | 5 +
io_uring/cancel.h | 4 +
io_uring/futex.c | 376 +++++++++++++++++++++++++++++++++
io_uring/futex.h | 36 ++++
io_uring/io_uring.c | 5 +
io_uring/opdef.c | 35 ++-
kernel/futex/futex.h | 92 +++++++-
kernel/futex/requeue.c | 3 +-
kernel/futex/syscalls.c | 41 ++--
kernel/futex/waitwake.c | 53 +++--
14 files changed, 630 insertions(+), 48 deletions(-)

You can also find the code here:

https://git.kernel.dk/cgit/linux/log/?h=io_uring-futex

V4:
- Refactor the prep setup so it's fully independent between the vectoed
and non-vectored futex handling.
- Ensure we -EINVAL any futex/futexv wait/waitv that specifies unused
fields.
- Fix a comment typo
- Update the patches from Peter.
- Fix two kerneldoc warnings
- Add a prep patch moving FUTEX2_MASK to futex.h

--
Jens Axboe




2023-07-31 17:08:23

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCHSET v4] Add io_uring futex/futexv support

On Fri, Jul 28 2023 at 10:42, Jens Axboe wrote:
> s patchset adds support for first futex wake and wait, and then
> futexv.

Can you please just wait until the futex core bits have been agreed on
and merged? No need to contribute more mess in everyones inbox.

2023-08-06 19:42:25

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCHSET v4] Add io_uring futex/futexv support

On 7/31/23 10:06?AM, Thomas Gleixner wrote:
> On Fri, Jul 28 2023 at 10:42, Jens Axboe wrote:
>> s patchset adds support for first futex wake and wait, and then
>> futexv.
>
> Can you please just wait until the futex core bits have been agreed on
> and merged? No need to contribute more mess in everyones inbox.

Also no need to keep dragging out the review of the other bits. The
dependency is only there so we can use FUTEX2 flags for this - which
does make sense to me, but we should probably split Peter's series in
two as there's no dependency on the functional bits on that patch
series. As we're getting ever closer to the merge window, and I have
other things sitting on top of the futex series, that's problematic for
me.

--
Jens Axboe


2023-08-07 02:33:34

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCHSET v4] Add io_uring futex/futexv support

Jens!

On Sun, Aug 06 2023 at 10:44, Jens Axboe wrote:
> On 7/31/23 10:06?AM, Thomas Gleixner wrote:
>> Can you please just wait until the futex core bits have been agreed on
>> and merged? No need to contribute more mess in everyones inbox.
>
> Also no need to keep dragging out the review of the other bits. The
> dependency is only there so we can use FUTEX2 flags for this - which
> does make sense to me, but we should probably split Peter's series in
> two as there's no dependency on the functional bits on that patch
> series. As we're getting ever closer to the merge window, and I have
> other things sitting on top of the futex series, that's problematic for
> me.

Seriously?

You are still trying to sell me "Features first - corrrectness
later/never"?

Go and look at the amount of fallout this has caused in the last years.
io-urine is definitely the leader of the pack in my [email protected]
inbox.

Vs. the problem at hand. I've failed to catch a major issue with futex
patches in the past and I'm not seeing any reason to rush any of this to
repeat the experience just because.

You know very well that your whatever depends on this series has to wait
until the basics are sorted and there is absolutely no reason that your
so important things have to be rushed into the next merge window.

It surely makes sense to split these things up into independent series,
but _you_ could have done that weeks ago instead of just reposting an
umodified and unreviewed RFC series from Peter and then coming out now
and complaining about the lack of progress.

Sorry no.

Thanks,

tglx


2023-08-07 18:59:04

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCHSET v4] Add io_uring futex/futexv support

On 8/6/23 7:23?PM, Thomas Gleixner wrote:
> Jens!
>
> On Sun, Aug 06 2023 at 10:44, Jens Axboe wrote:
>> On 7/31/23 10:06?AM, Thomas Gleixner wrote:
>>> Can you please just wait until the futex core bits have been agreed on
>>> and merged? No need to contribute more mess in everyones inbox.
>>
>> Also no need to keep dragging out the review of the other bits. The
>> dependency is only there so we can use FUTEX2 flags for this - which
>> does make sense to me, but we should probably split Peter's series in
>> two as there's no dependency on the functional bits on that patch
>> series. As we're getting ever closer to the merge window, and I have
>> other things sitting on top of the futex series, that's problematic for
>> me.
>
> Seriously?
>
> You are still trying to sell me "Features first - corrrectness
> later/never"?

That's not what I'm saying at all. I wrote these patches 3 months ago,
and like I mentioned, I think doing the futex2 flags for that side is a
good suggestion from Peter. As those initial prep patches are all these
require, rather than the full futex2 series, there's no reason not to
review these at the same time too, if people should be so inclined.

> Go and look at the amount of fallout this has caused in the last years.
> io-urine is definitely the leader of the pack in my [email protected]
> inbox.

We're now resorting to name calling? Sorry, but I think that's pretty
low and not very professional.

> Vs. the problem at hand. I've failed to catch a major issue with futex
> patches in the past and I'm not seeing any reason to rush any of this to
> repeat the experience just because.

I'm not asking anyone to rush anything.

> You know very well that your whatever depends on this series has to wait
> until the basics are sorted and there is absolutely no reason that your
> so important things have to be rushed into the next merge window.

Again, you're making assumptions.

> It surely makes sense to split these things up into independent series,
> but _you_ could have done that weeks ago instead of just reposting an
> umodified and unreviewed RFC series from Peter and then coming out now
> and complaining about the lack of progress.

It's Peter's series, I'm not going to split his series and step on his
toes. I already separately tested, and will do so with the updated
series as well when I get back, since I saw he posted one.

--
Jens Axboe


2023-08-19 20:26:30

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCHSET v4] Add io_uring futex/futexv support

On Mon, Aug 14 2023 at 18:18, Jens Axboe wrote:
> On 8/14/23 6:12 PM, Thomas Gleixner wrote:

>>> We're now resorting to name calling? Sorry, but I think that's pretty
>>> low and not very professional.
>>
>> I'm not resorting to that. If you got offended by the meme which
>> happened to elapse into my reply, then I can definitely understand
>> that. That was not my intention at all. But you might think about why
>> that meme exists in the first place.
>
> It's been there since day 1 because a) the spelling is close, and b)
> some people are just childish. Same reason kids in the 3rd grade come up
> with nicknames for each others. And that's fine, but most people grow
> out of that, and it certainly has no place in what is supposedly a
> professional setting.

Sure. Repeat that as often you want. I already made clear in my reply
that this was unintentional, no?

Though the fact that this "rush the feature" ends up in my security
inbox more than justified has absolutely nothing to do with my
potentially childish and non-professionl attitude, right?

Though you gracefully ignored that. Fair enough...