2022-05-21 15:41:47

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] char/mem: only use {read,write}_iter, not the old {read,write} functions

On 5/20/22 9:09 AM, Al Viro wrote:
> On Fri, May 20, 2022 at 03:50:30PM +0200, Jason A. Donenfeld wrote:
>> Currently mem.c implements both the {read,write}_iter functions and the
>> {read,write} functions. But with {read,write} going away at some point
>> in the future,
>
> Not likely to happen, unfortunately.
>
>> and most kernel code made to prefer {read,write}_iter,
>> there's no point in keeping around the old code.
>
> Profile and you'll see ;-/

Weren't you working on bits to get us to performance parity there?
What's the status of that?

It really is an unfortunate situation we're currently in with two
methods for either read or write, with one being greatly preferred as we
can pass in non-file associated state (like IOCB_NOWAIT, etc) but the
older variant being a bit faster. It lives us in a bad place, imho.

--
Jens Axboe



2022-05-23 07:17:30

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] char/mem: only use {read,write}_iter, not the old {read,write} functions

On 5/20/22 9:11 AM, Jens Axboe wrote:
> On 5/20/22 9:09 AM, Al Viro wrote:
>> On Fri, May 20, 2022 at 03:50:30PM +0200, Jason A. Donenfeld wrote:
>>> Currently mem.c implements both the {read,write}_iter functions and the
>>> {read,write} functions. But with {read,write} going away at some point
>>> in the future,
>>
>> Not likely to happen, unfortunately.
>>
>>> and most kernel code made to prefer {read,write}_iter,
>>> there's no point in keeping around the old code.
>>
>> Profile and you'll see ;-/
>
> Weren't you working on bits to get us to performance parity there?
> What's the status of that?

Totally unscientific test on the current kernel, running:

dd if=/dev/zero of=/dev/null bs=4k status=progress

With the current tree, I get 8.8GB/sec, and if I drop fops->read() for
/dev/zero, then I get 8.6GB/sec. That's 1%, which isn't nothing, but
it's also not a huge loss for moving us in the right direction.

Looking at a perf diff, it's mostly:

+0.34% [kernel.kallsyms] [k] new_sync_read
+0.33% [kernel.kallsyms] [k] init_sync_kiocb
+0.07% [kernel.kallsyms] [k] iov_iter_init
+0.80% [kernel.kallsyms] [k] iov_iter_zero

with these being gone after switch to ->read_iter():

0.63% [kernel.kallsyms] [k] read_zero
0.13% [kernel.kallsyms] [k] __clear_user

Didn't look closer, but I'm assuming this is _mostly_ tied to needing to
init 48 bytes of kiocb for each one. There might be ways to embed a
sync_kiocb inside the kiocb for the bits we need there, at least that
could get us down to 32 bytes.

> It really is an unfortunate situation we're currently in with two
> methods for either read or write, with one being greatly preferred as we
> can pass in non-file associated state (like IOCB_NOWAIT, etc) but the
> older variant being a bit faster. It lives us in a bad place, imho.

And splice etc, for example...

--
Jens Axboe