2024-05-13 13:31:22

by Liu Wei

[permalink] [raw]
Subject: [PATCH] mm/filemap: invalidating pages is still necessary when io with IOCB_NOWAIT

After commit (6be96d3ad3 fs: return if direct I/O will trigger writeback),
when we issuing AIO with direct I/O and IOCB_NOWAIT on a block device, the
process context will not be blocked.

However, if the device already has page cache in memory, EAGAIN will be
returned. And even when trying to reissue the AIO with direct I/O and
IOCB_NOWAIT again, we consistently receive EAGAIN.

Maybe a better way to deal with it: filemap_fdatawrite_range dirty pages
with WB_SYNC_NONE flag, and invalidate_mapping_pages unmapped pages at
the same time.

Signed-off-by: Liu Wei <[email protected]>
---
mm/filemap.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 30de18c4fd28..1852a00caf31 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2697,8 +2697,15 @@ int kiocb_invalidate_pages(struct kiocb *iocb, size_t count)

if (iocb->ki_flags & IOCB_NOWAIT) {
/* we could block if there are any pages in the range */
- if (filemap_range_has_page(mapping, pos, end))
+ if (filemap_range_has_page(mapping, pos, end)) {
+ if (mapping_needs_writeback(mapping)) {
+ __filemap_fdatawrite_range(mapping,
+ pos, end, WB_SYNC_NONE);
+ }
+ invalidate_mapping_pages(mapping,
+ pos >> PAGE_SHIFT, end >> PAGE_SHIFT);
return -EAGAIN;
+ }
} else {
ret = filemap_write_and_wait_range(mapping, pos, end);
if (ret)
--
2.42.1





2024-05-23 20:09:14

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm/filemap: invalidating pages is still necessary when io with IOCB_NOWAIT

On Mon, 13 May 2024 21:23:39 +0800 Liu Wei <[email protected]> wrote:

> After commit (6be96d3ad3 fs: return if direct I/O will trigger writeback),
> when we issuing AIO with direct I/O and IOCB_NOWAIT on a block device, the
> process context will not be blocked.
>
> However, if the device already has page cache in memory, EAGAIN will be
> returned. And even when trying to reissue the AIO with direct I/O and
> IOCB_NOWAIT again, we consistently receive EAGAIN.
>
> Maybe a better way to deal with it: filemap_fdatawrite_range dirty pages
> with WB_SYNC_NONE flag, and invalidate_mapping_pages unmapped pages at
> the same time.

Can't userspace do this? If EAGAIN, sync the fd and retry the IO?


2024-05-23 21:09:16

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] mm/filemap: invalidating pages is still necessary when io with IOCB_NOWAIT

On Mon, May 13, 2024 at 09:23:39PM +0800, Liu Wei wrote:
> After commit (6be96d3ad3 fs: return if direct I/O will trigger writeback),

If you're reporting problems with a particular commit, it's good form
to cc the people who actually wrote that commit.

> when we issuing AIO with direct I/O and IOCB_NOWAIT on a block device, the
> process context will not be blocked.
>
> However, if the device already has page cache in memory, EAGAIN will be
> returned. And even when trying to reissue the AIO with direct I/O and
> IOCB_NOWAIT again, we consistently receive EAGAIN.
>
> Maybe a better way to deal with it: filemap_fdatawrite_range dirty pages
> with WB_SYNC_NONE flag, and invalidate_mapping_pages unmapped pages at
> the same time.
>
> Signed-off-by: Liu Wei <[email protected]>
> ---
> mm/filemap.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 30de18c4fd28..1852a00caf31 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2697,8 +2697,15 @@ int kiocb_invalidate_pages(struct kiocb *iocb, size_t count)
>
> if (iocb->ki_flags & IOCB_NOWAIT) {
> /* we could block if there are any pages in the range */
> - if (filemap_range_has_page(mapping, pos, end))
> + if (filemap_range_has_page(mapping, pos, end)) {
> + if (mapping_needs_writeback(mapping)) {
> + __filemap_fdatawrite_range(mapping,
> + pos, end, WB_SYNC_NONE);
> + }
> + invalidate_mapping_pages(mapping,
> + pos >> PAGE_SHIFT, end >> PAGE_SHIFT);
> return -EAGAIN;
> + }
> } else {
> ret = filemap_write_and_wait_range(mapping, pos, end);
> if (ret)
> --
> 2.42.1
>
>
>

2024-05-23 21:18:12

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] mm/filemap: invalidating pages is still necessary when io with IOCB_NOWAIT

On 5/23/24 3:08 PM, Matthew Wilcox wrote:
> On Mon, May 13, 2024 at 09:23:39PM +0800, Liu Wei wrote:
>> After commit (6be96d3ad3 fs: return if direct I/O will trigger writeback),
>
> If you're reporting problems with a particular commit, it's good form
> to cc the people who actually wrote that commit.

Yeah indeed...

>> when we issuing AIO with direct I/O and IOCB_NOWAIT on a block device, the
>> process context will not be blocked.
>>
>> However, if the device already has page cache in memory, EAGAIN will be
>> returned. And even when trying to reissue the AIO with direct I/O and
>> IOCB_NOWAIT again, we consistently receive EAGAIN.

-EAGAIN doesn't mean "just try again and it'll work".

>> Maybe a better way to deal with it: filemap_fdatawrite_range dirty pages
>> with WB_SYNC_NONE flag, and invalidate_mapping_pages unmapped pages at
>> the same time.
>>
>> Signed-off-by: Liu Wei <[email protected]>
>> ---
>> mm/filemap.c | 9 ++++++++-
>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 30de18c4fd28..1852a00caf31 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -2697,8 +2697,15 @@ int kiocb_invalidate_pages(struct kiocb *iocb, size_t count)
>>
>> if (iocb->ki_flags & IOCB_NOWAIT) {
>> /* we could block if there are any pages in the range */
>> - if (filemap_range_has_page(mapping, pos, end))
>> + if (filemap_range_has_page(mapping, pos, end)) {
>> + if (mapping_needs_writeback(mapping)) {
>> + __filemap_fdatawrite_range(mapping,
>> + pos, end, WB_SYNC_NONE);
>> + }

I don't think WB_SYNC_NONE tells it not to block, it just says not to
wait for it... So this won't work as-is.

>> + invalidate_mapping_pages(mapping,
>> + pos >> PAGE_SHIFT, end >> PAGE_SHIFT);
>> return -EAGAIN;
>> + }
>> } else {
>> ret = filemap_write_and_wait_range(mapping, pos, end);
>> if (ret)
>> --
>> 2.42.1
>>
>>
>>

--
Jens Axboe


2024-05-23 21:18:34

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] mm/filemap: invalidating pages is still necessary when io with IOCB_NOWAIT

On 5/23/24 3:11 PM, Matthew Wilcox wrote:
> On Thu, May 23, 2024 at 01:08:02PM -0700, Andrew Morton wrote:
>> On Mon, 13 May 2024 21:23:39 +0800 Liu Wei <[email protected]> wrote:
>>
>>> After commit (6be96d3ad3 fs: return if direct I/O will trigger writeback),
>>> when we issuing AIO with direct I/O and IOCB_NOWAIT on a block device, the
>>> process context will not be blocked.
>>>
>>> However, if the device already has page cache in memory, EAGAIN will be
>>> returned. And even when trying to reissue the AIO with direct I/O and
>>> IOCB_NOWAIT again, we consistently receive EAGAIN.
>>>
>>> Maybe a better way to deal with it: filemap_fdatawrite_range dirty pages
>>> with WB_SYNC_NONE flag, and invalidate_mapping_pages unmapped pages at
>>> the same time.
>>
>> Can't userspace do this? If EAGAIN, sync the fd and retry the IO?
>
> I don't think that it can, because the pages will still be there, even
> if now clean? I think the idea was to punt to a worker thread which
> could sleep and retry without NOWAIT. But let's see what someone
> involved in this patch has to say about the intent.

Right, the idea is that if you get -EAGAIN, a non-blocking attempt
wasn't possible. You'd need to retry from somewhere where you CAN block.
Any issuer very much can do that, whether it's in-kernel or not.

It'd be somewhat fragile to make assumptions on what can cause the
-EAGAIN and try to rectify them, and then try again with IOCB_NOWAIT.

--
Jens Axboe


2024-05-24 01:00:45

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] mm/filemap: invalidating pages is still necessary when io with IOCB_NOWAIT

On Thu, May 23, 2024 at 01:08:02PM -0700, Andrew Morton wrote:
> On Mon, 13 May 2024 21:23:39 +0800 Liu Wei <[email protected]> wrote:
>
> > After commit (6be96d3ad3 fs: return if direct I/O will trigger writeback),
> > when we issuing AIO with direct I/O and IOCB_NOWAIT on a block device, the
> > process context will not be blocked.
> >
> > However, if the device already has page cache in memory, EAGAIN will be
> > returned. And even when trying to reissue the AIO with direct I/O and
> > IOCB_NOWAIT again, we consistently receive EAGAIN.
> >
> > Maybe a better way to deal with it: filemap_fdatawrite_range dirty pages
> > with WB_SYNC_NONE flag, and invalidate_mapping_pages unmapped pages at
> > the same time.
>
> Can't userspace do this? If EAGAIN, sync the fd and retry the IO?

I don't think that it can, because the pages will still be there, even
if now clean? I think the idea was to punt to a worker thread which
could sleep and retry without NOWAIT. But let's see what someone
involved in this patch has to say about the intent.

2024-05-27 10:22:57

by Liu Wei

[permalink] [raw]
Subject: Re: [PATCH] mm/filemap: invalidating pages is still necessary when io with IOCB_NOWAIT

I am a newer, thanks for the reminder.

>
> >> when we issuing AIO with direct I/O and IOCB_NOWAIT on a block device, the
> >> process context will not be blocked.
> >>
> >> However, if the device already has page cache in memory, EAGAIN will be
> >> returned. And even when trying to reissue the AIO with direct I/O and
> >> IOCB_NOWAIT again, we consistently receive EAGAIN.
>
> -EAGAIN doesn't mean "just try again and it'll work".
>
> >> Maybe a better way to deal with it: filemap_fdatawrite_range dirty pages
> >> with WB_SYNC_NONE flag, and invalidate_mapping_pages unmapped pages at
> >> the same time.
> >>
> >> Signed-off-by: Liu Wei <[email protected]>
> >> ---
> >> mm/filemap.c | 9 ++++++++-
> >> 1 file changed, 8 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/mm/filemap.c b/mm/filemap.c
> >> index 30de18c4fd28..1852a00caf31 100644
> >> --- a/mm/filemap.c
> >> +++ b/mm/filemap.c
> >> @@ -2697,8 +2697,15 @@ int kiocb_invalidate_pages(struct kiocb *iocb, size_t count)
> >>
> >> if (iocb->ki_flags & IOCB_NOWAIT) {
> >> /* we could block if there are any pages in the range */
> >> - if (filemap_range_has_page(mapping, pos, end))
> >> + if (filemap_range_has_page(mapping, pos, end)) {
> >> + if (mapping_needs_writeback(mapping)) {
> >> + __filemap_fdatawrite_range(mapping,
> >> + pos, end, WB_SYNC_NONE);
> >> + }
>
> I don't think WB_SYNC_NONE tells it not to block, it just says not to
> wait for it... So this won't work as-is.

Yes, but I think an asynchronous writex-back is better than simply return EAGAIN.
By using __filemap_fdatawrite_range to trigger a writeback, subsequent retries
may have a higher chance of success.

>
> >> + invalidate_mapping_pages(mapping,
> >> + pos >> PAGE_SHIFT, end >> PAGE_SHIFT);
> >> return -EAGAIN;
> >> + }
> >> } else {
> >> ret = filemap_write_and_wait_range(mapping, pos, end);
> >> if (ret)
> >> --
> >> 2.42.1
> >>
> >>
> >>



2024-05-27 15:36:39

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] mm/filemap: invalidating pages is still necessary when io with IOCB_NOWAIT

On 5/27/24 4:09 AM, Liu Wei wrote:
> I am a newer, thanks for the reminder.
>
>>
>>>> when we issuing AIO with direct I/O and IOCB_NOWAIT on a block device, the
>>>> process context will not be blocked.
>>>>
>>>> However, if the device already has page cache in memory, EAGAIN will be
>>>> returned. And even when trying to reissue the AIO with direct I/O and
>>>> IOCB_NOWAIT again, we consistently receive EAGAIN.
>>
>> -EAGAIN doesn't mean "just try again and it'll work".
>>
>>>> Maybe a better way to deal with it: filemap_fdatawrite_range dirty pages
>>>> with WB_SYNC_NONE flag, and invalidate_mapping_pages unmapped pages at
>>>> the same time.
>>>>
>>>> Signed-off-by: Liu Wei <[email protected]>
>>>> ---
>>>> mm/filemap.c | 9 ++++++++-
>>>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/filemap.c b/mm/filemap.c
>>>> index 30de18c4fd28..1852a00caf31 100644
>>>> --- a/mm/filemap.c
>>>> +++ b/mm/filemap.c
>>>> @@ -2697,8 +2697,15 @@ int kiocb_invalidate_pages(struct kiocb *iocb, size_t count)
>>>>
>>>> if (iocb->ki_flags & IOCB_NOWAIT) {
>>>> /* we could block if there are any pages in the range */
>>>> - if (filemap_range_has_page(mapping, pos, end))
>>>> + if (filemap_range_has_page(mapping, pos, end)) {
>>>> + if (mapping_needs_writeback(mapping)) {
>>>> + __filemap_fdatawrite_range(mapping,
>>>> + pos, end, WB_SYNC_NONE);
>>>> + }
>>
>> I don't think WB_SYNC_NONE tells it not to block, it just says not to
>> wait for it... So this won't work as-is.
>
> Yes, but I think an asynchronous writex-back is better than simply
> return EAGAIN. By using __filemap_fdatawrite_range to trigger a
> writeback, subsequent retries may have a higher chance of success.

And what's the application supposed to do, just hammer on the same
IOCB_NOWAIT submission until it then succeeds? The only way this can
reasonably work for that would be if yo can do:

1) Issue IOCB_NOWAIT IO
2) Get -EAGAIN
3) Sync kick off writeback, wait for it to be done
4) Issue IOCB_NOWAIT IO again
5) Success

If you just kick it off, then you'd repeat steps 1..2 ad nauseam until
it works out, not tenable.

And this doesn't even include the other point I mentioned, which is
__filemap_fdatawrite_range() IO issue blocking in the first place.

So no, NAK on this patch.

--
Jens Axboe