2022-06-10 14:47:27

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH -next] mm/filemap: fix that first page is not mark accessed in filemap_read()

On Fri, Jun 10, 2022 at 03:34:11PM +0100, Matthew Wilcox wrote:
> On Mon, Jun 06, 2022 at 09:10:03AM +0800, Yu Kuai wrote:
> > On 2022/06/03 2:30, Matthew Wilcox wrote:
> > > On Thu, Jun 02, 2022 at 04:21:29PM +0800, Yu Kuai wrote:
> > > > In filemap_read(), 'ra->prev_pos' is set to 'iocb->ki_pos + copied',
> > > > while it should be 'iocb->ki_ops'.
> > >
> > > Can you walk me through your reasoning which leads you to believe that
> > > it should be ki_pos instead of ki_pos + copied? As I understand it,
> > > prev_pos is the end of the previous read, not the beginning of the
> > > previous read.
> >
> > Hi, Matthew
> >
> > The main reason is the following judgement in flemap_read():
> >
> > if (iocb->ki_pos >> PAGE_SHIFT != -> current page
> > ra->prev_pos >> PAGE_SHIFT) -> previous page
> > folio_mark_accessed(fbatch.folios[0]);
> >
> > Which means if current page is the same as previous page, don't mark
> > page accessed. However, prev_pos is set to 'ki_pos + copied' during last
> > read, which will cause 'prev_pos >> PAGE_SHIFT' to be current page
> > instead of previous page.
> >
> > I was thinking that if prev_pos is set to the begining of the previous
> > read, 'prev_pos >> PAGE_SHIFT' will be previous page as expected. Set to
> > the end of previous read is ok, however, I think the caculation of
> > previous page should be '(prev_pos - 1) >> PAGE_SHIFT' instead.
>
> OK, I think Kent broke this in 723ef24b9b37 ("mm/filemap/c: break
> generic_file_buffered_read up into multiple functions"). Before:
>
> - prev_index = ra->prev_pos >> PAGE_SHIFT;
> - prev_offset = ra->prev_pos & (PAGE_SIZE-1);
> ...
> - if (prev_index != index || offset != prev_offset)
> - mark_page_accessed(page);
>
> After:
> + if (iocb->ki_pos >> PAGE_SHIFT != ra->prev_pos >> PAGE_SHIFT)
> + mark_page_accessed(page);
>
> So surely this should have been:
>
> + if (iocb->ki_pos != ra->prev_pos)
> + mark_page_accessed(page);
>
> Kent, do you recall why you changed it the way you did?

Oh, and if this is the right diagnosis, then this is the fix for the
current tree:

+++ b/mm/filemap.c
@@ -2673,8 +2673,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
* When a sequential read accesses a page several times, only
* mark it as accessed the first time.
*/
- if (iocb->ki_pos >> PAGE_SHIFT !=
- ra->prev_pos >> PAGE_SHIFT)
+ if (iocb->ki_pos != ra->prev_pos)
folio_mark_accessed(fbatch.folios[0]);

for (i = 0; i < folio_batch_count(&fbatch); i++) {



2022-06-10 18:28:04

by Kent Overstreet

[permalink] [raw]
Subject: Re: [PATCH -next] mm/filemap: fix that first page is not mark accessed in filemap_read()

On 6/10/22 10:36, Matthew Wilcox wrote:
> On Fri, Jun 10, 2022 at 03:34:11PM +0100, Matthew Wilcox wrote:
>> On Mon, Jun 06, 2022 at 09:10:03AM +0800, Yu Kuai wrote:
>>> On 2022/06/03 2:30, Matthew Wilcox wrote:
>>>> On Thu, Jun 02, 2022 at 04:21:29PM +0800, Yu Kuai wrote:
>>>>> In filemap_read(), 'ra->prev_pos' is set to 'iocb->ki_pos + copied',
>>>>> while it should be 'iocb->ki_ops'.
>>>>
>>>> Can you walk me through your reasoning which leads you to believe that
>>>> it should be ki_pos instead of ki_pos + copied? As I understand it,
>>>> prev_pos is the end of the previous read, not the beginning of the
>>>> previous read.
>>>
>>> Hi, Matthew
>>>
>>> The main reason is the following judgement in flemap_read():
>>>
>>> if (iocb->ki_pos >> PAGE_SHIFT != -> current page
>>> ra->prev_pos >> PAGE_SHIFT) -> previous page
>>> folio_mark_accessed(fbatch.folios[0]);
>>>
>>> Which means if current page is the same as previous page, don't mark
>>> page accessed. However, prev_pos is set to 'ki_pos + copied' during last
>>> read, which will cause 'prev_pos >> PAGE_SHIFT' to be current page
>>> instead of previous page.
>>>
>>> I was thinking that if prev_pos is set to the begining of the previous
>>> read, 'prev_pos >> PAGE_SHIFT' will be previous page as expected. Set to
>>> the end of previous read is ok, however, I think the caculation of
>>> previous page should be '(prev_pos - 1) >> PAGE_SHIFT' instead.
>>
>> OK, I think Kent broke this in 723ef24b9b37 ("mm/filemap/c: break
>> generic_file_buffered_read up into multiple functions"). Before:
>>
>> - prev_index = ra->prev_pos >> PAGE_SHIFT;
>> - prev_offset = ra->prev_pos & (PAGE_SIZE-1);
>> ...
>> - if (prev_index != index || offset != prev_offset)
>> - mark_page_accessed(page);
>>
>> After:
>> + if (iocb->ki_pos >> PAGE_SHIFT != ra->prev_pos >> PAGE_SHIFT)
>> + mark_page_accessed(page);
>>
>> So surely this should have been:
>>
>> + if (iocb->ki_pos != ra->prev_pos)
>> + mark_page_accessed(page);
>>
>> Kent, do you recall why you changed it the way you did?
>
> Oh, and if this is the right diagnosis, then this is the fix for the
> current tree:
>
> +++ b/mm/filemap.c
> @@ -2673,8 +2673,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
> * When a sequential read accesses a page several times, only
> * mark it as accessed the first time.
> */
> - if (iocb->ki_pos >> PAGE_SHIFT !=
> - ra->prev_pos >> PAGE_SHIFT)
> + if (iocb->ki_pos != ra->prev_pos)
> folio_mark_accessed(fbatch.folios[0]);
>
> for (i = 0; i < folio_batch_count(&fbatch); i++) {
>
>

I think this is the fix we want - I think Yu basically had the right
idea and had the off by one fix, this should be clearer though:

Yu, can you confirm the fix?

-- >8 --
Subject: [PATCH] filemap: Fix off by one error when marking folios accessed

In filemap_read() we mark pages accessed as we read them - but we don't
want to do so redundantly, if the previous read already did so.

But there was an off by one error: we want to check if the current page
was the same as the last page we read from, but the last page we read
from was (ra->prev_pos - 1) >> PAGE_SHIFT.

Reported-by: Yu Kuai <[email protected]>
Signed-off-by: Kent Overstreet <[email protected]>

diff --git a/mm/filemap.c b/mm/filemap.c
index 9daeaab360..8d5c8043cb 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2704,7 +2704,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct
iov_iter *iter,
* mark it as accessed the first time.
*/
if (iocb->ki_pos >> PAGE_SHIFT !=
- ra->prev_pos >> PAGE_SHIFT)
+ (ra->prev_pos - 1) >> PAGE_SHIFT)
folio_mark_accessed(fbatch.folios[0]);

for (i = 0; i < folio_batch_count(&fbatch); i++) {

2022-06-10 18:49:09

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH -next] mm/filemap: fix that first page is not mark accessed in filemap_read()

On Fri, Jun 10, 2022 at 01:47:02PM -0400, Kent Overstreet wrote:
> I think this is the fix we want - I think Yu basically had the right idea
> and had the off by one fix, this should be clearer though:
>
> Yu, can you confirm the fix?
>
> -- >8 --
> Subject: [PATCH] filemap: Fix off by one error when marking folios accessed
>
> In filemap_read() we mark pages accessed as we read them - but we don't
> want to do so redundantly, if the previous read already did so.
>
> But there was an off by one error: we want to check if the current page
> was the same as the last page we read from, but the last page we read
> from was (ra->prev_pos - 1) >> PAGE_SHIFT.
>
> Reported-by: Yu Kuai <[email protected]>
> Signed-off-by: Kent Overstreet <[email protected]>
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 9daeaab360..8d5c8043cb 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2704,7 +2704,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct
> iov_iter *iter,
> * mark it as accessed the first time.
> */
> if (iocb->ki_pos >> PAGE_SHIFT !=
> - ra->prev_pos >> PAGE_SHIFT)
> + (ra->prev_pos - 1) >> PAGE_SHIFT)
> folio_mark_accessed(fbatch.folios[0]);
>
> for (i = 0; i < folio_batch_count(&fbatch); i++) {
>

This is going to mark the folio as accessed multiple times if it's
a multi-page folio. How about this one?


diff --git a/mm/filemap.c b/mm/filemap.c
index 5f227b5420d7..a30587f2e598 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2599,6 +2599,13 @@ static int filemap_get_pages(struct kiocb *iocb, struct iov_iter *iter,
return err;
}

+static inline bool pos_same_folio(loff_t pos1, loff_t pos2, struct folio *folio)
+{
+ unsigned int shift = folio_shift(folio);
+
+ return (pos1 >> shift == pos2 >> shift);
+}
+
/**
* filemap_read - Read data from the page cache.
* @iocb: The iocb to read.
@@ -2670,11 +2677,11 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
writably_mapped = mapping_writably_mapped(mapping);

/*
- * When a sequential read accesses a page several times, only
+ * When a read accesses the same folio several times, only
* mark it as accessed the first time.
*/
- if (iocb->ki_pos >> PAGE_SHIFT !=
- ra->prev_pos >> PAGE_SHIFT)
+ if (!pos_same_folio(iocb->ki_pos, ra->prev_pos - 1,
+ fbatch.folios[0]))
folio_mark_accessed(fbatch.folios[0]);

for (i = 0; i < folio_batch_count(&fbatch); i++) {

2022-06-10 18:53:16

by Kent Overstreet

[permalink] [raw]
Subject: Re: [PATCH -next] mm/filemap: fix that first page is not mark accessed in filemap_read()

On 6/10/22 10:36, Matthew Wilcox wrote:
> On Fri, Jun 10, 2022 at 03:34:11PM +0100, Matthew Wilcox wrote:
>> On Mon, Jun 06, 2022 at 09:10:03AM +0800, Yu Kuai wrote:
>>> On 2022/06/03 2:30, Matthew Wilcox wrote:
>>>> On Thu, Jun 02, 2022 at 04:21:29PM +0800, Yu Kuai wrote:
>>>>> In filemap_read(), 'ra->prev_pos' is set to 'iocb->ki_pos + copied',
>>>>> while it should be 'iocb->ki_ops'.
>>>>
>>>> Can you walk me through your reasoning which leads you to believe that
>>>> it should be ki_pos instead of ki_pos + copied? As I understand it,
>>>> prev_pos is the end of the previous read, not the beginning of the
>>>> previous read.
>>>
>>> Hi, Matthew
>>>
>>> The main reason is the following judgement in flemap_read():
>>>
>>> if (iocb->ki_pos >> PAGE_SHIFT != -> current page
>>> ra->prev_pos >> PAGE_SHIFT) -> previous page
>>> folio_mark_accessed(fbatch.folios[0]);
>>>
>>> Which means if current page is the same as previous page, don't mark
>>> page accessed. However, prev_pos is set to 'ki_pos + copied' during last
>>> read, which will cause 'prev_pos >> PAGE_SHIFT' to be current page
>>> instead of previous page.
>>>
>>> I was thinking that if prev_pos is set to the begining of the previous
>>> read, 'prev_pos >> PAGE_SHIFT' will be previous page as expected. Set to
>>> the end of previous read is ok, however, I think the caculation of
>>> previous page should be '(prev_pos - 1) >> PAGE_SHIFT' instead.
>>
>> OK, I think Kent broke this in 723ef24b9b37 ("mm/filemap/c: break
>> generic_file_buffered_read up into multiple functions"). Before:
>>
>> - prev_index = ra->prev_pos >> PAGE_SHIFT;
>> - prev_offset = ra->prev_pos & (PAGE_SIZE-1);
>> ...
>> - if (prev_index != index || offset != prev_offset)
>> - mark_page_accessed(page);
>>
>> After:
>> + if (iocb->ki_pos >> PAGE_SHIFT != ra->prev_pos >> PAGE_SHIFT)
>> + mark_page_accessed(page);
>>
>> So surely this should have been:
>>
>> + if (iocb->ki_pos != ra->prev_pos)
>> + mark_page_accessed(page);
>>
>> Kent, do you recall why you changed it the way you did?

So the idea was that if we're reading from a different _page_ that we
read from previously, we should be marking it accessed. But there's an
off by one error, it should have been

if (iocb->ki_pos >> PAGE_SHIFT != (ra->prev_pos - 1) >> PAGE_SHIFT)
folio_mark_accessed(fbatch.folios[0])

It looks like this is what Yukai was arriving at too when he was saying
ki_pos + copied - 1, this is just a cleaner way of writing it :)

2022-06-10 19:15:35

by Kent Overstreet

[permalink] [raw]
Subject: Re: [PATCH -next] mm/filemap: fix that first page is not mark accessed in filemap_read()

On 6/10/22 14:34, Matthew Wilcox wrote:
> On Fri, Jun 10, 2022 at 01:47:02PM -0400, Kent Overstreet wrote:
>> I think this is the fix we want - I think Yu basically had the right idea
>> and had the off by one fix, this should be clearer though:
>>
>> Yu, can you confirm the fix?
>>
>> -- >8 --
>> Subject: [PATCH] filemap: Fix off by one error when marking folios accessed
>>
>> In filemap_read() we mark pages accessed as we read them - but we don't
>> want to do so redundantly, if the previous read already did so.
>>
>> But there was an off by one error: we want to check if the current page
>> was the same as the last page we read from, but the last page we read
>> from was (ra->prev_pos - 1) >> PAGE_SHIFT.
>>
>> Reported-by: Yu Kuai <[email protected]>
>> Signed-off-by: Kent Overstreet <[email protected]>
>>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 9daeaab360..8d5c8043cb 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -2704,7 +2704,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct
>> iov_iter *iter,
>> * mark it as accessed the first time.
>> */
>> if (iocb->ki_pos >> PAGE_SHIFT !=
>> - ra->prev_pos >> PAGE_SHIFT)
>> + (ra->prev_pos - 1) >> PAGE_SHIFT)
>> folio_mark_accessed(fbatch.folios[0]);
>>
>> for (i = 0; i < folio_batch_count(&fbatch); i++) {
>>
>
> This is going to mark the folio as accessed multiple times if it's
> a multi-page folio. How about this one?

I like that one - you can add my Reviewed-by

>
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 5f227b5420d7..a30587f2e598 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2599,6 +2599,13 @@ static int filemap_get_pages(struct kiocb *iocb, struct iov_iter *iter,
> return err;
> }
>
> +static inline bool pos_same_folio(loff_t pos1, loff_t pos2, struct folio *folio)
> +{
> + unsigned int shift = folio_shift(folio);
> +
> + return (pos1 >> shift == pos2 >> shift);
> +}
> +
> /**
> * filemap_read - Read data from the page cache.
> * @iocb: The iocb to read.
> @@ -2670,11 +2677,11 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
> writably_mapped = mapping_writably_mapped(mapping);
>
> /*
> - * When a sequential read accesses a page several times, only
> + * When a read accesses the same folio several times, only
> * mark it as accessed the first time.
> */
> - if (iocb->ki_pos >> PAGE_SHIFT !=
> - ra->prev_pos >> PAGE_SHIFT)
> + if (!pos_same_folio(iocb->ki_pos, ra->prev_pos - 1,
> + fbatch.folios[0]))
> folio_mark_accessed(fbatch.folios[0]);
>
> for (i = 0; i < folio_batch_count(&fbatch); i++) {