2020-10-08 09:53:27

by Christophe Leroy

[permalink] [raw]
Subject: mm: Question about the use of 'accessed' flags and pte_young() helper

In a 10 years old commit
(https://github.com/linuxppc/linux/commit/d069cb4373fe0d451357c4d3769623a7564dfa9f), powerpc 8xx has
made the handling of PTE accessed bit conditional to CONFIG_SWAP.
Since then, this has been extended to some other powerpc variants.

That commit means that when CONFIG_SWAP is not selected, the accessed bit is not set by SW TLB miss
handlers, leading to pte_young() returning garbage, or should I say possibly returning false
allthough a page has been accessed since its access flag was reset.

Looking at various mm/ places, pte_young() is used independent of CONFIG_SWAP

Is it still valid the not manage accessed flags when CONFIG_SWAP is not selected ?
If yes, should pte_young() always return true in that case ?

While we are at it, I'm wondering whether powerpc should redefine arch_faults_on_old_pte()
On some variants of powerpc, accessed flag is managed by HW. On others, it is managed by SW TLB miss
handlers via page fault handling.

Thanks
Christophe


2020-10-21 08:58:28

by Vlastimil Babka

[permalink] [raw]
Subject: Re: mm: Question about the use of 'accessed' flags and pte_young() helper

On 10/8/20 11:49 AM, Christophe Leroy wrote:
> In a 10 years old commit
> (https://github.com/linuxppc/linux/commit/d069cb4373fe0d451357c4d3769623a7564dfa9f), powerpc 8xx has
> made the handling of PTE accessed bit conditional to CONFIG_SWAP.
> Since then, this has been extended to some other powerpc variants.
>
> That commit means that when CONFIG_SWAP is not selected, the accessed bit is not set by SW TLB miss
> handlers, leading to pte_young() returning garbage, or should I say possibly returning false
> allthough a page has been accessed since its access flag was reset.
>
> Looking at various mm/ places, pte_young() is used independent of CONFIG_SWAP
>
> Is it still valid the not manage accessed flags when CONFIG_SWAP is not selected ?

AFAIK it's wrong, reclaim needs it to detect accessed pages on inactive list,
via page_referenced(), including file pages (page cache) where CONFIG_SWAP plays
no role. Maybe it was different 10 years ago.

> If yes, should pte_young() always return true in that case ?

It should best work as intended. If not possible, true is maybe better, as false
will lead to inactive file list thrashing.

> While we are at it, I'm wondering whether powerpc should redefine arch_faults_on_old_pte()
> On some variants of powerpc, accessed flag is managed by HW. On others, it is managed by SW TLB miss
> handlers via page fault handling.
>
> Thanks
> Christophe
>

2020-10-21 09:12:45

by Johannes Weiner

[permalink] [raw]
Subject: Re: mm: Question about the use of 'accessed' flags and pte_young() helper

On Tue, Oct 20, 2020 at 05:52:07PM +0200, Vlastimil Babka wrote:
> On 10/8/20 11:49 AM, Christophe Leroy wrote:
> > In a 10 years old commit
> > (https://github.com/linuxppc/linux/commit/d069cb4373fe0d451357c4d3769623a7564dfa9f), powerpc 8xx has
> > made the handling of PTE accessed bit conditional to CONFIG_SWAP.
> > Since then, this has been extended to some other powerpc variants.
> >
> > That commit means that when CONFIG_SWAP is not selected, the accessed bit is not set by SW TLB miss
> > handlers, leading to pte_young() returning garbage, or should I say possibly returning false
> > allthough a page has been accessed since its access flag was reset.
> >
> > Looking at various mm/ places, pte_young() is used independent of CONFIG_SWAP
> >
> > Is it still valid the not manage accessed flags when CONFIG_SWAP is not selected ?
>
> AFAIK it's wrong, reclaim needs it to detect accessed pages on inactive
> list, via page_referenced(), including file pages (page cache) where
> CONFIG_SWAP plays no role. Maybe it was different 10 years ago.

Yes, we require this bit for properly aging mmapped file pages. The
underlying assumption in the referenced commit is incorrect.

> > If yes, should pte_young() always return true in that case ?
>
> It should best work as intended. If not possible, true is maybe better, as
> false will lead to inactive file list thrashing.

An unconditional true will cause mmapped file pages to be permanently
mlocked / unevictable.

Either way will break some workloads. The only good answer is the
truth :-)