2023-09-26 04:46:51

by Zach O'Keefe

[permalink] [raw]
Subject: [PATCH v4] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

The 6.0 commits:

commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()")
commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")

merged "can we have THPs in this VMA?" logic that was previously done
separately by fault-path, khugepaged, and smaps "THPeligible" checks.

During the process, the semantics of the fault path check changed in two
ways:

1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path).
2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault
handler that could satisfy the fault. Previously, this check had been
done in create_huge_pud() and create_huge_pmd() routines, but after
the changes, we never reach those routines.

During the review of the above commits, it was determined that in-tree
users weren't affected by the change; most notably, since the only relevant
user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is
explicitly approved early in approval logic. However, this was a bad
assumption to make as it assumes the only reason to support ->huge_fault
was for DAX (which is not true in general).

Remove the VM_NO_KHUGEPAGED check when not in collapse path and give
any ->huge_fault handler a chance to handle the fault. Note that we
don't validate the file mode or mapping alignment, which is consistent
with the behavior before the aforementioned commits.

Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
Reported-by: Saurabh Singh Sengar <[email protected]>
Signed-off-by: Zach O'Keefe <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: David Hildenbrand <[email protected]>
---
I've updated the changelog to reflect discussions in [1] -- leaving
ack to David / Matthew on whether to take the patch.

Changed from v3[1]:
- [akpm / David H. / M. Wilcox] Updated log to capture email discussion
Changed from v2[2]:
- Fixed false negative in smaps check when !dax && ->huge_fault
Changed from v1[3]:
- [Saurabhi] Allow ->huge_fault handler to handle fault, if it exists

[1] https://lore.kernel.org/linux-mm/[email protected]/
[2] https://lore.kernel.org/linux-mm/[email protected]/
[3] https://lore.kernel.org/linux-mm/CAAa6QmQw+F=o6htOn=6ADD6mwvMO=Ow_67f3ifBv3GpXx9Xg_g@mail.gmail.com/

---
mm/huge_memory.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 0f93a73115f73..797fe617e51ab 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -100,11 +100,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags,
return in_pf;

/*
- * Special VMA and hugetlb VMA.
+ * khugepaged special VMA and hugetlb VMA.
* Must be checked after dax since some dax mappings may have
* VM_MIXEDMAP set.
*/
- if (vm_flags & VM_NO_KHUGEPAGED)
+ if (!in_pf && !smaps && (vm_flags & VM_NO_KHUGEPAGED))
return false;

/*
@@ -132,12 +132,18 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags,
!hugepage_flags_always())))
return false;

- /* Only regular file is valid */
- if (!in_pf && file_thp_enabled(vma))
- return true;
-
- if (!vma_is_anonymous(vma))
+ if (!vma_is_anonymous(vma)) {
+ /*
+ * Trust that ->huge_fault() handlers know what they are doing
+ * in fault path.
+ */
+ if (((in_pf || smaps)) && vma->vm_ops->huge_fault)
+ return true;
+ /* Only regular file is valid in collapse path */
+ if (((!in_pf || smaps)) && file_thp_enabled(vma))
+ return true;
return false;
+ }

if (vma_is_temporary_stack(vma))
return false;
--
2.42.0.515.g380fc7ccd1-goog


2023-09-27 01:50:07

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v4] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

On Mon, Sep 25, 2023 at 1:01 PM Zach O'Keefe <[email protected]> wrote:
>
> The 6.0 commits:
>
> commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()")
> commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
>
> merged "can we have THPs in this VMA?" logic that was previously done
> separately by fault-path, khugepaged, and smaps "THPeligible" checks.
>
> During the process, the semantics of the fault path check changed in two
> ways:
>
> 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path).
> 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault
> handler that could satisfy the fault. Previously, this check had been
> done in create_huge_pud() and create_huge_pmd() routines, but after
> the changes, we never reach those routines.
>
> During the review of the above commits, it was determined that in-tree
> users weren't affected by the change; most notably, since the only relevant
> user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is
> explicitly approved early in approval logic. However, this was a bad
> assumption to make as it assumes the only reason to support ->huge_fault
> was for DAX (which is not true in general).
>
> Remove the VM_NO_KHUGEPAGED check when not in collapse path and give
> any ->huge_fault handler a chance to handle the fault. Note that we
> don't validate the file mode or mapping alignment, which is consistent
> with the behavior before the aforementioned commits.

Looks good to me. Reviewed-by: Yang Shi <[email protected]>

>
> Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
> Reported-by: Saurabh Singh Sengar <[email protected]>
> Signed-off-by: Zach O'Keefe <[email protected]>
> Cc: Yang Shi <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> ---
> I've updated the changelog to reflect discussions in [1] -- leaving
> ack to David / Matthew on whether to take the patch.
>
> Changed from v3[1]:
> - [akpm / David H. / M. Wilcox] Updated log to capture email discussion
> Changed from v2[2]:
> - Fixed false negative in smaps check when !dax && ->huge_fault
> Changed from v1[3]:
> - [Saurabhi] Allow ->huge_fault handler to handle fault, if it exists
>
> [1] https://lore.kernel.org/linux-mm/[email protected]/
> [2] https://lore.kernel.org/linux-mm/[email protected]/
> [3] https://lore.kernel.org/linux-mm/CAAa6QmQw+F=o6htOn=6ADD6mwvMO=Ow_67f3ifBv3GpXx9Xg_g@mail.gmail.com/
>
> ---
> mm/huge_memory.c | 20 +++++++++++++-------
> 1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 0f93a73115f73..797fe617e51ab 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -100,11 +100,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags,
> return in_pf;
>
> /*
> - * Special VMA and hugetlb VMA.
> + * khugepaged special VMA and hugetlb VMA.
> * Must be checked after dax since some dax mappings may have
> * VM_MIXEDMAP set.
> */
> - if (vm_flags & VM_NO_KHUGEPAGED)
> + if (!in_pf && !smaps && (vm_flags & VM_NO_KHUGEPAGED))
> return false;
>
> /*
> @@ -132,12 +132,18 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags,
> !hugepage_flags_always())))
> return false;
>
> - /* Only regular file is valid */
> - if (!in_pf && file_thp_enabled(vma))
> - return true;
> -
> - if (!vma_is_anonymous(vma))
> + if (!vma_is_anonymous(vma)) {
> + /*
> + * Trust that ->huge_fault() handlers know what they are doing
> + * in fault path.
> + */
> + if (((in_pf || smaps)) && vma->vm_ops->huge_fault)
> + return true;
> + /* Only regular file is valid in collapse path */
> + if (((!in_pf || smaps)) && file_thp_enabled(vma))
> + return true;
> return false;
> + }
>
> if (vma_is_temporary_stack(vma))
> return false;
> --
> 2.42.0.515.g380fc7ccd1-goog
>

2023-10-06 17:50:30

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

On Mon, 25 Sep 2023 13:01:10 -0700 "Zach O'Keefe" <[email protected]> wrote:

> The 6.0 commits:
>
> commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()")
> commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
>
> merged "can we have THPs in this VMA?" logic that was previously done
> separately by fault-path, khugepaged, and smaps "THPeligible" checks.
>
> During the process, the semantics of the fault path check changed in two
> ways:
>
> 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path).
> 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault
> handler that could satisfy the fault. Previously, this check had been
> done in create_huge_pud() and create_huge_pmd() routines, but after
> the changes, we never reach those routines.
>
> During the review of the above commits, it was determined that in-tree
> users weren't affected by the change; most notably, since the only relevant
> user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is
> explicitly approved early in approval logic. However, this was a bad
> assumption to make as it assumes the only reason to support ->huge_fault
> was for DAX (which is not true in general).
>
> Remove the VM_NO_KHUGEPAGED check when not in collapse path and give
> any ->huge_fault handler a chance to handle the fault. Note that we
> don't validate the file mode or mapping alignment, which is consistent
> with the behavior before the aforementioned commits.

It's unclear what are the userspace visible impacts of this change.
Which makes it hard for others to determine whether -stable kernels
should be patched.

> Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
> Reported-by: Saurabh Singh Sengar <[email protected]>

It's nice to include a Closes: link after a Reported-by:. Then readers
are better able to answer the above question.

> Signed-off-by: Zach O'Keefe <[email protected]>
> Cc: Yang Shi <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: David Hildenbrand <[email protected]>

2023-10-06 17:58:33

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

On Mon, 25 Sep 2023 13:01:10 -0700 "Zach O'Keefe" <[email protected]> wrote:

> The 6.0 commits:
>
> commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()")
> commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
>
> merged "can we have THPs in this VMA?" logic that was previously done
> separately by fault-path, khugepaged, and smaps "THPeligible" checks.
>
> During the process, the semantics of the fault path check changed in two
> ways:
>
> 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path).
> 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault
> handler that could satisfy the fault. Previously, this check had been
> done in create_huge_pud() and create_huge_pmd() routines, but after
> the changes, we never reach those routines.
>
> During the review of the above commits, it was determined that in-tree
> users weren't affected by the change; most notably, since the only relevant
> user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is
> explicitly approved early in approval logic. However, this was a bad
> assumption to make as it assumes the only reason to support ->huge_fault
> was for DAX (which is not true in general).
>
> Remove the VM_NO_KHUGEPAGED check when not in collapse path and give
> any ->huge_fault handler a chance to handle the fault. Note that we
> don't validate the file mode or mapping alignment, which is consistent
> with the behavior before the aforementioned commits.
>
> ...
>
> @@ -100,11 +100,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags,
> return in_pf;
>

Ryan's "mm: thp: introduce anon_orders and anon_always_mask sysfs
files" changes hugepage_vma_check() to return an unsigned int, so this
patch will need some rework to fit in after that.

However Ryan's overall series "variable-order, large folios for
anonymous memory" is in early days and might not make it.

And as I don't know what is the urgency of this patch ("mm/thp: fix
"mm: thp: kill __transhuge_page_enabled()"), I'm unable to decide which
patch needs to come first (thus requiring rework of the other patch).

Please discuss!

2023-10-06 18:54:46

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v4] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

On 06.10.23 19:58, Andrew Morton wrote:
> On Mon, 25 Sep 2023 13:01:10 -0700 "Zach O'Keefe" <[email protected]> wrote:
>
>> The 6.0 commits:
>>
>> commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()")
>> commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
>>
>> merged "can we have THPs in this VMA?" logic that was previously done
>> separately by fault-path, khugepaged, and smaps "THPeligible" checks.
>>
>> During the process, the semantics of the fault path check changed in two
>> ways:
>>
>> 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path).
>> 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault
>> handler that could satisfy the fault. Previously, this check had been
>> done in create_huge_pud() and create_huge_pmd() routines, but after
>> the changes, we never reach those routines.
>>
>> During the review of the above commits, it was determined that in-tree
>> users weren't affected by the change; most notably, since the only relevant
>> user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is
>> explicitly approved early in approval logic. However, this was a bad
>> assumption to make as it assumes the only reason to support ->huge_fault
>> was for DAX (which is not true in general).
>>
>> Remove the VM_NO_KHUGEPAGED check when not in collapse path and give
>> any ->huge_fault handler a chance to handle the fault. Note that we
>> don't validate the file mode or mapping alignment, which is consistent
>> with the behavior before the aforementioned commits.
>>
>> ...
>>
>> @@ -100,11 +100,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags,
>> return in_pf;
>>
>
> Ryan's "mm: thp: introduce anon_orders and anon_always_mask sysfs
> files" changes hugepage_vma_check() to return an unsigned int, so this
> patch will need some rework to fit in after that.
>
> However Ryan's overall series "variable-order, large folios for
> anonymous memory" is in early days and might not make it.
>
> And as I don't know what is the urgency of this patch ("mm/thp: fix
> "mm: thp: kill __transhuge_page_enabled()"), I'm unable to decide which
> patch needs to come first (thus requiring rework of the other patch).
>
> Please discuss!

IMHO clearly this one.

--
Cheers,

David / dhildenb

2023-10-06 18:57:08

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v4] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

On 25.09.23 22:01, Zach O'Keefe wrote:
> The 6.0 commits:
>
> commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()")
> commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
>
> merged "can we have THPs in this VMA?" logic that was previously done
> separately by fault-path, khugepaged, and smaps "THPeligible" checks.
>
> During the process, the semantics of the fault path check changed in two
> ways:
>
> 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path).
> 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault
> handler that could satisfy the fault. Previously, this check had been
> done in create_huge_pud() and create_huge_pmd() routines, but after
> the changes, we never reach those routines.
>
> During the review of the above commits, it was determined that in-tree
> users weren't affected by the change; most notably, since the only relevant
> user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is
> explicitly approved early in approval logic. However, this was a bad
> assumption to make as it assumes the only reason to support ->huge_fault
> was for DAX (which is not true in general).
>
> Remove the VM_NO_KHUGEPAGED check when not in collapse path and give
> any ->huge_fault handler a chance to handle the fault. Note that we
> don't validate the file mode or mapping alignment, which is consistent
> with the behavior before the aforementioned commits.
>
> Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
> Reported-by: Saurabh Singh Sengar <[email protected]>
> Signed-off-by: Zach O'Keefe <[email protected]>
> Cc: Yang Shi <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> ---
> I've updated the changelog to reflect discussions in [1] -- leaving
> ack to David / Matthew on whether to take the patch.

Works for me.

Acked-by: David Hildenbrand <[email protected]>

--
Cheers,

David / dhildenb

2023-10-06 19:11:28

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

On Fri, 6 Oct 2023 20:52:30 +0200 David Hildenbrand <[email protected]> wrote:

> > And as I don't know what is the urgency of this patch ("mm/thp: fix
> > "mm: thp: kill __transhuge_page_enabled()"), I'm unable to decide which
> > patch needs to come first (thus requiring rework of the other patch).
> >
> > Please discuss!
>
> IMHO clearly this one.

OK. I'll drop the "variable-order, large folios for anonymous memory" v6
series for now.

2023-10-06 21:28:24

by Ryan Roberts

[permalink] [raw]
Subject: Re: [PATCH v4] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

On 06/10/2023 20:11, Andrew Morton wrote:
> On Fri, 6 Oct 2023 20:52:30 +0200 David Hildenbrand <[email protected]> wrote:
>
>>> And as I don't know what is the urgency of this patch ("mm/thp: fix
>>> "mm: thp: kill __transhuge_page_enabled()"), I'm unable to decide which
>>> patch needs to come first (thus requiring rework of the other patch).
>>>
>>> Please discuss!
>>
>> IMHO clearly this one.
>
> OK. I'll drop the "variable-order, large folios for anonymous memory" v6
> series for now.

Yep, agreed!

2023-10-09 13:23:04

by Zach O'Keefe

[permalink] [raw]
Subject: Re: [PATCH v4] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

On Fri, Oct 6, 2023 at 10:50 AM Andrew Morton <[email protected]> wrote:
>
> On Mon, 25 Sep 2023 13:01:10 -0700 "Zach O'Keefe" <[email protected]> wrote:
>
> > The 6.0 commits:
> >
> > commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()")
> > commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
> >
> > merged "can we have THPs in this VMA?" logic that was previously done
> > separately by fault-path, khugepaged, and smaps "THPeligible" checks.
> >
> > During the process, the semantics of the fault path check changed in two
> > ways:
> >
> > 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path).
> > 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault
> > handler that could satisfy the fault. Previously, this check had been
> > done in create_huge_pud() and create_huge_pmd() routines, but after
> > the changes, we never reach those routines.
> >
> > During the review of the above commits, it was determined that in-tree
> > users weren't affected by the change; most notably, since the only relevant
> > user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is
> > explicitly approved early in approval logic. However, this was a bad
> > assumption to make as it assumes the only reason to support ->huge_fault
> > was for DAX (which is not true in general).
> >
> > Remove the VM_NO_KHUGEPAGED check when not in collapse path and give
> > any ->huge_fault handler a chance to handle the fault. Note that we
> > don't validate the file mode or mapping alignment, which is consistent
> > with the behavior before the aforementioned commits.
>
> It's unclear what are the userspace visible impacts of this change.
> Which makes it hard for others to determine whether -stable kernels
> should be patched.

IMO, I don't think this change is suitable for -stable; the only users
that would have been affected are those that maintain out-of-tree
drivers / code that hooked into ->huge_fault() or used VM_MIXEDMAP +
THP. No users of the in-tree kernel would have been affected. It's
still a good "fix" to make going forward (and certainly happy to be
able to help Saurabh out).

+ greg k-h for vis / to confirm.

> > Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
> > Reported-by: Saurabh Singh Sengar <[email protected]>
>
> It's nice to include a Closes: link after a Reported-by:. Then readers
> are better able to answer the above question.

Ah, apologies, Andrew; I didn't know such a tag existed -- I'll be
sure to include it in the future.

> > Signed-off-by: Zach O'Keefe <[email protected]>
> > Cc: Yang Shi <[email protected]>
> > Cc: Matthew Wilcox <[email protected]>
> > Cc: David Hildenbrand <[email protected]>
>

2023-10-09 13:23:58

by Zach O'Keefe

[permalink] [raw]
Subject: Re: [PATCH v4] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

On Fri, Oct 6, 2023 at 2:28 PM Ryan Roberts <[email protected]> wrote:
>
> On 06/10/2023 20:11, Andrew Morton wrote:
> > On Fri, 6 Oct 2023 20:52:30 +0200 David Hildenbrand <[email protected]> wrote:
> >
> >>> And as I don't know what is the urgency of this patch ("mm/thp: fix
> >>> "mm: thp: kill __transhuge_page_enabled()"), I'm unable to decide which
> >>> patch needs to come first (thus requiring rework of the other patch).
> >>>
> >>> Please discuss!
> >>
> >> IMHO clearly this one.
> >
> > OK. I'll drop the "variable-order, large folios for anonymous memory" v6
> > series for now.
>
> Yep, agreed!

Thank all and sorry for the late response ; have been buried as of late.

Best,
Zach