2023-09-20 18:42:06

by Saurabh Singh Sengar

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: [PATCH v3] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

On Tue, Sep 05, 2023 at 11:58:17PM -0700, Saurabh Singh Sengar wrote:
> On Fri, Aug 25, 2023 at 08:09:07AM -0700, Zach O'Keefe wrote:
> > On Fri, Aug 25, 2023 at 5:58 AM David Hildenbrand <[email protected]> wrote:
> > >
> > > On 25.08.23 14:49, Matthew Wilcox wrote:
> > > > On Fri, Aug 25, 2023 at 09:59:23AM +0200, David Hildenbrand wrote:
> > > >> Especially, we do have bigger ->huge_fault changes coming up:
> > > >>
> > > >> https://lkml.kernel.org/r/[email protected]
> >
> > FWIW, one of those patches updates the docs to read,
> >
> > "->huge_fault() is called when there is no PUD or PMD entry present. This
> > gives the filesystem the opportunity to install a PUD or PMD sized page.
> > Filesystems can also use the ->fault method to return a PMD sized page,
> > so implementing this function may not be necessary. In particular,
> > filesystems should not call filemap_fault() from ->huge_fault(). [..]"
> >
> > Which won't work (in the general case) without this patch (well, at
> > least the ->huge_fault() check part).
> >
> > So, if we're advertising this is the way it works, maybe that gives a
> > stronger argument for addressing it sooner vs when the first in-tree
> > user depends on it?
> >
> > > >> If the driver is not in the tree, people don't care.
> > > >>
> > > >> You really should try upstreaming that driver.
> > > >>
> > > >>
> > > >> So this patch here adds complexity (which I don't like) in order to keep an
> > > >> OOT driver working -- possibly for a short time. I'm tempted to say "please
> > > >> fix your driver to not use huge faults in that scenario, it is no longer
> > > >> supported".
> > > >>
> > > >> But I'm just about to vanish for 1.5 week into vacation :)
> > > >>
> > > >> @Willy, what are your thoughts?
> > > >
> > > > Fundamentally there was a bad assumption with the original patch --
> > > > it assumed that the only reason to support ->huge_fault was for DAX,
> > > > and that's not true. It's just that the only drivers in-tree which
> > > > support ->huge_fault do so in order to support DAX.
> > >
> > > Okay, and we are willing to continue supporting that then and it's
> > > nothing we want to stop OOT drivers from doing.
> > >
> > > Fine with me; we should probably reflect that in the patch description.
> >
> > I can change these paragraphs,
> >
> > "During the review of the above commits, it was determined that in-tree
> > users weren't affected by the change; most notably, since the only relevant
> > user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is
> > explicitly approved early in approval logic. However, there is at least
> > one occurrence where an out-of-tree driver that used
> > VM_HUGEPAGE|VM_MIXEDMAP with a vm_ops->huge_fault handler, was broken.
> >
> > Remove the VM_NO_KHUGEPAGED check when not in collapse path and give
> > any ->huge_fault handler a chance to handle the fault. Note that we
> > don't validate the file mode or mapping alignment, which is consistent
> > with the behavior before the aforementioned commits."
> >
> > To read,
> >
> > "The above commits, however, overfit the existing in-tree use cases,
> > and assume that
> > the only reason to support ->huge_fault was for DAX (which is
> > explicitly approved early in the approval logic).
> > This is a bad assumption to make and unnecessarily prevents general
> > support of ->huge_fault by filesystems. Allow returning "true" if such
> > a handler exists, giving the fault path an opportunity to exercise it.
> >
> > Similarly, the rationale for including the VM_NO_KHUGEPAGED check
> > along the fault path was that it didn't alter any in-tree users, but
> > was likewise similarly unnecessarily restrictive (and reads odd).
> > Remove the check from the fault path."
> >
>
>
> Any chance this can make it to 6.6 kernel ?

ping

>
> - Saurabh


2023-09-22 21:26:41

by Yang Shi

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: [PATCH v3] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

On Tue, Sep 19, 2023 at 10:44 PM Saurabh Singh Sengar
<[email protected]> wrote:
>
> On Tue, Sep 05, 2023 at 11:58:17PM -0700, Saurabh Singh Sengar wrote:
> > On Fri, Aug 25, 2023 at 08:09:07AM -0700, Zach O'Keefe wrote:
> > > On Fri, Aug 25, 2023 at 5:58 AM David Hildenbrand <[email protected]> wrote:
> > > >
> > > > On 25.08.23 14:49, Matthew Wilcox wrote:
> > > > > On Fri, Aug 25, 2023 at 09:59:23AM +0200, David Hildenbrand wrote:
> > > > >> Especially, we do have bigger ->huge_fault changes coming up:
> > > > >>
> > > > >> https://lkml.kernel.org/r/[email protected]
> > >
> > > FWIW, one of those patches updates the docs to read,
> > >
> > > "->huge_fault() is called when there is no PUD or PMD entry present. This
> > > gives the filesystem the opportunity to install a PUD or PMD sized page.
> > > Filesystems can also use the ->fault method to return a PMD sized page,
> > > so implementing this function may not be necessary. In particular,
> > > filesystems should not call filemap_fault() from ->huge_fault(). [..]"
> > >
> > > Which won't work (in the general case) without this patch (well, at
> > > least the ->huge_fault() check part).
> > >
> > > So, if we're advertising this is the way it works, maybe that gives a
> > > stronger argument for addressing it sooner vs when the first in-tree
> > > user depends on it?
> > >
> > > > >> If the driver is not in the tree, people don't care.
> > > > >>
> > > > >> You really should try upstreaming that driver.
> > > > >>
> > > > >>
> > > > >> So this patch here adds complexity (which I don't like) in order to keep an
> > > > >> OOT driver working -- possibly for a short time. I'm tempted to say "please
> > > > >> fix your driver to not use huge faults in that scenario, it is no longer
> > > > >> supported".
> > > > >>
> > > > >> But I'm just about to vanish for 1.5 week into vacation :)
> > > > >>
> > > > >> @Willy, what are your thoughts?
> > > > >
> > > > > Fundamentally there was a bad assumption with the original patch --
> > > > > it assumed that the only reason to support ->huge_fault was for DAX,
> > > > > and that's not true. It's just that the only drivers in-tree which
> > > > > support ->huge_fault do so in order to support DAX.
> > > >
> > > > Okay, and we are willing to continue supporting that then and it's
> > > > nothing we want to stop OOT drivers from doing.
> > > >
> > > > Fine with me; we should probably reflect that in the patch description.
> > >
> > > I can change these paragraphs,
> > >
> > > "During the review of the above commits, it was determined that in-tree
> > > users weren't affected by the change; most notably, since the only relevant
> > > user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is
> > > explicitly approved early in approval logic. However, there is at least
> > > one occurrence where an out-of-tree driver that used
> > > VM_HUGEPAGE|VM_MIXEDMAP with a vm_ops->huge_fault handler, was broken.
> > >
> > > Remove the VM_NO_KHUGEPAGED check when not in collapse path and give
> > > any ->huge_fault handler a chance to handle the fault. Note that we
> > > don't validate the file mode or mapping alignment, which is consistent
> > > with the behavior before the aforementioned commits."
> > >
> > > To read,
> > >
> > > "The above commits, however, overfit the existing in-tree use cases,
> > > and assume that
> > > the only reason to support ->huge_fault was for DAX (which is
> > > explicitly approved early in the approval logic).
> > > This is a bad assumption to make and unnecessarily prevents general
> > > support of ->huge_fault by filesystems. Allow returning "true" if such
> > > a handler exists, giving the fault path an opportunity to exercise it.
> > >
> > > Similarly, the rationale for including the VM_NO_KHUGEPAGED check
> > > along the fault path was that it didn't alter any in-tree users, but
> > > was likewise similarly unnecessarily restrictive (and reads odd).
> > > Remove the check from the fault path."
> > >
> >
> >
> > Any chance this can make it to 6.6 kernel ?
>
> ping

I think we tend to merge this patch, but anyway it is Andrew's call.
Included Andrew in this loop.

>
> >
> > - Saurabh