Should fix the recent syzbot report for:
https://lore.kernel.org/all/[email protected]/
Patch 1 is a small fixup where I notice mm-unstable crashes with the new
hugetlb memcg accounting when testing the changes.
Patch 2 should be the fix to the reported issue.
Patch 3 is an oneliner to add an assertion similar to what found the issue
in patch 2.
Only smoke tested over a bunch of hugetlb unit tests. Reviews welcomed.
Thanks,
Peter Xu (3):
fixup! mm: always initialise folio->_deferred_list
mm/hugetlb: Fix missing hugetlb_lock for resv uncharge
mm/hugetlb: Assert hugetlb_lock in __hugetlb_cgroup_commit_charge
mm/hugetlb.c | 5 ++++-
mm/hugetlb_cgroup.c | 2 +-
mm/memcontrol.c | 1 +
3 files changed, 6 insertions(+), 2 deletions(-)
--
2.44.0
Current mm-unstable will hit this when running test_hugetlb_memcg. This
fixes the crash for me.
Signed-off-by: Peter Xu <[email protected]>
---
mm/memcontrol.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1840ba4c355d..7703ced535a3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -7529,6 +7529,7 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug)
VM_BUG_ON_FOLIO(folio_test_lru(folio), folio);
VM_BUG_ON_FOLIO(folio_order(folio) > 1 &&
+ !folio_test_hugetlb(folio) &&
!list_empty(&folio->_deferred_list), folio);
/*
--
2.44.0
This is similar to __hugetlb_cgroup_uncharge_folio() where it relies on
holding hugetlb_lock. Add the similar assertion like the other one, since
it looks like such things may help some day.
Signed-off-by: Peter Xu <[email protected]>
---
mm/hugetlb_cgroup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index aa4486bd3904..e20339a346b9 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -308,7 +308,7 @@ static void __hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages,
{
if (hugetlb_cgroup_disabled() || !h_cg)
return;
-
+ lockdep_assert_held(&hugetlb_lock);
__set_hugetlb_cgroup(folio, h_cg, rsvd);
if (!rsvd) {
unsigned long usage =
--
2.44.0
There is a recent report on UFFDIO_COPY over hugetlb:
https://lore.kernel.org/all/[email protected]/
350: lockdep_assert_held(&hugetlb_lock);
Should be an issue in hugetlb but triggered in an userfault context, where
it goes into the unlikely path where two threads modifying the resv map
together. Mike has a fix in that path for resv uncharge but it looks like
the locking criteria was overlooked: hugetlb_cgroup_uncharge_folio_rsvd()
will update the cgroup pointer, so it requires to be called with the lock
held.
Looks like a stable material, so have it copied.
Reported-by: [email protected]
Cc: Mina Almasry <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: linux-stable <[email protected]>
Fixes: 79aa925bf239 ("hugetlb_cgroup: fix reservation accounting")
Signed-off-by: Peter Xu <[email protected]>
---
mm/hugetlb.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 26ab9dfc7d63..3158a55ce567 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3247,9 +3247,12 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
rsv_adjust = hugepage_subpool_put_pages(spool, 1);
hugetlb_acct_memory(h, -rsv_adjust);
- if (deferred_reserve)
+ if (deferred_reserve) {
+ spin_lock_irq(&hugetlb_lock);
hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
pages_per_huge_page(h), folio);
+ spin_unlock_irq(&hugetlb_lock);
+ }
}
if (!memcg_charge_ret)
--
2.44.0
On Wed, Apr 17, 2024 at 05:18:34PM -0400, Peter Xu wrote:
> Current mm-unstable will hit this when running test_hugetlb_memcg. This
> fixes the crash for me.
>
> Signed-off-by: Peter Xu <[email protected]>
> ---
> mm/memcontrol.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 1840ba4c355d..7703ced535a3 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -7529,6 +7529,7 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug)
>
> VM_BUG_ON_FOLIO(folio_test_lru(folio), folio);
> VM_BUG_ON_FOLIO(folio_order(folio) > 1 &&
> + !folio_test_hugetlb(folio) &&
> !list_empty(&folio->_deferred_list), folio);
Hum. I thought we didn't get here for hugetlb folios. What
stacktrace did you get?
I'm basing it on comments like this:
/* hugetlb has its own memcg */
if (folio_test_hugetlb(folio)) {
if (lruvec) {
unlock_page_lruvec_irqrestore(lruvec, flags);
lruvec = NULL;
}
free_huge_folio(folio);
continue;
}
On Thu, Apr 18, 2024 at 12:46:39AM +0100, Matthew Wilcox wrote:
> On Wed, Apr 17, 2024 at 05:18:34PM -0400, Peter Xu wrote:
> > Current mm-unstable will hit this when running test_hugetlb_memcg. This
> > fixes the crash for me.
[1]
> >
> > Signed-off-by: Peter Xu <[email protected]>
> > ---
> > mm/memcontrol.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 1840ba4c355d..7703ced535a3 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -7529,6 +7529,7 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug)
> >
> > VM_BUG_ON_FOLIO(folio_test_lru(folio), folio);
> > VM_BUG_ON_FOLIO(folio_order(folio) > 1 &&
> > + !folio_test_hugetlb(folio) &&
> > !list_empty(&folio->_deferred_list), folio);
>
> Hum. I thought we didn't get here for hugetlb folios. What
> stacktrace did you get?
A normal hugetlb free path iirc. You can try the test case, I mentioned
the reproducer [1] above. It crashes constantly.
>
> I'm basing it on comments like this:
>
> /* hugetlb has its own memcg */
> if (folio_test_hugetlb(folio)) {
> if (lruvec) {
> unlock_page_lruvec_irqrestore(lruvec, flags);
> lruvec = NULL;
> }
> free_huge_folio(folio);
> continue;
> }
Hugetlb does have its own memcg but I guess now it's even more than that,
see the patch merged months ago:
https://lore.kernel.org/all/[email protected]/
Especially:
@@ -1902,6 +1902,7 @@ void free_huge_folio(struct folio *folio)
pages_per_huge_page(h), folio);
hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
pages_per_huge_page(h), folio);
+ mem_cgroup_uncharge(folio);
if (restore_reserve)
h->resv_huge_pages++;
The comment above looks a bit confusing to me, as memcg is not the only
thing special for hugetlb pages. Explicitly mentioning it there made me
feel like hugetlb can be freed like a normal compound page if memcg is not
a problem, but looks like not yet?
--
Peter Xu
On Wed, Apr 17, 2024 at 2:18 PM Peter Xu <[email protected]> wrote:
>
> This is similar to __hugetlb_cgroup_uncharge_folio() where it relies on
> holding hugetlb_lock. Add the similar assertion like the other one, since
> it looks like such things may help some day.
>
> Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Mina Almasry <[email protected]>
> ---
> mm/hugetlb_cgroup.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> index aa4486bd3904..e20339a346b9 100644
> --- a/mm/hugetlb_cgroup.c
> +++ b/mm/hugetlb_cgroup.c
> @@ -308,7 +308,7 @@ static void __hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages,
> {
> if (hugetlb_cgroup_disabled() || !h_cg)
> return;
> -
> + lockdep_assert_held(&hugetlb_lock);
Maybe also remove the comment on the top of the function:
/* Should be called with hugetlb_lock held */
Now that the function asserts, the comment seems redundant, but up to you.
> __set_hugetlb_cgroup(folio, h_cg, rsvd);
> if (!rsvd) {
> unsigned long usage =
> --
> 2.44.0
>
--
Thanks,
Mina
On Wed, Apr 17, 2024 at 2:18 PM Peter Xu <[email protected]> wrote:
>
> There is a recent report on UFFDIO_COPY over hugetlb:
>
> https://lore.kernel.org/all/[email protected]/
>
> 350: lockdep_assert_held(&hugetlb_lock);
>
> Should be an issue in hugetlb but triggered in an userfault context, where
> it goes into the unlikely path where two threads modifying the resv map
> together. Mike has a fix in that path for resv uncharge but it looks like
> the locking criteria was overlooked: hugetlb_cgroup_uncharge_folio_rsvd()
> will update the cgroup pointer, so it requires to be called with the lock
> held.
>
> Looks like a stable material, so have it copied.
>
> Reported-by: [email protected]
> Cc: Mina Almasry <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: linux-stable <[email protected]>
> Fixes: 79aa925bf239 ("hugetlb_cgroup: fix reservation accounting")
> Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Mina Almasry <[email protected]>
> ---
> mm/hugetlb.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 26ab9dfc7d63..3158a55ce567 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3247,9 +3247,12 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
>
> rsv_adjust = hugepage_subpool_put_pages(spool, 1);
> hugetlb_acct_memory(h, -rsv_adjust);
> - if (deferred_reserve)
> + if (deferred_reserve) {
> + spin_lock_irq(&hugetlb_lock);
> hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
> pages_per_huge_page(h), folio);
> + spin_unlock_irq(&hugetlb_lock);
> + }
> }
>
> if (!memcg_charge_ret)
> --
> 2.44.0
>
--
Thanks,
Mina
On Fri, Apr 19, 2024 at 08:03:08AM -0700, Mina Almasry wrote:
> On Wed, Apr 17, 2024 at 2:18 PM Peter Xu <[email protected]> wrote:
> >
> > This is similar to __hugetlb_cgroup_uncharge_folio() where it relies on
> > holding hugetlb_lock. Add the similar assertion like the other one, since
> > it looks like such things may help some day.
> >
> > Signed-off-by: Peter Xu <[email protected]>
>
> Reviewed-by: Mina Almasry <[email protected]>
Thanks.
>
> > ---
> > mm/hugetlb_cgroup.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> > index aa4486bd3904..e20339a346b9 100644
> > --- a/mm/hugetlb_cgroup.c
> > +++ b/mm/hugetlb_cgroup.c
> > @@ -308,7 +308,7 @@ static void __hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages,
> > {
> > if (hugetlb_cgroup_disabled() || !h_cg)
> > return;
> > -
> > + lockdep_assert_held(&hugetlb_lock);
>
> Maybe also remove the comment on the top of the function:
>
> /* Should be called with hugetlb_lock held */
>
> Now that the function asserts, the comment seems redundant, but up to you.
IMHO there's no harm to be verbose in this case, so I'll just keep it
simple to be as-is.
Thanks,
--
Peter Xu