This series focus on fixing cache maintenance.
v3:
- Collect Reviewed-by tag from Zi Yan.
- Fix hugetlb cache maintenance.
v2:
- Collect Reviewed-by tag from Zi Yan.
- Using a for loop instead of the folio variant for backportability.
Muchun Song (5):
mm: thp: fix wrong cache flush in remove_migration_pmd()
mm: fix missing cache flush for all tail pages of compound page
mm: hugetlb: fix missing cache flush in copy_huge_page_from_user()
mm: hugetlb: fix missing cache flush in hugetlb_mcopy_atomic_pte()
mm: replace multiple dcache flush with flush_dcache_folio()
mm/huge_memory.c | 3 ++-
mm/hugetlb.c | 1 +
mm/memory.c | 2 ++
mm/migrate.c | 3 +--
4 files changed, 6 insertions(+), 3 deletions(-)
--
2.11.0
The D-cache maintenance inside move_to_new_page() only consider one page,
there is still D-cache maintenance issue for tail pages of compound page
(e.g. THP or HugeTLB). THP migration is only enabled on x86_64, ARM64
and powerpc, while powerpc and arm64 need to maintain the consistency
between I-Cache and D-Cache, which depends on flush_dcache_page() to
maintain the consistency between I-Cache and D-Cache. In theory, the
issue can be found on arm64 and powerpc. HugeTLB migration is enabled
on arm, arm64, mips, parisc, powerpc, riscv, s390 and sh, while arm
has handled the compound page cache flush in flush_dcache_page(), but
most others do not. In theory, the issue exists on many architectures.
Fix this by not using flush_dcache_folio() since it is not backportable.
Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path")
Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Zi Yan <[email protected]>
---
mm/migrate.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index c9296d63878d..c418e8d92b9c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -933,9 +933,12 @@ static int move_to_new_page(struct page *newpage, struct page *page,
if (!PageMappingFlags(page))
page->mapping = NULL;
- if (likely(!is_zone_device_page(newpage)))
- flush_dcache_page(newpage);
+ if (likely(!is_zone_device_page(newpage))) {
+ int i, nr = compound_nr(newpage);
+ for (i = 0; i < nr; i++)
+ flush_dcache_page(newpage + i);
+ }
}
out:
return rc;
--
2.11.0
folio_copy() will copy the data from one page to the target page, then
the target page will be mapped to the user space address, which might
have an alias issue with the kernel address used to copy the data from
the page to. Fix this issue by flushing dcache but not use
flush_dcache_folio() since it is not backportable.
Fixes: 8cc5fcbb5be8 ("mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY")
Signed-off-by: Muchun Song <[email protected]>
---
mm/hugetlb.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a1baa198519a..f1f1ab31dc8a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5804,6 +5804,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
goto out;
}
} else {
+ int i, nr;
+
if (vm_shared &&
hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) {
put_page(*pagep);
@@ -5819,6 +5821,9 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
goto out;
}
folio_copy(page_folio(page), page_folio(*pagep));
+ nr = compound_nr(page);
+ for (i = 0; i < nr; i++)
+ flush_dcache_page(page + i);
put_page(*pagep);
*pagep = NULL;
}
--
2.11.0
The flush_cache_page() is supposed to be justified only if the page
is already placed in process page table, and that is done right after
flush_cache_page(). So using this interface is wrong. And there is
no need to invalite cache since it was non-present before in
remove_migration_pmd(). So just to remove it.
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Zi Yan <[email protected]>
---
mm/huge_memory.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f58524394dc1..45ede45b11f5 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3207,7 +3207,6 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)
if (pmd_swp_uffd_wp(*pvmw->pmd))
pmde = pmd_wrprotect(pmd_mkuffd_wp(pmde));
- flush_cache_range(vma, mmun_start, mmun_start + HPAGE_PMD_SIZE);
if (PageAnon(new))
page_add_anon_rmap(new, vma, mmun_start, true);
else
@@ -3215,6 +3214,8 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)
set_pmd_at(mm, mmun_start, pvmw->pmd, pmde);
if ((vma->vm_flags & VM_LOCKED) && !PageDoubleMap(new))
mlock_vma_page(new);
+
+ /* No need to invalidate - it was non-present before */
update_mmu_cache_pmd(vma, address, pvmw->pmd);
}
#endif
--
2.11.0
The userfaultfd calls copy_huge_page_from_user() which does not do
any cache flushing for the target page. Then the target page will
be mapped to the user space with a different address (user address),
which might have an alias issue with the kernel address used to copy
the data from the user to. Fix this issue by flushing dcache in
copy_huge_page_from_user().
Fixes: fa4d75c1de13 ("userfaultfd: hugetlbfs: add copy_huge_page_from_user for hugetlb userfaultfd support")
Signed-off-by: Muchun Song <[email protected]>
---
mm/memory.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/memory.c b/mm/memory.c
index e8ce066be5f2..eb027da68aa7 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5405,6 +5405,8 @@ long copy_huge_page_from_user(struct page *dst_page,
if (rc)
break;
+ flush_dcache_page(subpage);
+
cond_resched();
}
return ret_val;
--
2.11.0
Simplify the code by using flush_dcache_folio().
Signed-off-by: Muchun Song <[email protected]>
---
mm/hugetlb.c | 6 +-----
mm/migrate.c | 8 ++------
2 files changed, 3 insertions(+), 11 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f1f1ab31dc8a..828240aee3f9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5804,8 +5804,6 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
goto out;
}
} else {
- int i, nr;
-
if (vm_shared &&
hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) {
put_page(*pagep);
@@ -5821,9 +5819,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
goto out;
}
folio_copy(page_folio(page), page_folio(*pagep));
- nr = compound_nr(page);
- for (i = 0; i < nr; i++)
- flush_dcache_page(page + i);
+ flush_dcache_folio(page_folio(page));
put_page(*pagep);
*pagep = NULL;
}
diff --git a/mm/migrate.c b/mm/migrate.c
index c418e8d92b9c..daf2b3508670 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -933,12 +933,8 @@ static int move_to_new_page(struct page *newpage, struct page *page,
if (!PageMappingFlags(page))
page->mapping = NULL;
- if (likely(!is_zone_device_page(newpage))) {
- int i, nr = compound_nr(newpage);
-
- for (i = 0; i < nr; i++)
- flush_dcache_page(newpage + i);
- }
+ if (likely(!is_zone_device_page(newpage)))
+ flush_dcache_folio(page_folio(newpage));
}
out:
return rc;
--
2.11.0
On 1/31/22 08:02, Muchun Song wrote:
> The D-cache maintenance inside move_to_new_page() only consider one page,
> there is still D-cache maintenance issue for tail pages of compound page
> (e.g. THP or HugeTLB). THP migration is only enabled on x86_64, ARM64
> and powerpc, while powerpc and arm64 need to maintain the consistency
> between I-Cache and D-Cache, which depends on flush_dcache_page() to
> maintain the consistency between I-Cache and D-Cache. In theory, the
> issue can be found on arm64 and powerpc. HugeTLB migration is enabled
> on arm, arm64, mips, parisc, powerpc, riscv, s390 and sh, while arm
> has handled the compound page cache flush in flush_dcache_page(), but
> most others do not. In theory, the issue exists on many architectures.
> Fix this by not using flush_dcache_folio() since it is not backportable.
>
> Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path")
> Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
> Signed-off-by: Muchun Song <[email protected]>
> Reviewed-by: Zi Yan <[email protected]>
> ---
> mm/migrate.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index c9296d63878d..c418e8d92b9c 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -933,9 +933,12 @@ static int move_to_new_page(struct page *newpage, struct page *page,
> if (!PageMappingFlags(page))
> page->mapping = NULL;
>
> - if (likely(!is_zone_device_page(newpage)))
> - flush_dcache_page(newpage);
> + if (likely(!is_zone_device_page(newpage))) {
> + int i, nr = compound_nr(newpage);
>
> + for (i = 0; i < nr; i++)
> + flush_dcache_page(newpage + i);
> + }
As you have already noted elsewhere, the arm64 version of flush_dcache_page
seems to handle this and flush the entire compound_page. Is this going to
flush the entire compound page compound_nr times?
--
Mike Kravetz
> }
> out:
> return rc;
Cc: Matthew
On 1/31/22 08:02, Muchun Song wrote:
> folio_copy() will copy the data from one page to the target page, then
> the target page will be mapped to the user space address, which might
> have an alias issue with the kernel address used to copy the data from
> the page to. Fix this issue by flushing dcache but not use
> flush_dcache_folio() since it is not backportable.
>
> Fixes: 8cc5fcbb5be8 ("mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY")
> Signed-off-by: Muchun Song <[email protected]>
> ---
> mm/hugetlb.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index a1baa198519a..f1f1ab31dc8a 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5804,6 +5804,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> goto out;
> }
> } else {
> + int i, nr;
> +
> if (vm_shared &&
> hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) {
> put_page(*pagep);
> @@ -5819,6 +5821,9 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> goto out;
> }
> folio_copy(page_folio(page), page_folio(*pagep));
What if we changed that folio_copy() to?
copy_user_huge_page(page, *pagep, dst_addr, dst_vma,
pages_per_huge_page(h));
Seems like that would take care of the flush_dcache_page and it would be
backportable.
Of course, Matthew may hate the idea. Not sure if there are any plans to
convert copy_user_huge_page to use folio type as it has some special hinting
logic.
--
Mike Kravetz
> + nr = compound_nr(page);
> + for (i = 0; i < nr; i++)
> + flush_dcache_page(page + i);
> put_page(*pagep);
> *pagep = NULL;
> }
On Wed, Feb 2, 2022 at 6:24 AM Mike Kravetz <[email protected]> wrote:
>
> Cc: Matthew
>
> On 1/31/22 08:02, Muchun Song wrote:
> > folio_copy() will copy the data from one page to the target page, then
> > the target page will be mapped to the user space address, which might
> > have an alias issue with the kernel address used to copy the data from
> > the page to. Fix this issue by flushing dcache but not use
> > flush_dcache_folio() since it is not backportable.
> >
> > Fixes: 8cc5fcbb5be8 ("mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY")
> > Signed-off-by: Muchun Song <[email protected]>
> > ---
> > mm/hugetlb.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index a1baa198519a..f1f1ab31dc8a 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -5804,6 +5804,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> > goto out;
> > }
> > } else {
> > + int i, nr;
> > +
> > if (vm_shared &&
> > hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) {
> > put_page(*pagep);
> > @@ -5819,6 +5821,9 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> > goto out;
> > }
> > folio_copy(page_folio(page), page_folio(*pagep));
>
> What if we changed that folio_copy() to?
>
> copy_user_huge_page(page, *pagep, dst_addr, dst_vma,
> pages_per_huge_page(h));
>
> Seems like that would take care of the flush_dcache_page and it would be
> backportable.
Agree. I also can replace folio_copy() with copy_user_huge_page().
>
> Of course, Matthew may hate the idea. Not sure if there are any plans to
> convert copy_user_huge_page to use folio type as it has some special hinting
> logic.
On Wed, Feb 2, 2022 at 7:33 AM Mike Kravetz <[email protected]> wrote:
>
> On 1/31/22 08:02, Muchun Song wrote:
> > The D-cache maintenance inside move_to_new_page() only consider one page,
> > there is still D-cache maintenance issue for tail pages of compound page
> > (e.g. THP or HugeTLB). THP migration is only enabled on x86_64, ARM64
> > and powerpc, while powerpc and arm64 need to maintain the consistency
> > between I-Cache and D-Cache, which depends on flush_dcache_page() to
> > maintain the consistency between I-Cache and D-Cache. In theory, the
> > issue can be found on arm64 and powerpc. HugeTLB migration is enabled
> > on arm, arm64, mips, parisc, powerpc, riscv, s390 and sh, while arm
> > has handled the compound page cache flush in flush_dcache_page(), but
> > most others do not. In theory, the issue exists on many architectures.
> > Fix this by not using flush_dcache_folio() since it is not backportable.
> >
> > Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path")
> > Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
> > Signed-off-by: Muchun Song <[email protected]>
> > Reviewed-by: Zi Yan <[email protected]>
> > ---
> > mm/migrate.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index c9296d63878d..c418e8d92b9c 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -933,9 +933,12 @@ static int move_to_new_page(struct page *newpage, struct page *page,
> > if (!PageMappingFlags(page))
> > page->mapping = NULL;
> >
> > - if (likely(!is_zone_device_page(newpage)))
> > - flush_dcache_page(newpage);
> > + if (likely(!is_zone_device_page(newpage))) {
> > + int i, nr = compound_nr(newpage);
> >
> > + for (i = 0; i < nr; i++)
> > + flush_dcache_page(newpage + i);
> > + }
>
> As you have already noted elsewhere, the arm64 version of flush_dcache_page
> seems to handle this and flush the entire compound_page. Is this going to
> flush the entire compound page compound_nr times?
>
Not flush cache compound_nr times but set PG_dcache_clean
on page->flags compound_nr times. The definition of
flush_dcache_page() on arm64 is as follows.
void flush_dcache_page(struct page *page)
{
if (test_bit(PG_dcache_clean, &page->flags))
clear_bit(PG_dcache_clean, &page->flags);
}
So I have a plan to improve this, e.g. implement flush_dcache_folio()
on arm64 and replace those multiple dcache flush with
flush_dcache_folio().
void flush_dcache_folio(struct folio *folio)
{
if (test_bit(PG_dcache_clean, &folio->flags))
clear_bit(PG_dcache_clean, &folio->flags);
}
Thanks.
On Tue, Feb 1, 2022 at 12:04 AM Muchun Song <[email protected]> wrote:
>
> The D-cache maintenance inside move_to_new_page() only consider one page,
> there is still D-cache maintenance issue for tail pages of compound page
> (e.g. THP or HugeTLB). THP migration is only enabled on x86_64, ARM64
> and powerpc, while powerpc and arm64 need to maintain the consistency
> between I-Cache and D-Cache, which depends on flush_dcache_page() to
> maintain the consistency between I-Cache and D-Cache. In theory, the
> issue can be found on arm64 and powerpc.
My bad. I have looked at the code closely on arm64 and powerpc. There
should be no issues since their icache flushing function already considers
the compound pages. I'll update the commit log in the next version.
> HugeTLB migration is enabled
> on arm, arm64, mips, parisc, powerpc, riscv, s390 and sh, while arm
> has handled the compound page cache flush in flush_dcache_page(), but
> most others do not. In theory, the issue exists on many architectures.
> Fix this by not using flush_dcache_folio() since it is not backportable.
>
> Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path")
This Fixes tag will be removed in the next version.
> Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
> Signed-off-by: Muchun Song <[email protected]>
> Reviewed-by: Zi Yan <[email protected]>