Nick Bowler reported another sparc64 breakage after the young/dirty
persistent work for page migration (per "Link:" below). That's after a
similar report [2].
It turns out page migration was overlooked, and it wasn't failing before
because page migration was not enabled in the initial report test environment.
David proposed another way [2] to fix this from sparc64 side, but that
patch didn't land somehow. Neither did I check whether there's any other
arch that has similar issues.
Let's fix it for now as simple as moving the write bit handling to be after
dirty, like what we did before.
Note: this is based on mm-unstable, because the breakage was since 6.1 and
we're at a very late stage of 6.2 (-rc8), so I assume for this specific
case we should target this at 6.3.
[1] https://lore.kernel.org/all/[email protected]/
[2] https://lore.kernel.org/all/[email protected]/
Cc: [email protected]
Fixes: 2e3468778dbe ("mm: remember young/dirty bit for page migrations")
Link: https://lore.kernel.org/all/CADyTPExpEqaJiMGoV+Z6xVgL50ZoMJg49B10LcZ=8eg19u34BA@mail.gmail.com/
Reported-by: Nick Bowler <[email protected]>
Tested-by: Nick Bowler <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
mm/huge_memory.c | 6 ++++--
mm/migrate.c | 2 ++
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1343a7d88299..4fc43859e59a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3274,8 +3274,6 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)
pmde = mk_huge_pmd(new, READ_ONCE(vma->vm_page_prot));
if (pmd_swp_soft_dirty(*pvmw->pmd))
pmde = pmd_mksoft_dirty(pmde);
- if (is_writable_migration_entry(entry))
- pmde = maybe_pmd_mkwrite(pmde, vma);
if (pmd_swp_uffd_wp(*pvmw->pmd))
pmde = pmd_mkuffd_wp(pmde);
if (!is_migration_entry_young(entry))
@@ -3283,6 +3281,10 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)
/* NOTE: this may contain setting soft-dirty on some archs */
if (PageDirty(new) && is_migration_entry_dirty(entry))
pmde = pmd_mkdirty(pmde);
+ if (is_writable_migration_entry(entry))
+ pmde = maybe_pmd_mkwrite(pmde, vma);
+ else
+ pmde = pmd_wrprotect(pmde);
if (PageAnon(new)) {
rmap_t rmap_flags = RMAP_COMPOUND;
diff --git a/mm/migrate.c b/mm/migrate.c
index ef68a1aff35c..40c63e77e91f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -225,6 +225,8 @@ static bool remove_migration_pte(struct folio *folio,
pte = maybe_mkwrite(pte, vma);
else if (pte_swp_uffd_wp(*pvmw.pte))
pte = pte_mkuffd_wp(pte);
+ else
+ pte = pte_wrprotect(pte);
if (folio_test_anon(folio) && !is_readable_migration_entry(entry))
rmap_flags |= RMAP_EXCLUSIVE;
--
2.39.1
On 16.02.23 16:30, Peter Xu wrote:
> Nick Bowler reported another sparc64 breakage after the young/dirty
> persistent work for page migration (per "Link:" below). That's after a
> similar report [2].
>
> It turns out page migration was overlooked, and it wasn't failing before
> because page migration was not enabled in the initial report test environment.
>
> David proposed another way [2] to fix this from sparc64 side, but that
> patch didn't land somehow.
I pinged another time and asked Andrew to pick it up eventually.
--
Thanks,
David / dhildenb
On 16.02.23 16:30, Peter Xu wrote:
> Nick Bowler reported another sparc64 breakage after the young/dirty
> persistent work for page migration (per "Link:" below). That's after a
> similar report [2].
Thx for handling this.
> [...]
>
> Note: this is based on mm-unstable, because the breakage was since 6.1 and
> we're at a very late stage of 6.2 (-rc8), so I assume for this specific
> case we should target this at 6.3.
>
> [1] https://lore.kernel.org/all/[email protected]/
> [2] https://lore.kernel.org/all/[email protected]/
>
> Cc: [email protected]
Not that it matters much, but feel free to use this instead:
CC: [email protected]
Then things don't depend on me (in case I ever get help with my cat
herding job). And it also make it even more obvious that this patch
fixes a regression to anyone who handles it downstream.
> Fixes: 2e3468778dbe ("mm: remember young/dirty bit for page migrations")
That's a commit from 6.1, hence this should likely have:
Cc: [email protected] # 6.1.y
[no, a fixes tag alone does not suffice, see docs]
> Link: https://lore.kernel.org/all/CADyTPExpEqaJiMGoV+Z6xVgL50ZoMJg49B10LcZ=8eg19u34BA@mail.gmail.com/
> Reported-by: Nick Bowler <[email protected]>
> [...]
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
On 16.02.23 16:30, Peter Xu wrote:
> Nick Bowler reported another sparc64 breakage after the young/dirty
> persistent work for page migration (per "Link:" below). That's after a
> similar report [2].
>
> It turns out page migration was overlooked, and it wasn't failing before
> because page migration was not enabled in the initial report test environment.
>
> David proposed another way [2] to fix this from sparc64 side, but that
> patch didn't land somehow. Neither did I check whether there's any other
> arch that has similar issues.
>
> Let's fix it for now as simple as moving the write bit handling to be after
> dirty, like what we did before.
>
> Note: this is based on mm-unstable, because the breakage was since 6.1 and
> we're at a very late stage of 6.2 (-rc8), so I assume for this specific
> case we should target this at 6.3.
>
> [1] https://lore.kernel.org/all/[email protected]/
> [2] https://lore.kernel.org/all/[email protected]/
>
> Cc: [email protected]
> Fixes: 2e3468778dbe ("mm: remember young/dirty bit for page migrations")
> Link: https://lore.kernel.org/all/CADyTPExpEqaJiMGoV+Z6xVgL50ZoMJg49B10LcZ=8eg19u34BA@mail.gmail.com/
> Reported-by: Nick Bowler <[email protected]>
> Tested-by: Nick Bowler <[email protected]>
> Signed-off-by: Peter Xu <[email protected]>
> ---
> mm/huge_memory.c | 6 ++++--
> mm/migrate.c | 2 ++
> 2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 1343a7d88299..4fc43859e59a 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3274,8 +3274,6 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)
> pmde = mk_huge_pmd(new, READ_ONCE(vma->vm_page_prot));
> if (pmd_swp_soft_dirty(*pvmw->pmd))
> pmde = pmd_mksoft_dirty(pmde);
> - if (is_writable_migration_entry(entry))
> - pmde = maybe_pmd_mkwrite(pmde, vma);
> if (pmd_swp_uffd_wp(*pvmw->pmd))
> pmde = pmd_mkuffd_wp(pmde);
> if (!is_migration_entry_young(entry))
> @@ -3283,6 +3281,10 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)
> /* NOTE: this may contain setting soft-dirty on some archs */
> if (PageDirty(new) && is_migration_entry_dirty(entry))
> pmde = pmd_mkdirty(pmde);
> + if (is_writable_migration_entry(entry))
> + pmde = maybe_pmd_mkwrite(pmde, vma);
> + else
> + pmde = pmd_wrprotect(pmde);
>
> if (PageAnon(new)) {
> rmap_t rmap_flags = RMAP_COMPOUND;
> diff --git a/mm/migrate.c b/mm/migrate.c
> index ef68a1aff35c..40c63e77e91f 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -225,6 +225,8 @@ static bool remove_migration_pte(struct folio *folio,
> pte = maybe_mkwrite(pte, vma);
> else if (pte_swp_uffd_wp(*pvmw.pte))
> pte = pte_mkuffd_wp(pte);
> + else
> + pte = pte_wrprotect(pte);
>
> if (folio_test_anon(folio) && !is_readable_migration_entry(entry))
> rmap_flags |= RMAP_EXCLUSIVE;
I'd really rather focus on fixing the root cause instead, anyhow if my
patch won't make it:
Acked-by: David Hildenbrand <[email protected]>
--
Thanks,
David / dhildenb
On Thu, Feb 16, 2023 at 05:59:16PM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 16.02.23 16:30, Peter Xu wrote:
> > Nick Bowler reported another sparc64 breakage after the young/dirty
> > persistent work for page migration (per "Link:" below). That's after a
> > similar report [2].
>
> Thx for handling this.
>
> > [...]
> >
> > Note: this is based on mm-unstable, because the breakage was since 6.1 and
> > we're at a very late stage of 6.2 (-rc8), so I assume for this specific
> > case we should target this at 6.3.
> >
> > [1] https://lore.kernel.org/all/[email protected]/
> > [2] https://lore.kernel.org/all/[email protected]/
> >
> > Cc: [email protected]
>
> Not that it matters much, but feel free to use this instead:
>
> CC: [email protected]
>
> Then things don't depend on me (in case I ever get help with my cat
> herding job). And it also make it even more obvious that this patch
> fixes a regression to anyone who handles it downstream.
Sure.
>
> > Fixes: 2e3468778dbe ("mm: remember young/dirty bit for page migrations")
>
> That's a commit from 6.1, hence this should likely have:
>
> Cc: [email protected] # 6.1.y
>
> [no, a fixes tag alone does not suffice, see docs]
Oops, I just forgot that. We definitely need to cc:stable. I'll make sure
it's there if there's a new version, or I hope Andrew could help to attach
otherwise.
Thanks,
--
Peter Xu
On Thu, 16 Feb 2023 16:37:19 +0100 David Hildenbrand <[email protected]> wrote:
> On 16.02.23 16:30, Peter Xu wrote:
> > Nick Bowler reported another sparc64 breakage after the young/dirty
> > persistent work for page migration (per "Link:" below). That's after a
> > similar report [2].
> >
> > It turns out page migration was overlooked, and it wasn't failing before
> > because page migration was not enabled in the initial report test environment.
> >
> > David proposed another way [2] to fix this from sparc64 side, but that
> > patch didn't land somehow.
>
> I pinged another time and asked Andrew to pick it up eventually.
>
urgh, sorry, I'd have filtered that out as a sparc patch.