This patch series reworks the logic that handles swapping in page
metadata to fix a reported bug [1] where metadata can sometimes not
be swapped in correctly after commit c145e0b47c77 ("mm: streamline COW
logic in do_swap_page()").
- Patch 1 fixes the bug itself, but still requires architectures
to restore metadata in both arch_swap_restore() and set_pte_at().
- Patch 2 makes it so that architectures only need to restore metadata
in arch_swap_restore().
- Patch 3 changes arm64 to remove support for restoring metadata
in set_pte_at().
[1] https://lore.kernel.org/all/[email protected]/
v4:
- Rebased onto v6.4-rc3
- Reverted change to arch/arm64/mm/mteswap.c; this change was not
valid because swapcache pages can have arch_swap_restore() called
on them multiple times
v3:
- Added patch to call arch_swap_restore() from unuse_pte()
- Rebased onto arm64/for-next/fixes
v2:
- Call arch_swap_restore() directly instead of via arch_do_swap_page()
Peter Collingbourne (3):
mm: Call arch_swap_restore() from do_swap_page()
mm: Call arch_swap_restore() from unuse_pte()
arm64: mte: Simplify swap tag restoration logic
arch/arm64/include/asm/mte.h | 4 ++--
arch/arm64/include/asm/pgtable.h | 14 ++----------
arch/arm64/kernel/mte.c | 37 ++++++--------------------------
mm/memory.c | 7 ++++++
mm/swapfile.c | 7 ++++++
5 files changed, 25 insertions(+), 44 deletions(-)
--
2.40.1.698.g37aff9b760-goog
As a result of the previous two patches, there are no circumstances
in which a swapped-in page is installed in a page table without first
having arch_swap_restore() called on it. Therefore, we no longer need
the logic in set_pte_at() that restores the tags, so remove it.
Signed-off-by: Peter Collingbourne <[email protected]>
Link: https://linux-review.googlesource.com/id/I8ad54476f3b2d0144ccd8ce0c1d7a2963e5ff6f3
Reviewed-by: Steven Price <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
---
v4:
- Rebased onto v6.4-rc3
- Reverted change to arch/arm64/mm/mteswap.c; this change was not
valid because swapcache pages can have arch_swap_restore() called
on them multiple times
v3:
- Rebased onto arm64/for-next/fixes, which already has a fix
for the issue previously tagged, therefore removed Fixes:
tag
arch/arm64/include/asm/mte.h | 4 ++--
arch/arm64/include/asm/pgtable.h | 14 ++----------
arch/arm64/kernel/mte.c | 37 ++++++--------------------------
3 files changed, 11 insertions(+), 44 deletions(-)
diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index c028afb1cd0b..4cedbaa16f41 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -90,7 +90,7 @@ static inline bool try_page_mte_tagging(struct page *page)
}
void mte_zero_clear_page_tags(void *addr);
-void mte_sync_tags(pte_t old_pte, pte_t pte);
+void mte_sync_tags(pte_t pte);
void mte_copy_page_tags(void *kto, const void *kfrom);
void mte_thread_init_user(void);
void mte_thread_switch(struct task_struct *next);
@@ -122,7 +122,7 @@ static inline bool try_page_mte_tagging(struct page *page)
static inline void mte_zero_clear_page_tags(void *addr)
{
}
-static inline void mte_sync_tags(pte_t old_pte, pte_t pte)
+static inline void mte_sync_tags(pte_t pte)
{
}
static inline void mte_copy_page_tags(void *kto, const void *kfrom)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 0bd18de9fd97..e8a252e62b12 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -337,18 +337,8 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
* don't expose tags (instruction fetches don't check tags).
*/
if (system_supports_mte() && pte_access_permitted(pte, false) &&
- !pte_special(pte)) {
- pte_t old_pte = READ_ONCE(*ptep);
- /*
- * We only need to synchronise if the new PTE has tags enabled
- * or if swapping in (in which case another mapping may have
- * set tags in the past even if this PTE isn't tagged).
- * (!pte_none() && !pte_present()) is an open coded version of
- * is_swap_pte()
- */
- if (pte_tagged(pte) || (!pte_none(old_pte) && !pte_present(old_pte)))
- mte_sync_tags(old_pte, pte);
- }
+ !pte_special(pte) && pte_tagged(pte))
+ mte_sync_tags(pte);
__check_safe_pte_update(mm, ptep, pte);
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 7e89968bd282..c40728046fed 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -35,41 +35,18 @@ DEFINE_STATIC_KEY_FALSE(mte_async_or_asymm_mode);
EXPORT_SYMBOL_GPL(mte_async_or_asymm_mode);
#endif
-static void mte_sync_page_tags(struct page *page, pte_t old_pte,
- bool check_swap, bool pte_is_tagged)
-{
- if (check_swap && is_swap_pte(old_pte)) {
- swp_entry_t entry = pte_to_swp_entry(old_pte);
-
- if (!non_swap_entry(entry))
- mte_restore_tags(entry, page);
- }
-
- if (!pte_is_tagged)
- return;
-
- if (try_page_mte_tagging(page)) {
- mte_clear_page_tags(page_address(page));
- set_page_mte_tagged(page);
- }
-}
-
-void mte_sync_tags(pte_t old_pte, pte_t pte)
+void mte_sync_tags(pte_t pte)
{
struct page *page = pte_page(pte);
long i, nr_pages = compound_nr(page);
- bool check_swap = nr_pages == 1;
- bool pte_is_tagged = pte_tagged(pte);
-
- /* Early out if there's nothing to do */
- if (!check_swap && !pte_is_tagged)
- return;
/* if PG_mte_tagged is set, tags have already been initialised */
- for (i = 0; i < nr_pages; i++, page++)
- if (!page_mte_tagged(page))
- mte_sync_page_tags(page, old_pte, check_swap,
- pte_is_tagged);
+ for (i = 0; i < nr_pages; i++, page++) {
+ if (try_page_mte_tagging(page)) {
+ mte_clear_page_tags(page_address(page));
+ set_page_mte_tagged(page);
+ }
+ }
/* ensure the tags are visible before the PTE is set */
smp_wmb();
--
2.40.1.698.g37aff9b760-goog
We would like to move away from requiring architectures to restore
metadata from swap in the set_pte_at() implementation, as this is not only
error-prone but adds complexity to the arch-specific code. This requires
us to call arch_swap_restore() before calling swap_free() whenever pages
are restored from swap. We are currently doing so everywhere except in
unuse_pte(); do so there as well.
Signed-off-by: Peter Collingbourne <[email protected]>
Link: https://linux-review.googlesource.com/id/I68276653e612d64cde271ce1b5a99ae05d6bbc4f
Suggested-by: David Hildenbrand <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Acked-by: "Huang, Ying" <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
---
mm/swapfile.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 274bbf797480..e9843fadecd6 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1794,6 +1794,13 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
goto setpte;
}
+ /*
+ * Some architectures may have to restore extra metadata to the page
+ * when reading from swap. This metadata may be indexed by swap entry
+ * so this must be called before swap_free().
+ */
+ arch_swap_restore(entry, page_folio(page));
+
/* See do_swap_page() */
BUG_ON(!PageAnon(page) && PageMappedToDisk(page));
BUG_ON(PageAnon(page) && PageAnonExclusive(page));
--
2.40.1.698.g37aff9b760-goog
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
the call to swap_free() before the call to set_pte_at(), which meant that
the MTE tags could end up being freed before set_pte_at() had a chance
to restore them. Fix it by adding a call to the arch_swap_restore() hook
before the call to swap_free().
Signed-off-by: Peter Collingbourne <[email protected]>
Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965
Cc: <[email protected]> # 6.1
Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
Reported-by: Qun-wei Lin (林群崴) <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Acked-by: David Hildenbrand <[email protected]>
Acked-by: "Huang, Ying" <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
---
v2:
- Call arch_swap_restore() directly instead of via arch_do_swap_page()
mm/memory.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/mm/memory.c b/mm/memory.c
index f69fbc251198..fc25764016b3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3932,6 +3932,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
}
}
+ /*
+ * Some architectures may have to restore extra metadata to the page
+ * when reading from swap. This metadata may be indexed by swap entry
+ * so this must be called before swap_free().
+ */
+ arch_swap_restore(entry, folio);
+
/*
* Remove the swap entry and conditionally try to free up the swapcache.
* We're already holding a reference on the page but haven't mapped it
--
2.40.1.698.g37aff9b760-goog
Hi Peter,
On Mon, May 22, 2023 at 05:43:08PM -0700, Peter Collingbourne wrote:
> Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
> the call to swap_free() before the call to set_pte_at(), which meant that
> the MTE tags could end up being freed before set_pte_at() had a chance
> to restore them. Fix it by adding a call to the arch_swap_restore() hook
> before the call to swap_free().
>
> Signed-off-by: Peter Collingbourne <[email protected]>
> Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965
> Cc: <[email protected]> # 6.1
> Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
> Reported-by: Qun-wei Lin (林群崴) <[email protected]>
> Closes: https://lore.kernel.org/all/[email protected]/
> Acked-by: David Hildenbrand <[email protected]>
> Acked-by: "Huang, Ying" <[email protected]>
> Reviewed-by: Steven Price <[email protected]>
> Acked-by: Catalin Marinas <[email protected]>
> ---
> v2:
> - Call arch_swap_restore() directly instead of via arch_do_swap_page()
>
> mm/memory.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index f69fbc251198..fc25764016b3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3932,6 +3932,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> }
> }
>
> + /*
> + * Some architectures may have to restore extra metadata to the page
> + * when reading from swap. This metadata may be indexed by swap entry
> + * so this must be called before swap_free().
> + */
> + arch_swap_restore(entry, folio);
> +
> /*
> * Remove the swap entry and conditionally try to free up the swapcache.
> * We're already holding a reference on the page but haven't mapped it
It looks like the intention is for this patch to land in 6.4, whereas the
other two in the series could go in later, right? If so, I was expecting
Andrew to pick this one up but he's not actually on CC. I've added him now,
but you may want to send this as a separate fix so it's obvious what needs
picking up for this cycle.
Cheers,
Will
On Mon, Jun 5, 2023 at 7:06 AM Will Deacon <[email protected]> wrote:
>
> Hi Peter,
>
> On Mon, May 22, 2023 at 05:43:08PM -0700, Peter Collingbourne wrote:
> > Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
> > the call to swap_free() before the call to set_pte_at(), which meant that
> > the MTE tags could end up being freed before set_pte_at() had a chance
> > to restore them. Fix it by adding a call to the arch_swap_restore() hook
> > before the call to swap_free().
> >
> > Signed-off-by: Peter Collingbourne <[email protected]>
> > Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965
> > Cc: <[email protected]> # 6.1
> > Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
> > Reported-by: Qun-wei Lin (林群崴) <[email protected]>
> > Closes: https://lore.kernel.org/all/[email protected]/
> > Acked-by: David Hildenbrand <[email protected]>
> > Acked-by: "Huang, Ying" <[email protected]>
> > Reviewed-by: Steven Price <[email protected]>
> > Acked-by: Catalin Marinas <[email protected]>
> > ---
> > v2:
> > - Call arch_swap_restore() directly instead of via arch_do_swap_page()
> >
> > mm/memory.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index f69fbc251198..fc25764016b3 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3932,6 +3932,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> > }
> > }
> >
> > + /*
> > + * Some architectures may have to restore extra metadata to the page
> > + * when reading from swap. This metadata may be indexed by swap entry
> > + * so this must be called before swap_free().
> > + */
> > + arch_swap_restore(entry, folio);
> > +
> > /*
> > * Remove the swap entry and conditionally try to free up the swapcache.
> > * We're already holding a reference on the page but haven't mapped it
>
> It looks like the intention is for this patch to land in 6.4, whereas the
> other two in the series could go in later, right? If so, I was expecting
> Andrew to pick this one up but he's not actually on CC. I've added him now,
> but you may want to send this as a separate fix so it's obvious what needs
> picking up for this cycle.
I was expecting that this whole series could be picked up in mm. There
was a previous attempt to apply v3 of this series to mm, but that
failed because a dependent patch (commit c4c597f1b367 ("arm64: mte: Do
not set PG_mte_tagged if tags were not initialized")) hadn't been
merged into Linus's master branch yet. The series should be good to go
in now that that patch has been merged.
Peter
On Mon, Jun 05, 2023 at 10:41:12AM -0700, Peter Collingbourne wrote:
> On Mon, Jun 5, 2023 at 7:06 AM Will Deacon <[email protected]> wrote:
> > On Mon, May 22, 2023 at 05:43:08PM -0700, Peter Collingbourne wrote:
> > > Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
> > > the call to swap_free() before the call to set_pte_at(), which meant that
> > > the MTE tags could end up being freed before set_pte_at() had a chance
> > > to restore them. Fix it by adding a call to the arch_swap_restore() hook
> > > before the call to swap_free().
> > >
> > > Signed-off-by: Peter Collingbourne <[email protected]>
> > > Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965
> > > Cc: <[email protected]> # 6.1
> > > Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
> > > Reported-by: Qun-wei Lin (林群崴) <[email protected]>
> > > Closes: https://lore.kernel.org/all/[email protected]/
> > > Acked-by: David Hildenbrand <[email protected]>
> > > Acked-by: "Huang, Ying" <[email protected]>
> > > Reviewed-by: Steven Price <[email protected]>
> > > Acked-by: Catalin Marinas <[email protected]>
> > > ---
> > > v2:
> > > - Call arch_swap_restore() directly instead of via arch_do_swap_page()
> > >
> > > mm/memory.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index f69fbc251198..fc25764016b3 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -3932,6 +3932,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> > > }
> > > }
> > >
> > > + /*
> > > + * Some architectures may have to restore extra metadata to the page
> > > + * when reading from swap. This metadata may be indexed by swap entry
> > > + * so this must be called before swap_free().
> > > + */
> > > + arch_swap_restore(entry, folio);
> > > +
> > > /*
> > > * Remove the swap entry and conditionally try to free up the swapcache.
> > > * We're already holding a reference on the page but haven't mapped it
> >
> > It looks like the intention is for this patch to land in 6.4, whereas the
> > other two in the series could go in later, right? If so, I was expecting
> > Andrew to pick this one up but he's not actually on CC. I've added him now,
> > but you may want to send this as a separate fix so it's obvious what needs
> > picking up for this cycle.
>
> I was expecting that this whole series could be picked up in mm. There
> was a previous attempt to apply v3 of this series to mm, but that
> failed because a dependent patch (commit c4c597f1b367 ("arm64: mte: Do
> not set PG_mte_tagged if tags were not initialized")) hadn't been
> merged into Linus's master branch yet. The series should be good to go
> in now that that patch has been merged.
Did this series fall through the cracks? I can't see it in linux-next
(or maybe my grep'ing failed). The commit mentioned above is in 6.4-rc3
AFAICT. Unfortunately Andrew was not cc'ed on the initial post, Will
added him later, so he likely missed it. For reference, the series is
here:
https://lore.kernel.org/r/[email protected]/
Andrew, what's your preference for this series? I'd like at least the
first patch to go into 6.5 as a fix. The second patch seems to be fairly
low risk and I'm happy for the third arm64 patch/cleanup to go in
6.5-rc1 (but it depends on the second patch). If you prefer, I can pick
them up and send a pull request to Linus next week before -rc1.
Otherwise you (or I) can queue the first patch and leave the other two
for 6.6.
Thanks.
--
Catalin
On Thu, 29 Jun 2023 10:57:14 +0100 Catalin Marinas <[email protected]> wrote:
> Andrew, what's your preference for this series? I'd like at least the
> first patch to go into 6.5 as a fix. The second patch seems to be fairly
> low risk and I'm happy for the third arm64 patch/cleanup to go in
> 6.5-rc1 (but it depends on the second patch). If you prefer, I can pick
> them up and send a pull request to Linus next week before -rc1.
> Otherwise you (or I) can queue the first patch and leave the other two
> for 6.6.
Thanks. I queued [1/3] for 6.5-rcX with a cc:stable. And I queued
[2/3] and [3/3] for 6.6-rc1.
If you wish to grab any/all of these then please do so - Stephen
will tell us of the duplicate and I'll drop the mm-git copy.
On Sun, Jul 02, 2023 at 12:38:21PM -0700, Andrew Morton wrote:
> On Thu, 29 Jun 2023 10:57:14 +0100 Catalin Marinas <[email protected]> wrote:
> > Andrew, what's your preference for this series? I'd like at least the
> > first patch to go into 6.5 as a fix. The second patch seems to be fairly
> > low risk and I'm happy for the third arm64 patch/cleanup to go in
> > 6.5-rc1 (but it depends on the second patch). If you prefer, I can pick
> > them up and send a pull request to Linus next week before -rc1.
> > Otherwise you (or I) can queue the first patch and leave the other two
> > for 6.6.
>
> Thanks. I queued [1/3] for 6.5-rcX with a cc:stable. And I queued
> [2/3] and [3/3] for 6.6-rc1.
>
> If you wish to grab any/all of these then please do so - Stephen
> will tell us of the duplicate and I'll drop the mm-git copy.
That's great, thanks. We'll let you know if there are any conflicts
during the 6.6 preparation.
--
Catalin