2023-07-27 12:34:27

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH v2 0/4] A few fixup patches for mm

Hi everyone,
This series contains a few fixup patches to fix potential unexpected
return value, fix wrong swap entry type for hwpoisoned swapcache page
and so on. More details can be found in the respective changelogs.
Thanks!

Miaohe Lin (4):
mm/swapfile: fix wrong swap entry type for hwpoisoned swapcache page
mm: memory-failure: fix potential unexpected return value from
unpoison_memory()
mm: memory-failure: avoid false hwpoison page mapped error info
mm: memory-failure: add PageOffline() check
---
v2:
collect Reviewed-by and Acked-by tag per Matthew and Naoya.
1/4: a better fix per Matthew
2/4: fix a code smell per Naoya
Thanks!
---
mm/ksm.c | 2 ++
mm/memory-failure.c | 32 ++++++++++++++++++--------------
mm/swapfile.c | 8 ++++----
3 files changed, 24 insertions(+), 18 deletions(-)

--
2.33.0



2023-07-27 12:41:57

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH v2 1/4] mm/swapfile: fix wrong swap entry type for hwpoisoned swapcache page

Hwpoisoned dirty swap cache page is kept in the swap cache and there's
simple interception code in do_swap_page() to catch it. But when trying
to swapoff, unuse_pte() will wrongly install a general sense of "future
accesses are invalid" swap entry for hwpoisoned swap cache page due to
unaware of such type of page. The user will receive SIGBUS signal without
expected BUS_MCEERR_AR payload. BTW, typo 'hwposioned' is fixed.

Fixes: 6b970599e807 ("mm: hwpoison: support recovery from ksm_might_need_to_copy()")
Signed-off-by: Miaohe Lin <[email protected]>
---
mm/ksm.c | 2 ++
mm/swapfile.c | 8 ++++----
2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 97a9627116fa..74804158ee02 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2794,6 +2794,8 @@ struct page *ksm_might_need_to_copy(struct page *page,
anon_vma->root == vma->anon_vma->root) {
return page; /* still no need to copy it */
}
+ if (PageHWPoison(page))
+ return ERR_PTR(-EHWPOISON);
if (!PageUptodate(page))
return page; /* let do_swap_page report the error */

diff --git a/mm/swapfile.c b/mm/swapfile.c
index e04eb9c0482d..0df94c4000ea 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1744,7 +1744,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
struct page *swapcache;
spinlock_t *ptl;
pte_t *pte, new_pte, old_pte;
- bool hwposioned = false;
+ bool hwpoisoned = PageHWPoison(page);
int ret = 1;

swapcache = page;
@@ -1752,7 +1752,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
if (unlikely(!page))
return -ENOMEM;
else if (unlikely(PTR_ERR(page) == -EHWPOISON))
- hwposioned = true;
+ hwpoisoned = true;

pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
if (unlikely(!pte || !pte_same_as_swp(ptep_get(pte),
@@ -1763,11 +1763,11 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,

old_pte = ptep_get(pte);

- if (unlikely(hwposioned || !PageUptodate(page))) {
+ if (unlikely(hwpoisoned || !PageUptodate(page))) {
swp_entry_t swp_entry;

dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
- if (hwposioned) {
+ if (hwpoisoned) {
swp_entry = make_hwpoison_entry(swapcache);
page = swapcache;
} else {
--
2.33.0


2023-07-27 13:06:20

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH v2 2/4] mm: memory-failure: fix potential unexpected return value from unpoison_memory()

If unpoison_memory() fails to clear page hwpoisoned flag, return value
ret is expected to be -EBUSY. But when get_hwpoison_page() returns 1
and fails to clear page hwpoisoned flag due to races, return value will
be unexpected 1 leading to users being confused. And there's a code smell
that the variable "ret" is used not only to save the return value of
unpoison_memory(), but also the return value from get_hwpoison_page().
Make a further cleanup by using another auto-variable solely to save the
return value of get_hwpoison_page() as suggested by Naoya.

Fixes: bf181c582588 ("mm/hwpoison: fix unpoison_memory()")
Signed-off-by: Miaohe Lin <[email protected]>
---
mm/memory-failure.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index a114c8c3039c..4a3e88c15631 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2502,7 +2502,7 @@ int unpoison_memory(unsigned long pfn)
{
struct folio *folio;
struct page *p;
- int ret = -EBUSY;
+ int ret = -EBUSY, ghp;
unsigned long count = 1;
bool huge = false;
static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL,
@@ -2550,29 +2550,28 @@ int unpoison_memory(unsigned long pfn)
if (folio_test_slab(folio) || PageTable(&folio->page) || folio_test_reserved(folio))
goto unlock_mutex;

- ret = get_hwpoison_page(p, MF_UNPOISON);
- if (!ret) {
+ ghp = get_hwpoison_page(p, MF_UNPOISON);
+ if (!ghp) {
if (PageHuge(p)) {
huge = true;
count = folio_free_raw_hwp(folio, false);
- if (count == 0) {
- ret = -EBUSY;
+ if (count == 0)
goto unlock_mutex;
- }
}
ret = folio_test_clear_hwpoison(folio) ? 0 : -EBUSY;
- } else if (ret < 0) {
- if (ret == -EHWPOISON) {
+ } else if (ghp < 0) {
+ if (ghp == -EHWPOISON) {
ret = put_page_back_buddy(p) ? 0 : -EBUSY;
- } else
+ } else {
+ ret = ghp;
unpoison_pr_info("Unpoison: failed to grab page %#lx\n",
pfn, &unpoison_rs);
+ }
} else {
if (PageHuge(p)) {
huge = true;
count = folio_free_raw_hwp(folio, false);
if (count == 0) {
- ret = -EBUSY;
folio_put(folio);
goto unlock_mutex;
}
--
2.33.0


2023-07-27 13:09:37

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH v2 3/4] mm: memory-failure: avoid false hwpoison page mapped error info

folio->_mapcount is overloaded in SLAB, so folio_mapped() has to be done
after folio_test_slab() is checked. Otherwise slab folio might be treated
as a mapped folio leading to false 'Someone maps the hwpoison page' error
info.

Fixes: 230ac719c500 ("mm/hwpoison: don't try to unpoison containment-failed pages")
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
Acked-by: Naoya Horiguchi <[email protected]>
---
mm/memory-failure.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 4a3e88c15631..d975a6b224f7 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2535,6 +2535,13 @@ int unpoison_memory(unsigned long pfn)
goto unlock_mutex;
}

+ if (folio_test_slab(folio) || PageTable(&folio->page) || folio_test_reserved(folio))
+ goto unlock_mutex;
+
+ /*
+ * Note that folio->_mapcount is overloaded in SLAB, so the simple test
+ * in folio_mapped() has to be done after folio_test_slab() is checked.
+ */
if (folio_mapped(folio)) {
unpoison_pr_info("Unpoison: Someone maps the hwpoison page %#lx\n",
pfn, &unpoison_rs);
@@ -2547,9 +2554,6 @@ int unpoison_memory(unsigned long pfn)
goto unlock_mutex;
}

- if (folio_test_slab(folio) || PageTable(&folio->page) || folio_test_reserved(folio))
- goto unlock_mutex;
-
ghp = get_hwpoison_page(p, MF_UNPOISON);
if (!ghp) {
if (PageHuge(p)) {
--
2.33.0


2023-07-27 17:32:11

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] A few fixup patches for mm

On Thu, 27 Jul 2023 19:56:39 +0800 Miaohe Lin <[email protected]> wrote:

> This series contains a few fixup patches to fix potential unexpected
> return value, fix wrong swap entry type for hwpoisoned swapcache page
> and so on. More details can be found in the respective changelogs.

I'm thinking that patches 1-3 should be backported into -stable kernels.
Thoughts on this?

2023-07-28 02:41:17

by Miaohe Lin

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] A few fixup patches for mm

On 2023/7/28 0:57, Andrew Morton wrote:
> On Thu, 27 Jul 2023 19:56:39 +0800 Miaohe Lin <[email protected]> wrote:
>
>> This series contains a few fixup patches to fix potential unexpected
>> return value, fix wrong swap entry type for hwpoisoned swapcache page
>> and so on. More details can be found in the respective changelogs.
>
> I'm thinking that patches 1-3 should be backported into -stable kernels.
> Thoughts on this?

I tend to backport the patches 1-3. They helps.

Thanks Andrew.