2022-08-23 03:30:38

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH v2 0/6] A few fixup patches for memory-failure

Hi everyone,
This series contains a few fixup patches to fix incorrect update of page
refcnt, fix possible use-after-free issue and so on. More details can be
found in the respective changelogs.
Thanks!

---
v2:
Collect Acked-by tag per Naoya. Thanks!
Change to add list_del() in kill_procs in 4/6
Remove deprecated comment in 5/6
---
Miaohe Lin (6):
mm, hwpoison: fix page refcnt leaking in try_memory_failure_hugetlb()
mm, hwpoison: fix page refcnt leaking in unpoison_memory()
mm, hwpoison: fix extra put_page() in soft_offline_page()
mm, hwpoison: fix possible use-after-free in mf_dax_kill_procs()
mm, hwpoison: kill procs if unmap fails
mm, hwpoison: avoid trying to unpoison reserved page

mm/memory-failure.c | 28 +++++++++++++---------------
1 file changed, 13 insertions(+), 15 deletions(-)

--
2.23.0


2022-08-23 03:31:38

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH v2 2/6] mm, hwpoison: fix page refcnt leaking in unpoison_memory()

When free_raw_hwp_pages() fails its work, the refcnt of the hugetlb page
would have been incremented if ret > 0. Using put_page() to fix refcnt
leaking in this case.

Fixes: debb6b9c3fdd ("mm, hwpoison: make unpoison aware of raw error info in hwpoisoned hugepage")
Signed-off-by: Miaohe Lin <[email protected]>
Acked-by: Naoya Horiguchi <[email protected]>
---
mm/memory-failure.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 9d1ebfef04ee..ecd42d717c6f 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2378,6 +2378,7 @@ int unpoison_memory(unsigned long pfn)
count = free_raw_hwp_pages(page, false);
if (count == 0) {
ret = -EBUSY;
+ put_page(page);
goto unlock_mutex;
}
}
--
2.23.0

2022-08-23 03:32:28

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH v2 4/6] mm, hwpoison: fix possible use-after-free in mf_dax_kill_procs()

After kill_procs(), tk will be freed without being removed from the to_kill
list. In the next iteration, the freed list entry in the to_kill list will
be accessed, thus leading to use-after-free issue. Adding list_del() in
kill_procs() to fix the issue.

Fixes: c36e20249571 ("mm: introduce mf_dax_kill_procs() for fsdax case")
Signed-off-by: Miaohe Lin <[email protected]>
---
mm/memory-failure.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 1d79e693f1b9..f8262f577baf 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -413,7 +413,7 @@ static void kill_procs(struct list_head *to_kill, int forcekill, bool fail,
{
struct to_kill *tk, *next;

- list_for_each_entry_safe (tk, next, to_kill, nd) {
+ list_for_each_entry_safe(tk, next, to_kill, nd) {
if (forcekill) {
/*
* In case something went wrong with munmapping
@@ -437,6 +437,7 @@ static void kill_procs(struct list_head *to_kill, int forcekill, bool fail,
pr_err("%#lx: Cannot send advisory machine check signal to %s:%d\n",
pfn, tk->tsk->comm, tk->tsk->pid);
}
+ list_del(&tk->nd);
put_task_struct(tk->tsk);
kfree(tk);
}
--
2.23.0

2022-08-23 03:32:50

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH v2 3/6] mm, hwpoison: fix extra put_page() in soft_offline_page()

When hwpoison_filter() refuses to soft offline a page, the page refcnt
incremented previously by MF_COUNT_INCREASED would have been consumed
via get_hwpoison_page() if ret <= 0. So the put_ref_page() here will
put the extra one. Remove it to fix the issue.

Fixes: 9113eaf331bf ("mm/memory-failure.c: add hwpoison_filter for soft offline")
Signed-off-by: Miaohe Lin <[email protected]>
Acked-by: Naoya Horiguchi <[email protected]>
---
mm/memory-failure.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ecd42d717c6f..1d79e693f1b9 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2575,8 +2575,6 @@ int soft_offline_page(unsigned long pfn, int flags)
if (hwpoison_filter(page)) {
if (ret > 0)
put_page(page);
- else
- put_ref_page(ref_page);

mutex_unlock(&mf_mutex);
return -EOPNOTSUPP;
--
2.23.0

2022-08-23 03:48:23

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH v2 5/6] mm, hwpoison: kill procs if unmap fails

If try_to_unmap() fails, the hwpoisoned page still resides in the address
space of some processes. We should kill these processes or the hwpoisoned
page might be consumed later. collect_procs() is always called to collect
relevant processes now so they can be killed later if unmap fails.

Signed-off-by: Miaohe Lin <[email protected]>
Acked-by: Naoya Horiguchi <[email protected]>
---
mm/memory-failure.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index f8262f577baf..c2910f9af1d4 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1397,7 +1397,7 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
struct address_space *mapping;
LIST_HEAD(tokill);
bool unmap_success;
- int kill = 1, forcekill;
+ int forcekill;
bool mlocked = PageMlocked(hpage);

/*
@@ -1438,7 +1438,6 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
if (page_mkclean(hpage)) {
SetPageDirty(hpage);
} else {
- kill = 0;
ttu |= TTU_IGNORE_HWPOISON;
pr_info("%#lx: corrupted page was clean: dropped without side effects\n",
pfn);
@@ -1449,12 +1448,8 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
* First collect all the processes that have the page
* mapped in dirty form. This has to be done before try_to_unmap,
* because ttu takes the rmap data structures down.
- *
- * Error handling: We ignore errors here because
- * there's nothing that can be done.
*/
- if (kill)
- collect_procs(hpage, &tokill, flags & MF_ACTION_REQUIRED);
+ collect_procs(hpage, &tokill, flags & MF_ACTION_REQUIRED);

if (PageHuge(hpage) && !PageAnon(hpage)) {
/*
@@ -1496,7 +1491,8 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
* use a more force-full uncatchable kill to prevent
* any accesses to the poisoned memory.
*/
- forcekill = PageDirty(hpage) || (flags & MF_MUST_KILL);
+ forcekill = PageDirty(hpage) || (flags & MF_MUST_KILL) ||
+ !unmap_success;
kill_procs(&tokill, forcekill, !unmap_success, pfn, flags);

return unmap_success;
--
2.23.0

2022-08-23 04:25:20

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH v2 6/6] mm, hwpoison: avoid trying to unpoison reserved page

For reserved pages, HWPoison flag will be set without increasing the page
refcnt. So we shouldn't even try to unpoison these pages and thus decrease
the page refcnt unexpectly. Add a PageReserved() check to filter this case
out and remove the below unneeded zero page (zero page is reserved) check.

Signed-off-by: Miaohe Lin <[email protected]>
Acked-by: Naoya Horiguchi <[email protected]>
---
mm/memory-failure.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index c2910f9af1d4..f3ff2515ccc6 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2351,7 +2351,7 @@ int unpoison_memory(unsigned long pfn)
goto unlock_mutex;
}

- if (PageSlab(page) || PageTable(page))
+ if (PageSlab(page) || PageTable(page) || PageReserved(page))
goto unlock_mutex;

ret = get_hwpoison_page(p, MF_UNPOISON);
@@ -2382,7 +2382,7 @@ int unpoison_memory(unsigned long pfn)
freeit = !!TestClearPageHWPoison(p);

put_page(page);
- if (freeit && !(pfn == my_zero_pfn(0) && page_count(p) == 1)) {
+ if (freeit) {
put_page(page);
ret = 0;
}
--
2.23.0

Subject: Re: [PATCH v2 4/6] mm, hwpoison: fix possible use-after-free in mf_dax_kill_procs()

On Tue, Aug 23, 2022 at 11:23:44AM +0800, Miaohe Lin wrote:
> After kill_procs(), tk will be freed without being removed from the to_kill
> list. In the next iteration, the freed list entry in the to_kill list will
> be accessed, thus leading to use-after-free issue. Adding list_del() in
> kill_procs() to fix the issue.
>
> Fixes: c36e20249571 ("mm: introduce mf_dax_kill_procs() for fsdax case")
> Signed-off-by: Miaohe Lin <[email protected]>

Thank you for the update.

Acked-by: Naoya Horiguchi <[email protected]>

> ---
> mm/memory-failure.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 1d79e693f1b9..f8262f577baf 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -413,7 +413,7 @@ static void kill_procs(struct list_head *to_kill, int forcekill, bool fail,
> {
> struct to_kill *tk, *next;
>
> - list_for_each_entry_safe (tk, next, to_kill, nd) {
> + list_for_each_entry_safe(tk, next, to_kill, nd) {
> if (forcekill) {
> /*
> * In case something went wrong with munmapping
> @@ -437,6 +437,7 @@ static void kill_procs(struct list_head *to_kill, int forcekill, bool fail,
> pr_err("%#lx: Cannot send advisory machine check signal to %s:%d\n",
> pfn, tk->tsk->comm, tk->tsk->pid);
> }
> + list_del(&tk->nd);
> put_task_struct(tk->tsk);
> kfree(tk);
> }
> --
> 2.23.0