In our efforts to remove uses of PG_private, we have found folios with
the private flag clear and folio->private not-NULL. That is the root
cause behind 642d51fb0775 ("ceph: check folio PG_private bit instead
of folio->private"). It can also affect a few other filesystems that
haven't yet reported a problem.
compaction_alloc() can return a page with uninitialised page->private,
and rather than checking all the callers of migrate_pages(), just zero
page->private after calling get_new_page(). Similarly, the tail pages
from split_huge_page() may also have an uninitialised page->private.
Reported-by: Xiubo Li <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
mm/huge_memory.c | 1 +
mm/migrate.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f7248002dad9..9b31a50217b5 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2377,6 +2377,7 @@ static void __split_huge_page_tail(struct page *head, int tail,
page_tail);
page_tail->mapping = head->mapping;
page_tail->index = head->index + tail;
+ page_tail->private = NULL;
/* Page flags must be visible before we make the page non-compound. */
smp_wmb();
diff --git a/mm/migrate.c b/mm/migrate.c
index e51588e95f57..6c1ea61f39d8 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1106,6 +1106,7 @@ static int unmap_and_move(new_page_t get_new_page,
if (!newpage)
return -ENOMEM;
+ newpage->private = 0;
rc = __unmap_and_move(page, newpage, force, mode);
if (rc == MIGRATEPAGE_SUCCESS)
set_page_owner_migrate_reason(newpage, reason);
--
2.35.1
On 6/19/22 11:11 PM, Matthew Wilcox (Oracle) wrote:
> In our efforts to remove uses of PG_private, we have found folios with
> the private flag clear and folio->private not-NULL. That is the root
> cause behind 642d51fb0775 ("ceph: check folio PG_private bit instead
> of folio->private"). It can also affect a few other filesystems that
> haven't yet reported a problem.
>
> compaction_alloc() can return a page with uninitialised page->private,
> and rather than checking all the callers of migrate_pages(), just zero
> page->private after calling get_new_page(). Similarly, the tail pages
> from split_huge_page() may also have an uninitialised page->private.
>
> Reported-by: Xiubo Li <[email protected]>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> ---
> mm/huge_memory.c | 1 +
> mm/migrate.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index f7248002dad9..9b31a50217b5 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2377,6 +2377,7 @@ static void __split_huge_page_tail(struct page *head, int tail,
> page_tail);
> page_tail->mapping = head->mapping;
> page_tail->index = head->index + tail;
> + page_tail->private = NULL;
There has a warning when compiling it:
mm/huge_memory.c: In function ‘__split_huge_page_tail’:
mm/huge_memory.c:2380:21: warning: assignment to ‘long unsigned int’
from ‘void *’ makes integer from pointer without a cast [-Wint-conversion]
page_tail->private = NULL;
^
AR mm/built-in.a
>
> /* Page flags must be visible before we make the page non-compound. */
> smp_wmb();
> diff --git a/mm/migrate.c b/mm/migrate.c
> index e51588e95f57..6c1ea61f39d8 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1106,6 +1106,7 @@ static int unmap_and_move(new_page_t get_new_page,
> if (!newpage)
> return -ENOMEM;
>
> + newpage->private = 0;
> rc = __unmap_and_move(page, newpage, force, mode);
> if (rc == MIGRATEPAGE_SUCCESS)
> set_page_owner_migrate_reason(newpage, reason);
On 6/19/22 11:11 PM, Matthew Wilcox (Oracle) wrote:
> In our efforts to remove uses of PG_private, we have found folios with
> the private flag clear and folio->private not-NULL. That is the root
> cause behind 642d51fb0775 ("ceph: check folio PG_private bit instead
> of folio->private"). It can also affect a few other filesystems that
> haven't yet reported a problem.
>
> compaction_alloc() can return a page with uninitialised page->private,
> and rather than checking all the callers of migrate_pages(), just zero
> page->private after calling get_new_page(). Similarly, the tail pages
> from split_huge_page() may also have an uninitialised page->private.
>
> Reported-by: Xiubo Li <[email protected]>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> ---
> mm/huge_memory.c | 1 +
> mm/migrate.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index f7248002dad9..9b31a50217b5 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2377,6 +2377,7 @@ static void __split_huge_page_tail(struct page *head, int tail,
> page_tail);
> page_tail->mapping = head->mapping;
> page_tail->index = head->index + tail;
> + page_tail->private = NULL;
>
> /* Page flags must be visible before we make the page non-compound. */
> smp_wmb();
> diff --git a/mm/migrate.c b/mm/migrate.c
> index e51588e95f57..6c1ea61f39d8 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1106,6 +1106,7 @@ static int unmap_and_move(new_page_t get_new_page,
> if (!newpage)
> return -ENOMEM;
>
> + newpage->private = 0;
> rc = __unmap_and_move(page, newpage, force, mode);
> if (rc == MIGRATEPAGE_SUCCESS)
> set_page_owner_migrate_reason(newpage, reason);
Test this patch by reverting my previous patch for many times yesterday,
and it worked well for me till now. I will test it more to see whether
there are other cases could cause the crash.
Tested-by: Xiubo Li <[email protected]>
-- Xiubo