2023-08-26 09:59:46

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v2 0/3] Fixes and cleanups to break_down_buddy_pages

Hi all, this series contains one bugfix and two cleanups to
break_down_buddy_pages in page_alloc.
Patch 1 corrects page forward in case guard page debug is enabled. Patch
2 and 3 remove unnecessary check and local variable.
More details can be found in respective patches. Thanks!

v1->v2:
-Add Fixes tag and impact to patch 1

Kemeng Shi (3):
mm/page_alloc: correct start page when guard page debug is enabled
mm/page_alloc: remove unnecessary check in break_down_buddy_pages
mm/page_alloc: remove unnecessary next_page in break_down_buddy_pages

mm/page_alloc.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)

--
2.30.0



2023-08-26 13:56:39

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v2 1/3] mm/page_alloc: correct start page when guard page debug is enabled

When guard page debug is enabled and set_page_guard returns success, we
miss to forward page to point to start of next split range and we will do
split unexpectedly in page range without target page. Move start page
update before set_page_guard to fix this.

As we split to wrong target page, then splited pages are not able to merge
back to original order when target page is put back and splited pages
except target page is not usable. To be specific:

Consider target page is the third page in buddy page with order 2.
| buddy-2 | Page | Target | Page |

After break down to target page, we will only set first page to Guard
because of bug.
| Guard | Page | Target | Page |

When we try put_page_back_buddy with target page, the buddy page of target
if neither guard nor buddy, Then it's not able to construct original page
with order 2
| Guard | Page | buddy-0 | Page |

All pages except target page is not in free list and is not usable.

Fixes: 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")
Signed-off-by: Kemeng Shi <[email protected]>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fefc4074d9d0..88c5f5aea9b0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6505,6 +6505,7 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page,
next_page = page;
current_buddy = page + size;
}
+ page = next_page;

if (set_page_guard(zone, current_buddy, high, migratetype))
continue;
@@ -6512,7 +6513,6 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page,
if (current_buddy != target) {
add_to_free_list(current_buddy, zone, high, migratetype);
set_buddy_order(current_buddy, high);
- page = next_page;
}
}
}
--
2.30.0


2023-08-26 14:05:41

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v2 3/3] mm/page_alloc: remove unnecessary next_page in break_down_buddy_pages

The next_page is only used to forward page in case target is in second
harf range. Move forward page directly to remove unnecessary next_page.

Signed-off-by: Kemeng Shi <[email protected]>
---
mm/page_alloc.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bb74b40dc195..1c35ee022dde 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6492,20 +6492,18 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page,
int migratetype)
{
unsigned long size = 1 << high;
- struct page *current_buddy, *next_page;
+ struct page *current_buddy;

while (high > low) {
high--;
size >>= 1;

if (target >= &page[size]) {
- next_page = page + size;
current_buddy = page;
+ page = page + size;
} else {
- next_page = page;
current_buddy = page + size;
}
- page = next_page;

if (set_page_guard(zone, current_buddy, high, migratetype))
continue;
--
2.30.0


2023-08-30 18:37:23

by Kemeng Shi

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] mm/page_alloc: correct start page when guard page debug is enabled



on 8/28/2023 11:21 PM, Naoya Horiguchi wrote:
> On Sat, Aug 26, 2023 at 11:47:43PM +0800, Kemeng Shi wrote:
>> When guard page debug is enabled and set_page_guard returns success, we
>> miss to forward page to point to start of next split range and we will do
>> split unexpectedly in page range without target page. Move start page
>> update before set_page_guard to fix this.
>>
>> As we split to wrong target page, then splited pages are not able to merge
>> back to original order when target page is put back and splited pages
>> except target page is not usable. To be specific:
>>
>> Consider target page is the third page in buddy page with order 2.
>> | buddy-2 | Page | Target | Page |
>>
>> After break down to target page, we will only set first page to Guard
>> because of bug.
>> | Guard | Page | Target | Page |
>>
>> When we try put_page_back_buddy with target page, the buddy page of target
>> if neither guard nor buddy, Then it's not able to construct original page
>> with order 2
>> | Guard | Page | buddy-0 | Page |
>>
>> All pages except target page is not in free list and is not usable.
>>
>> Fixes: 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")
>> Signed-off-by: Kemeng Shi <[email protected]>
>
> Thank you for finding the problem and writing patches. I think the patch
> fixes the reported problem, But I wonder that we really need guard page
> mechanism in break_down_buddy_pages() which is only called from memory_failure.
> As stated in Documentation/admin-guide/kernel-parameters.txt, this is a
> debugging feature to detect memory corruption due to buggy kernel or drivers
> code. So if HW memory failrue seems to be out of the scope, and I feel that
> we could simply remove it from break_down_buddy_pages().
>
> debug_guardpage_minorder=
> [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this
> parameter allows control of the order of pages that will
> be intentionally kept free (and hence protected) by the
> buddy allocator. Bigger value increase the probability
> of catching random memory corruption, but reduce the
> amount of memory for normal system use. The maximum
> possible value is MAX_ORDER/2. Setting this parameter
> to 1 or 2 should be enough to identify most random
> memory corruption problems caused by bugs in kernel or
> driver code when a CPU writes to (or reads from) a
> random memory location. Note that there exists a class
> of memory corruptions problems caused by buggy H/W or
> F/W or by drivers badly programming DMA (basically when
> memory is written at bus level and the CPU MMU is
> bypassed) which are not detectable by
> CONFIG_DEBUG_PAGEALLOC, hence this option will not help
> tracking down these problems.
>
> If you have any idea about how guard page mechanism helps memory_failrue,
> could you share it?
>
Hi Naoya, thanks for feedback. Commit c0a32fc5a2e47 ("mm: more intensive
memory corruption debugging") menthioned we konw that with
CONFIG_DEBUG_PAGEALLOC configured, the CPU will generate an exception on
access (read,write) to an unallocated page, which permits us to catch code
which corrupts memory; Guard page aims to keep more free/protected pages
and to interlace free/protected and allocated pages to increase the
probability of catching corruption. Keep guard page around failrue looks
helpful to catch random access. Wish this can help.

> Thanks,
> Naoya Horiguchi
>
>> ---
>> mm/page_alloc.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index fefc4074d9d0..88c5f5aea9b0 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6505,6 +6505,7 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page,
>> next_page = page;
>> current_buddy = page + size;
>> }
>> + page = next_page;
>>
>> if (set_page_guard(zone, current_buddy, high, migratetype))
>> continue;
>> @@ -6512,7 +6513,6 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page,
>> if (current_buddy != target) {
>> add_to_free_list(current_buddy, zone, high, migratetype);
>> set_buddy_order(current_buddy, high);
>> - page = next_page;
>> }
>> }
>> }
>> --
>> 2.30.0
>>
>>
>>
>


2023-09-26 11:39:22

by Naoya Horiguchi

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm/page_alloc: remove unnecessary next_page in break_down_buddy_pages

On Sat, Aug 26, 2023 at 11:47:45PM +0800, Kemeng Shi wrote:
> The next_page is only used to forward page in case target is in second
> harf range. Move forward page directly to remove unnecessary next_page.

s/harf/half/

>
> Signed-off-by: Kemeng Shi <[email protected]>

Acked-by: Naoya Horiguchi <[email protected]>

> ---
> mm/page_alloc.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index bb74b40dc195..1c35ee022dde 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6492,20 +6492,18 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page,
> int migratetype)
> {
> unsigned long size = 1 << high;
> - struct page *current_buddy, *next_page;
> + struct page *current_buddy;
>
> while (high > low) {
> high--;
> size >>= 1;
>
> if (target >= &page[size]) {
> - next_page = page + size;
> current_buddy = page;
> + page = page + size;
> } else {
> - next_page = page;
> current_buddy = page + size;
> }
> - page = next_page;
>
> if (set_page_guard(zone, current_buddy, high, migratetype))
> continue;
> --
> 2.30.0
>
>
>

2023-09-26 15:44:56

by Naoya Horiguchi

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] mm/page_alloc: correct start page when guard page debug is enabled

On Wed, Aug 30, 2023 at 02:27:33PM +0800, Kemeng Shi wrote:
>
>
> on 8/28/2023 11:21 PM, Naoya Horiguchi wrote:
> > On Sat, Aug 26, 2023 at 11:47:43PM +0800, Kemeng Shi wrote:
> >> When guard page debug is enabled and set_page_guard returns success, we
> >> miss to forward page to point to start of next split range and we will do
> >> split unexpectedly in page range without target page. Move start page
> >> update before set_page_guard to fix this.
> >>
> >> As we split to wrong target page, then splited pages are not able to merge
> >> back to original order when target page is put back and splited pages
> >> except target page is not usable. To be specific:
> >>
> >> Consider target page is the third page in buddy page with order 2.
> >> | buddy-2 | Page | Target | Page |
> >>
> >> After break down to target page, we will only set first page to Guard
> >> because of bug.
> >> | Guard | Page | Target | Page |
> >>
> >> When we try put_page_back_buddy with target page, the buddy page of target
> >> if neither guard nor buddy, Then it's not able to construct original page
> >> with order 2
> >> | Guard | Page | buddy-0 | Page |
> >>
> >> All pages except target page is not in free list and is not usable.
> >>
> >> Fixes: 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")
> >> Signed-off-by: Kemeng Shi <[email protected]>
> >
> > Thank you for finding the problem and writing patches. I think the patch
> > fixes the reported problem, But I wonder that we really need guard page
> > mechanism in break_down_buddy_pages() which is only called from memory_failure.
> > As stated in Documentation/admin-guide/kernel-parameters.txt, this is a
> > debugging feature to detect memory corruption due to buggy kernel or drivers
> > code. So if HW memory failrue seems to be out of the scope, and I feel that
> > we could simply remove it from break_down_buddy_pages().
> >
> > debug_guardpage_minorder=
> > [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this
> > parameter allows control of the order of pages that will
> > be intentionally kept free (and hence protected) by the
> > buddy allocator. Bigger value increase the probability
> > of catching random memory corruption, but reduce the
> > amount of memory for normal system use. The maximum
> > possible value is MAX_ORDER/2. Setting this parameter
> > to 1 or 2 should be enough to identify most random
> > memory corruption problems caused by bugs in kernel or
> > driver code when a CPU writes to (or reads from) a
> > random memory location. Note that there exists a class
> > of memory corruptions problems caused by buggy H/W or
> > F/W or by drivers badly programming DMA (basically when
> > memory is written at bus level and the CPU MMU is
> > bypassed) which are not detectable by
> > CONFIG_DEBUG_PAGEALLOC, hence this option will not help
> > tracking down these problems.
> >
> > If you have any idea about how guard page mechanism helps memory_failrue,
> > could you share it?
> >
> Hi Naoya, thanks for feedback. Commit c0a32fc5a2e47 ("mm: more intensive
> memory corruption debugging") menthioned we konw that with
> CONFIG_DEBUG_PAGEALLOC configured, the CPU will generate an exception on
> access (read,write) to an unallocated page, which permits us to catch code
> which corrupts memory; Guard page aims to keep more free/protected pages
> and to interlace free/protected and allocated pages to increase the
> probability of catching corruption. Keep guard page around failrue looks
> helpful to catch random access. Wish this can help.

Sorry for my late response.
I'm OK with keeping guardpage stuff in this code path as long as it properly works.
And the patch looks good to me.

Acked-by: Naoya Horiguchi <[email protected]>

Do you think of sending this patch (only patch 1/3) to -stable?
If so, please add "Cc: [email protected]" tag.

Thanks,
Naoya Horiguchi

2023-09-27 02:03:46

by Kemeng Shi

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] mm/page_alloc: correct start page when guard page debug is enabled



on 9/26/2023 7:33 PM, Naoya Horiguchi wrote:
> On Wed, Aug 30, 2023 at 02:27:33PM +0800, Kemeng Shi wrote:
>>
>>
>> on 8/28/2023 11:21 PM, Naoya Horiguchi wrote:
>>> On Sat, Aug 26, 2023 at 11:47:43PM +0800, Kemeng Shi wrote:
>>>> When guard page debug is enabled and set_page_guard returns success, we
>>>> miss to forward page to point to start of next split range and we will do
>>>> split unexpectedly in page range without target page. Move start page
>>>> update before set_page_guard to fix this.
>>>>
>>>> As we split to wrong target page, then splited pages are not able to merge
>>>> back to original order when target page is put back and splited pages
>>>> except target page is not usable. To be specific:
>>>>
>>>> Consider target page is the third page in buddy page with order 2.
>>>> | buddy-2 | Page | Target | Page |
>>>>
>>>> After break down to target page, we will only set first page to Guard
>>>> because of bug.
>>>> | Guard | Page | Target | Page |
>>>>
>>>> When we try put_page_back_buddy with target page, the buddy page of target
>>>> if neither guard nor buddy, Then it's not able to construct original page
>>>> with order 2
>>>> | Guard | Page | buddy-0 | Page |
>>>>
>>>> All pages except target page is not in free list and is not usable.
>>>>
>>>> Fixes: 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")
>>>> Signed-off-by: Kemeng Shi <[email protected]>
>>>
>>> Thank you for finding the problem and writing patches. I think the patch
>>> fixes the reported problem, But I wonder that we really need guard page
>>> mechanism in break_down_buddy_pages() which is only called from memory_failure.
>>> As stated in Documentation/admin-guide/kernel-parameters.txt, this is a
>>> debugging feature to detect memory corruption due to buggy kernel or drivers
>>> code. So if HW memory failrue seems to be out of the scope, and I feel that
>>> we could simply remove it from break_down_buddy_pages().
>>>
>>> debug_guardpage_minorder=
>>> [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this
>>> parameter allows control of the order of pages that will
>>> be intentionally kept free (and hence protected) by the
>>> buddy allocator. Bigger value increase the probability
>>> of catching random memory corruption, but reduce the
>>> amount of memory for normal system use. The maximum
>>> possible value is MAX_ORDER/2. Setting this parameter
>>> to 1 or 2 should be enough to identify most random
>>> memory corruption problems caused by bugs in kernel or
>>> driver code when a CPU writes to (or reads from) a
>>> random memory location. Note that there exists a class
>>> of memory corruptions problems caused by buggy H/W or
>>> F/W or by drivers badly programming DMA (basically when
>>> memory is written at bus level and the CPU MMU is
>>> bypassed) which are not detectable by
>>> CONFIG_DEBUG_PAGEALLOC, hence this option will not help
>>> tracking down these problems.
>>>
>>> If you have any idea about how guard page mechanism helps memory_failrue,
>>> could you share it?
>>>
>> Hi Naoya, thanks for feedback. Commit c0a32fc5a2e47 ("mm: more intensive
>> memory corruption debugging") menthioned we konw that with
>> CONFIG_DEBUG_PAGEALLOC configured, the CPU will generate an exception on
>> access (read,write) to an unallocated page, which permits us to catch code
>> which corrupts memory; Guard page aims to keep more free/protected pages
>> and to interlace free/protected and allocated pages to increase the
>> probability of catching corruption. Keep guard page around failrue looks
>> helpful to catch random access. Wish this can help.
>
> Sorry for my late response.
> I'm OK with keeping guardpage stuff in this code path as long as it properly works.
> And the patch looks good to me.
>
> Acked-by: Naoya Horiguchi <[email protected]>
>
> Do you think of sending this patch (only patch 1/3) to -stable?
> If so, please add "Cc: [email protected]" tag.
>
Thanks for reply. Will cc stable in next version. Thanks!
> Thanks,
> Naoya Horiguchi
>