By defination, COMPACT[STALL|FAIL] events needs to be counted when there
is 'At least in one zone compaction wasn't deferred or skipped from the
direct compaction'. And when compaction is skipped or deferred,
COMPACT_SKIPPED will be returned but it will still go and update these
compaction events which is wrong in the sense that COMPACT[STALL|FAIL]
is counted without even trying the compaction.
Correct this by skipping the counting of these events when
COMPACT_SKIPPED is returned for compaction. This indirectly also avoid
the unnecessary try into the get_page_from_freelist() when compaction is
not even tried.
Signed-off-by: Charan Teja Reddy <[email protected]>
---
mm/page_alloc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 519a60d..531f244 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4152,6 +4152,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
memalloc_noreclaim_restore(noreclaim_flag);
psi_memstall_leave(&pflags);
+ if (*compact_result == COMPACT_SKIPPED)
+ return NULL;
/*
* At least in one zone compaction wasn't deferred or skipped, so let's
* count a compaction stall
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation
On Mon, 1 Feb 2021, Charan Teja Reddy wrote:
> By defination, COMPACT[STALL|FAIL] events needs to be counted when there
s/defination/definition/
> is 'At least in one zone compaction wasn't deferred or skipped from the
> direct compaction'. And when compaction is skipped or deferred,
> COMPACT_SKIPPED will be returned but it will still go and update these
> compaction events which is wrong in the sense that COMPACT[STALL|FAIL]
> is counted without even trying the compaction.
>
> Correct this by skipping the counting of these events when
> COMPACT_SKIPPED is returned for compaction. This indirectly also avoid
> the unnecessary try into the get_page_from_freelist() when compaction is
> not even tried.
>
> Signed-off-by: Charan Teja Reddy <[email protected]>
> ---
> mm/page_alloc.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 519a60d..531f244 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4152,6 +4152,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
> memalloc_noreclaim_restore(noreclaim_flag);
> psi_memstall_leave(&pflags);
>
> + if (*compact_result == COMPACT_SKIPPED)
> + return NULL;
> /*
> * At least in one zone compaction wasn't deferred or skipped, so let's
> * count a compaction stall
This makes sense, I wonder if it would also be useful to check that
page == NULL, either in try_to_compact_pages() or here for
COMPACT_SKIPPED?
Thanks David for the review!!
On 2/2/2021 2:54 AM, David Rientjes wrote:
> On Mon, 1 Feb 2021, Charan Teja Reddy wrote:
>
>> By defination, COMPACT[STALL|FAIL] events needs to be counted when there
>
> s/defination/definition/\
Done.
>
>> is 'At least in one zone compaction wasn't deferred or skipped from the
>> direct compaction'. And when compaction is skipped or deferred,
>> COMPACT_SKIPPED will be returned but it will still go and update these
>> compaction events which is wrong in the sense that COMPACT[STALL|FAIL]
>> is counted without even trying the compaction.
>>
>> Correct this by skipping the counting of these events when
>> COMPACT_SKIPPED is returned for compaction. This indirectly also avoid
>> the unnecessary try into the get_page_from_freelist() when compaction is
>> not even tried.
>>
>> Signed-off-by: Charan Teja Reddy <[email protected]>
>> ---
>> mm/page_alloc.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 519a60d..531f244 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -4152,6 +4152,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
>> memalloc_noreclaim_restore(noreclaim_flag);
>> psi_memstall_leave(&pflags);
>>
>> + if (*compact_result == COMPACT_SKIPPED)
>> + return NULL;
>> /*
>> * At least in one zone compaction wasn't deferred or skipped, so let's
>> * count a compaction stall
>
> This makes sense, I wonder if it would also be useful to check that
> page == NULL, either in try_to_compact_pages() or here for
> COMPACT_SKIPPED?
In the code, when COMPACT_SKIPPED is being returned, the page will
always be NULL. So, I'm not sure how much useful it is for the page ==
NULL check here. Or I failed to understand your point here?
As, till now, COMPACTFAIL also presents page allocation failures because
of the direct compaction is skipped, creating a separate COMPACTSKIPFAIL
event which indicates that 'failed to get the free page as direct
compaction is skipped' is useful?
>
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project
On Tue, 2 Feb 2021, Charan Teja Kalla wrote:
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index 519a60d..531f244 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -4152,6 +4152,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
> >> memalloc_noreclaim_restore(noreclaim_flag);
> >> psi_memstall_leave(&pflags);
> >>
> >> + if (*compact_result == COMPACT_SKIPPED)
> >> + return NULL;
> >> /*
> >> * At least in one zone compaction wasn't deferred or skipped, so let's
> >> * count a compaction stall
> >
> > This makes sense, I wonder if it would also be useful to check that
> > page == NULL, either in try_to_compact_pages() or here for
> > COMPACT_SKIPPED?
>
> In the code, when COMPACT_SKIPPED is being returned, the page will
> always be NULL. So, I'm not sure how much useful it is for the page ==
> NULL check here. Or I failed to understand your point here?
>
Your code is short-circuiting the rest of __alloc_pages_direct_compact()
where the return value is dictated by whether page is NULL or non-NULL.
We can't leak a captured page if we are testing for it being NULL or
non-NULL, which is what the rest of __alloc_pages_direct_compact() does
*before* your change. So the idea was to add a check the page is actually
NULL here since you are now relying on the return value of
compact_zone_order() to be COMPACT_SKIPPED to infer page == NULL.
I agree that's currently true in the code, I was trying to catch any
errors where current->capture_control.page was non-NULL but
try_to_compact_pages() returns COMPACT_SKIPPED. There's some complexity
here.
So my idea was the expand this out to:
if (*compact_result == COMPACT_SKIPPED) {
VM_BUG_ON(page);
return NULL;
}
On 2/6/2021 3:58 AM, David Rientjes wrote:
>> In the code, when COMPACT_SKIPPED is being returned, the page will
>> always be NULL. So, I'm not sure how much useful it is for the page ==
>> NULL check here. Or I failed to understand your point here?
>>
> Your code is short-circuiting the rest of __alloc_pages_direct_compact()
> where the return value is dictated by whether page is NULL or non-NULL.
> We can't leak a captured page if we are testing for it being NULL or
> non-NULL, which is what the rest of __alloc_pages_direct_compact() does
> *before* your change. So the idea was to add a check the page is actually
> NULL here since you are now relying on the return value of
> compact_zone_order() to be COMPACT_SKIPPED to infer page == NULL.
>
> I agree that's currently true in the code, I was trying to catch any
> errors where current->capture_control.page was non-NULL but
> try_to_compact_pages() returns COMPACT_SKIPPED. There's some complexity
> here.
Thanks for the detailed explanation. This looks fine to me. I will send
V2 with this information in the commit log.
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project
On 2/5/21 11:28 PM, David Rientjes wrote:
> On Tue, 2 Feb 2021, Charan Teja Kalla wrote:
>
>> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> >> index 519a60d..531f244 100644
>> >> --- a/mm/page_alloc.c
>> >> +++ b/mm/page_alloc.c
>> >> @@ -4152,6 +4152,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
>> >> memalloc_noreclaim_restore(noreclaim_flag);
>> >> psi_memstall_leave(&pflags);
>> >>
>> >> + if (*compact_result == COMPACT_SKIPPED)
>> >> + return NULL;
>> >> /*
>> >> * At least in one zone compaction wasn't deferred or skipped, so let's
>> >> * count a compaction stall
>> >
>> > This makes sense, I wonder if it would also be useful to check that
>> > page == NULL, either in try_to_compact_pages() or here for
>> > COMPACT_SKIPPED?
>>
>> In the code, when COMPACT_SKIPPED is being returned, the page will
>> always be NULL. So, I'm not sure how much useful it is for the page ==
>> NULL check here. Or I failed to understand your point here?
>>
>
> Your code is short-circuiting the rest of __alloc_pages_direct_compact()
> where the return value is dictated by whether page is NULL or non-NULL.
> We can't leak a captured page if we are testing for it being NULL or
> non-NULL, which is what the rest of __alloc_pages_direct_compact() does
> *before* your change. So the idea was to add a check the page is actually
> NULL here since you are now relying on the return value of
> compact_zone_order() to be COMPACT_SKIPPED to infer page == NULL.
>
> I agree that's currently true in the code, I was trying to catch any
> errors where current->capture_control.page was non-NULL but
> try_to_compact_pages() returns COMPACT_SKIPPED. There's some complexity
> here.
>
> So my idea was the expand this out to:
>
> if (*compact_result == COMPACT_SKIPPED) {
> VM_BUG_ON(page);
> return NULL;
> }
Note that this may indeed actually trigger due to free page capture, when an IRQ
handler frees the page. See commit b9e20f0da1f5 ("mm, compaction: make capture
control handling safe wrt interrupts") describing how this was happening for
Hugh. So, this VM_BUG_ON() would sooner or later trigger.
It's because while compact_zone() does detect a successful capture and return
COMPACT_SUCCESS, the IRQ-capture can also happen later without being detected -
at any moment until compact_zone_order() resets the current->capture_control to
NULL. And at that point it may be already poised to return COMPACT_SKIPPED.
It might be cleanest to check *capture at the end of compact_zone_order() and
return COMPACT_SUCCESS when non-NULL. Technically it might be not true that
compaction was successful (we were just lucky that IRQ came and freed the page),
but not much harm in that. Better than e.g. the danger of leaking the captured
page which the proposed patch would do due to the shortcut.
The minor downside is that you would count a stall that wasn't really a stall in
case we skipped compaction, but captured a page by luck, but it would be very rare.