2014-10-23 08:14:48

by Joonsoo Kim

[permalink] [raw]
Subject: Re: + mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch added to -mm tree

On Tue, Oct 14, 2014 at 01:53:44PM -0700, [email protected] wrote:
>
> The patch titled
> Subject: mm/compaction.c: avoid premature range skip in isolate_migratepages_range
> has been added to the -mm tree. Its filename is
> mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
>
> This patch should soon appear at
> http://ozlabs.org/~akpm/mmots/broken-out/mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
> and later at
> http://ozlabs.org/~akpm/mmotm/broken-out/mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
>
> Before you just go and hit "reply", please:
> a) Consider who else should be cc'ed
> b) Prefer to cc a suitable mailing list as well
> c) Ideally: find the original patch on the mailing list and do a
> reply-to-all to that, adding suitable additional cc's
>
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
>
> ------------------------------------------------------
> From: Joonsoo Kim <[email protected]>
> Subject: mm/compaction.c: avoid premature range skip in isolate_migratepages_range
>
> commit edc2ca612496 ("mm, compaction: move pageblock checks up from
> isolate_migratepages_range()") commonizes isolate_migratepages variants
> and make them use isolate_migratepages_block().
>
> isolate_migratepages_block() could stop the execution when enough pages
> are isolated, but, there is no code in isolate_migratepages_range() to
> handle this case. In the result, even if isolate_migratepages_block()
> returns prematurely without checking all pages in the range,
>
> isolate_migratepages_block() is called repeately on the following
> pageblock and some pages in the previous range are skipped to check.
> Then, CMA is failed frequently due to this fact.
>
> To fix this problem, this patch let isolate_migratepages_range() know the
> situation that enough pages are isolated and stop the isolation in that
> case.
>
> Note that isolate_migratepages() has no such problem, because, it always
> stops the isolation after just one call of isolate_migratepages_block().
>
> Signed-off-by: Joonsoo Kim <[email protected]>
> Cc: Vlastimil Babka <[email protected]>
> Cc: David Rientjes <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Cc: Michal Nazarewicz <[email protected]>
> Cc: Naoya Horiguchi <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Rik van Riel <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Zhang Yanfei <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>

Hello, Andrew.

I forgot to mention that this should be merged for v3.18. :)

Thanks.


2014-10-23 08:39:54

by Vlastimil Babka

[permalink] [raw]
Subject: Re: + mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch added to -mm tree

On 10/23/2014 10:15 AM, Joonsoo Kim wrote:
> On Tue, Oct 14, 2014 at 01:53:44PM -0700, [email protected] wrote:
>>
>> The patch titled
>> Subject: mm/compaction.c: avoid premature range skip in isolate_migratepages_range
>> has been added to the -mm tree. Its filename is
>> mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
>>
>> This patch should soon appear at
>> http://ozlabs.org/~akpm/mmots/broken-out/mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
>> and later at
>> http://ozlabs.org/~akpm/mmotm/broken-out/mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
>>
>> Before you just go and hit "reply", please:
>> a) Consider who else should be cc'ed
>> b) Prefer to cc a suitable mailing list as well
>> c) Ideally: find the original patch on the mailing list and do a
>> reply-to-all to that, adding suitable additional cc's
>>
>> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>>
>> The -mm tree is included into linux-next and is updated
>> there every 3-4 working days
>>
>> ------------------------------------------------------
>> From: Joonsoo Kim <[email protected]>
>> Subject: mm/compaction.c: avoid premature range skip in isolate_migratepages_range
>>
>> commit edc2ca612496 ("mm, compaction: move pageblock checks up from
>> isolate_migratepages_range()") commonizes isolate_migratepages variants
>> and make them use isolate_migratepages_block().
>>
>> isolate_migratepages_block() could stop the execution when enough pages
>> are isolated, but, there is no code in isolate_migratepages_range() to
>> handle this case. In the result, even if isolate_migratepages_block()
>> returns prematurely without checking all pages in the range,
>>
>> isolate_migratepages_block() is called repeately on the following
>> pageblock and some pages in the previous range are skipped to check.
>> Then, CMA is failed frequently due to this fact.
>>
>> To fix this problem, this patch let isolate_migratepages_range() know the
>> situation that enough pages are isolated and stop the isolation in that
>> case.
>>
>> Note that isolate_migratepages() has no such problem, because, it always
>> stops the isolation after just one call of isolate_migratepages_block().
>>
>> Signed-off-by: Joonsoo Kim <[email protected]>
>> Cc: Vlastimil Babka <[email protected]>
>> Cc: David Rientjes <[email protected]>
>> Cc: Minchan Kim <[email protected]>
>> Cc: Michal Nazarewicz <[email protected]>
>> Cc: Naoya Horiguchi <[email protected]>
>> Cc: Christoph Lameter <[email protected]>
>> Cc: Rik van Riel <[email protected]>
>> Cc: Mel Gorman <[email protected]>
>> Cc: Zhang Yanfei <[email protected]>
>> Signed-off-by: Andrew Morton <[email protected]>

Acked-by: Vlastimil Babka <[email protected]>

Sorry for the trouble. But I think a more robust and future-proof fix
would be a check such as: if (pfn < block_end_pfn) break;
This should catch any reason where isolate_migratepages_block() did not
finish whole pageblock, and which was not fatal enough to return pfn==0.
However currently this seems to happen only due to isolating too much,
so your patch should work.
So it's up to you if you want to make the check more generic now, or
later after this bug is fixed for 3.18.

Vlastimil

> Hello, Andrew.
>
> I forgot to mention that this should be merged for v3.18. :)
>
> Thanks.
>

2014-10-24 02:59:06

by Joonsoo Kim

[permalink] [raw]
Subject: Re: + mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch added to -mm tree

On Thu, Oct 23, 2014 at 10:39:45AM +0200, Vlastimil Babka wrote:
> On 10/23/2014 10:15 AM, Joonsoo Kim wrote:
> > On Tue, Oct 14, 2014 at 01:53:44PM -0700, [email protected] wrote:
> >>
> >> The patch titled
> >> Subject: mm/compaction.c: avoid premature range skip in isolate_migratepages_range
> >> has been added to the -mm tree. Its filename is
> >> mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
> >>
> >> This patch should soon appear at
> >> http://ozlabs.org/~akpm/mmots/broken-out/mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
> >> and later at
> >> http://ozlabs.org/~akpm/mmotm/broken-out/mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
> >>
> >> Before you just go and hit "reply", please:
> >> a) Consider who else should be cc'ed
> >> b) Prefer to cc a suitable mailing list as well
> >> c) Ideally: find the original patch on the mailing list and do a
> >> reply-to-all to that, adding suitable additional cc's
> >>
> >> *** Remember to use Documentation/SubmitChecklist when testing your code ***
> >>
> >> The -mm tree is included into linux-next and is updated
> >> there every 3-4 working days
> >>
> >> ------------------------------------------------------
> >> From: Joonsoo Kim <[email protected]>
> >> Subject: mm/compaction.c: avoid premature range skip in isolate_migratepages_range
> >>
> >> commit edc2ca612496 ("mm, compaction: move pageblock checks up from
> >> isolate_migratepages_range()") commonizes isolate_migratepages variants
> >> and make them use isolate_migratepages_block().
> >>
> >> isolate_migratepages_block() could stop the execution when enough pages
> >> are isolated, but, there is no code in isolate_migratepages_range() to
> >> handle this case. In the result, even if isolate_migratepages_block()
> >> returns prematurely without checking all pages in the range,
> >>
> >> isolate_migratepages_block() is called repeately on the following
> >> pageblock and some pages in the previous range are skipped to check.
> >> Then, CMA is failed frequently due to this fact.
> >>
> >> To fix this problem, this patch let isolate_migratepages_range() know the
> >> situation that enough pages are isolated and stop the isolation in that
> >> case.
> >>
> >> Note that isolate_migratepages() has no such problem, because, it always
> >> stops the isolation after just one call of isolate_migratepages_block().
> >>
> >> Signed-off-by: Joonsoo Kim <[email protected]>
> >> Cc: Vlastimil Babka <[email protected]>
> >> Cc: David Rientjes <[email protected]>
> >> Cc: Minchan Kim <[email protected]>
> >> Cc: Michal Nazarewicz <[email protected]>
> >> Cc: Naoya Horiguchi <[email protected]>
> >> Cc: Christoph Lameter <[email protected]>
> >> Cc: Rik van Riel <[email protected]>
> >> Cc: Mel Gorman <[email protected]>
> >> Cc: Zhang Yanfei <[email protected]>
> >> Signed-off-by: Andrew Morton <[email protected]>
>
> Acked-by: Vlastimil Babka <[email protected]>
>
> Sorry for the trouble. But I think a more robust and future-proof fix
> would be a check such as: if (pfn < block_end_pfn) break;
> This should catch any reason where isolate_migratepages_block() did not
> finish whole pageblock, and which was not fatal enough to return pfn==0.
> However currently this seems to happen only due to isolating too much,
> so your patch should work.
> So it's up to you if you want to make the check more generic now, or
> later after this bug is fixed for 3.18.

'if (pfn < block_end_pfn) break;' has one problem.
If we have enough isolated pages and reach at block_end_pfn,
we can't stop with above check.

More proper check may be as following.
'if (pfn < block_end_pfn ||
cc->nr_migratepages == COMPACT_CLUSTER_MAX) break;'
But, as you mentioned, there is no case where 'pfn < block_end_pfn'
now, so I'd like to remain the patch as is.

Thanks.

2014-10-29 14:04:10

by Vlastimil Babka

[permalink] [raw]
Subject: Re: + mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch added to -mm tree

On 10/24/2014 05:00 AM, Joonsoo Kim wrote:
> On Thu, Oct 23, 2014 at 10:39:45AM +0200, Vlastimil Babka wrote:
>> On 10/23/2014 10:15 AM, Joonsoo Kim wrote:
>>> On Tue, Oct 14, 2014 at 01:53:44PM -0700, [email protected] wrote:
>>>>
>>>> The patch titled
>>>> Subject: mm/compaction.c: avoid premature range skip in isolate_migratepages_range
>>>> has been added to the -mm tree. Its filename is
>>>> mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
>>>>
>>>> This patch should soon appear at
>>>> http://ozlabs.org/~akpm/mmots/broken-out/mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
>>>> and later at
>>>> http://ozlabs.org/~akpm/mmotm/broken-out/mm-compaction-avoid-premature-range-skip-in-isolate_migratepages_range.patch
>>>>
>>>> Before you just go and hit "reply", please:
>>>> a) Consider who else should be cc'ed
>>>> b) Prefer to cc a suitable mailing list as well
>>>> c) Ideally: find the original patch on the mailing list and do a
>>>> reply-to-all to that, adding suitable additional cc's
>>>>
>>>> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>>>>
>>>> The -mm tree is included into linux-next and is updated
>>>> there every 3-4 working days
>>>>
>>>> ------------------------------------------------------
>>>> From: Joonsoo Kim <[email protected]>
>>>> Subject: mm/compaction.c: avoid premature range skip in isolate_migratepages_range
>>>>
>>>> commit edc2ca612496 ("mm, compaction: move pageblock checks up from
>>>> isolate_migratepages_range()") commonizes isolate_migratepages variants
>>>> and make them use isolate_migratepages_block().
>>>>
>>>> isolate_migratepages_block() could stop the execution when enough pages
>>>> are isolated, but, there is no code in isolate_migratepages_range() to
>>>> handle this case. In the result, even if isolate_migratepages_block()
>>>> returns prematurely without checking all pages in the range,
>>>>
>>>> isolate_migratepages_block() is called repeately on the following
>>>> pageblock and some pages in the previous range are skipped to check.
>>>> Then, CMA is failed frequently due to this fact.
>>>>
>>>> To fix this problem, this patch let isolate_migratepages_range() know the
>>>> situation that enough pages are isolated and stop the isolation in that
>>>> case.
>>>>
>>>> Note that isolate_migratepages() has no such problem, because, it always
>>>> stops the isolation after just one call of isolate_migratepages_block().
>>>>
>>>> Signed-off-by: Joonsoo Kim <[email protected]>
>>>> Cc: Vlastimil Babka <[email protected]>
>>>> Cc: David Rientjes <[email protected]>
>>>> Cc: Minchan Kim <[email protected]>
>>>> Cc: Michal Nazarewicz <[email protected]>
>>>> Cc: Naoya Horiguchi <[email protected]>
>>>> Cc: Christoph Lameter <[email protected]>
>>>> Cc: Rik van Riel <[email protected]>
>>>> Cc: Mel Gorman <[email protected]>
>>>> Cc: Zhang Yanfei <[email protected]>
>>>> Signed-off-by: Andrew Morton <[email protected]>
>>
>> Acked-by: Vlastimil Babka <[email protected]>
>>
>> Sorry for the trouble. But I think a more robust and future-proof fix
>> would be a check such as: if (pfn < block_end_pfn) break;
>> This should catch any reason where isolate_migratepages_block() did not
>> finish whole pageblock, and which was not fatal enough to return pfn==0.
>> However currently this seems to happen only due to isolating too much,
>> so your patch should work.
>> So it's up to you if you want to make the check more generic now, or
>> later after this bug is fixed for 3.18.
>
> 'if (pfn < block_end_pfn) break;' has one problem.
> If we have enough isolated pages and reach at block_end_pfn,
> we can't stop with above check.

Oh, right.

> More proper check may be as following.
> 'if (pfn < block_end_pfn ||
> cc->nr_migratepages == COMPACT_CLUSTER_MAX) break;'
> But, as you mentioned, there is no case where 'pfn < block_end_pfn'
> now, so I'd like to remain the patch as is.

Yep, that would only make things uglier :/ So I'm for merging your patch
in 3.18 cycle to prevent breakage.

> Thanks.
>