2015-12-14 05:02:35

by Joonsoo Kim

[permalink] [raw]
Subject: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn

free_pfn and compact_cached_free_pfn are the pointer that remember
restart position of freepage scanner. When they are reset or invalid,
we set them to zone_end_pfn because freepage scanner works in reverse
direction. But, because zone range is defined as [zone_start_pfn,
zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
not store it to free_pfn and compact_cached_free_pfn. Instead, we need
to store zone_end_pfn - 1 to them. There is one more thing we should
consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
and compact_cached_free_pfn are set to middle of pageblock, it regards
that sitiation as that it already scans front part of pageblock so we
lose opportunity to scan there. To fix-up, this patch do round_down()
to guarantee that reset position will be pageblock aligned.

Note that thanks to the current pageblock_pfn_to_page() implementation,
actual access to zone_end_pfn doesn't happen until now. But, following
patch will change pageblock_pfn_to_page() so this patch is needed
from now on.

Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/compaction.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 585de54..56fa321 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone)
{
zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
- zone->compact_cached_free_pfn = zone_end_pfn(zone);
+ zone->compact_cached_free_pfn =
+ round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages);
}

/*
@@ -1371,11 +1372,11 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
*/
cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
cc->free_pfn = zone->compact_cached_free_pfn;
- if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
- cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
+ if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
+ cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages);
zone->compact_cached_free_pfn = cc->free_pfn;
}
- if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
+ if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
cc->migrate_pfn = start_pfn;
zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
--
1.9.1


2015-12-14 05:02:39

by Joonsoo Kim

[permalink] [raw]
Subject: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous

There is a performance drop report due to hugepage allocation and in there
half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
In that workload, compaction is triggered to make hugepage but most of
pageblocks are un-available for compaction due to pageblock type and
skip bit so compaction usually fails. Most costly operations in this case
is to find valid pageblock while scanning whole zone range. To check
if pageblock is valid to compact, valid pfn within pageblock is required
and we can obtain it by calling pageblock_pfn_to_page(). This function
checks whether pageblock is in a single zone and return valid pfn
if possible. Problem is that we need to check it every time before
scanning pageblock even if we re-visit it and this turns out to
be very expensive in this workload.

Although we have no way to skip this pageblock check in the system
where hole exists at arbitrary position, we can use cached value for
zone continuity and just do pfn_to_page() in the system where hole doesn't
exist. This optimization considerably speeds up in above workload.

Before vs After
Max: 1096 MB/s vs 1325 MB/s
Min: 635 MB/s 1015 MB/s
Avg: 899 MB/s 1194 MB/s

Avg is improved by roughly 30% [2].

Not to disturb the system where compaction isn't triggered, checking will
be done at first compaction invocation.

[1]: http://www.spinics.net/lists/linux-mm/msg97378.html
[2]: https://lkml.org/lkml/2015/12/9/23

Signed-off-by: Joonsoo Kim <[email protected]>
---
include/linux/mmzone.h | 1 +
mm/compaction.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 68cc063..cd3736e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -521,6 +521,7 @@ struct zone {
#if defined CONFIG_COMPACTION || defined CONFIG_CMA
/* Set to true when the PG_migrate_skip bits should be cleared */
bool compact_blockskip_flush;
+ bool contiguous;
#endif

ZONE_PADDING(_pad3_)
diff --git a/mm/compaction.c b/mm/compaction.c
index 56fa321..ce60b38 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -88,7 +88,7 @@ static inline bool migrate_async_suitable(int migratetype)
* the first and last page of a pageblock and avoid checking each individual
* page in a pageblock.
*/
-static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
+static struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
unsigned long end_pfn, struct zone *zone)
{
struct page *start_page;
@@ -114,6 +114,51 @@ static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
return start_page;
}

+static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
+ unsigned long end_pfn, struct zone *zone)
+{
+ if (zone->contiguous)
+ return pfn_to_page(start_pfn);
+
+ return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
+}
+
+static void check_zone_contiguous(struct zone *zone)
+{
+ unsigned long block_start_pfn = zone->zone_start_pfn;
+ unsigned long block_end_pfn;
+ unsigned long pfn;
+
+ /* Already initialized if cached pfn is non-zero */
+ if (zone->compact_cached_migrate_pfn[0] ||
+ zone->compact_cached_free_pfn)
+ return;
+
+ /* Mark that checking is in progress */
+ zone->compact_cached_free_pfn = ULONG_MAX;
+
+ block_end_pfn = ALIGN(block_start_pfn + 1, pageblock_nr_pages);
+ for (; block_start_pfn < zone_end_pfn(zone);
+ block_start_pfn = block_end_pfn,
+ block_end_pfn += pageblock_nr_pages) {
+
+ block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
+
+ if (!__pageblock_pfn_to_page(block_start_pfn,
+ block_end_pfn, zone))
+ return;
+
+ /* Check validity of pfn within pageblock */
+ for (pfn = block_start_pfn; pfn < block_end_pfn; pfn++) {
+ if (!pfn_valid_within(pfn))
+ return;
+ }
+ }
+
+ /* We confirm that there is no hole */
+ zone->contiguous = true;
+}
+
#ifdef CONFIG_COMPACTION

/* Do not skip compaction more than 64 times */
@@ -1357,6 +1402,8 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
;
}

+ check_zone_contiguous(zone);
+
/*
* Clear pageblock skip if there were failures recently and compaction
* is about to be retried after being deferred. kswapd does not do
--
1.9.1

2015-12-14 10:07:08

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn

On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
> free_pfn and compact_cached_free_pfn are the pointer that remember
> restart position of freepage scanner. When they are reset or invalid,
> we set them to zone_end_pfn because freepage scanner works in reverse
> direction. But, because zone range is defined as [zone_start_pfn,
> zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
> not store it to free_pfn and compact_cached_free_pfn. Instead, we need
> to store zone_end_pfn - 1 to them. There is one more thing we should
> consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
> and compact_cached_free_pfn are set to middle of pageblock, it regards
> that sitiation as that it already scans front part of pageblock so we
> lose opportunity to scan there. To fix-up, this patch do round_down()
> to guarantee that reset position will be pageblock aligned.
>
> Note that thanks to the current pageblock_pfn_to_page() implementation,
> actual access to zone_end_pfn doesn't happen until now. But, following
> patch will change pageblock_pfn_to_page() so this patch is needed
> from now on.
>
> Signed-off-by: Joonsoo Kim <[email protected]>

Acked-by: Vlastimil Babka <[email protected]>

Note that until now in compaction we've used basically an open-coded
round_down(), and ALIGN() for rounding up. You introduce a first use of
round_down(), and it would be nice to standardize on round_down() and
round_up() everywhere. I think it's more obvious than open-coding and
ALIGN() (which doesn't tell the reader if it's aligning up or down).
Hopefully they really do the same thing and there are no caveats...

> ---
> mm/compaction.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 585de54..56fa321 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone)
> {
> zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
> zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
> - zone->compact_cached_free_pfn = zone_end_pfn(zone);
> + zone->compact_cached_free_pfn =
> + round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages);
> }
>
> /*
> @@ -1371,11 +1372,11 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
> */
> cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
> cc->free_pfn = zone->compact_cached_free_pfn;
> - if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
> - cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
> + if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
> + cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages);
> zone->compact_cached_free_pfn = cc->free_pfn;
> }
> - if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
> + if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
> cc->migrate_pfn = start_pfn;
> zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
> zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
>

2015-12-14 10:29:57

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous

On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
> There is a performance drop report due to hugepage allocation and in there
> half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
> In that workload, compaction is triggered to make hugepage but most of
> pageblocks are un-available for compaction due to pageblock type and
> skip bit so compaction usually fails. Most costly operations in this case
> is to find valid pageblock while scanning whole zone range. To check
> if pageblock is valid to compact, valid pfn within pageblock is required
> and we can obtain it by calling pageblock_pfn_to_page(). This function
> checks whether pageblock is in a single zone and return valid pfn
> if possible. Problem is that we need to check it every time before
> scanning pageblock even if we re-visit it and this turns out to
> be very expensive in this workload.

Hm I wonder if this is safe wrt memory hotplug? Shouldn't there be a
rechecking plugged into the appropriate hotplug add/remove callbacks?
Which would make the whole thing generic too, zone->contiguous
information doesn't have to be limited to compaction. And it would
remove the rather ugly part where cached pfn info is used as an
indication of zone->contiguous being already set...

> Although we have no way to skip this pageblock check in the system
> where hole exists at arbitrary position, we can use cached value for
> zone continuity and just do pfn_to_page() in the system where hole doesn't
> exist. This optimization considerably speeds up in above workload.
>
> Before vs After
> Max: 1096 MB/s vs 1325 MB/s
> Min: 635 MB/s 1015 MB/s
> Avg: 899 MB/s 1194 MB/s
>
> Avg is improved by roughly 30% [2].

Unless I'm mistaken, these results also include my RFC series (Aaron can
you clarify?). These patches should better be tested standalone on top
of base, as being simpler they will probably be included sooner (the RFC
series needs reviews at the very least :) - although the memory hotplug
concerns might make the "sooner" here relative too.

Anyway it's interesting that this patch improved "Min", and variance in
general (on top of my RFC) so much. I would expect the overhead of
pageblock_pfn_to_page() to be quite stable, hmm.

> Not to disturb the system where compaction isn't triggered, checking will
> be done at first compaction invocation.
>
> [1]: http://www.spinics.net/lists/linux-mm/msg97378.html
> [2]: https://lkml.org/lkml/2015/12/9/23
>
> Signed-off-by: Joonsoo Kim <[email protected]>
> ---
> include/linux/mmzone.h | 1 +
> mm/compaction.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 49 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 68cc063..cd3736e 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -521,6 +521,7 @@ struct zone {
> #if defined CONFIG_COMPACTION || defined CONFIG_CMA
> /* Set to true when the PG_migrate_skip bits should be cleared */
> bool compact_blockskip_flush;
> + bool contiguous;
> #endif
>
> ZONE_PADDING(_pad3_)
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 56fa321..ce60b38 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -88,7 +88,7 @@ static inline bool migrate_async_suitable(int migratetype)
> * the first and last page of a pageblock and avoid checking each individual
> * page in a pageblock.
> */
> -static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
> +static struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
> unsigned long end_pfn, struct zone *zone)
> {
> struct page *start_page;
> @@ -114,6 +114,51 @@ static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
> return start_page;
> }
>
> +static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
> + unsigned long end_pfn, struct zone *zone)
> +{
> + if (zone->contiguous)
> + return pfn_to_page(start_pfn);
> +
> + return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
> +}
> +
> +static void check_zone_contiguous(struct zone *zone)
> +{
> + unsigned long block_start_pfn = zone->zone_start_pfn;
> + unsigned long block_end_pfn;
> + unsigned long pfn;
> +
> + /* Already initialized if cached pfn is non-zero */
> + if (zone->compact_cached_migrate_pfn[0] ||
> + zone->compact_cached_free_pfn)
> + return;
> +
> + /* Mark that checking is in progress */
> + zone->compact_cached_free_pfn = ULONG_MAX;
> +
> + block_end_pfn = ALIGN(block_start_pfn + 1, pageblock_nr_pages);
> + for (; block_start_pfn < zone_end_pfn(zone);
> + block_start_pfn = block_end_pfn,
> + block_end_pfn += pageblock_nr_pages) {
> +
> + block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
> +
> + if (!__pageblock_pfn_to_page(block_start_pfn,
> + block_end_pfn, zone))
> + return;
> +
> + /* Check validity of pfn within pageblock */
> + for (pfn = block_start_pfn; pfn < block_end_pfn; pfn++) {
> + if (!pfn_valid_within(pfn))
> + return;
> + }
> + }
> +
> + /* We confirm that there is no hole */
> + zone->contiguous = true;
> +}
> +
> #ifdef CONFIG_COMPACTION
>
> /* Do not skip compaction more than 64 times */
> @@ -1357,6 +1402,8 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
> ;
> }
>
> + check_zone_contiguous(zone);
> +
> /*
> * Clear pageblock skip if there were failures recently and compaction
> * is about to be retried after being deferred. kswapd does not do
>

2015-12-14 15:25:47

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous

2015-12-14 19:29 GMT+09:00 Vlastimil Babka <[email protected]>:
> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>
>> There is a performance drop report due to hugepage allocation and in there
>> half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
>> In that workload, compaction is triggered to make hugepage but most of
>> pageblocks are un-available for compaction due to pageblock type and
>> skip bit so compaction usually fails. Most costly operations in this case
>> is to find valid pageblock while scanning whole zone range. To check
>> if pageblock is valid to compact, valid pfn within pageblock is required
>> and we can obtain it by calling pageblock_pfn_to_page(). This function
>> checks whether pageblock is in a single zone and return valid pfn
>> if possible. Problem is that we need to check it every time before
>> scanning pageblock even if we re-visit it and this turns out to
>> be very expensive in this workload.
>
>
> Hm I wonder if this is safe wrt memory hotplug? Shouldn't there be a
> rechecking plugged into the appropriate hotplug add/remove callbacks? Which
> would make the whole thing generic too, zone->contiguous information doesn't
> have to be limited to compaction. And it would remove the rather ugly part
> where cached pfn info is used as an indication of zone->contiguous being
> already set...

Will check it.

>> Although we have no way to skip this pageblock check in the system
>> where hole exists at arbitrary position, we can use cached value for
>> zone continuity and just do pfn_to_page() in the system where hole doesn't
>> exist. This optimization considerably speeds up in above workload.
>>
>> Before vs After
>> Max: 1096 MB/s vs 1325 MB/s
>> Min: 635 MB/s 1015 MB/s
>> Avg: 899 MB/s 1194 MB/s
>>
>> Avg is improved by roughly 30% [2].
>
>
> Unless I'm mistaken, these results also include my RFC series (Aaron can you
> clarify?). These patches should better be tested standalone on top of base,
> as being simpler they will probably be included sooner (the RFC series needs
> reviews at the very least :) - although the memory hotplug concerns might
> make the "sooner" here relative too.

AFAIK, these patches are tested standalone on top of base. When I sent it,
I asked to Aaron to test it on top of base.

Btw, I missed adding Reported/Tested-by tag for Aaron. I will add it
on next spin.

> Anyway it's interesting that this patch improved "Min", and variance in
> general (on top of my RFC) so much. I would expect the overhead of
> pageblock_pfn_to_page() to be quite stable, hmm.

Perhaps, pageblock_pfn_to_page() would be stable. Combination of
slow scanning and kswapd's skip bit flushing would result in unstable result.

Thanks.

>
>> Not to disturb the system where compaction isn't triggered, checking will
>> be done at first compaction invocation.
>>
>> [1]: http://www.spinics.net/lists/linux-mm/msg97378.html
>> [2]: https://lkml.org/lkml/2015/12/9/23
>>
>> Signed-off-by: Joonsoo Kim <[email protected]>
>> ---
>> include/linux/mmzone.h | 1 +
>> mm/compaction.c | 49
>> ++++++++++++++++++++++++++++++++++++++++++++++++-
>> 2 files changed, 49 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 68cc063..cd3736e 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -521,6 +521,7 @@ struct zone {
>> #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>> /* Set to true when the PG_migrate_skip bits should be cleared */
>> bool compact_blockskip_flush;
>> + bool contiguous;
>> #endif
>>
>> ZONE_PADDING(_pad3_)
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 56fa321..ce60b38 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -88,7 +88,7 @@ static inline bool migrate_async_suitable(int
>> migratetype)
>> * the first and last page of a pageblock and avoid checking each
>> individual
>> * page in a pageblock.
>> */
>> -static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
>> +static struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
>> unsigned long end_pfn, struct zone *zone)
>> {
>> struct page *start_page;
>> @@ -114,6 +114,51 @@ static struct page *pageblock_pfn_to_page(unsigned
>> long start_pfn,
>> return start_page;
>> }
>>
>> +static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
>> + unsigned long end_pfn, struct zone *zone)
>> +{
>> + if (zone->contiguous)
>> + return pfn_to_page(start_pfn);
>> +
>> + return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
>> +}
>> +
>> +static void check_zone_contiguous(struct zone *zone)
>> +{
>> + unsigned long block_start_pfn = zone->zone_start_pfn;
>> + unsigned long block_end_pfn;
>> + unsigned long pfn;
>> +
>> + /* Already initialized if cached pfn is non-zero */
>> + if (zone->compact_cached_migrate_pfn[0] ||
>> + zone->compact_cached_free_pfn)
>> + return;
>> +
>> + /* Mark that checking is in progress */
>> + zone->compact_cached_free_pfn = ULONG_MAX;
>> +
>> + block_end_pfn = ALIGN(block_start_pfn + 1, pageblock_nr_pages);
>> + for (; block_start_pfn < zone_end_pfn(zone);
>> + block_start_pfn = block_end_pfn,
>> + block_end_pfn += pageblock_nr_pages) {
>> +
>> + block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
>> +
>> + if (!__pageblock_pfn_to_page(block_start_pfn,
>> + block_end_pfn, zone))
>> + return;
>> +
>> + /* Check validity of pfn within pageblock */
>> + for (pfn = block_start_pfn; pfn < block_end_pfn; pfn++) {
>> + if (!pfn_valid_within(pfn))
>> + return;
>> + }
>> + }
>> +
>> + /* We confirm that there is no hole */
>> + zone->contiguous = true;
>> +}
>> +
>> #ifdef CONFIG_COMPACTION
>>
>> /* Do not skip compaction more than 64 times */
>> @@ -1357,6 +1402,8 @@ static int compact_zone(struct zone *zone, struct
>> compact_control *cc)
>> ;
>> }
>>
>> + check_zone_contiguous(zone);
>> +
>> /*
>> * Clear pageblock skip if there were failures recently and
>> compaction
>> * is about to be retried after being deferred. kswapd does not do
>>
>

2015-12-14 15:26:59

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn

2015-12-14 19:07 GMT+09:00 Vlastimil Babka <[email protected]>:
> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>
>> free_pfn and compact_cached_free_pfn are the pointer that remember
>> restart position of freepage scanner. When they are reset or invalid,
>> we set them to zone_end_pfn because freepage scanner works in reverse
>> direction. But, because zone range is defined as [zone_start_pfn,
>> zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
>> not store it to free_pfn and compact_cached_free_pfn. Instead, we need
>> to store zone_end_pfn - 1 to them. There is one more thing we should
>> consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
>> and compact_cached_free_pfn are set to middle of pageblock, it regards
>> that sitiation as that it already scans front part of pageblock so we
>> lose opportunity to scan there. To fix-up, this patch do round_down()
>> to guarantee that reset position will be pageblock aligned.
>>
>> Note that thanks to the current pageblock_pfn_to_page() implementation,
>> actual access to zone_end_pfn doesn't happen until now. But, following
>> patch will change pageblock_pfn_to_page() so this patch is needed
>> from now on.
>>
>> Signed-off-by: Joonsoo Kim <[email protected]>
>
>
> Acked-by: Vlastimil Babka <[email protected]>
>
> Note that until now in compaction we've used basically an open-coded
> round_down(), and ALIGN() for rounding up. You introduce a first use of
> round_down(), and it would be nice to standardize on round_down() and
> round_up() everywhere. I think it's more obvious than open-coding and
> ALIGN() (which doesn't tell the reader if it's aligning up or down).
> Hopefully they really do the same thing and there are no caveats...

Okay. Will send another patch for this clean-up on next spin.

Thanks.

>
>> ---
>> mm/compaction.c | 9 +++++----
>> 1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 585de54..56fa321 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone)
>> {
>> zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
>> zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
>> - zone->compact_cached_free_pfn = zone_end_pfn(zone);
>> + zone->compact_cached_free_pfn =
>> + round_down(zone_end_pfn(zone) - 1,
>> pageblock_nr_pages);
>> }
>>
>> /*
>> @@ -1371,11 +1372,11 @@ static int compact_zone(struct zone *zone, struct
>> compact_control *cc)
>> */
>> cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
>> cc->free_pfn = zone->compact_cached_free_pfn;
>> - if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
>> - cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
>> + if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
>> + cc->free_pfn = round_down(end_pfn - 1,
>> pageblock_nr_pages);
>> zone->compact_cached_free_pfn = cc->free_pfn;
>> }
>> - if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
>> + if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
>> cc->migrate_pfn = start_pfn;
>> zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
>> zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
>>
>

2015-12-15 01:06:29

by Aaron Lu

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous

On 12/14/2015 11:25 PM, Joonsoo Kim wrote:
> 2015-12-14 19:29 GMT+09:00 Vlastimil Babka <[email protected]>:
>> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>> Before vs After
>>> Max: 1096 MB/s vs 1325 MB/s
>>> Min: 635 MB/s 1015 MB/s
>>> Avg: 899 MB/s 1194 MB/s
>>>
>>> Avg is improved by roughly 30% [2].
>>
>>
>> Unless I'm mistaken, these results also include my RFC series (Aaron can you
>> clarify?). These patches should better be tested standalone on top of base,
>> as being simpler they will probably be included sooner (the RFC series needs
>> reviews at the very least :) - although the memory hotplug concerns might
>> make the "sooner" here relative too.
>
> AFAIK, these patches are tested standalone on top of base. When I sent it,
> I asked to Aaron to test it on top of base.

Right, it is tested standalone on top of base.

2015-12-15 08:24:21

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous

On 12/15/2015 02:06 AM, Aaron Lu wrote:
> On 12/14/2015 11:25 PM, Joonsoo Kim wrote:
>> 2015-12-14 19:29 GMT+09:00 Vlastimil Babka <[email protected]>:
>>> Unless I'm mistaken, these results also include my RFC series (Aaron can you
>>> clarify?). These patches should better be tested standalone on top of base,
>>> as being simpler they will probably be included sooner (the RFC series needs
>>> reviews at the very least :) - although the memory hotplug concerns might
>>> make the "sooner" here relative too.
>>
>> AFAIK, these patches are tested standalone on top of base. When I sent it,
>> I asked to Aaron to test it on top of base.
>
> Right, it is tested standalone on top of base.

Thanks, sorry about the noise then.

2015-12-15 08:31:43

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn

On 12/14/2015 04:26 PM, Joonsoo Kim wrote:
> 2015-12-14 19:07 GMT+09:00 Vlastimil Babka <[email protected]>:
>> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>>
>>
>> Acked-by: Vlastimil Babka <[email protected]>
>>
>> Note that until now in compaction we've used basically an open-coded
>> round_down(), and ALIGN() for rounding up. You introduce a first use of
>> round_down(), and it would be nice to standardize on round_down() and
>> round_up() everywhere. I think it's more obvious than open-coding and
>> ALIGN() (which doesn't tell the reader if it's aligning up or down).
>> Hopefully they really do the same thing and there are no caveats...
>
> Okay. Will send another patch for this clean-up on next spin.

Great, I didn't mean that the cleanup is needed right now, but whether
we agree on an idiom to use whenever doing any changes from now on.
Maybe it would be best to add some defines in the top of compaction.c
that would also hide away the repeated pageblock_nr_pages everywhere?
Something like:

#define pageblock_start(pfn) round_down(pfn, pageblock_nr_pages)
#define pageblock_end(pfn) round_up((pfn)+1, pageblock_nr_pages)

2015-12-16 05:43:07

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn

On Tue, Dec 15, 2015 at 09:31:39AM +0100, Vlastimil Babka wrote:
> On 12/14/2015 04:26 PM, Joonsoo Kim wrote:
> >2015-12-14 19:07 GMT+09:00 Vlastimil Babka <[email protected]>:
> >>On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
> >>>
> >>
> >>Acked-by: Vlastimil Babka <[email protected]>
> >>
> >>Note that until now in compaction we've used basically an open-coded
> >>round_down(), and ALIGN() for rounding up. You introduce a first use of
> >>round_down(), and it would be nice to standardize on round_down() and
> >>round_up() everywhere. I think it's more obvious than open-coding and
> >>ALIGN() (which doesn't tell the reader if it's aligning up or down).
> >>Hopefully they really do the same thing and there are no caveats...
> >
> >Okay. Will send another patch for this clean-up on next spin.
>
> Great, I didn't mean that the cleanup is needed right now, but
> whether we agree on an idiom to use whenever doing any changes from
> now on.

Okay.

> Maybe it would be best to add some defines in the top of
> compaction.c that would also hide away the repeated
> pageblock_nr_pages everywhere? Something like:
>
> #define pageblock_start(pfn) round_down(pfn, pageblock_nr_pages)
> #define pageblock_end(pfn) round_up((pfn)+1, pageblock_nr_pages)

Quick grep shows that there are much more places this new define or
some variant can be used. It would be good clean-up. I will try it
separately.

Thanks.