Objective
----
The motivation for this series of patches is use unsigned int for
"order" in compaction.c, just like in other memory subsystems.
In addition, did some cleanup about "order" in page_alloc
and vmscan.
Description
----
Directly modifying the type of "order" to unsigned int is ok in most
places, because "order" is always non-negative.
But there are two places that are special, one is next_search_order()
and the other is compact_node().
For next_search_order(), order may be negative. It can be avoided by
some modifications.
For compact_node(), order = -1 means performing manual compaction.
It can be avoided by specifying order = MAX_ORDER.
Key changes in [PATCH 05/10] mm/compaction: make "order" and
"search_order" unsigned.
More information can be obtained from commit messages.
Test
----
I have done some stress testing locally and have not found any problems.
In addition, local tests indicate no performance impact.
Pengfei Li (10):
mm/page_alloc: use unsigned int for "order" in should_compact_retry()
mm/page_alloc: use unsigned int for "order" in __rmqueue_fallback()
mm/page_alloc: use unsigned int for "order" in should_compact_retry()
mm/page_alloc: remove never used "order" in alloc_contig_range()
mm/compaction: make "order" and "search_order" unsigned int in struct
compact_control
mm/compaction: make "order" unsigned int in compaction.c
trace/events/compaction: make "order" unsigned int
mm/compaction: use unsigned int for "compact_order_failed" in struct
zone
mm/compaction: use unsigned int for "kcompactd_max_order" in struct
pglist_data
mm/vmscan: use unsigned int for "kswapd_order" in struct pglist_data
include/linux/compaction.h | 30 +++----
include/linux/mmzone.h | 8 +-
include/trace/events/compaction.h | 40 +++++-----
include/trace/events/kmem.h | 6 +-
include/trace/events/oom.h | 6 +-
include/trace/events/vmscan.h | 4 +-
mm/compaction.c | 127 +++++++++++++++---------------
mm/internal.h | 6 +-
mm/page_alloc.c | 16 ++--
mm/vmscan.c | 6 +-
10 files changed, 126 insertions(+), 123 deletions(-)
--
2.21.0
Like another should_compact_retry(), use unsigned int for "order".
And modify trace_compact_retry() accordingly.
Signed-off-by: Pengfei Li <[email protected]>
---
include/trace/events/oom.h | 6 +++---
mm/page_alloc.c | 7 +++----
2 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/include/trace/events/oom.h b/include/trace/events/oom.h
index 26a11e4a2c36..b7fa989349c7 100644
--- a/include/trace/events/oom.h
+++ b/include/trace/events/oom.h
@@ -154,7 +154,7 @@ TRACE_EVENT(skip_task_reaping,
#ifdef CONFIG_COMPACTION
TRACE_EVENT(compact_retry,
- TP_PROTO(int order,
+ TP_PROTO(unsigned int order,
enum compact_priority priority,
enum compact_result result,
int retries,
@@ -164,7 +164,7 @@ TRACE_EVENT(compact_retry,
TP_ARGS(order, priority, result, retries, max_retries, ret),
TP_STRUCT__entry(
- __field( int, order)
+ __field(unsigned int, order)
__field( int, priority)
__field( int, result)
__field( int, retries)
@@ -181,7 +181,7 @@ TRACE_EVENT(compact_retry,
__entry->ret = ret;
),
- TP_printk("order=%d priority=%s compaction_result=%s retries=%d max_retries=%d should_retry=%d",
+ TP_printk("order=%u priority=%s compaction_result=%s retries=%d max_retries=%d should_retry=%d",
__entry->order,
__print_symbolic(__entry->priority, COMPACTION_PRIORITY),
__print_symbolic(__entry->result, COMPACTION_FEEDBACK),
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 272c6de1bf4e..75c18f4fd66a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3940,10 +3940,9 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
}
static inline bool
-should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
- enum compact_result compact_result,
- enum compact_priority *compact_priority,
- int *compaction_retries)
+should_compact_retry(struct alloc_context *ac, unsigned int order,
+ int alloc_flags, enum compact_result compact_result,
+ enum compact_priority *compact_priority, int *compaction_retries)
{
int max_retries = MAX_COMPACT_RETRIES;
int min_priority;
--
2.21.0
Because "order" will never be negative in __rmqueue_fallback(),
so just make "order" unsigned int.
And modify trace_mm_page_alloc_extfrag() accordingly.
Signed-off-by: Pengfei Li <[email protected]>
---
include/trace/events/kmem.h | 6 +++---
mm/page_alloc.c | 4 ++--
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
index eb57e3037deb..31f4d09aa31f 100644
--- a/include/trace/events/kmem.h
+++ b/include/trace/events/kmem.h
@@ -277,7 +277,7 @@ TRACE_EVENT(mm_page_pcpu_drain,
TRACE_EVENT(mm_page_alloc_extfrag,
TP_PROTO(struct page *page,
- int alloc_order, int fallback_order,
+ unsigned int alloc_order, int fallback_order,
int alloc_migratetype, int fallback_migratetype),
TP_ARGS(page,
@@ -286,7 +286,7 @@ TRACE_EVENT(mm_page_alloc_extfrag,
TP_STRUCT__entry(
__field( unsigned long, pfn )
- __field( int, alloc_order )
+ __field( unsigned int, alloc_order )
__field( int, fallback_order )
__field( int, alloc_migratetype )
__field( int, fallback_migratetype )
@@ -303,7 +303,7 @@ TRACE_EVENT(mm_page_alloc_extfrag,
get_pageblock_migratetype(page));
),
- TP_printk("page=%p pfn=%lu alloc_order=%d fallback_order=%d pageblock_order=%d alloc_migratetype=%d fallback_migratetype=%d fragmenting=%d change_ownership=%d",
+ TP_printk("page=%p pfn=%lu alloc_order=%u fallback_order=%d pageblock_order=%d alloc_migratetype=%d fallback_migratetype=%d fragmenting=%d change_ownership=%d",
pfn_to_page(__entry->pfn),
__entry->pfn,
__entry->alloc_order,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 75c18f4fd66a..1432cbcd87cd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2631,8 +2631,8 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
* condition simpler.
*/
static __always_inline bool
-__rmqueue_fallback(struct zone *zone, int order, int start_migratetype,
- unsigned int alloc_flags)
+__rmqueue_fallback(struct zone *zone, unsigned int order,
+ int start_migratetype, unsigned int alloc_flags)
{
struct free_area *area;
int current_order;
--
2.21.0
Because "order" will never be negative in should_compact_retry(),
so just make "order" unsigned int.
Signed-off-by: Pengfei Li <[email protected]>
---
mm/page_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1432cbcd87cd..7d47af09461f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -839,7 +839,7 @@ static inline struct capture_control *task_capc(struct zone *zone)
static inline bool
compaction_capture(struct capture_control *capc, struct page *page,
- int order, int migratetype)
+ unsigned int order, int migratetype)
{
if (!capc || order != capc->cc->order)
return false;
@@ -870,7 +870,7 @@ static inline struct capture_control *task_capc(struct zone *zone)
static inline bool
compaction_capture(struct capture_control *capc, struct page *page,
- int order, int migratetype)
+ unsigned int order, int migratetype)
{
return false;
}
--
2.21.0
Objective
----
The "order"and "search_order" is int in struct compact_control. This
commit is aim to make "order" is unsigned int like other mm subsystem.
Change
----
1) Change "order" and "search_order" to unsigned int
2) Make is_via_compact_memory() return true when "order" is equal to
MAX_ORDER instead of -1, and rename it to is_manual_compaction() for
a clearer meaning.
3) Modify next_search_order() to fit unsigned order.
4) Restore fast_search_fail to unsigned int.
This is ok because search_order is already unsigned int, and after
reverting fast_search_fail to unsigned int, compact_control is still
within two cache lines.
Signed-off-by: Pengfei Li <[email protected]>
---
mm/compaction.c | 96 +++++++++++++++++++++++++------------------------
mm/internal.h | 6 ++--
2 files changed, 53 insertions(+), 49 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 952dc2fb24e5..e47d8fa943a6 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -75,7 +75,7 @@ static void split_map_pages(struct list_head *list)
list_for_each_entry_safe(page, next, list, lru) {
list_del(&page->lru);
- order = page_private(page);
+ order = page_order(page);
nr_pages = 1 << order;
post_alloc_hook(page, order, __GFP_MOVABLE);
@@ -879,7 +879,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
* potential isolation targets.
*/
if (PageBuddy(page)) {
- unsigned long freepage_order = page_order_unsafe(page);
+ unsigned int freepage_order = page_order_unsafe(page);
/*
* Without lock, we cannot be sure that what we got is
@@ -1119,6 +1119,15 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn,
#endif /* CONFIG_COMPACTION || CONFIG_CMA */
#ifdef CONFIG_COMPACTION
+/*
+ * order == MAX_ORDER is expected when compacting via
+ * /proc/sys/vm/compact_memory
+ */
+static inline bool is_manual_compaction(struct compact_control *cc)
+{
+ return cc->order == MAX_ORDER;
+}
+
static bool suitable_migration_source(struct compact_control *cc,
struct page *page)
{
@@ -1167,7 +1176,7 @@ static bool suitable_migration_target(struct compact_control *cc,
static inline unsigned int
freelist_scan_limit(struct compact_control *cc)
{
- unsigned short shift = BITS_PER_LONG - 1;
+ unsigned int shift = BITS_PER_LONG - 1;
return (COMPACT_CLUSTER_MAX >> min(shift, cc->fast_search_fail)) + 1;
}
@@ -1253,21 +1262,24 @@ fast_isolate_around(struct compact_control *cc, unsigned long pfn, unsigned long
}
/* Search orders in round-robin fashion */
-static int next_search_order(struct compact_control *cc, int order)
+static unsigned int
+next_search_order(struct compact_control *cc, unsigned int order)
{
- order--;
- if (order < 0)
- order = cc->order - 1;
+ unsigned int next_order = order - 1;
- /* Search wrapped around? */
- if (order == cc->search_order) {
- cc->search_order--;
- if (cc->search_order < 0)
+ if (order == 0)
+ next_order = cc->order - 1;
+
+ if (next_order == cc->search_order) {
+ next_order = UINT_MAX;
+
+ order = cc->search_order;
+ cc->search_order -= 1;
+ if (order == 0)
cc->search_order = cc->order - 1;
- return -1;
}
- return order;
+ return next_order;
}
static unsigned long
@@ -1280,10 +1292,10 @@ fast_isolate_freepages(struct compact_control *cc)
unsigned long distance;
struct page *page = NULL;
bool scan_start = false;
- int order;
+ unsigned int order;
- /* Full compaction passes in a negative order */
- if (cc->order <= 0)
+ /* Full compaction when manual compaction */
+ if (is_manual_compaction(cc))
return cc->free_pfn;
/*
@@ -1310,10 +1322,10 @@ fast_isolate_freepages(struct compact_control *cc)
* Search starts from the last successful isolation order or the next
* order to search after a previous failure
*/
- cc->search_order = min_t(unsigned int, cc->order - 1, cc->search_order);
+ cc->search_order = min(cc->order - 1, cc->search_order);
for (order = cc->search_order;
- !page && order >= 0;
+ !page && order < MAX_ORDER;
order = next_search_order(cc, order)) {
struct free_area *area = &cc->zone->free_area[order];
struct list_head *freelist;
@@ -1837,15 +1849,6 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
}
-/*
- * order == -1 is expected when compacting via
- * /proc/sys/vm/compact_memory
- */
-static inline bool is_via_compact_memory(int order)
-{
- return order == -1;
-}
-
static enum compact_result __compact_finished(struct compact_control *cc)
{
unsigned int order;
@@ -1872,7 +1875,7 @@ static enum compact_result __compact_finished(struct compact_control *cc)
return COMPACT_PARTIAL_SKIPPED;
}
- if (is_via_compact_memory(cc->order))
+ if (is_manual_compaction(cc))
return COMPACT_CONTINUE;
/*
@@ -1962,9 +1965,6 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
{
unsigned long watermark;
- if (is_via_compact_memory(order))
- return COMPACT_CONTINUE;
-
watermark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
/*
* If watermarks for high-order allocation are already met, there
@@ -2071,7 +2071,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
static enum compact_result
compact_zone(struct compact_control *cc, struct capture_control *capc)
{
- enum compact_result ret;
+ enum compact_result ret = COMPACT_CONTINUE;
unsigned long start_pfn = cc->zone->zone_start_pfn;
unsigned long end_pfn = zone_end_pfn(cc->zone);
unsigned long last_migrated_pfn;
@@ -2079,21 +2079,25 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
bool update_cached;
cc->migratetype = gfpflags_to_migratetype(cc->gfp_mask);
- ret = compaction_suitable(cc->zone, cc->order, cc->alloc_flags,
- cc->classzone_idx);
- /* Compaction is likely to fail */
- if (ret == COMPACT_SUCCESS || ret == COMPACT_SKIPPED)
- return ret;
- /* huh, compaction_suitable is returning something unexpected */
- VM_BUG_ON(ret != COMPACT_CONTINUE);
+ if (!is_manual_compaction(cc)) {
+ ret = compaction_suitable(cc->zone, cc->order,
+ cc->alloc_flags, cc->classzone_idx);
- /*
- * Clear pageblock skip if there were failures recently and compaction
- * is about to be retried after being deferred.
- */
- if (compaction_restarting(cc->zone, cc->order))
- __reset_isolation_suitable(cc->zone);
+ /* Compaction is likely to fail */
+ if (ret == COMPACT_SUCCESS || ret == COMPACT_SKIPPED)
+ return ret;
+
+ /* huh, compaction_suitable is returning something unexpected */
+ VM_BUG_ON(ret != COMPACT_CONTINUE);
+
+ /*
+ * Clear pageblock skip if there were failures recently and
+ * compaction is about to be retried after being deferred.
+ */
+ if (compaction_restarting(cc->zone, cc->order))
+ __reset_isolation_suitable(cc->zone);
+ }
/*
* Setup to move all movable pages to the end of the zone. Used cached
@@ -2407,7 +2411,7 @@ static void compact_node(int nid)
int zoneid;
struct zone *zone;
struct compact_control cc = {
- .order = -1,
+ .order = MAX_ORDER, /* is manual compaction */
.total_migrate_scanned = 0,
.total_free_scanned = 0,
.mode = MIGRATE_SYNC,
diff --git a/mm/internal.h b/mm/internal.h
index e32390802fd3..4e0ab641fb6c 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -188,10 +188,10 @@ struct compact_control {
struct zone *zone;
unsigned long total_migrate_scanned;
unsigned long total_free_scanned;
- unsigned short fast_search_fail;/* failures to use free list searches */
- short search_order; /* order to start a fast search at */
+ unsigned int fast_search_fail; /* failures to use free list searches */
const gfp_t gfp_mask; /* gfp mask of a direct compactor */
- int order; /* order a direct compactor needs */
+ unsigned int order; /* order a direct compactor needs */
+ unsigned int search_order; /* order to start a fast search at */
int migratetype; /* migratetype of direct compactor */
const unsigned int alloc_flags; /* alloc flags of a direct compactor */
const int classzone_idx; /* zone index of a direct compactor */
--
2.21.0
Make the same type as "compact_control->order".
Signed-off-by: Pengfei Li <[email protected]>
---
include/trace/events/compaction.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/trace/events/compaction.h b/include/trace/events/compaction.h
index e5bf6ee4e814..1e1e74f6d128 100644
--- a/include/trace/events/compaction.h
+++ b/include/trace/events/compaction.h
@@ -170,14 +170,14 @@ TRACE_EVENT(mm_compaction_end,
TRACE_EVENT(mm_compaction_try_to_compact_pages,
TP_PROTO(
- int order,
+ unsigned int order,
gfp_t gfp_mask,
int prio),
TP_ARGS(order, gfp_mask, prio),
TP_STRUCT__entry(
- __field(int, order)
+ __field(unsigned int, order)
__field(gfp_t, gfp_mask)
__field(int, prio)
),
@@ -188,7 +188,7 @@ TRACE_EVENT(mm_compaction_try_to_compact_pages,
__entry->prio = prio;
),
- TP_printk("order=%d gfp_mask=%s priority=%d",
+ TP_printk("order=%u gfp_mask=%s priority=%d",
__entry->order,
show_gfp_flags(__entry->gfp_mask),
__entry->prio)
@@ -197,7 +197,7 @@ TRACE_EVENT(mm_compaction_try_to_compact_pages,
DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
TP_PROTO(struct zone *zone,
- int order,
+ unsigned int order,
int ret),
TP_ARGS(zone, order, ret),
@@ -205,7 +205,7 @@ DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
TP_STRUCT__entry(
__field(int, nid)
__field(enum zone_type, idx)
- __field(int, order)
+ __field(unsigned int, order)
__field(int, ret)
),
@@ -216,7 +216,7 @@ DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
__entry->ret = ret;
),
- TP_printk("node=%d zone=%-8s order=%d ret=%s",
+ TP_printk("node=%d zone=%-8s order=%u ret=%s",
__entry->nid,
__print_symbolic(__entry->idx, ZONE_TYPE),
__entry->order,
@@ -226,7 +226,7 @@ DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
DEFINE_EVENT(mm_compaction_suitable_template, mm_compaction_finished,
TP_PROTO(struct zone *zone,
- int order,
+ unsigned int order,
int ret),
TP_ARGS(zone, order, ret)
@@ -235,7 +235,7 @@ DEFINE_EVENT(mm_compaction_suitable_template, mm_compaction_finished,
DEFINE_EVENT(mm_compaction_suitable_template, mm_compaction_suitable,
TP_PROTO(struct zone *zone,
- int order,
+ unsigned int order,
int ret),
TP_ARGS(zone, order, ret)
--
2.21.0
Because "kswapd_order" will never be negative, so just make it
unsigned int. And modify wakeup_kswapd(), kswapd_try_to_sleep()
and trace_mm_vmscan_kswapd_wake() accordingly.
Besides, make "order" unsigned int in two related trace functions.
Signed-off-by: Pengfei Li <[email protected]>
---
include/linux/mmzone.h | 4 ++--
include/trace/events/compaction.h | 10 +++++-----
include/trace/events/vmscan.h | 4 ++--
mm/vmscan.c | 6 +++---
4 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 60bebdf47661..1196ed0cee67 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -717,7 +717,7 @@ typedef struct pglist_data {
wait_queue_head_t pfmemalloc_wait;
struct task_struct *kswapd; /* Protected by
mem_hotplug_begin/end() */
- int kswapd_order;
+ unsigned int kswapd_order;
enum zone_type kswapd_classzone_idx;
int kswapd_failures; /* Number of 'reclaimed == 0' runs */
@@ -802,7 +802,7 @@ static inline bool pgdat_is_empty(pg_data_t *pgdat)
#include <linux/memory_hotplug.h>
void build_all_zonelists(pg_data_t *pgdat);
-void wakeup_kswapd(struct zone *zone, gfp_t gfp_mask, int order,
+void wakeup_kswapd(struct zone *zone, gfp_t gfp_mask, unsigned int order,
enum zone_type classzone_idx);
bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,
int classzone_idx, unsigned int alloc_flags,
diff --git a/include/trace/events/compaction.h b/include/trace/events/compaction.h
index f83ba40f9614..34a9fac3b4d6 100644
--- a/include/trace/events/compaction.h
+++ b/include/trace/events/compaction.h
@@ -314,13 +314,13 @@ TRACE_EVENT(mm_compaction_kcompactd_sleep,
DECLARE_EVENT_CLASS(kcompactd_wake_template,
- TP_PROTO(int nid, int order, enum zone_type classzone_idx),
+ TP_PROTO(int nid, unsigned int order, enum zone_type classzone_idx),
TP_ARGS(nid, order, classzone_idx),
TP_STRUCT__entry(
__field(int, nid)
- __field(int, order)
+ __field(unsigned int, order)
__field(enum zone_type, classzone_idx)
),
@@ -330,7 +330,7 @@ DECLARE_EVENT_CLASS(kcompactd_wake_template,
__entry->classzone_idx = classzone_idx;
),
- TP_printk("nid=%d order=%d classzone_idx=%-8s",
+ TP_printk("nid=%d order=%u classzone_idx=%-8s",
__entry->nid,
__entry->order,
__print_symbolic(__entry->classzone_idx, ZONE_TYPE))
@@ -338,14 +338,14 @@ DECLARE_EVENT_CLASS(kcompactd_wake_template,
DEFINE_EVENT(kcompactd_wake_template, mm_compaction_wakeup_kcompactd,
- TP_PROTO(int nid, int order, enum zone_type classzone_idx),
+ TP_PROTO(int nid, unsigned int order, enum zone_type classzone_idx),
TP_ARGS(nid, order, classzone_idx)
);
DEFINE_EVENT(kcompactd_wake_template, mm_compaction_kcompactd_wake,
- TP_PROTO(int nid, int order, enum zone_type classzone_idx),
+ TP_PROTO(int nid, unsigned int order, enum zone_type classzone_idx),
TP_ARGS(nid, order, classzone_idx)
);
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index c37e2280e6dd..13c214f3750b 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -74,7 +74,7 @@ TRACE_EVENT(mm_vmscan_kswapd_wake,
TRACE_EVENT(mm_vmscan_wakeup_kswapd,
- TP_PROTO(int nid, int zid, int order, gfp_t gfp_flags),
+ TP_PROTO(int nid, int zid, unsigned int order, gfp_t gfp_flags),
TP_ARGS(nid, zid, order, gfp_flags),
@@ -92,7 +92,7 @@ TRACE_EVENT(mm_vmscan_wakeup_kswapd,
__entry->gfp_flags = gfp_flags;
),
- TP_printk("nid=%d order=%d gfp_flags=%s",
+ TP_printk("nid=%d order=%u gfp_flags=%s",
__entry->nid,
__entry->order,
show_gfp_flags(__entry->gfp_flags))
diff --git a/mm/vmscan.c b/mm/vmscan.c
index f4fd02ae233e..9d98a2e5f736 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3781,8 +3781,8 @@ static enum zone_type kswapd_classzone_idx(pg_data_t *pgdat,
return pgdat->kswapd_classzone_idx;
}
-static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order,
- unsigned int classzone_idx)
+static void kswapd_try_to_sleep(pg_data_t *pgdat, unsigned int alloc_order,
+ unsigned int reclaim_order, unsigned int classzone_idx)
{
long remaining = 0;
DEFINE_WAIT(wait);
@@ -3956,7 +3956,7 @@ static int kswapd(void *p)
* has failed or is not needed, still wake up kcompactd if only compaction is
* needed.
*/
-void wakeup_kswapd(struct zone *zone, gfp_t gfp_flags, int order,
+void wakeup_kswapd(struct zone *zone, gfp_t gfp_flags, unsigned int order,
enum zone_type classzone_idx)
{
pg_data_t *pgdat;
--
2.21.0
The "order" will never be used in alloc_contig_range(), and "order"
is a negative number is very strange. So just remove it.
Signed-off-by: Pengfei Li <[email protected]>
---
mm/page_alloc.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7d47af09461f..6208ebfac980 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8347,7 +8347,6 @@ int alloc_contig_range(unsigned long start, unsigned long end,
struct compact_control cc = {
.nr_migratepages = 0,
- .order = -1,
.zone = page_zone(pfn_to_page(start)),
.mode = MIGRATE_SYNC,
.ignore_skip_hint = true,
--
2.21.0
Since compact_control->order and compact_control->search_order
have been modified to unsigned int in the previous commit, then
some of the functions in compaction.c are modified accordingly.
Signed-off-by: Pengfei Li <[email protected]>
---
include/linux/compaction.h | 12 ++++++------
mm/compaction.c | 21 ++++++++++-----------
2 files changed, 16 insertions(+), 17 deletions(-)
diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index 9569e7c786d3..0201dfa57d44 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -96,8 +96,8 @@ extern enum compact_result try_to_compact_pages(gfp_t gfp_mask,
const struct alloc_context *ac, enum compact_priority prio,
struct page **page);
extern void reset_isolation_suitable(pg_data_t *pgdat);
-extern enum compact_result compaction_suitable(struct zone *zone, int order,
- unsigned int alloc_flags, int classzone_idx);
+extern enum compact_result compaction_suitable(struct zone *zone,
+ unsigned int order, unsigned int alloc_flags, int classzone_idx);
extern void defer_compaction(struct zone *zone, int order);
extern bool compaction_deferred(struct zone *zone, int order);
@@ -170,8 +170,8 @@ static inline bool compaction_withdrawn(enum compact_result result)
}
-bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
- int alloc_flags);
+bool compaction_zonelist_suitable(struct alloc_context *ac,
+ unsigned int order, int alloc_flags);
extern int kcompactd_run(int nid);
extern void kcompactd_stop(int nid);
@@ -182,8 +182,8 @@ static inline void reset_isolation_suitable(pg_data_t *pgdat)
{
}
-static inline enum compact_result compaction_suitable(struct zone *zone, int order,
- int alloc_flags, int classzone_idx)
+static inline enum compact_result compaction_suitable(struct zone *zone,
+ unsigned int order, int alloc_flags, int classzone_idx)
{
return COMPACT_SKIPPED;
}
diff --git a/mm/compaction.c b/mm/compaction.c
index e47d8fa943a6..ac5df82d46e0 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1639,7 +1639,7 @@ static unsigned long fast_find_migrateblock(struct compact_control *cc)
unsigned long distance;
unsigned long pfn = cc->migrate_pfn;
unsigned long high_pfn;
- int order;
+ unsigned int order;
/* Skip hints are relied on to avoid repeats on the fast search */
if (cc->ignore_skip_hint)
@@ -1958,10 +1958,9 @@ static enum compact_result compact_finished(struct compact_control *cc)
* COMPACT_SUCCESS - If the allocation would succeed without compaction
* COMPACT_CONTINUE - If compaction should run now
*/
-static enum compact_result __compaction_suitable(struct zone *zone, int order,
- unsigned int alloc_flags,
- int classzone_idx,
- unsigned long wmark_target)
+static enum compact_result __compaction_suitable(struct zone *zone,
+ unsigned int order, unsigned int alloc_flags,
+ int classzone_idx, unsigned long wmark_target)
{
unsigned long watermark;
@@ -1998,7 +1997,7 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
return COMPACT_CONTINUE;
}
-enum compact_result compaction_suitable(struct zone *zone, int order,
+enum compact_result compaction_suitable(struct zone *zone, unsigned int order,
unsigned int alloc_flags,
int classzone_idx)
{
@@ -2036,7 +2035,7 @@ enum compact_result compaction_suitable(struct zone *zone, int order,
return ret;
}
-bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
+bool compaction_zonelist_suitable(struct alloc_context *ac, unsigned int order,
int alloc_flags)
{
struct zone *zone;
@@ -2278,10 +2277,10 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
return ret;
}
-static enum compact_result compact_zone_order(struct zone *zone, int order,
- gfp_t gfp_mask, enum compact_priority prio,
- unsigned int alloc_flags, int classzone_idx,
- struct page **capture)
+static enum compact_result compact_zone_order(struct zone *zone,
+ unsigned int order, gfp_t gfp_mask,
+ enum compact_priority prio, unsigned int alloc_flags,
+ int classzone_idx, struct page **capture)
{
enum compact_result ret;
struct compact_control cc = {
--
2.21.0
Because "compact_order_failed" will never be negative, so just
make it unsigned int. And modify three related trace functions
accordingly.
Signed-off-by: Pengfei Li <[email protected]>
---
include/linux/compaction.h | 12 ++++++------
include/linux/mmzone.h | 2 +-
include/trace/events/compaction.h | 14 +++++++-------
mm/compaction.c | 8 ++++----
4 files changed, 18 insertions(+), 18 deletions(-)
diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index 0201dfa57d44..a8049d582265 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -99,11 +99,11 @@ extern void reset_isolation_suitable(pg_data_t *pgdat);
extern enum compact_result compaction_suitable(struct zone *zone,
unsigned int order, unsigned int alloc_flags, int classzone_idx);
-extern void defer_compaction(struct zone *zone, int order);
-extern bool compaction_deferred(struct zone *zone, int order);
-extern void compaction_defer_reset(struct zone *zone, int order,
+extern void defer_compaction(struct zone *zone, unsigned int order);
+extern bool compaction_deferred(struct zone *zone, unsigned int order);
+extern void compaction_defer_reset(struct zone *zone, unsigned int order,
bool alloc_success);
-extern bool compaction_restarting(struct zone *zone, int order);
+extern bool compaction_restarting(struct zone *zone, unsigned int order);
/* Compaction has made some progress and retrying makes sense */
static inline bool compaction_made_progress(enum compact_result result)
@@ -188,11 +188,11 @@ static inline enum compact_result compaction_suitable(struct zone *zone,
return COMPACT_SKIPPED;
}
-static inline void defer_compaction(struct zone *zone, int order)
+static inline void defer_compaction(struct zone *zone, unsigned int order)
{
}
-static inline bool compaction_deferred(struct zone *zone, int order)
+static inline bool compaction_deferred(struct zone *zone, unsigned int order)
{
return true;
}
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d77d717c620c..0947e7cb4214 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -545,7 +545,7 @@ struct zone {
*/
unsigned int compact_considered;
unsigned int compact_defer_shift;
- int compact_order_failed;
+ unsigned int compact_order_failed;
#endif
#if defined CONFIG_COMPACTION || defined CONFIG_CMA
diff --git a/include/trace/events/compaction.h b/include/trace/events/compaction.h
index 1e1e74f6d128..f83ba40f9614 100644
--- a/include/trace/events/compaction.h
+++ b/include/trace/events/compaction.h
@@ -243,17 +243,17 @@ DEFINE_EVENT(mm_compaction_suitable_template, mm_compaction_suitable,
DECLARE_EVENT_CLASS(mm_compaction_defer_template,
- TP_PROTO(struct zone *zone, int order),
+ TP_PROTO(struct zone *zone, unsigned int order),
TP_ARGS(zone, order),
TP_STRUCT__entry(
__field(int, nid)
__field(enum zone_type, idx)
- __field(int, order)
+ __field(unsigned int, order)
__field(unsigned int, considered)
__field(unsigned int, defer_shift)
- __field(int, order_failed)
+ __field(unsigned int, order_failed)
),
TP_fast_assign(
@@ -265,7 +265,7 @@ DECLARE_EVENT_CLASS(mm_compaction_defer_template,
__entry->order_failed = zone->compact_order_failed;
),
- TP_printk("node=%d zone=%-8s order=%d order_failed=%d consider=%u limit=%lu",
+ TP_printk("node=%d zone=%-8s order=%u order_failed=%u consider=%u limit=%lu",
__entry->nid,
__print_symbolic(__entry->idx, ZONE_TYPE),
__entry->order,
@@ -276,21 +276,21 @@ DECLARE_EVENT_CLASS(mm_compaction_defer_template,
DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_deferred,
- TP_PROTO(struct zone *zone, int order),
+ TP_PROTO(struct zone *zone, unsigned int order),
TP_ARGS(zone, order)
);
DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_defer_compaction,
- TP_PROTO(struct zone *zone, int order),
+ TP_PROTO(struct zone *zone, unsigned int order),
TP_ARGS(zone, order)
);
DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_defer_reset,
- TP_PROTO(struct zone *zone, int order),
+ TP_PROTO(struct zone *zone, unsigned int order),
TP_ARGS(zone, order)
);
diff --git a/mm/compaction.c b/mm/compaction.c
index ac5df82d46e0..aad638ad2cc6 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -139,7 +139,7 @@ EXPORT_SYMBOL(__ClearPageMovable);
* allocation success. 1 << compact_defer_limit compactions are skipped up
* to a limit of 1 << COMPACT_MAX_DEFER_SHIFT
*/
-void defer_compaction(struct zone *zone, int order)
+void defer_compaction(struct zone *zone, unsigned int order)
{
zone->compact_considered = 0;
zone->compact_defer_shift++;
@@ -154,7 +154,7 @@ void defer_compaction(struct zone *zone, int order)
}
/* Returns true if compaction should be skipped this time */
-bool compaction_deferred(struct zone *zone, int order)
+bool compaction_deferred(struct zone *zone, unsigned int order)
{
unsigned long defer_limit = 1UL << zone->compact_defer_shift;
@@ -178,7 +178,7 @@ bool compaction_deferred(struct zone *zone, int order)
* which means an allocation either succeeded (alloc_success == true) or is
* expected to succeed.
*/
-void compaction_defer_reset(struct zone *zone, int order,
+void compaction_defer_reset(struct zone *zone, unsigned int order,
bool alloc_success)
{
if (alloc_success) {
@@ -192,7 +192,7 @@ void compaction_defer_reset(struct zone *zone, int order,
}
/* Returns true if restarting compaction after many failures */
-bool compaction_restarting(struct zone *zone, int order)
+bool compaction_restarting(struct zone *zone, unsigned int order)
{
if (order < zone->compact_order_failed)
return false;
--
2.21.0
Because "kcompactd_max_order" will never be negative, so just
make it unsigned int.
Signed-off-by: Pengfei Li <[email protected]>
---
include/linux/compaction.h | 6 ++++--
include/linux/mmzone.h | 2 +-
mm/compaction.c | 2 +-
3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index a8049d582265..1b296de6efef 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -175,7 +175,8 @@ bool compaction_zonelist_suitable(struct alloc_context *ac,
extern int kcompactd_run(int nid);
extern void kcompactd_stop(int nid);
-extern void wakeup_kcompactd(pg_data_t *pgdat, int order, int classzone_idx);
+extern void wakeup_kcompactd(pg_data_t *pgdat, unsigned int order,
+ int classzone_idx);
#else
static inline void reset_isolation_suitable(pg_data_t *pgdat)
@@ -220,7 +221,8 @@ static inline void kcompactd_stop(int nid)
{
}
-static inline void wakeup_kcompactd(pg_data_t *pgdat, int order, int classzone_idx)
+static inline void wakeup_kcompactd(pg_data_t *pgdat, unsigned int order,
+ int classzone_idx)
{
}
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0947e7cb4214..60bebdf47661 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -723,7 +723,7 @@ typedef struct pglist_data {
int kswapd_failures; /* Number of 'reclaimed == 0' runs */
#ifdef CONFIG_COMPACTION
- int kcompactd_max_order;
+ unsigned int kcompactd_max_order;
enum zone_type kcompactd_classzone_idx;
wait_queue_head_t kcompactd_wait;
struct task_struct *kcompactd;
diff --git a/mm/compaction.c b/mm/compaction.c
index aad638ad2cc6..909ead244cff 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2607,7 +2607,7 @@ static void kcompactd_do_work(pg_data_t *pgdat)
pgdat->kcompactd_classzone_idx = pgdat->nr_zones - 1;
}
-void wakeup_kcompactd(pg_data_t *pgdat, int order, int classzone_idx)
+void wakeup_kcompactd(pg_data_t *pgdat, unsigned int order, int classzone_idx)
{
if (!order)
return;
--
2.21.0
On Fri, 2019-07-26 at 02:42 +0800, Pengfei Li wrote:
> Objective
> ----
> The motivation for this series of patches is use unsigned int for
> "order" in compaction.c, just like in other memory subsystems.
I suppose you will need more justification for this change. Right now, I don't
see much real benefit apart from possibly introducing more regressions in those
tricky areas of the code. Also, your testing seems quite lightweight.
>
> In addition, did some cleanup about "order" in page_alloc
> and vmscan.
>
>
> Description
> ----
> Directly modifying the type of "order" to unsigned int is ok in most
> places, because "order" is always non-negative.
>
> But there are two places that are special, one is next_search_order()
> and the other is compact_node().
>
> For next_search_order(), order may be negative. It can be avoided by
> some modifications.
>
> For compact_node(), order = -1 means performing manual compaction.
> It can be avoided by specifying order = MAX_ORDER.
>
> Key changes in [PATCH 05/10] mm/compaction: make "order" and
> "search_order" unsigned.
>
> More information can be obtained from commit messages.
>
>
> Test
> ----
> I have done some stress testing locally and have not found any problems.
>
> In addition, local tests indicate no performance impact.
>
>
> Pengfei Li (10):
> mm/page_alloc: use unsigned int for "order" in should_compact_retry()
> mm/page_alloc: use unsigned int for "order" in __rmqueue_fallback()
> mm/page_alloc: use unsigned int for "order" in should_compact_retry()
> mm/page_alloc: remove never used "order" in alloc_contig_range()
> mm/compaction: make "order" and "search_order" unsigned int in struct
> compact_control
> mm/compaction: make "order" unsigned int in compaction.c
> trace/events/compaction: make "order" unsigned int
> mm/compaction: use unsigned int for "compact_order_failed" in struct
> zone
> mm/compaction: use unsigned int for "kcompactd_max_order" in struct
> pglist_data
> mm/vmscan: use unsigned int for "kswapd_order" in struct pglist_data
>
> include/linux/compaction.h | 30 +++----
> include/linux/mmzone.h | 8 +-
> include/trace/events/compaction.h | 40 +++++-----
> include/trace/events/kmem.h | 6 +-
> include/trace/events/oom.h | 6 +-
> include/trace/events/vmscan.h | 4 +-
> mm/compaction.c | 127 +++++++++++++++---------------
> mm/internal.h | 6 +-
> mm/page_alloc.c | 16 ++--
> mm/vmscan.c | 6 +-
> 10 files changed, 126 insertions(+), 123 deletions(-)
>
On Fri, Jul 26, 2019 at 02:42:44AM +0800, Pengfei Li wrote:
> static inline bool
> -should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
> - enum compact_result compact_result,
> - enum compact_priority *compact_priority,
> - int *compaction_retries)
> +should_compact_retry(struct alloc_context *ac, unsigned int order,
> + int alloc_flags, enum compact_result compact_result,
> + enum compact_priority *compact_priority, int *compaction_retries)
> {
> int max_retries = MAX_COMPACT_RETRIES;
One tab here is insufficient indentation. It should be at least two.
Some parts of the kernel insist on lining up arguments with the opening
parenthesis of the function; I don't know if mm really obeys this rule,
but you're indenting function arguments to the same level as the opening
variables of the function, which is confusing.
On Fri, Jul 26, 2019 at 2:58 AM Matthew Wilcox <[email protected]> wrote:
>
> On Fri, Jul 26, 2019 at 02:42:44AM +0800, Pengfei Li wrote:
> > static inline bool
> > -should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
> > - enum compact_result compact_result,
> > - enum compact_priority *compact_priority,
> > - int *compaction_retries)
> > +should_compact_retry(struct alloc_context *ac, unsigned int order,
> > + int alloc_flags, enum compact_result compact_result,
> > + enum compact_priority *compact_priority, int *compaction_retries)
> > {
> > int max_retries = MAX_COMPACT_RETRIES;
>
> One tab here is insufficient indentation. It should be at least two.
Thanks for your comments.
> Some parts of the kernel insist on lining up arguments with the opening
> parenthesis of the function; I don't know if mm really obeys this rule,
> but you're indenting function arguments to the same level as the opening
> variables of the function, which is confusing.
I will use two tabs in the next version.
--
Pengfei
On Fri, Jul 26, 2019 at 2:52 AM Qian Cai <[email protected]> wrote:
>
> On Fri, 2019-07-26 at 02:42 +0800, Pengfei Li wrote:
> > Objective
> > ----
> > The motivation for this series of patches is use unsigned int for
> > "order" in compaction.c, just like in other memory subsystems.
>
> I suppose you will need more justification for this change. Right now, I don't
Thanks for your comments.
> see much real benefit apart from possibly introducing more regressions in those
As you can see, except for patch [05/10], other commits only modify the type
of "order". So the change is not big.
For the benefit, "order" may be negative, which is confusing and weird.
There is no good reason not to do this since it can be avoided.
> tricky areas of the code. Also, your testing seems quite lightweight.
>
Yes, you are right.
I use "stress" for stress testing, and made some small code coverage testing.
As you said, I need more ideas and comments about testing.
Any suggestions for testing?
Thanks again.
--
Pengfei
> >
> > In addition, did some cleanup about "order" in page_alloc
> > and vmscan.
> >
> >
> > Description
> > ----
> > Directly modifying the type of "order" to unsigned int is ok in most
> > places, because "order" is always non-negative.
> >
> > But there are two places that are special, one is next_search_order()
> > and the other is compact_node().
> >
> > For next_search_order(), order may be negative. It can be avoided by
> > some modifications.
> >
> > For compact_node(), order = -1 means performing manual compaction.
> > It can be avoided by specifying order = MAX_ORDER.
> >
> > Key changes in [PATCH 05/10] mm/compaction: make "order" and
> > "search_order" unsigned.
> >
> > More information can be obtained from commit messages.
> >
> >
> > Test
> > ----
> > I have done some stress testing locally and have not found any problems.
> >
> > In addition, local tests indicate no performance impact.
> >
> >
> > Pengfei Li (10):
> > mm/page_alloc: use unsigned int for "order" in should_compact_retry()
> > mm/page_alloc: use unsigned int for "order" in __rmqueue_fallback()
> > mm/page_alloc: use unsigned int for "order" in should_compact_retry()
> > mm/page_alloc: remove never used "order" in alloc_contig_range()
> > mm/compaction: make "order" and "search_order" unsigned int in struct
> > compact_control
> > mm/compaction: make "order" unsigned int in compaction.c
> > trace/events/compaction: make "order" unsigned int
> > mm/compaction: use unsigned int for "compact_order_failed" in struct
> > zone
> > mm/compaction: use unsigned int for "kcompactd_max_order" in struct
> > pglist_data
> > mm/vmscan: use unsigned int for "kswapd_order" in struct pglist_data
> >
> > include/linux/compaction.h | 30 +++----
> > include/linux/mmzone.h | 8 +-
> > include/trace/events/compaction.h | 40 +++++-----
> > include/trace/events/kmem.h | 6 +-
> > include/trace/events/oom.h | 6 +-
> > include/trace/events/vmscan.h | 4 +-
> > mm/compaction.c | 127 +++++++++++++++---------------
> > mm/internal.h | 6 +-
> > mm/page_alloc.c | 16 ++--
> > mm/vmscan.c | 6 +-
> > 10 files changed, 126 insertions(+), 123 deletions(-)
> >
On Fri 26-07-19 07:48:36, Pengfei Li wrote:
[...]
> For the benefit, "order" may be negative, which is confusing and weird.
order = -1 has a special meaning.
> There is no good reason not to do this since it can be avoided.
"This is good because we can do it" doesn't really sound like a
convincing argument to me. I would understand if this reduced a
generated code, made an overall code readability much better or
something along those lines. Also we only use MAX_ORDER range of values
so I could argue that a smaller data type (e.g. short) should be
sufficient for this data type.
Please note that _any_ change, alebit seemingly small, can introduce a
subtle bug. Also each patch requires a man power to review so you have
to understand that "just because we can" is not a strong motivation for
people to spend their time on such a patch.
--
Michal Hocko
SUSE Labs
On Fri, Jul 26, 2019 at 02:42:43AM +0800, Pengfei Li wrote:
> Objective
> ----
> The motivation for this series of patches is use unsigned int for
> "order" in compaction.c, just like in other memory subsystems.
>
Why? The series is relatively subtle in parts, particularly patch 5.
There have been places where by it was important for order to be able to
go negative due to loop exit conditions. If there was a gain from this
or it was a cleanup in the context of another major body of work, I
could understand the justification but that does not appear to be the
case here.
--
Mel Gorman
SUSE Labs
On 25/07/2019 20.42, Pengfei Li wrote:
> Because "order" will never be negative in __rmqueue_fallback(),
> so just make "order" unsigned int.
> And modify trace_mm_page_alloc_extfrag() accordingly.
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 75c18f4fd66a..1432cbcd87cd 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2631,8 +2631,8 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
> * condition simpler.
> */
> static __always_inline bool
> -__rmqueue_fallback(struct zone *zone, int order, int start_migratetype,
> - unsigned int alloc_flags)
> +__rmqueue_fallback(struct zone *zone, unsigned int order,
> + int start_migratetype, unsigned int alloc_flags)
> {
Please read the last paragraph of the comment above this function, run
git blame to figure out when that was introduced, and then read the full
commit description. Here be dragons. At the very least, this patch is
wrong in that it makes that comment inaccurate.
Rasmus
On Fri, Jul 26, 2019 at 5:36 PM Rasmus Villemoes
<[email protected]> wrote:
>
> On 25/07/2019 20.42, Pengfei Li wrote:
> > Because "order" will never be negative in __rmqueue_fallback(),
> > so just make "order" unsigned int.
> > And modify trace_mm_page_alloc_extfrag() accordingly.
> >
>
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 75c18f4fd66a..1432cbcd87cd 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -2631,8 +2631,8 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
> > * condition simpler.
> > */
> > static __always_inline bool
> > -__rmqueue_fallback(struct zone *zone, int order, int start_migratetype,
> > - unsigned int alloc_flags)
> > +__rmqueue_fallback(struct zone *zone, unsigned int order,
> > + int start_migratetype, unsigned int alloc_flags)
> > {
>
> Please read the last paragraph of the comment above this function, run
> git blame to figure out when that was introduced, and then read the full
> commit description.
Thanks for your comments.
I have read the commit info of commit b002529d2563 ("mm/page_alloc.c:
eliminate unsigned confusion in __rmqueue_fallback").
And I looked at the discussion at https://lkml.org/lkml/2017/6/21/684 in detail.
> Here be dragons. At the very least, this patch is
> wrong in that it makes that comment inaccurate.
I wonder if you noticed the commit 6bb154504f8b ("mm, page_alloc: spread
allocations across zones before introducing fragmentation").
Commit 6bb154504f8b introduces a local variable min_order in
__rmqueue_fallback().
And you can see
for (current_order = MAX_ORDER - 1; current_order >= min_order;
--current_order) {
The “current_order” and "min_order" are int, so here is ok.
Since __rmqueue_fallback() is only called by __rmqueue() and "order" is unsigned
int in __rmqueue(), then I think that making "order" is also unsigned
int is good.
Maybe I should also modify the comments here?
>
> Rasmus
Thank you again for your review.
--
Pengfei
On Fri, Jul 26, 2019 at 3:26 PM Mel Gorman <[email protected]> wrote:
>
Thank you for your comments.
> On Fri, Jul 26, 2019 at 02:42:43AM +0800, Pengfei Li wrote:
> > Objective
> > ----
> > The motivation for this series of patches is use unsigned int for
> > "order" in compaction.c, just like in other memory subsystems.
> >
>
> Why? The series is relatively subtle in parts, particularly patch 5.
Before I sent this series of patches, I took a close look at the
git log for compact.c.
Here is a short history, trouble you to look patiently.
1) At first, "order" is _unsigned int_
The commit 56de7263fcf3 ("mm: compaction: direct compact when a
high-order allocation fails") introduced the "order" in
compact_control and its type is unsigned int.
Besides, you specify that order == -1 is the flag that triggers
compaction via proc.
2) Next, because order equals -1 is special, it causes an error.
The commit 7be62de99adc ("vmscan: kswapd carefully call compaction")
determines if "order" is less than 0.
This condition is always true because the type of "order" is
_unsigned int_.
- compact_zone(zone, &cc);
+ if (cc->order < 0 || !compaction_deferred(zone))
3) Finally, in order to fix the above error, the type of the order
is modified to _int_
It is done by commit: aad6ec3777bf ("mm: compaction: make
compact_control order signed").
The reason I mention this is because I want to express that the
type of "order" is originally _unsigned int_. And "order" is
modified to _int_ because of the special value of -1.
If the special value of "order" is not a negative number (for
example, -1), but a number greater than MAX_ORDER - 1 (for example,
MAX_ORDER), then the "order" may still be _unsigned int_ now.
> There have been places where by it was important for order to be able to
> go negative due to loop exit conditions.
I think that even if "cc->order" is _unsigned int_, it can be done
with a local temporary variable easily.
Like this,
function(...)
{
for(int tmp_order = cc->order; tmp_order >= 0; tmp_order--) {
...
}
}
> If there was a gain from this
> or it was a cleanup in the context of another major body of work, I
> could understand the justification but that does not appear to be the
> case here.
>
My final conclusion:
Why "order" is _int_ instead of unsigned int?
=> Because order == -1 is used as the flag.
=> So what about making "order" greater than MAX_ORDER - 1?
=> The "order" can be _unsigned int_ just like in most places.
(Can we only pick -1 as this special value?)
This series of patches makes sense because,
1) It guarantees that "order" remains the same type.
No one likes to see this
__alloc_pages_slowpath(unsigned int order, ...)
=> should_compact_retry(int order, ...) /* The type changed */
=> compaction_zonelist_suitable(int order, ...)
=> __compaction_suitable(int order, ...)
=> zone_watermark_ok(unsigned int order, ...) /* The type
changed again! */
2) It eliminates the evil "order == -1".
If "order" is specified as any positive number greater than
MAX_ORDER - 1 in commit 56de7263fcf3, perhaps no int order will
appear in compact.c until now.
> --
> Mel Gorman
Thank you again for your comments, and sincerely thank you for
your patience in reading such a long email.
> SUSE Labs
On Fri, Jul 26, 2019 at 3:12 PM Michal Hocko <[email protected]> wrote:
>
Thank you for your comments.
> On Fri 26-07-19 07:48:36, Pengfei Li wrote:
> [...]
> > For the benefit, "order" may be negative, which is confusing and weird.
>
> order = -1 has a special meaning.
>
Yes. But I mean -1 can be replaced by any number greater than
MAX_ORDER - 1 and there is no reason to be negative.
> > There is no good reason not to do this since it can be avoided.
>
> "This is good because we can do it" doesn't really sound like a
> convincing argument to me. I would understand if this reduced a
> generated code, made an overall code readability much better or
> something along those lines. Also we only use MAX_ORDER range of values
> so I could argue that a smaller data type (e.g. short) should be
> sufficient for this data type.
>
I resend an email to interpret the meaning of my commit, and I would be
very grateful if you post some comments on this.
> Please note that _any_ change, alebit seemingly small, can introduce a
> subtle bug. Also each patch requires a man power to review so you have
> to understand that "just because we can" is not a strong motivation for
> people to spend their time on such a patch.
Sincerely thank you, I will keep these in mind.
> --
> Michal Hocko
> SUSE Labs
--
Pengfei
On Sun, Jul 28, 2019 at 12:44:36AM +0800, Pengfei Li wrote:
> On Fri, Jul 26, 2019 at 3:26 PM Mel Gorman <[email protected]> wrote:
> >
>
> Thank you for your comments.
>
> > On Fri, Jul 26, 2019 at 02:42:43AM +0800, Pengfei Li wrote:
> > > Objective
> > > ----
> > > The motivation for this series of patches is use unsigned int for
> > > "order" in compaction.c, just like in other memory subsystems.
> > >
> >
> > Why? The series is relatively subtle in parts, particularly patch 5.
>
> Before I sent this series of patches, I took a close look at the
> git log for compact.c.
>
> Here is a short history, trouble you to look patiently.
>
> 1) At first, "order" is _unsigned int_
>
> The commit 56de7263fcf3 ("mm: compaction: direct compact when a
> high-order allocation fails") introduced the "order" in
> compact_control and its type is unsigned int.
>
> Besides, you specify that order == -1 is the flag that triggers
> compaction via proc.
>
Yes, specifying that compaction in that context is for the entire zone
without any specific allocation context or request.
> 2) Next, because order equals -1 is special, it causes an error.
>
> The commit 7be62de99adc ("vmscan: kswapd carefully call compaction")
> determines if "order" is less than 0.
>
> This condition is always true because the type of "order" is
> _unsigned int_.
>
> - compact_zone(zone, &cc);
> + if (cc->order < 0 || !compaction_deferred(zone))
>
> 3) Finally, in order to fix the above error, the type of the order
> is modified to _int_
>
> It is done by commit: aad6ec3777bf ("mm: compaction: make
> compact_control order signed").
>
> The reason I mention this is because I want to express that the
> type of "order" is originally _unsigned int_. And "order" is
> modified to _int_ because of the special value of -1.
>
And in itself, why does that matter?
> If the special value of "order" is not a negative number (for
> example, -1), but a number greater than MAX_ORDER - 1 (for example,
> MAX_ORDER), then the "order" may still be _unsigned int_ now.
>
Sure, but then you have to check that every check on order understands
the new special value.
> > There have been places where by it was important for order to be able to
> > go negative due to loop exit conditions.
>
> I think that even if "cc->order" is _unsigned int_, it can be done
> with a local temporary variable easily.
>
> Like this,
>
> function(...)
> {
> for(int tmp_order = cc->order; tmp_order >= 0; tmp_order--) {
> ...
> }
> }
>
Yes, it can be expressed as unsigned but in itself why does that justify
the review of a large series? There is limited to no performance gain
and functionally it's equivalent.
> > If there was a gain from this
> > or it was a cleanup in the context of another major body of work, I
> > could understand the justification but that does not appear to be the
> > case here.
> >
>
> My final conclusion:
>
> Why "order" is _int_ instead of unsigned int?
> => Because order == -1 is used as the flag.
> => So what about making "order" greater than MAX_ORDER - 1?
> => The "order" can be _unsigned int_ just like in most places.
>
> (Can we only pick -1 as this special value?)
>
No, but the existing code did make that choice and has been debugged
with that decision.
> This series of patches makes sense because,
>
> 1) It guarantees that "order" remains the same type.
>
And the advantage is?
> No one likes to see this
>
> __alloc_pages_slowpath(unsigned int order, ...)
> => should_compact_retry(int order, ...) /* The type changed */
> => compaction_zonelist_suitable(int order, ...)
> => __compaction_suitable(int order, ...)
> => zone_watermark_ok(unsigned int order, ...) /* The type
> changed again! */
>
> 2) It eliminates the evil "order == -1".
>
> If "order" is specified as any positive number greater than
> MAX_ORDER - 1 in commit 56de7263fcf3, perhaps no int order will
> appear in compact.c until now.
>
So, while it is possible, the point still holds. There is marginal to no
performance advantage (some CPUs perform fractionally better when an
index variable is unsigned rather than signed but it's difficult to
measure even when you're looking for it). It'll take time to review and
then debug the entire series. If this was in the context of a larger
functional enablement or performance optimisation then it would be
worthwhile but as it is, it looks more like churn for the sake of it.
--
Mel Gorman
SUSE Labs
On Mon, Jul 29, 2019 at 4:34 PM Mel Gorman <[email protected]> wrote:
>
> On Sun, Jul 28, 2019 at 12:44:36AM +0800, Pengfei Li wrote:
> > On Fri, Jul 26, 2019 at 3:26 PM Mel Gorman <[email protected]> wrote:
> > >
> >
> > Thank you for your comments.
> >
> > > On Fri, Jul 26, 2019 at 02:42:43AM +0800, Pengfei Li wrote:
> > > > Objective
> > > > ----
> > > > The motivation for this series of patches is use unsigned int for
> > > > "order" in compaction.c, just like in other memory subsystems.
> > > >
> > >
> > > Why? The series is relatively subtle in parts, particularly patch 5.
> >
> > Before I sent this series of patches, I took a close look at the
> > git log for compact.c.
> >
> > Here is a short history, trouble you to look patiently.
> >
> > 1) At first, "order" is _unsigned int_
> >
> > The commit 56de7263fcf3 ("mm: compaction: direct compact when a
> > high-order allocation fails") introduced the "order" in
> > compact_control and its type is unsigned int.
> >
> > Besides, you specify that order == -1 is the flag that triggers
> > compaction via proc.
> >
>
> Yes, specifying that compaction in that context is for the entire zone
> without any specific allocation context or request.
Yes
>
> > 2) Next, because order equals -1 is special, it causes an error.
> >
> > The commit 7be62de99adc ("vmscan: kswapd carefully call compaction")
> > determines if "order" is less than 0.
> >
> > This condition is always true because the type of "order" is
> > _unsigned int_.
> >
> > - compact_zone(zone, &cc);
> > + if (cc->order < 0 || !compaction_deferred(zone))
> >
> > 3) Finally, in order to fix the above error, the type of the order
> > is modified to _int_
> >
> > It is done by commit: aad6ec3777bf ("mm: compaction: make
> > compact_control order signed").
> >
> > The reason I mention this is because I want to express that the
> > type of "order" is originally _unsigned int_. And "order" is
> > modified to _int_ because of the special value of -1.
> >
>
> And in itself, why does that matter?
The -1 makes order is int, which breaks the consistency of the type of order.
>
> > If the special value of "order" is not a negative number (for
> > example, -1), but a number greater than MAX_ORDER - 1 (for example,
> > MAX_ORDER), then the "order" may still be _unsigned int_ now.
> >
>
> Sure, but then you have to check that every check on order understands
> the new special value.
>
Since this check is done by is_via_compact_memory(), it is easy to modify the
special value being checked.
I have checked every check many times and now I need some reviews from
the community.
> > > There have been places where by it was important for order to be able to
> > > go negative due to loop exit conditions.
> >
> > I think that even if "cc->order" is _unsigned int_, it can be done
> > with a local temporary variable easily.
> >
> > Like this,
> >
> > function(...)
> > {
> > for(int tmp_order = cc->order; tmp_order >= 0; tmp_order--) {
> > ...
> > }
> > }
> >
>
> Yes, it can be expressed as unsigned but in itself why does that justify
> the review of a large series?
At first glance it seems that this series of patches is large. But at least
half of it is to modify the corresponding trace function. And you can see
that except patch 4 and patch 5, other patches only modify the type of
order.
Even for patch 5 with function modifications, the modified code has only
about 50 lines.
> There is limited to no performance gain
> and functionally it's equivalent.
This is just clean up. And others have done similar work before.
commit d00181b96eb8 ("mm: use 'unsigned int' for page order")
commit 7aeb09f9104b ("mm: page_alloc: use unsigned int for order in
more places")
>
> > > If there was a gain from this
> > > or it was a cleanup in the context of another major body of work, I
> > > could understand the justification but that does not appear to be the
> > > case here.
> > >
> >
> > My final conclusion:
> >
> > Why "order" is _int_ instead of unsigned int?
> > => Because order == -1 is used as the flag.
> > => So what about making "order" greater than MAX_ORDER - 1?
> > => The "order" can be _unsigned int_ just like in most places.
> >
> > (Can we only pick -1 as this special value?)
> >
>
> No, but the existing code did make that choice and has been debugged
> with that decision.
But this choice breaks the consistency of the order type, isn't it?
Because the check is done in is_via_compact_memory() , you don't need to
worry too much about the modification.
>
> > This series of patches makes sense because,
> >
> > 1) It guarantees that "order" remains the same type.
> >
>
> And the advantage is?
As I mentioned earlier, maintaining the consistency of the order type.
Do you really like to see such a call stack?
__alloc_pages_slowpath(unsigned int order, ...)
/* The type has changed! */
=> should_compact_retry(int order, ...)
=> compaction_zonelist_suitable(int order, ...)
=> __compaction_suitable(int order, ...)
/* The type has changed again! */
=> zone_watermark_ok(unsigned int order, ...)
The type of order has changed twice!
According to commit d00181b96eb8 and commit 7aeb09f9104b, we
want the order type to be the same.
There are currently only five functions in page_alloc.c that use int
order, and three of them are related to compaction.
>
> > No one likes to see this
> >
> > __alloc_pages_slowpath(unsigned int order, ...)
> > => should_compact_retry(int order, ...) /* The type changed */
> > => compaction_zonelist_suitable(int order, ...)
> > => __compaction_suitable(int order, ...)
> > => zone_watermark_ok(unsigned int order, ...) /* The type
> > changed again! */
> >
> > 2) It eliminates the evil "order == -1".
> >
> > If "order" is specified as any positive number greater than
> > MAX_ORDER - 1 in commit 56de7263fcf3, perhaps no int order will
> > appear in compact.c until now.
> >
>
> So, while it is possible, the point still holds. There is marginal to no
> performance advantage (some CPUs perform fractionally better when an
> index variable is unsigned rather than signed but it's difficult to
> measure even when you're looking for it). It'll take time to review and
> then debug the entire series. If this was in the context of a larger
> functional enablement or performance optimisation then it would be
> worthwhile but as it is, it looks more like churn for the sake of it.
My summary,
1) This is just clean up. And others have done similar work before.
commit d00181b96eb8 ("mm: use 'unsigned int' for page order")
commit 7aeb09f9104b ("mm: page_alloc: use unsigned int for order in
more places")
2) Thanks to is_via_compact_memory(), it is easy to modify the
special value -1 to another value.
3) Not much modification.
Function code modified about 50 lines, and only in patch 4 and patch 5.
At least half of it is to modify the corresponding trace function
>
> --
> Mel Gorman
> SUSE Labs
Sincerely thank you for your detailed comments.
--
Pengfei