2017-08-24 06:36:47

by Joonsoo Kim

[permalink] [raw]
Subject: [PATCH 0/3] mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE

From: Joonsoo Kim <[email protected]>

This patchset is the follow-up of the discussion about the
"Introduce ZONE_CMA (v7)" [1]. Please reference it if more information
is needed.

In this patchset, the memory of the CMA area is managed by using
the ZONE_MOVABLE. Since there is another type of the memory in this zone,
we need to maintain a migratetype for the CMA memory to account
the number of the CMA memory. So, unlike previous patchset, there is
less deletion of the code.

Otherwise, there is no big change.

Motivation of this patchset is described in the commit description of
the patch "mm/cma: manage the memory of the CMA area by using
the ZONE_MOVABLE". Please refer it for more information.

This patchset is based on linux-next-20170822 plus
"mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE".

Thanks.

[1]: lkml.kernel.org/r/[email protected]

Joonsoo Kim (3):
mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE
mm/cma: remove ALLOC_CMA
ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM = y

arch/arm/mm/dma-mapping.c | 8 +++-
include/linux/memory_hotplug.h | 3 --
include/linux/mm.h | 1 +
mm/cma.c | 83 ++++++++++++++++++++++++++++++++++++------
mm/compaction.c | 4 +-
mm/internal.h | 4 +-
mm/page_alloc.c | 83 +++++++++++++++++++++++++++---------------
7 files changed, 137 insertions(+), 49 deletions(-)

--
2.7.4


2017-08-24 06:36:58

by Joonsoo Kim

[permalink] [raw]
Subject: [PATCH 2/3] mm/cma: remove ALLOC_CMA

From: Joonsoo Kim <[email protected]>

Now, all reserved pages for CMA region are belong to the ZONE_MOVABLE
and it only serves for a request with GFP_HIGHMEM && GFP_MOVABLE.
Therefore, we don't need to maintain ALLOC_CMA at all.

Reviewed-by: Aneesh Kumar K.V <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/compaction.c | 4 +---
mm/internal.h | 1 -
mm/page_alloc.c | 28 +++-------------------------
3 files changed, 4 insertions(+), 29 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index bf018d8..ee16d4a 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1457,14 +1457,12 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
* if compaction succeeds.
* For costly orders, we require low watermark instead of min for
* compaction to proceed to increase its chances.
- * ALLOC_CMA is used, as pages in CMA pageblocks are considered
- * suitable migration targets
*/
watermark = (order > PAGE_ALLOC_COSTLY_ORDER) ?
low_wmark_pages(zone) : min_wmark_pages(zone);
watermark += compact_gap(order);
if (!__zone_watermark_ok(zone, 0, watermark, classzone_idx,
- ALLOC_CMA, wmark_target))
+ 0, wmark_target))
return COMPACT_SKIPPED;

return COMPACT_CONTINUE;
diff --git a/mm/internal.h b/mm/internal.h
index b4f9ebc..0aaa05a 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -497,7 +497,6 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
#define ALLOC_HARDER 0x10 /* try to alloc harder */
#define ALLOC_HIGH 0x20 /* __GFP_HIGH set */
#define ALLOC_CPUSET 0x40 /* check for correct cpuset */
-#define ALLOC_CMA 0x80 /* allow allocations from CMA areas */

enum ttu_flags;
struct tlbflush_unmap_batch;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bbd00f1..b74bd78 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2710,7 +2710,7 @@ int __isolate_free_page(struct page *page, unsigned int order)
* exists.
*/
watermark = min_wmark_pages(zone) + (1UL << order);
- if (!zone_watermark_ok(zone, 0, watermark, 0, ALLOC_CMA))
+ if (!zone_watermark_ok(zone, 0, watermark, 0, 0))
return 0;

__mod_zone_freepage_state(zone, -(1UL << order), mt);
@@ -2987,12 +2987,6 @@ bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,
}


-#ifdef CONFIG_CMA
- /* If allocation can't use CMA areas don't use free CMA pages */
- if (!(alloc_flags & ALLOC_CMA))
- free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);
-#endif
-
/*
* Check watermarks for an order-0 allocation request. If these
* are not met, then a high-order request also cannot go ahead
@@ -3022,10 +3016,8 @@ bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,
}

#ifdef CONFIG_CMA
- if ((alloc_flags & ALLOC_CMA) &&
- !list_empty(&area->free_list[MIGRATE_CMA])) {
+ if (!list_empty(&area->free_list[MIGRATE_CMA]))
return true;
- }
#endif
}
return false;
@@ -3042,13 +3034,6 @@ static inline bool zone_watermark_fast(struct zone *z, unsigned int order,
unsigned long mark, int classzone_idx, unsigned int alloc_flags)
{
long free_pages = zone_page_state(z, NR_FREE_PAGES);
- long cma_pages = 0;
-
-#ifdef CONFIG_CMA
- /* If allocation can't use CMA areas don't use free CMA pages */
- if (!(alloc_flags & ALLOC_CMA))
- cma_pages = zone_page_state(z, NR_FREE_CMA_PAGES);
-#endif

/*
* Fast check for order-0 only. If this fails then the reserves
@@ -3057,7 +3042,7 @@ static inline bool zone_watermark_fast(struct zone *z, unsigned int order,
* the caller is !atomic then it'll uselessly search the free
* list. That corner case is then slower but it is harmless.
*/
- if (!order && (free_pages - cma_pages) > mark + z->lowmem_reserve[classzone_idx])
+ if (!order && free_pages > mark + z->lowmem_reserve[classzone_idx])
return true;

return __zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags,
@@ -3676,10 +3661,6 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
} else if (unlikely(rt_task(current)) && !in_interrupt())
alloc_flags |= ALLOC_HARDER;

-#ifdef CONFIG_CMA
- if (gfpflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
- alloc_flags |= ALLOC_CMA;
-#endif
return alloc_flags;
}

@@ -4156,9 +4137,6 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
if (should_fail_alloc_page(gfp_mask, order))
return false;

- if (IS_ENABLED(CONFIG_CMA) && ac->migratetype == MIGRATE_MOVABLE)
- *alloc_flags |= ALLOC_CMA;
-
return true;
}

--
2.7.4

2017-08-24 06:37:03

by Joonsoo Kim

[permalink] [raw]
Subject: [PATCH 3/3] ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM = y

From: Joonsoo Kim <[email protected]>

CMA area is now managed by the separate zone, ZONE_MOVABLE,
to fix many MM related problems. In this implementation, if
CONFIG_HIGHMEM = y, then ZONE_MOVABLE is considered as HIGHMEM and
the memory of the CMA area is also considered as HIGHMEM.
That means that they are considered as the page without direct mapping.
However, CMA area could be in a lowmem and the memory could have
direct mapping.

In ARM, when establishing a new mapping for DMA, direct mapping should
be cleared since two mapping with different cache policy could cause
unknown problem. With this patch, PageHighmem() for the CMA memory
located in lowmem returns true so that the function for DMA mapping
cannot notice whether it needs to clear direct mapping or not, correctly.
To handle this situation, this patch always clears direct mapping
for such CMA memory.

Signed-off-by: Joonsoo Kim <[email protected]>
---
arch/arm/mm/dma-mapping.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index fcf1473..38f0fde 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -513,7 +513,13 @@ void __init dma_contiguous_remap(void)
flush_tlb_kernel_range(__phys_to_virt(start),
__phys_to_virt(end));

- iotable_init(&map, 1);
+ /*
+ * For highmem system, all the memory in CMA region will be
+ * considered as highmem even if it's physical address belong
+ * to lowmem. Therefore, re-mapping isn't required.
+ */
+ if (!IS_ENABLED(CONFIG_HIGHMEM))
+ iotable_init(&map, 1);
}
}

--
2.7.4

2017-08-24 06:36:55

by Joonsoo Kim

[permalink] [raw]
Subject: [PATCH 1/3] mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE

From: Joonsoo Kim <[email protected]>

0. History

This patchset is the follow-up of the discussion about the
"Introduce ZONE_CMA (v7)" [1]. Please reference it if more information
is needed.

1. What does this patch do?

This patch changes the management way for the memory of the CMA area
in the MM subsystem. Currently, The memory of the CMA area is managed
by the zone where their pfn is belong to. However, this approach has
some problems since MM subsystem doesn't have enough logic to handle
the situation that different characteristic memories are in a single zone.
To solve this issue, this patch try to manage all the memory of
the CMA area by using the MOVABLE zone. In MM subsystem's point of view,
characteristic of the memory on the MOVABLE zone and the memory of
the CMA area are the same. So, managing the memory of the CMA area
by using the MOVABLE zone will not have any problem.

2. Motivation

There are some problems with current approach. See following.
Although these problem would not be inherent and it could be fixed without
this conception change, it requires many hooks addition in various
code path and it would be intrusive to core MM and would be really
error-prone. Therefore, I try to solve them with this new approach.
Anyway, following is the problems of the current implementation.

o CMA memory utilization

First, following is the freepage calculation logic in MM.

- For movable allocation: freepage = total freepage
- For unmovable allocation: freepage = total freepage - CMA freepage

Freepages on the CMA area is used after the normal freepages in the zone
where the memory of the CMA area is belong to are exhausted. At that moment
that the number of the normal freepages is zero, so

- For movable allocation: freepage = total freepage = CMA freepage
- For unmovable allocation: freepage = 0

If unmovable allocation comes at this moment, allocation request would
fail to pass the watermark check and reclaim is started. After reclaim,
there would exist the normal freepages so freepages on the CMA areas
would not be used.

FYI, there is another attempt [2] trying to solve this problem in lkml.
And, as far as I know, Qualcomm also has out-of-tree solution for this
problem.

o useless reclaim

There is no logic to distinguish CMA pages in the reclaim path. Hence,
CMA page is reclaimed even if the system just needs the page that can
be usable for the kernel allocation.

o atomic allocation failure

This is also related to the fallback allocation policy for the memory
of the CMA area. Consider the situation that the number of the normal
freepages is *zero* since the bunch of the movable allocation requests
come. Kswapd would not be woken up due to following freepage calculation
logic.

- For movable allocation: freepage = total freepage = CMA freepage

If atomic unmovable allocation request comes at this moment, it would
fails due to following logic.

- For unmovable allocation: freepage = total freepage - CMA freepage = 0

It was reported by Aneesh [3].

o useless compaction

Usual high-order allocation request is unmovable allocation request and
it cannot be served from the memory of the CMA area. In compaction,
migration scanner try to migrate the page in the CMA area and make
high-order page there. As mentioned above, it cannot be usable
for the unmovable allocation request so it's just waste.

3. Current approach and new approach

Current approach is that the memory of the CMA area is managed by the zone
where their pfn is belong to. However, these memory should be
distinguishable since they have a strong limitation. So, they are marked
as MIGRATE_CMA in pageblock flag and handled specially. However,
as mentioned in section 2, the MM subsystem doesn't have enough logic
to deal with this special pageblock so many problems raised.

New approach is that the memory of the CMA area is managed by
the MOVABLE zone. MM already have enough logic to deal with special zone
like as HIGHMEM and MOVABLE zone. So, managing the memory of the CMA area
by the MOVABLE zone just naturally work well because constraints
for the memory of the CMA area that the memory should always be migratable
is the same with the constraint for the MOVABLE zone.

There is one side-effect for the usability of the memory of the CMA area.
The use of MOVABLE zone is only allowed for a request with GFP_HIGHMEM &&
GFP_MOVABLE so now the memory of the CMA area is also only allowed for
this gfp flag. Before this patchset, a request with GFP_MOVABLE can use
them. IMO, It would not be a big issue since most of GFP_MOVABLE request
also has GFP_HIGHMEM flag. For example, file cache page and anonymous page.
However, file cache page for blockdev file is an exception. Request for it
has no GFP_HIGHMEM flag. There is pros and cons on this exception.
In my experience, blockdev file cache pages are one of the top reason
that causes cma_alloc() to fail temporarily. So, we can get more guarantee
of cma_alloc() success by discarding this case.

Note that there is no change in admin POV since this patchset is just
for internal implementation change in MM subsystem. Just one minor
difference for admin is that the memory stat for CMA area will be printed
in the MOVABLE zone. That's all.

4. Result

Following is the experimental result related to utilization problem.

8 CPUs, 1024 MB, VIRTUAL MACHINE
make -j16

<Before>
CMA area: 0 MB 512 MB
Elapsed-time: 92.4 186.5
pswpin: 82 18647
pswpout: 160 69839

<After>
CMA : 0 MB 512 MB
Elapsed-time: 93.1 93.4
pswpin: 84 46
pswpout: 183 92

[1]: lkml.kernel.org/r/[email protected]
[2]: https://lkml.org/lkml/2014/10/15/623
[3]: http://www.spinics.net/lists/linux-mm/msg100562.html

Reviewed-by: Aneesh Kumar K.V <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
include/linux/memory_hotplug.h | 3 --
include/linux/mm.h | 1 +
mm/cma.c | 83 ++++++++++++++++++++++++++++++++++++------
mm/internal.h | 3 ++
mm/page_alloc.c | 55 +++++++++++++++++++++++++---
5 files changed, 126 insertions(+), 19 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 0995e1a..fb94608 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -230,9 +230,6 @@ void put_online_mems(void);
void mem_hotplug_begin(void);
void mem_hotplug_done(void);

-extern void set_zone_contiguous(struct zone *zone);
-extern void clear_zone_contiguous(struct zone *zone);
-
#else /* ! CONFIG_MEMORY_HOTPLUG */
#define pfn_to_online_page(pfn) \
({ \
diff --git a/include/linux/mm.h b/include/linux/mm.h
index deb9c70..d4daadc 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2049,6 +2049,7 @@ extern void setup_per_cpu_pageset(void);

extern void zone_pcp_update(struct zone *zone);
extern void zone_pcp_reset(struct zone *zone);
+extern void setup_zone_pageset(struct zone *zone);

/* page_alloc.c */
extern int min_free_kbytes;
diff --git a/mm/cma.c b/mm/cma.c
index c0da318..a8ababb 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -38,6 +38,7 @@
#include <trace/events/cma.h>

#include "cma.h"
+#include "internal.h"

struct cma cma_areas[MAX_CMA_AREAS];
unsigned cma_area_count;
@@ -108,23 +109,25 @@ static int __init cma_activate_area(struct cma *cma)
if (!cma->bitmap)
return -ENOMEM;

- WARN_ON_ONCE(!pfn_valid(pfn));
- zone = page_zone(pfn_to_page(pfn));
-
do {
unsigned j;

base_pfn = pfn;
+ if (!pfn_valid(base_pfn))
+ goto err;
+
+ zone = page_zone(pfn_to_page(base_pfn));
for (j = pageblock_nr_pages; j; --j, pfn++) {
- WARN_ON_ONCE(!pfn_valid(pfn));
+ if (!pfn_valid(pfn))
+ goto err;
+
/*
- * alloc_contig_range requires the pfn range
- * specified to be in the same zone. Make this
- * simple by forcing the entire CMA resv range
- * to be in the same zone.
+ * In init_cma_reserved_pageblock(), present_pages
+ * is adjusted with assumption that all pages in
+ * the pageblock come from a single zone.
*/
if (page_zone(pfn_to_page(pfn)) != zone)
- goto not_in_zone;
+ goto err;
}
init_cma_reserved_pageblock(pfn_to_page(base_pfn));
} while (--i);
@@ -138,7 +141,7 @@ static int __init cma_activate_area(struct cma *cma)

return 0;

-not_in_zone:
+err:
pr_err("CMA area %s could not be activated\n", cma->name);
kfree(cma->bitmap);
cma->count = 0;
@@ -148,6 +151,41 @@ static int __init cma_activate_area(struct cma *cma)
static int __init cma_init_reserved_areas(void)
{
int i;
+ struct zone *zone;
+ pg_data_t *pgdat;
+
+ if (!cma_area_count)
+ return 0;
+
+ for_each_online_pgdat(pgdat) {
+ unsigned long start_pfn = UINT_MAX, end_pfn = 0;
+
+ zone = &pgdat->node_zones[ZONE_MOVABLE];
+
+ /*
+ * In this case, we cannot adjust the zone range
+ * since it is now maximum node span and we don't
+ * know original zone range.
+ */
+ if (populated_zone(zone))
+ continue;
+
+ for (i = 0; i < cma_area_count; i++) {
+ if (pfn_to_nid(cma_areas[i].base_pfn) !=
+ pgdat->node_id)
+ continue;
+
+ start_pfn = min(start_pfn, cma_areas[i].base_pfn);
+ end_pfn = max(end_pfn, cma_areas[i].base_pfn +
+ cma_areas[i].count);
+ }
+
+ if (!end_pfn)
+ continue;
+
+ zone->zone_start_pfn = start_pfn;
+ zone->spanned_pages = end_pfn - start_pfn;
+ }

for (i = 0; i < cma_area_count; i++) {
int ret = cma_activate_area(&cma_areas[i]);
@@ -156,9 +194,32 @@ static int __init cma_init_reserved_areas(void)
return ret;
}

+ /*
+ * Reserved pages for ZONE_MOVABLE are now activated and
+ * this would change ZONE_MOVABLE's managed page counter and
+ * the other zones' present counter. We need to re-calculate
+ * various zone information that depends on this initialization.
+ */
+ build_all_zonelists(NULL);
+ for_each_populated_zone(zone) {
+ if (zone_idx(zone) == ZONE_MOVABLE) {
+ zone_pcp_reset(zone);
+ setup_zone_pageset(zone);
+ } else
+ zone_pcp_update(zone);
+
+ set_zone_contiguous(zone);
+ }
+
+ /*
+ * We need to re-init per zone wmark by calling
+ * init_per_zone_wmark_min() but doesn't call here because it is
+ * registered on core_initcall and it will be called later than us.
+ */
+
return 0;
}
-core_initcall(cma_init_reserved_areas);
+pure_initcall(cma_init_reserved_areas);

/**
* cma_init_reserved_mem() - create custom contiguous area from reserved memory
diff --git a/mm/internal.h b/mm/internal.h
index 1df011f..b4f9ebc 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -168,6 +168,9 @@ extern void post_alloc_hook(struct page *page, unsigned int order,
gfp_t gfp_flags);
extern int user_min_free_kbytes;

+extern void set_zone_contiguous(struct zone *zone);
+extern void clear_zone_contiguous(struct zone *zone);
+
#if defined CONFIG_COMPACTION || defined CONFIG_CMA

/*
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index eb094b1..bbd00f1 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1596,16 +1596,38 @@ void __init page_alloc_init_late(void)
}

#ifdef CONFIG_CMA
+static void __init adjust_present_page_count(struct page *page, long count)
+{
+ struct zone *zone = page_zone(page);
+
+ /* We don't need to hold a lock since it is boot-up process */
+ zone->present_pages += count;
+}
+
/* Free whole pageblock and set its migration type to MIGRATE_CMA. */
void __init init_cma_reserved_pageblock(struct page *page)
{
unsigned i = pageblock_nr_pages;
+ unsigned long pfn = page_to_pfn(page);
struct page *p = page;
+ int nid = page_to_nid(page);
+
+ /*
+ * ZONE_MOVABLE will steal present pages from other zones by
+ * changing page links so page_zone() is changed. Before that,
+ * we need to adjust previous zone's page count first.
+ */
+ adjust_present_page_count(page, -pageblock_nr_pages);

do {
__ClearPageReserved(p);
set_page_count(p, 0);
- } while (++p, --i);
+
+ /* Steal pages from other zones */
+ set_page_links(p, ZONE_MOVABLE, nid, pfn);
+ } while (++p, ++pfn, --i);
+
+ adjust_present_page_count(page, pageblock_nr_pages);

set_pageblock_migratetype(page, MIGRATE_CMA);

@@ -6023,6 +6045,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
{
enum zone_type j;
int nid = pgdat->node_id;
+ unsigned long node_end_pfn = 0;

pgdat_resize_init(pgdat);
#ifdef CONFIG_NUMA_BALANCING
@@ -6050,9 +6073,13 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
struct zone *zone = pgdat->node_zones + j;
unsigned long size, realsize, freesize, memmap_pages;
unsigned long zone_start_pfn = zone->zone_start_pfn;
+ unsigned long movable_size = 0;

size = zone->spanned_pages;
realsize = freesize = zone->present_pages;
+ if (zone_end_pfn(zone) > node_end_pfn)
+ node_end_pfn = zone_end_pfn(zone);
+

/*
* Adjust freesize so that it accounts for how much memory
@@ -6101,12 +6128,30 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
zone_seqlock_init(zone);
zone_pcp_init(zone);

- if (!size)
+ /*
+ * The size of the CMA area is unknown now so we need to
+ * prepare the memory for the usemap at maximum.
+ */
+ if (IS_ENABLED(CONFIG_CMA) && j == ZONE_MOVABLE &&
+ pgdat->node_spanned_pages) {
+ movable_size = node_end_pfn - pgdat->node_start_pfn;
+ }
+
+ if (!size && !movable_size)
continue;

set_pageblock_order();
- setup_usemap(pgdat, zone, zone_start_pfn, size);
- init_currently_empty_zone(zone, zone_start_pfn, size);
+ if (movable_size) {
+ zone->zone_start_pfn = pgdat->node_start_pfn;
+ zone->spanned_pages = movable_size;
+ setup_usemap(pgdat, zone,
+ pgdat->node_start_pfn, movable_size);
+ init_currently_empty_zone(zone,
+ pgdat->node_start_pfn, movable_size);
+ } else {
+ setup_usemap(pgdat, zone, zone_start_pfn, size);
+ init_currently_empty_zone(zone, zone_start_pfn, size);
+ }
memmap_init(size, nid, j, zone_start_pfn);
}
}
@@ -7657,7 +7702,7 @@ void free_contig_range(unsigned long pfn, unsigned nr_pages)
}
#endif

-#ifdef CONFIG_MEMORY_HOTPLUG
+#if defined CONFIG_MEMORY_HOTPLUG || defined CONFIG_CMA
/*
* The zone indicated has a new number of managed_pages; batch sizes and percpu
* page high values need to be recalulated.
--
2.7.4

2017-08-25 21:32:16

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 0/3] mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE

On Thu, 24 Aug 2017 15:36:30 +0900 [email protected] wrote:

> From: Joonsoo Kim <[email protected]>
>
> This patchset is the follow-up of the discussion about the
> "Introduce ZONE_CMA (v7)" [1]. Please reference it if more information
> is needed.
>
> In this patchset, the memory of the CMA area is managed by using
> the ZONE_MOVABLE. Since there is another type of the memory in this zone,
> we need to maintain a migratetype for the CMA memory to account
> the number of the CMA memory. So, unlike previous patchset, there is
> less deletion of the code.
>
> Otherwise, there is no big change.
>
> Motivation of this patchset is described in the commit description of
> the patch "mm/cma: manage the memory of the CMA area by using
> the ZONE_MOVABLE". Please refer it for more information.
>
> This patchset is based on linux-next-20170822 plus
> "mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE".
>

But "mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE" did
not do very well at review - both Michal and Vlastimil are looking for
changes. So we're not ready for a patch series which depends upon that
one?


2017-08-28 00:30:45

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH 0/3] mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE

On Fri, Aug 25, 2017 at 02:32:13PM -0700, Andrew Morton wrote:
> On Thu, 24 Aug 2017 15:36:30 +0900 [email protected] wrote:
>
> > From: Joonsoo Kim <[email protected]>
> >
> > This patchset is the follow-up of the discussion about the
> > "Introduce ZONE_CMA (v7)" [1]. Please reference it if more information
> > is needed.
> >
> > In this patchset, the memory of the CMA area is managed by using
> > the ZONE_MOVABLE. Since there is another type of the memory in this zone,
> > we need to maintain a migratetype for the CMA memory to account
> > the number of the CMA memory. So, unlike previous patchset, there is
> > less deletion of the code.
> >
> > Otherwise, there is no big change.
> >
> > Motivation of this patchset is described in the commit description of
> > the patch "mm/cma: manage the memory of the CMA area by using
> > the ZONE_MOVABLE". Please refer it for more information.
> >
> > This patchset is based on linux-next-20170822 plus
> > "mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE".
> >
>
> But "mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE" did
> not do very well at review - both Michal and Vlastimil are looking for
> changes. So we're not ready for a patch series which depends upon that
> one?

Oops. I checked again and I found that this patchset is not dependant
to that patch. It's just leftover from ZONE_CMA patchset.

Thanks.

2017-08-29 09:16:24

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 1/3] mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE

On 08/24/2017 08:36 AM, [email protected] wrote:
> From: Joonsoo Kim <[email protected]>
>
> 0. History
>
> This patchset is the follow-up of the discussion about the
> "Introduce ZONE_CMA (v7)" [1]. Please reference it if more information
> is needed.
>

[...]

>
> [1]: lkml.kernel.org/r/[email protected]
> [2]: https://lkml.org/lkml/2014/10/15/623
> [3]: http://www.spinics.net/lists/linux-mm/msg100562.html
>
> Reviewed-by: Aneesh Kumar K.V <[email protected]>
> Acked-by: Vlastimil Babka <[email protected]>

The previous version has introduced ZONE_CMA, so I would think switching
to ZONE_MOVABLE is enough to drop previous reviews. Perhaps most of the
code involved is basically the same, though?

Anyway I checked the current patch and did some basic tests with qemu,
so you can keep my ack.

BTW, if we dropped NR_FREE_CMA_PAGES, could we also drop MIGRATE_CMA and
related hooks? Is that counter really that useful as it works right now?
It will decrease both by CMA allocations (which has to be explicitly
freed) and by movable allocations (which can be migrated). What if only
CMA alloc/release touched it?

2017-08-31 01:40:05

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH 1/3] mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE

On Tue, Aug 29, 2017 at 11:16:18AM +0200, Vlastimil Babka wrote:
> On 08/24/2017 08:36 AM, [email protected] wrote:
> > From: Joonsoo Kim <[email protected]>
> >
> > 0. History
> >
> > This patchset is the follow-up of the discussion about the
> > "Introduce ZONE_CMA (v7)" [1]. Please reference it if more information
> > is needed.
> >
>
> [...]
>
> >
> > [1]: lkml.kernel.org/r/[email protected]
> > [2]: https://lkml.org/lkml/2014/10/15/623
> > [3]: http://www.spinics.net/lists/linux-mm/msg100562.html
> >
> > Reviewed-by: Aneesh Kumar K.V <[email protected]>
> > Acked-by: Vlastimil Babka <[email protected]>
>
> The previous version has introduced ZONE_CMA, so I would think switching
> to ZONE_MOVABLE is enough to drop previous reviews. Perhaps most of the
> code involved is basically the same, though?

Yes, most of the code involved is the same. I considered to drop
previous review tags but most of the code and concept is the same so I
decide to keep review tags. I should mention it in cover-letter but I
forgot to mention it. Sorry about that.

> Anyway I checked the current patch and did some basic tests with qemu,
> so you can keep my ack.

Thanks!

>
> BTW, if we dropped NR_FREE_CMA_PAGES, could we also drop MIGRATE_CMA and
> related hooks? Is that counter really that useful as it works right now?
> It will decrease both by CMA allocations (which has to be explicitly
> freed) and by movable allocations (which can be migrated). What if only
> CMA alloc/release touched it?

I think that NR_FREE_CMA_PAGES would not be as useful as previous. We
can remove it.

However, removing MIGRATE_CMA has a problem. There is an usecase to
check if the page comes from the CMA area or not. See
check_page_span() in mm/usercopy.c. I can implement it differently by
iterating whole CMA area and finding the match, but I'm not sure it's
performance effect. I guess that it would be marginal.

Anyway, I'd like not to cause any side-effect now. After patches are
settle down on mainline, I will try to remove them as you suggested.

Thanks.

2017-08-31 11:35:32

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 1/3] mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE

On 08/31/2017 03:40 AM, Joonsoo Kim wrote:
> On Tue, Aug 29, 2017 at 11:16:18AM +0200, Vlastimil Babka wrote:
>> On 08/24/2017 08:36 AM, [email protected] wrote:
>>> From: Joonsoo Kim <[email protected]>
>>>
>>> 0. History
>>>
>>> This patchset is the follow-up of the discussion about the
>>> "Introduce ZONE_CMA (v7)" [1]. Please reference it if more information
>>> is needed.
>>>
>>
>> [...]
>>
>>>
>>> [1]: lkml.kernel.org/r/[email protected]
>>> [2]: https://lkml.org/lkml/2014/10/15/623
>>> [3]: http://www.spinics.net/lists/linux-mm/msg100562.html
>>>
>>> Reviewed-by: Aneesh Kumar K.V <[email protected]>
>>> Acked-by: Vlastimil Babka <[email protected]>
>>
>> The previous version has introduced ZONE_CMA, so I would think switching
>> to ZONE_MOVABLE is enough to drop previous reviews. Perhaps most of the
>> code involved is basically the same, though?
>
> Yes, most of the code involved is the same. I considered to drop
> previous review tags but most of the code and concept is the same so I
> decide to keep review tags. I should mention it in cover-letter but I
> forgot to mention it. Sorry about that.
>
>> Anyway I checked the current patch and did some basic tests with qemu,
>> so you can keep my ack.
>
> Thanks!
>
>>
>> BTW, if we dropped NR_FREE_CMA_PAGES, could we also drop MIGRATE_CMA and
>> related hooks? Is that counter really that useful as it works right now?
>> It will decrease both by CMA allocations (which has to be explicitly
>> freed) and by movable allocations (which can be migrated). What if only
>> CMA alloc/release touched it?
>
> I think that NR_FREE_CMA_PAGES would not be as useful as previous. We
> can remove it.
>
> However, removing MIGRATE_CMA has a problem. There is an usecase to
> check if the page comes from the CMA area or not. See
> check_page_span() in mm/usercopy.c. I can implement it differently by
> iterating whole CMA area and finding the match, but I'm not sure it's
> performance effect. I guess that it would be marginal.

+CC Kees Cook

Hmm, seems like this check is to make sure we don't copy from/to parts
of kernel memory we're not supposed to? Then I believe checking that
pages are in ZONE_MOVABLE should then give the same guarantees as
MIGRATE_CMA.

BTW the comment says "Reject if range is entirely either Reserved or
CMA" but the code does the opposite thing. I assume the comment is wrong?

> Anyway, I'd like not to cause any side-effect now. After patches are
> settle down on mainline, I will try to remove them as you suggested.
>
> Thanks.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>

2017-08-31 15:08:02

by Laura Abbott

[permalink] [raw]
Subject: Re: [PATCH 1/3] mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE

On 08/31/2017 07:32 AM, Vlastimil Babka wrote:
> On 08/31/2017 03:40 AM, Joonsoo Kim wrote:
>> On Tue, Aug 29, 2017 at 11:16:18AM +0200, Vlastimil Babka wrote:
>>> On 08/24/2017 08:36 AM, [email protected] wrote:
>>>> From: Joonsoo Kim <[email protected]>
>>>>
>>>> 0. History
>>>>
>>>> This patchset is the follow-up of the discussion about the
>>>> "Introduce ZONE_CMA (v7)" [1]. Please reference it if more information
>>>> is needed.
>>>>
>>>
>>> [...]
>>>
>>>>
>>>> [1]: lkml.kernel.org/r/[email protected]
>>>> [2]: https://lkml.org/lkml/2014/10/15/623
>>>> [3]: http://www.spinics.net/lists/linux-mm/msg100562.html
>>>>
>>>> Reviewed-by: Aneesh Kumar K.V <[email protected]>
>>>> Acked-by: Vlastimil Babka <[email protected]>
>>>
>>> The previous version has introduced ZONE_CMA, so I would think switching
>>> to ZONE_MOVABLE is enough to drop previous reviews. Perhaps most of the
>>> code involved is basically the same, though?
>>
>> Yes, most of the code involved is the same. I considered to drop
>> previous review tags but most of the code and concept is the same so I
>> decide to keep review tags. I should mention it in cover-letter but I
>> forgot to mention it. Sorry about that.
>>
>>> Anyway I checked the current patch and did some basic tests with qemu,
>>> so you can keep my ack.
>>
>> Thanks!
>>
>>>
>>> BTW, if we dropped NR_FREE_CMA_PAGES, could we also drop MIGRATE_CMA and
>>> related hooks? Is that counter really that useful as it works right now?
>>> It will decrease both by CMA allocations (which has to be explicitly
>>> freed) and by movable allocations (which can be migrated). What if only
>>> CMA alloc/release touched it?
>>
>> I think that NR_FREE_CMA_PAGES would not be as useful as previous. We
>> can remove it.
>>
>> However, removing MIGRATE_CMA has a problem. There is an usecase to
>> check if the page comes from the CMA area or not. See
>> check_page_span() in mm/usercopy.c. I can implement it differently by
>> iterating whole CMA area and finding the match, but I'm not sure it's
>> performance effect. I guess that it would be marginal.
>
> +CC Kees Cook
>
> Hmm, seems like this check is to make sure we don't copy from/to parts
> of kernel memory we're not supposed to? Then I believe checking that
> pages are in ZONE_MOVABLE should then give the same guarantees as
> MIGRATE_CMA.
>

The check is to make sure we are copying only to a single page unless
that page is allocated with __GFP_COMP. CMA needs extra checks since
its allocations have nothing to do with compound page. Checking
ZONE_MOVABLE might cause us to miss some cases of copying to vanilla
ZONE_MOVABLE pages.

> BTW the comment says "Reject if range is entirely either Reserved or
> CMA" but the code does the opposite thing. I assume the comment is wrong?
>

Yes, I think that needs clarification.

Thanks,
Laura

2017-09-01 07:31:48

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 1/3] mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE

On 08/31/2017 05:07 PM, Laura Abbott wrote:
> On 08/31/2017 07:32 AM, Vlastimil Babka wrote:
>> On 08/31/2017 03:40 AM, Joonsoo Kim wrote:
>>> On Tue, Aug 29, 2017 at 11:16:18AM +0200, Vlastimil Babka wrote:
>>>>
>>>> BTW, if we dropped NR_FREE_CMA_PAGES, could we also drop MIGRATE_CMA and
>>>> related hooks? Is that counter really that useful as it works right now?
>>>> It will decrease both by CMA allocations (which has to be explicitly
>>>> freed) and by movable allocations (which can be migrated). What if only
>>>> CMA alloc/release touched it?
>>>
>>> I think that NR_FREE_CMA_PAGES would not be as useful as previous. We
>>> can remove it.
>>>
>>> However, removing MIGRATE_CMA has a problem. There is an usecase to
>>> check if the page comes from the CMA area or not. See
>>> check_page_span() in mm/usercopy.c. I can implement it differently by
>>> iterating whole CMA area and finding the match, but I'm not sure it's
>>> performance effect. I guess that it would be marginal.
>>
>> +CC Kees Cook
>>
>> Hmm, seems like this check is to make sure we don't copy from/to parts
>> of kernel memory we're not supposed to? Then I believe checking that
>> pages are in ZONE_MOVABLE should then give the same guarantees as
>> MIGRATE_CMA.
>>
>
> The check is to make sure we are copying only to a single page unless
> that page is allocated with __GFP_COMP. CMA needs extra checks since
> its allocations have nothing to do with compound page. Checking
> ZONE_MOVABLE might cause us to miss some cases of copying to vanilla
> ZONE_MOVABLE pages.

How big problem is that? ZONE_MOVABLE should not contain kernel pages,
so from the kernel protection side we are OK? I expect there's another
check somewhere that the pages are not userspace, as that would be
unexpected on a wrong side of copy_to/from_user, no?

Also you can already miss some cases with the is_migrate_cma check,
because pages might be in the CMA pageblocks but not be allocated by CMA
itself - movable pages allocation can fallback here.

>> BTW the comment says "Reject if range is entirely either Reserved or
>> CMA" but the code does the opposite thing. I assume the comment is wrong?
>>
>
> Yes, I think that needs clarification.
>
> Thanks,
> Laura
>

2017-09-01 21:03:10

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH 1/3] mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE

On Thu, Aug 31, 2017 at 4:32 AM, Vlastimil Babka <[email protected]> wrote:
> On 08/31/2017 03:40 AM, Joonsoo Kim wrote:
>> On Tue, Aug 29, 2017 at 11:16:18AM +0200, Vlastimil Babka wrote:
>>> On 08/24/2017 08:36 AM, [email protected] wrote:
>>>> From: Joonsoo Kim <[email protected]>
>>>>
>>>> 0. History
>>>>
>>>> This patchset is the follow-up of the discussion about the
>>>> "Introduce ZONE_CMA (v7)" [1]. Please reference it if more information
>>>> is needed.
>>>>
>>>
>>> [...]
>>>
>>>>
>>>> [1]: lkml.kernel.org/r/[email protected]
>>>> [2]: https://lkml.org/lkml/2014/10/15/623
>>>> [3]: http://www.spinics.net/lists/linux-mm/msg100562.html
>>>>
>>>> Reviewed-by: Aneesh Kumar K.V <[email protected]>
>>>> Acked-by: Vlastimil Babka <[email protected]>
>>>
>>> The previous version has introduced ZONE_CMA, so I would think switching
>>> to ZONE_MOVABLE is enough to drop previous reviews. Perhaps most of the
>>> code involved is basically the same, though?
>>
>> Yes, most of the code involved is the same. I considered to drop
>> previous review tags but most of the code and concept is the same so I
>> decide to keep review tags. I should mention it in cover-letter but I
>> forgot to mention it. Sorry about that.
>>
>>> Anyway I checked the current patch and did some basic tests with qemu,
>>> so you can keep my ack.
>>
>> Thanks!
>>
>>>
>>> BTW, if we dropped NR_FREE_CMA_PAGES, could we also drop MIGRATE_CMA and
>>> related hooks? Is that counter really that useful as it works right now?
>>> It will decrease both by CMA allocations (which has to be explicitly
>>> freed) and by movable allocations (which can be migrated). What if only
>>> CMA alloc/release touched it?
>>
>> I think that NR_FREE_CMA_PAGES would not be as useful as previous. We
>> can remove it.
>>
>> However, removing MIGRATE_CMA has a problem. There is an usecase to
>> check if the page comes from the CMA area or not. See
>> check_page_span() in mm/usercopy.c. I can implement it differently by
>> iterating whole CMA area and finding the match, but I'm not sure it's
>> performance effect. I guess that it would be marginal.
>
> +CC Kees Cook
>
> Hmm, seems like this check is to make sure we don't copy from/to parts
> of kernel memory we're not supposed to? Then I believe checking that
> pages are in ZONE_MOVABLE should then give the same guarantees as
> MIGRATE_CMA.

Yeah, as Laura said, the idea is to make sure that a copy doesn't
exceed the bounds of the allocation (and that means a single page when
not __GFP_COMP nor CMA nor Reserved).

The trouble with this check, which I'd like see fixed, is that there
are portions of the kernel that make separate adjacent page
allocations and then copy across individual allocations in a single
usercopy. It's not clear to me if that is fixable just by adding
__GFP_COMP or not, though.

-Kees

--
Kees Cook
Pixel Security