2020-04-07 13:55:19

by David Hildenbrand

[permalink] [raw]
Subject: [PATCH v1 0/2] mm/memory_hotplug: remove is_mem_section_removable()

This is the follow-up of "[PATCH v1] drivers/base/memory.c: indicate all
memory blocks as removable" [1], which gets rid of
is_mem_section_removable().

More details can be found in [1] and [2]

[1] https://lkml.kernel.org/r/[email protected]
[2] https://lkml.kernel.org/r/[email protected]

David Hildenbrand (2):
powerpc/pseries/hotplug-memory: stop checking
is_mem_section_removable()
mm/memory_hotplug: remove is_mem_section_removable()

.../platforms/pseries/hotplug-memory.c | 26 +------
include/linux/memory_hotplug.h | 7 --
mm/memory_hotplug.c | 75 -------------------
3 files changed, 3 insertions(+), 105 deletions(-)

--
2.25.1


2020-04-07 13:55:27

by David Hildenbrand

[permalink] [raw]
Subject: [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable()

Fortunately, all users of is_mem_section_removable() are gone. Get rid of
it, including some now unnecessary functions.

Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Wei Yang <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
---
include/linux/memory_hotplug.h | 7 ----
mm/memory_hotplug.c | 75 ----------------------------------
2 files changed, 82 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 93d9ada74ddd..7dca9cd6076b 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -314,19 +314,12 @@ static inline void pgdat_resize_init(struct pglist_data *pgdat) {}

#ifdef CONFIG_MEMORY_HOTREMOVE

-extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
extern void try_offline_node(int nid);
extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
extern int remove_memory(int nid, u64 start, u64 size);
extern void __remove_memory(int nid, u64 start, u64 size);

#else
-static inline bool is_mem_section_removable(unsigned long pfn,
- unsigned long nr_pages)
-{
- return false;
-}
-
static inline void try_offline_node(int nid) {}

static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 47cf6036eb31..4d338d546d52 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1112,81 +1112,6 @@ int add_memory(int nid, u64 start, u64 size)
EXPORT_SYMBOL_GPL(add_memory);

#ifdef CONFIG_MEMORY_HOTREMOVE
-/*
- * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
- * set and the size of the free page is given by page_order(). Using this,
- * the function determines if the pageblock contains only free pages.
- * Due to buddy contraints, a free page at least the size of a pageblock will
- * be located at the start of the pageblock
- */
-static inline int pageblock_free(struct page *page)
-{
- return PageBuddy(page) && page_order(page) >= pageblock_order;
-}
-
-/* Return the pfn of the start of the next active pageblock after a given pfn */
-static unsigned long next_active_pageblock(unsigned long pfn)
-{
- struct page *page = pfn_to_page(pfn);
-
- /* Ensure the starting page is pageblock-aligned */
- BUG_ON(pfn & (pageblock_nr_pages - 1));
-
- /* If the entire pageblock is free, move to the end of free page */
- if (pageblock_free(page)) {
- int order;
- /* be careful. we don't have locks, page_order can be changed.*/
- order = page_order(page);
- if ((order < MAX_ORDER) && (order >= pageblock_order))
- return pfn + (1 << order);
- }
-
- return pfn + pageblock_nr_pages;
-}
-
-static bool is_pageblock_removable_nolock(unsigned long pfn)
-{
- struct page *page = pfn_to_page(pfn);
- struct zone *zone;
-
- /*
- * We have to be careful here because we are iterating over memory
- * sections which are not zone aware so we might end up outside of
- * the zone but still within the section.
- * We have to take care about the node as well. If the node is offline
- * its NODE_DATA will be NULL - see page_zone.
- */
- if (!node_online(page_to_nid(page)))
- return false;
-
- zone = page_zone(page);
- pfn = page_to_pfn(page);
- if (!zone_spans_pfn(zone, pfn))
- return false;
-
- return !has_unmovable_pages(zone, page, MIGRATE_MOVABLE,
- MEMORY_OFFLINE);
-}
-
-/* Checks if this range of memory is likely to be hot-removable. */
-bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
-{
- unsigned long end_pfn, pfn;
-
- end_pfn = min(start_pfn + nr_pages,
- zone_end_pfn(page_zone(pfn_to_page(start_pfn))));
-
- /* Check the starting page of each pageblock within the range */
- for (pfn = start_pfn; pfn < end_pfn; pfn = next_active_pageblock(pfn)) {
- if (!is_pageblock_removable_nolock(pfn))
- return false;
- cond_resched();
- }
-
- /* All pageblocks in the memory block are likely to be hot-removable */
- return true;
-}
-
/*
* Confirm all pages in a range [start, end) belong to the same zone (skipping
* memory holes). When true, return the zone.
--
2.25.1

2020-04-07 13:56:53

by David Hildenbrand

[permalink] [raw]
Subject: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
blocks as removable"), the user space interface to compute whether a memory
block can be offlined (exposed via
/sys/devices/system/memory/memoryX/removable) has effectively been
deprecated. We want to remove the leftovers of the kernel implementation.

When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
we'll start by:
1. Testing if it contains any holes, and reject if so
2. Testing if pages belong to different zones, and reject if so
3. Isolating the page range, checking if it contains any unmovable pages

Using is_mem_section_removable() before trying to offline is not only racy,
it can easily result in false positives/negatives. Let's stop manually
checking is_mem_section_removable(), and let device_offline() handle it
completely instead. We can remove the racy is_mem_section_removable()
implementation next.

We now take more locks (e.g., memory hotplug lock when offlining and the
zone lock when isolating), but maybe we should optimize that
implementation instead if this ever becomes a real problem (after all,
memory unplug is already an expensive operation). We started using
is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
Implement memory hotplug remove in the kernel"), with the initial
hotremove support of lmbs.

Cc: Nathan Fontenot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Wei Yang <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
---
.../platforms/pseries/hotplug-memory.c | 26 +++----------------
1 file changed, 3 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index b2cde1732301..5ace2f9a277e 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)

static bool lmb_is_removable(struct drmem_lmb *lmb)
{
- int i, scns_per_block;
- bool rc = true;
- unsigned long pfn, block_sz;
- u64 phys_addr;
-
if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
return false;

- block_sz = memory_block_size_bytes();
- scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
- phys_addr = lmb->base_addr;
-
#ifdef CONFIG_FA_DUMP
/*
* Don't hot-remove memory that falls in fadump boot memory area
* and memory that is reserved for capturing old kernel memory.
*/
- if (is_fadump_memory_area(phys_addr, block_sz))
+ if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
return false;
#endif
-
- for (i = 0; i < scns_per_block; i++) {
- pfn = PFN_DOWN(phys_addr);
- if (!pfn_in_present_section(pfn)) {
- phys_addr += MIN_MEMORY_BLOCK_SIZE;
- continue;
- }
-
- rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
- phys_addr += MIN_MEMORY_BLOCK_SIZE;
- }
-
- return rc;
+ /* device_offline() will determine if we can actually remove this lmb */
+ return true;
}

static int dlpar_add_lmb(struct drmem_lmb *);
--
2.25.1

2020-04-07 13:59:58

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

On Tue 07-04-20 15:54:15, David Hildenbrand wrote:
> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> blocks as removable"), the user space interface to compute whether a memory
> block can be offlined (exposed via
> /sys/devices/system/memory/memoryX/removable) has effectively been
> deprecated. We want to remove the leftovers of the kernel implementation.
>
> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> we'll start by:
> 1. Testing if it contains any holes, and reject if so
> 2. Testing if pages belong to different zones, and reject if so
> 3. Isolating the page range, checking if it contains any unmovable pages
>
> Using is_mem_section_removable() before trying to offline is not only racy,
> it can easily result in false positives/negatives. Let's stop manually
> checking is_mem_section_removable(), and let device_offline() handle it
> completely instead. We can remove the racy is_mem_section_removable()
> implementation next.
>
> We now take more locks (e.g., memory hotplug lock when offlining and the
> zone lock when isolating), but maybe we should optimize that
> implementation instead if this ever becomes a real problem (after all,
> memory unplug is already an expensive operation). We started using
> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> Implement memory hotplug remove in the kernel"), with the initial
> hotremove support of lmbs.

I am not familiar with this code but it makes sense to make it sync with
the global behavior.

> Cc: Nathan Fontenot <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Oscar Salvador <[email protected]>
> Cc: Baoquan He <[email protected]>
> Cc: Wei Yang <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>
> ---
> .../platforms/pseries/hotplug-memory.c | 26 +++----------------
> 1 file changed, 3 insertions(+), 23 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index b2cde1732301..5ace2f9a277e 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>
> static bool lmb_is_removable(struct drmem_lmb *lmb)
> {
> - int i, scns_per_block;
> - bool rc = true;
> - unsigned long pfn, block_sz;
> - u64 phys_addr;
> -
> if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
> return false;
>
> - block_sz = memory_block_size_bytes();
> - scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
> - phys_addr = lmb->base_addr;
> -
> #ifdef CONFIG_FA_DUMP
> /*
> * Don't hot-remove memory that falls in fadump boot memory area
> * and memory that is reserved for capturing old kernel memory.
> */
> - if (is_fadump_memory_area(phys_addr, block_sz))
> + if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
> return false;
> #endif
> -
> - for (i = 0; i < scns_per_block; i++) {
> - pfn = PFN_DOWN(phys_addr);
> - if (!pfn_in_present_section(pfn)) {
> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
> - continue;
> - }
> -
> - rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
> - }
> -
> - return rc;
> + /* device_offline() will determine if we can actually remove this lmb */
> + return true;
> }
>
> static int dlpar_add_lmb(struct drmem_lmb *);
> --
> 2.25.1

--
Michal Hocko
SUSE Labs

2020-04-07 14:02:12

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable()

On Tue 07-04-20 15:54:16, David Hildenbrand wrote:
> Fortunately, all users of is_mem_section_removable() are gone. Get rid of
> it, including some now unnecessary functions.
>
> Cc: Michael Ellerman <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Oscar Salvador <[email protected]>
> Cc: Baoquan He <[email protected]>
> Cc: Wei Yang <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>

Acked-by: Michal Hocko <[email protected]>

> ---
> include/linux/memory_hotplug.h | 7 ----
> mm/memory_hotplug.c | 75 ----------------------------------

\o/

Thanks!

> 2 files changed, 82 deletions(-)
>
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 93d9ada74ddd..7dca9cd6076b 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -314,19 +314,12 @@ static inline void pgdat_resize_init(struct pglist_data *pgdat) {}
>
> #ifdef CONFIG_MEMORY_HOTREMOVE
>
> -extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
> extern void try_offline_node(int nid);
> extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
> extern int remove_memory(int nid, u64 start, u64 size);
> extern void __remove_memory(int nid, u64 start, u64 size);
>
> #else
> -static inline bool is_mem_section_removable(unsigned long pfn,
> - unsigned long nr_pages)
> -{
> - return false;
> -}
> -
> static inline void try_offline_node(int nid) {}
>
> static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 47cf6036eb31..4d338d546d52 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1112,81 +1112,6 @@ int add_memory(int nid, u64 start, u64 size)
> EXPORT_SYMBOL_GPL(add_memory);
>
> #ifdef CONFIG_MEMORY_HOTREMOVE
> -/*
> - * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
> - * set and the size of the free page is given by page_order(). Using this,
> - * the function determines if the pageblock contains only free pages.
> - * Due to buddy contraints, a free page at least the size of a pageblock will
> - * be located at the start of the pageblock
> - */
> -static inline int pageblock_free(struct page *page)
> -{
> - return PageBuddy(page) && page_order(page) >= pageblock_order;
> -}
> -
> -/* Return the pfn of the start of the next active pageblock after a given pfn */
> -static unsigned long next_active_pageblock(unsigned long pfn)
> -{
> - struct page *page = pfn_to_page(pfn);
> -
> - /* Ensure the starting page is pageblock-aligned */
> - BUG_ON(pfn & (pageblock_nr_pages - 1));
> -
> - /* If the entire pageblock is free, move to the end of free page */
> - if (pageblock_free(page)) {
> - int order;
> - /* be careful. we don't have locks, page_order can be changed.*/
> - order = page_order(page);
> - if ((order < MAX_ORDER) && (order >= pageblock_order))
> - return pfn + (1 << order);
> - }
> -
> - return pfn + pageblock_nr_pages;
> -}
> -
> -static bool is_pageblock_removable_nolock(unsigned long pfn)
> -{
> - struct page *page = pfn_to_page(pfn);
> - struct zone *zone;
> -
> - /*
> - * We have to be careful here because we are iterating over memory
> - * sections which are not zone aware so we might end up outside of
> - * the zone but still within the section.
> - * We have to take care about the node as well. If the node is offline
> - * its NODE_DATA will be NULL - see page_zone.
> - */
> - if (!node_online(page_to_nid(page)))
> - return false;
> -
> - zone = page_zone(page);
> - pfn = page_to_pfn(page);
> - if (!zone_spans_pfn(zone, pfn))
> - return false;
> -
> - return !has_unmovable_pages(zone, page, MIGRATE_MOVABLE,
> - MEMORY_OFFLINE);
> -}
> -
> -/* Checks if this range of memory is likely to be hot-removable. */
> -bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> -{
> - unsigned long end_pfn, pfn;
> -
> - end_pfn = min(start_pfn + nr_pages,
> - zone_end_pfn(page_zone(pfn_to_page(start_pfn))));
> -
> - /* Check the starting page of each pageblock within the range */
> - for (pfn = start_pfn; pfn < end_pfn; pfn = next_active_pageblock(pfn)) {
> - if (!is_pageblock_removable_nolock(pfn))
> - return false;
> - cond_resched();
> - }
> -
> - /* All pageblocks in the memory block are likely to be hot-removable */
> - return true;
> -}
> -
> /*
> * Confirm all pages in a range [start, end) belong to the same zone (skipping
> * memory holes). When true, return the zone.
> --
> 2.25.1
>

--
Michal Hocko
SUSE Labs

2020-04-07 21:31:03

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable()

On Tue, Apr 07, 2020 at 03:54:16PM +0200, David Hildenbrand wrote:
>Fortunately, all users of is_mem_section_removable() are gone. Get rid of
>it, including some now unnecessary functions.
>
>Cc: Michael Ellerman <[email protected]>
>Cc: Benjamin Herrenschmidt <[email protected]>
>Cc: Michal Hocko <[email protected]>
>Cc: Andrew Morton <[email protected]>
>Cc: Oscar Salvador <[email protected]>
>Cc: Baoquan He <[email protected]>
>Cc: Wei Yang <[email protected]>
>Signed-off-by: David Hildenbrand <[email protected]>

Nice.

Reviewed-by: Wei Yang <[email protected]>

--
Wei Yang
Help you, Help me

2020-04-08 02:49:10

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

Add Pingfan to CC since he usually handles ppc related bugs for RHEL.

On 04/07/20 at 03:54pm, David Hildenbrand wrote:
> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> blocks as removable"), the user space interface to compute whether a memory
> block can be offlined (exposed via
> /sys/devices/system/memory/memoryX/removable) has effectively been
> deprecated. We want to remove the leftovers of the kernel implementation.

Pingfan, can you have a look at this change on PPC? Please feel free to
give comments if any concern, or offer ack if it's OK to you.

>
> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> we'll start by:
> 1. Testing if it contains any holes, and reject if so
> 2. Testing if pages belong to different zones, and reject if so
> 3. Isolating the page range, checking if it contains any unmovable pages
>
> Using is_mem_section_removable() before trying to offline is not only racy,
> it can easily result in false positives/negatives. Let's stop manually
> checking is_mem_section_removable(), and let device_offline() handle it
> completely instead. We can remove the racy is_mem_section_removable()
> implementation next.
>
> We now take more locks (e.g., memory hotplug lock when offlining and the
> zone lock when isolating), but maybe we should optimize that
> implementation instead if this ever becomes a real problem (after all,
> memory unplug is already an expensive operation). We started using
> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> Implement memory hotplug remove in the kernel"), with the initial
> hotremove support of lmbs.
>
> Cc: Nathan Fontenot <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Oscar Salvador <[email protected]>
> Cc: Baoquan He <[email protected]>
> Cc: Wei Yang <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>
> ---
> .../platforms/pseries/hotplug-memory.c | 26 +++----------------
> 1 file changed, 3 insertions(+), 23 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index b2cde1732301..5ace2f9a277e 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>
> static bool lmb_is_removable(struct drmem_lmb *lmb)
> {
> - int i, scns_per_block;
> - bool rc = true;
> - unsigned long pfn, block_sz;
> - u64 phys_addr;
> -
> if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
> return false;
>
> - block_sz = memory_block_size_bytes();
> - scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
> - phys_addr = lmb->base_addr;
> -
> #ifdef CONFIG_FA_DUMP
> /*
> * Don't hot-remove memory that falls in fadump boot memory area
> * and memory that is reserved for capturing old kernel memory.
> */
> - if (is_fadump_memory_area(phys_addr, block_sz))
> + if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
> return false;
> #endif
> -
> - for (i = 0; i < scns_per_block; i++) {
> - pfn = PFN_DOWN(phys_addr);
> - if (!pfn_in_present_section(pfn)) {
> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
> - continue;
> - }
> -
> - rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
> - }
> -
> - return rc;
> + /* device_offline() will determine if we can actually remove this lmb */
> + return true;
> }
>
> static int dlpar_add_lmb(struct drmem_lmb *);
> --
> 2.25.1
>

2020-04-08 02:49:39

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable()

On 04/07/20 at 03:54pm, David Hildenbrand wrote:
> Fortunately, all users of is_mem_section_removable() are gone. Get rid of
> it, including some now unnecessary functions.
>
> Cc: Michael Ellerman <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Oscar Salvador <[email protected]>
> Cc: Baoquan He <[email protected]>
> Cc: Wei Yang <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>

Assuming no issue to patch 1, this one looks good.

Reviewed-by: Baoquan He <[email protected]>

> ---
> include/linux/memory_hotplug.h | 7 ----
> mm/memory_hotplug.c | 75 ----------------------------------
> 2 files changed, 82 deletions(-)
>
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 93d9ada74ddd..7dca9cd6076b 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -314,19 +314,12 @@ static inline void pgdat_resize_init(struct pglist_data *pgdat) {}
>
> #ifdef CONFIG_MEMORY_HOTREMOVE
>
> -extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
> extern void try_offline_node(int nid);
> extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
> extern int remove_memory(int nid, u64 start, u64 size);
> extern void __remove_memory(int nid, u64 start, u64 size);
>
> #else
> -static inline bool is_mem_section_removable(unsigned long pfn,
> - unsigned long nr_pages)
> -{
> - return false;
> -}
> -
> static inline void try_offline_node(int nid) {}
>
> static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 47cf6036eb31..4d338d546d52 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1112,81 +1112,6 @@ int add_memory(int nid, u64 start, u64 size)
> EXPORT_SYMBOL_GPL(add_memory);
>
> #ifdef CONFIG_MEMORY_HOTREMOVE
> -/*
> - * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
> - * set and the size of the free page is given by page_order(). Using this,
> - * the function determines if the pageblock contains only free pages.
> - * Due to buddy contraints, a free page at least the size of a pageblock will
> - * be located at the start of the pageblock
> - */
> -static inline int pageblock_free(struct page *page)
> -{
> - return PageBuddy(page) && page_order(page) >= pageblock_order;
> -}
> -
> -/* Return the pfn of the start of the next active pageblock after a given pfn */
> -static unsigned long next_active_pageblock(unsigned long pfn)
> -{
> - struct page *page = pfn_to_page(pfn);
> -
> - /* Ensure the starting page is pageblock-aligned */
> - BUG_ON(pfn & (pageblock_nr_pages - 1));
> -
> - /* If the entire pageblock is free, move to the end of free page */
> - if (pageblock_free(page)) {
> - int order;
> - /* be careful. we don't have locks, page_order can be changed.*/
> - order = page_order(page);
> - if ((order < MAX_ORDER) && (order >= pageblock_order))
> - return pfn + (1 << order);
> - }
> -
> - return pfn + pageblock_nr_pages;
> -}
> -
> -static bool is_pageblock_removable_nolock(unsigned long pfn)
> -{
> - struct page *page = pfn_to_page(pfn);
> - struct zone *zone;
> -
> - /*
> - * We have to be careful here because we are iterating over memory
> - * sections which are not zone aware so we might end up outside of
> - * the zone but still within the section.
> - * We have to take care about the node as well. If the node is offline
> - * its NODE_DATA will be NULL - see page_zone.
> - */
> - if (!node_online(page_to_nid(page)))
> - return false;
> -
> - zone = page_zone(page);
> - pfn = page_to_pfn(page);
> - if (!zone_spans_pfn(zone, pfn))
> - return false;
> -
> - return !has_unmovable_pages(zone, page, MIGRATE_MOVABLE,
> - MEMORY_OFFLINE);
> -}
> -
> -/* Checks if this range of memory is likely to be hot-removable. */
> -bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> -{
> - unsigned long end_pfn, pfn;
> -
> - end_pfn = min(start_pfn + nr_pages,
> - zone_end_pfn(page_zone(pfn_to_page(start_pfn))));
> -
> - /* Check the starting page of each pageblock within the range */
> - for (pfn = start_pfn; pfn < end_pfn; pfn = next_active_pageblock(pfn)) {
> - if (!is_pageblock_removable_nolock(pfn))
> - return false;
> - cond_resched();
> - }
> -
> - /* All pageblocks in the memory block are likely to be hot-removable */
> - return true;
> -}
> -
> /*
> * Confirm all pages in a range [start, end) belong to the same zone (skipping
> * memory holes). When true, return the zone.
> --
> 2.25.1
>

2020-04-09 03:02:16

by Pingfan Liu

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()



On 04/08/2020 10:46 AM, Baoquan He wrote:
> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>
> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>> blocks as removable"), the user space interface to compute whether a memory
>> block can be offlined (exposed via
>> /sys/devices/system/memory/memoryX/removable) has effectively been
>> deprecated. We want to remove the leftovers of the kernel implementation.
>
> Pingfan, can you have a look at this change on PPC? Please feel free to
> give comments if any concern, or offer ack if it's OK to you.
>
>>
>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>> we'll start by:
>> 1. Testing if it contains any holes, and reject if so
>> 2. Testing if pages belong to different zones, and reject if so
>> 3. Isolating the page range, checking if it contains any unmovable pages
>>
>> Using is_mem_section_removable() before trying to offline is not only racy,
>> it can easily result in false positives/negatives. Let's stop manually
>> checking is_mem_section_removable(), and let device_offline() handle it
>> completely instead. We can remove the racy is_mem_section_removable()
>> implementation next.
>>
>> We now take more locks (e.g., memory hotplug lock when offlining and the
>> zone lock when isolating), but maybe we should optimize that
>> implementation instead if this ever becomes a real problem (after all,
>> memory unplug is already an expensive operation). We started using
>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>> Implement memory hotplug remove in the kernel"), with the initial
>> hotremove support of lmbs.
>>
>> Cc: Nathan Fontenot <[email protected]>
>> Cc: Michael Ellerman <[email protected]>
>> Cc: Benjamin Herrenschmidt <[email protected]>
>> Cc: Paul Mackerras <[email protected]>
>> Cc: Michal Hocko <[email protected]>
>> Cc: Andrew Morton <[email protected]>
>> Cc: Oscar Salvador <[email protected]>
>> Cc: Baoquan He <[email protected]>
>> Cc: Wei Yang <[email protected]>
>> Signed-off-by: David Hildenbrand <[email protected]>
>> ---
>> .../platforms/pseries/hotplug-memory.c | 26 +++----------------
>> 1 file changed, 3 insertions(+), 23 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> index b2cde1732301..5ace2f9a277e 100644
>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>>
>> static bool lmb_is_removable(struct drmem_lmb *lmb)
>> {
>> - int i, scns_per_block;
>> - bool rc = true;
>> - unsigned long pfn, block_sz;
>> - u64 phys_addr;
>> -
>> if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>> return false;
>>
>> - block_sz = memory_block_size_bytes();
>> - scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>> - phys_addr = lmb->base_addr;
>> -
>> #ifdef CONFIG_FA_DUMP
>> /*
>> * Don't hot-remove memory that falls in fadump boot memory area
>> * and memory that is reserved for capturing old kernel memory.
>> */
>> - if (is_fadump_memory_area(phys_addr, block_sz))
>> + if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>> return false;
>> #endif
>> -
>> - for (i = 0; i < scns_per_block; i++) {
>> - pfn = PFN_DOWN(phys_addr);
>> - if (!pfn_in_present_section(pfn)) {
>> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
>> - continue;
>> - }
>> -
>> - rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
>> - }
>> -
>> - return rc;
>> + /* device_offline() will determine if we can actually remove this lmb */
>> + return true;
So I think here swaps the check and do sequence. At least it breaks
dlpar_memory_remove_by_count(). It is doable to remove
is_mem_section_removable(), but here should be more effort to re-arrange
the code.

Thanks,
Pingfan
>> }
>>
>> static int dlpar_add_lmb(struct drmem_lmb *);
>> --
>> 2.25.1
>>

2020-04-09 07:28:39

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

David Hildenbrand <[email protected]> writes:

> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> blocks as removable"), the user space interface to compute whether a memory
> block can be offlined (exposed via
> /sys/devices/system/memory/memoryX/removable) has effectively been
> deprecated. We want to remove the leftovers of the kernel implementation.
>
> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> we'll start by:
> 1. Testing if it contains any holes, and reject if so
> 2. Testing if pages belong to different zones, and reject if so
> 3. Isolating the page range, checking if it contains any unmovable pages
>
> Using is_mem_section_removable() before trying to offline is not only racy,
> it can easily result in false positives/negatives. Let's stop manually
> checking is_mem_section_removable(), and let device_offline() handle it
> completely instead. We can remove the racy is_mem_section_removable()
> implementation next.
>
> We now take more locks (e.g., memory hotplug lock when offlining and the
> zone lock when isolating), but maybe we should optimize that
> implementation instead if this ever becomes a real problem (after all,
> memory unplug is already an expensive operation). We started using
> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> Implement memory hotplug remove in the kernel"), with the initial
> hotremove support of lmbs.

It's also not very pretty in dmesg.

Before:

pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
dlpar: Could not handle DLPAR request "memory add count 10"

After:

pseries-hotplug-mem: Attempting to hot-remove 10 LMB(s)
page:c00c000001ca8200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000072a080180
flags: 0x7ffff000000200(slab)
raw: 007ffff000000200 c00c000001cffd48 c000000781c51410 c000000793327580
raw: c00000072a080180 0000000001550001 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000001cc4a80 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000079b110080
flags: 0x7ffff000000000()
raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
raw: c00000079b110080 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000001d08200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000074208ff00
flags: 0x7ffff000000200(slab)
raw: 007ffff000000200 c000000781c5f150 c00c000001d37f88 c000000798a24880
raw: c00000074208ff00 0000000001550002 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000001d40140 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000750059c00
flags: 0x7ffff000000200(slab)
raw: 007ffff000000200 c00c000001dfcfc8 c00c000001d3c288 c0000007851c2d00
raw: c000000750059c00 0000000001000003 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000001d9c000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000002370080
flags: 0x7ffff000000000()
raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
raw: c000000002370080 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000001dc0200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000002370080
flags: 0x7ffff000000000()
raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
raw: c000000002370080 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000001e00000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
flags: 0x7ffff000000200(slab)
raw: 007ffff000000200 5deadbeef0000100 5deadbeef0000122 c0000007a801f500
raw: 0000000000000000 0000000008000800 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000001e40440 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
flags: 0x7ffff000000200(slab)
raw: 007ffff000000200 5deadbeef0000100 5deadbeef0000122 c0000007a801e380
raw: 0000000000000000 0000000000400040 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000001e80000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc0000007a0000640
flags: 0x7ffff000000200(slab)
raw: 007ffff000000200 c00c000001e5af48 c00c000001e80408 c000000f42d00a00
raw: c0000007a0000640 00000000066600a2 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000003c89d40 refcount:2 mapcount:1 mapping:0000000018c4a547 index:0x10a41
anon flags: 0x17ffff000080024(uptodate|active|swapbacked)
raw: 017ffff000080024 5deadbeef0000100 5deadbeef0000122 c0000007986b03c9
raw: 0000000000010a41 0000000000000000 0000000200000000 c00000000340b000
page dumped because: unmovable page
page->mem_cgroup:c00000000340b000
page:c00c000003cc0000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000f3000fd38
flags: 0x17ffff000000200(slab)
raw: 017ffff000000200 c000000f3c911890 c000000f3c911890 c00000079fffd980
raw: c000000f3000fd38 0000000000700003 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000003d00000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc0000007a2ec0000
flags: 0x17ffff000000000()
raw: 017ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
raw: c0000007a2ec0000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000003e2c000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000f8b008400
flags: 0x27ffff000000200(slab)
raw: 027ffff000000200 c000000f8e000190 c000000f8e000190 c0000007a801e380
raw: c000000f8b008400 0000000000400038 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:c00c000003fec000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
flags: 0x37ffff000000000()
raw: 037ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: unmovable page
pseries-hotplug-mem: Memory hot-remove failed, adding LMB's back
dlpar: Could not handle DLPAR request "memory remove count 10"



It looks like set_migratetype_isolate() and start_isolate_page_range()
can be told not to report those warnings, but we're just calling
device_offline() which doesn't let us specify that.

cheers

2020-04-09 07:28:42

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

On 09.04.20 04:59, piliu wrote:
>
>
> On 04/08/2020 10:46 AM, Baoquan He wrote:
>> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>>
>> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>> blocks as removable"), the user space interface to compute whether a memory
>>> block can be offlined (exposed via
>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>
>> Pingfan, can you have a look at this change on PPC? Please feel free to
>> give comments if any concern, or offer ack if it's OK to you.
>>
>>>
>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>> we'll start by:
>>> 1. Testing if it contains any holes, and reject if so
>>> 2. Testing if pages belong to different zones, and reject if so
>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>
>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>> it can easily result in false positives/negatives. Let's stop manually
>>> checking is_mem_section_removable(), and let device_offline() handle it
>>> completely instead. We can remove the racy is_mem_section_removable()
>>> implementation next.
>>>
>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>> zone lock when isolating), but maybe we should optimize that
>>> implementation instead if this ever becomes a real problem (after all,
>>> memory unplug is already an expensive operation). We started using
>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>> Implement memory hotplug remove in the kernel"), with the initial
>>> hotremove support of lmbs.
>>>
>>> Cc: Nathan Fontenot <[email protected]>
>>> Cc: Michael Ellerman <[email protected]>
>>> Cc: Benjamin Herrenschmidt <[email protected]>
>>> Cc: Paul Mackerras <[email protected]>
>>> Cc: Michal Hocko <[email protected]>
>>> Cc: Andrew Morton <[email protected]>
>>> Cc: Oscar Salvador <[email protected]>
>>> Cc: Baoquan He <[email protected]>
>>> Cc: Wei Yang <[email protected]>
>>> Signed-off-by: David Hildenbrand <[email protected]>
>>> ---
>>> .../platforms/pseries/hotplug-memory.c | 26 +++----------------
>>> 1 file changed, 3 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> index b2cde1732301..5ace2f9a277e 100644
>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>>>
>>> static bool lmb_is_removable(struct drmem_lmb *lmb)
>>> {
>>> - int i, scns_per_block;
>>> - bool rc = true;
>>> - unsigned long pfn, block_sz;
>>> - u64 phys_addr;
>>> -
>>> if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>>> return false;
>>>
>>> - block_sz = memory_block_size_bytes();
>>> - scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>>> - phys_addr = lmb->base_addr;
>>> -
>>> #ifdef CONFIG_FA_DUMP
>>> /*
>>> * Don't hot-remove memory that falls in fadump boot memory area
>>> * and memory that is reserved for capturing old kernel memory.
>>> */
>>> - if (is_fadump_memory_area(phys_addr, block_sz))
>>> + if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>>> return false;
>>> #endif
>>> -
>>> - for (i = 0; i < scns_per_block; i++) {
>>> - pfn = PFN_DOWN(phys_addr);
>>> - if (!pfn_in_present_section(pfn)) {
>>> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>> - continue;
>>> - }
>>> -
>>> - rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>>> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>> - }
>>> -
>>> - return rc;
>>> + /* device_offline() will determine if we can actually remove this lmb */
>>> + return true;
> So I think here swaps the check and do sequence. At least it breaks
> dlpar_memory_remove_by_count(). It is doable to remove
> is_mem_section_removable(), but here should be more effort to re-arrange
> the code.
>

Thanks Pingfan,

1. "swaps the check and do sequence":

Partially. Any caller of dlpar_remove_lmb() already has to deal with
false positives. device_offline() can easily fail after
dlpar_remove_lmb() == true. It's inherently racy.

2. "breaks dlpar_memory_remove_by_count()"

Can you elaborate why it "breaks" it? It will simply try to
offline+remove lmbs, detect that it wasn't able to offline+remove as
much as it wanted (which could happen before as well easily), and re-add
the already offlined+removed ones.

3. "more effort to re-arrange the code"

What would be your suggestion?

We would rip out that racy check if we can remove as much memory as
requested in dlpar_memory_remove_by_count() and simply always try to
remove + recover.


--
Thanks,

David / dhildenb

2020-04-09 07:33:50

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

> It's also not very pretty in dmesg.
>
> Before:
>
> pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
> pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
> dlpar: Could not handle DLPAR request "memory add count 10"
>

Thanks for running it through the mill.

Here you test "hotadd", below you test "hot-remove". But yeah, there is
a change in the amount of dmesg.

> After:
>
> pseries-hotplug-mem: Attempting to hot-remove 10 LMB(s)
> page:c00c000001ca8200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000072a080180
> flags: 0x7ffff000000200(slab)
> raw: 007ffff000000200 c00c000001cffd48 c000000781c51410 c000000793327580
> raw: c00000072a080180 0000000001550001 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000001cc4a80 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000079b110080
> flags: 0x7ffff000000000()
> raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
> raw: c00000079b110080 0000000000000000 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000001d08200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000074208ff00
> flags: 0x7ffff000000200(slab)
> raw: 007ffff000000200 c000000781c5f150 c00c000001d37f88 c000000798a24880
> raw: c00000074208ff00 0000000001550002 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000001d40140 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000750059c00
> flags: 0x7ffff000000200(slab)
> raw: 007ffff000000200 c00c000001dfcfc8 c00c000001d3c288 c0000007851c2d00
> raw: c000000750059c00 0000000001000003 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000001d9c000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000002370080
> flags: 0x7ffff000000000()
> raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
> raw: c000000002370080 0000000000000000 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000001dc0200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000002370080
> flags: 0x7ffff000000000()
> raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
> raw: c000000002370080 0000000000000000 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000001e00000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
> flags: 0x7ffff000000200(slab)
> raw: 007ffff000000200 5deadbeef0000100 5deadbeef0000122 c0000007a801f500
> raw: 0000000000000000 0000000008000800 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000001e40440 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
> flags: 0x7ffff000000200(slab)
> raw: 007ffff000000200 5deadbeef0000100 5deadbeef0000122 c0000007a801e380
> raw: 0000000000000000 0000000000400040 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000001e80000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc0000007a0000640
> flags: 0x7ffff000000200(slab)
> raw: 007ffff000000200 c00c000001e5af48 c00c000001e80408 c000000f42d00a00
> raw: c0000007a0000640 00000000066600a2 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000003c89d40 refcount:2 mapcount:1 mapping:0000000018c4a547 index:0x10a41
> anon flags: 0x17ffff000080024(uptodate|active|swapbacked)
> raw: 017ffff000080024 5deadbeef0000100 5deadbeef0000122 c0000007986b03c9
> raw: 0000000000010a41 0000000000000000 0000000200000000 c00000000340b000
> page dumped because: unmovable page
> page->mem_cgroup:c00000000340b000
> page:c00c000003cc0000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000f3000fd38
> flags: 0x17ffff000000200(slab)
> raw: 017ffff000000200 c000000f3c911890 c000000f3c911890 c00000079fffd980
> raw: c000000f3000fd38 0000000000700003 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000003d00000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc0000007a2ec0000
> flags: 0x17ffff000000000()
> raw: 017ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
> raw: c0000007a2ec0000 0000000000000000 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000003e2c000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000f8b008400
> flags: 0x27ffff000000200(slab)
> raw: 027ffff000000200 c000000f8e000190 c000000f8e000190 c0000007a801e380
> raw: c000000f8b008400 0000000000400038 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> page:c00c000003fec000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
> flags: 0x37ffff000000000()
> raw: 037ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
> raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> pseries-hotplug-mem: Memory hot-remove failed, adding LMB's back
> dlpar: Could not handle DLPAR request "memory remove count 10"
>
>
>
> It looks like set_migratetype_isolate() and start_isolate_page_range()
> can be told not to report those warnings, but we're just calling
> device_offline() which doesn't let us specify that.

Yeah, but these messages can easily pop up (to a more limited degree)
with the current code as well, as checking for movable pages without
isolating the page range gives you no guarantees that no unmovable data
will end up on the lmb until you offline it. It's simply racy.

I discussed this output with Michal when we changed the
/sys/devices/system/memory/memoryX/removable behavior, and he had the
opinion that dmesg (debug) output should not really be an issue.

We could make this output

a) configurable at runtime and let powerpc code disable it while calling
device_offline(). So user space attempts will still trigger the messages

b) configurable at compile time


--
Thanks,

David / dhildenb

2020-04-09 08:01:45

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

On Thu 09-04-20 17:26:01, Michael Ellerman wrote:
> David Hildenbrand <[email protected]> writes:
>
> > In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> > blocks as removable"), the user space interface to compute whether a memory
> > block can be offlined (exposed via
> > /sys/devices/system/memory/memoryX/removable) has effectively been
> > deprecated. We want to remove the leftovers of the kernel implementation.
> >
> > When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> > we'll start by:
> > 1. Testing if it contains any holes, and reject if so
> > 2. Testing if pages belong to different zones, and reject if so
> > 3. Isolating the page range, checking if it contains any unmovable pages
> >
> > Using is_mem_section_removable() before trying to offline is not only racy,
> > it can easily result in false positives/negatives. Let's stop manually
> > checking is_mem_section_removable(), and let device_offline() handle it
> > completely instead. We can remove the racy is_mem_section_removable()
> > implementation next.
> >
> > We now take more locks (e.g., memory hotplug lock when offlining and the
> > zone lock when isolating), but maybe we should optimize that
> > implementation instead if this ever becomes a real problem (after all,
> > memory unplug is already an expensive operation). We started using
> > is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> > Implement memory hotplug remove in the kernel"), with the initial
> > hotremove support of lmbs.
>
> It's also not very pretty in dmesg.
>
> Before:
>
> pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
> pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
> dlpar: Could not handle DLPAR request "memory add count 10"

Yeah, there is more output but isn't that useful? Or put it differently
what is the actual problem from having those messages in the kernel log?

From the below you can clearly tell that there are kernel allocations
which prevent hot remove from happening.

If the overall size of the debugging output is a concern then we can
think of a way to reduce it. E.g. once you have a couple of pages
reported then all others from the same block are likely not interesting
much.
--
Michal Hocko
SUSE Labs

2020-04-09 08:13:28

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

On 09.04.20 09:59, Michal Hocko wrote:
> On Thu 09-04-20 17:26:01, Michael Ellerman wrote:
>> David Hildenbrand <[email protected]> writes:
>>
>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>> blocks as removable"), the user space interface to compute whether a memory
>>> block can be offlined (exposed via
>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>>
>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>> we'll start by:
>>> 1. Testing if it contains any holes, and reject if so
>>> 2. Testing if pages belong to different zones, and reject if so
>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>
>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>> it can easily result in false positives/negatives. Let's stop manually
>>> checking is_mem_section_removable(), and let device_offline() handle it
>>> completely instead. We can remove the racy is_mem_section_removable()
>>> implementation next.
>>>
>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>> zone lock when isolating), but maybe we should optimize that
>>> implementation instead if this ever becomes a real problem (after all,
>>> memory unplug is already an expensive operation). We started using
>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>> Implement memory hotplug remove in the kernel"), with the initial
>>> hotremove support of lmbs.
>>
>> It's also not very pretty in dmesg.
>>
>> Before:
>>
>> pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
>> pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
>> dlpar: Could not handle DLPAR request "memory add count 10"
>
> Yeah, there is more output but isn't that useful? Or put it differently
> what is the actual problem from having those messages in the kernel log?
>
> From the below you can clearly tell that there are kernel allocations
> which prevent hot remove from happening.
>
> If the overall size of the debugging output is a concern then we can
> think of a way to reduce it. E.g. once you have a couple of pages
> reported then all others from the same block are likely not interesting
> much.
>

IIRC, we only report one page per block already. (and stop, as we
detected something unmovable)

--
Thanks,

David / dhildenb

2020-04-09 08:50:37

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

On Thu 09-04-20 10:12:20, David Hildenbrand wrote:
> On 09.04.20 09:59, Michal Hocko wrote:
> > On Thu 09-04-20 17:26:01, Michael Ellerman wrote:
> >> David Hildenbrand <[email protected]> writes:
> >>
> >>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> >>> blocks as removable"), the user space interface to compute whether a memory
> >>> block can be offlined (exposed via
> >>> /sys/devices/system/memory/memoryX/removable) has effectively been
> >>> deprecated. We want to remove the leftovers of the kernel implementation.
> >>>
> >>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> >>> we'll start by:
> >>> 1. Testing if it contains any holes, and reject if so
> >>> 2. Testing if pages belong to different zones, and reject if so
> >>> 3. Isolating the page range, checking if it contains any unmovable pages
> >>>
> >>> Using is_mem_section_removable() before trying to offline is not only racy,
> >>> it can easily result in false positives/negatives. Let's stop manually
> >>> checking is_mem_section_removable(), and let device_offline() handle it
> >>> completely instead. We can remove the racy is_mem_section_removable()
> >>> implementation next.
> >>>
> >>> We now take more locks (e.g., memory hotplug lock when offlining and the
> >>> zone lock when isolating), but maybe we should optimize that
> >>> implementation instead if this ever becomes a real problem (after all,
> >>> memory unplug is already an expensive operation). We started using
> >>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> >>> Implement memory hotplug remove in the kernel"), with the initial
> >>> hotremove support of lmbs.
> >>
> >> It's also not very pretty in dmesg.
> >>
> >> Before:
> >>
> >> pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
> >> pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
> >> dlpar: Could not handle DLPAR request "memory add count 10"
> >
> > Yeah, there is more output but isn't that useful? Or put it differently
> > what is the actual problem from having those messages in the kernel log?
> >
> > From the below you can clearly tell that there are kernel allocations
> > which prevent hot remove from happening.
> >
> > If the overall size of the debugging output is a concern then we can
> > think of a way to reduce it. E.g. once you have a couple of pages
> > reported then all others from the same block are likely not interesting
> > much.
> >
>
> IIRC, we only report one page per block already. (and stop, as we
> detected something unmovable)

You are right.
--
Michal Hocko
SUSE Labs

2020-04-09 08:58:12

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

On 09.04.20 09:26, David Hildenbrand wrote:
> On 09.04.20 04:59, piliu wrote:
>>
>>
>> On 04/08/2020 10:46 AM, Baoquan He wrote:
>>> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>>>
>>> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>>> blocks as removable"), the user space interface to compute whether a memory
>>>> block can be offlined (exposed via
>>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>>
>>> Pingfan, can you have a look at this change on PPC? Please feel free to
>>> give comments if any concern, or offer ack if it's OK to you.
>>>
>>>>
>>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>>> we'll start by:
>>>> 1. Testing if it contains any holes, and reject if so
>>>> 2. Testing if pages belong to different zones, and reject if so
>>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>>
>>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>>> it can easily result in false positives/negatives. Let's stop manually
>>>> checking is_mem_section_removable(), and let device_offline() handle it
>>>> completely instead. We can remove the racy is_mem_section_removable()
>>>> implementation next.
>>>>
>>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>>> zone lock when isolating), but maybe we should optimize that
>>>> implementation instead if this ever becomes a real problem (after all,
>>>> memory unplug is already an expensive operation). We started using
>>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>>> Implement memory hotplug remove in the kernel"), with the initial
>>>> hotremove support of lmbs.
>>>>
>>>> Cc: Nathan Fontenot <[email protected]>
>>>> Cc: Michael Ellerman <[email protected]>
>>>> Cc: Benjamin Herrenschmidt <[email protected]>
>>>> Cc: Paul Mackerras <[email protected]>
>>>> Cc: Michal Hocko <[email protected]>
>>>> Cc: Andrew Morton <[email protected]>
>>>> Cc: Oscar Salvador <[email protected]>
>>>> Cc: Baoquan He <[email protected]>
>>>> Cc: Wei Yang <[email protected]>
>>>> Signed-off-by: David Hildenbrand <[email protected]>
>>>> ---
>>>> .../platforms/pseries/hotplug-memory.c | 26 +++----------------
>>>> 1 file changed, 3 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> index b2cde1732301..5ace2f9a277e 100644
>>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>>>>
>>>> static bool lmb_is_removable(struct drmem_lmb *lmb)
>>>> {
>>>> - int i, scns_per_block;
>>>> - bool rc = true;
>>>> - unsigned long pfn, block_sz;
>>>> - u64 phys_addr;
>>>> -
>>>> if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>>>> return false;
>>>>
>>>> - block_sz = memory_block_size_bytes();
>>>> - scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>>>> - phys_addr = lmb->base_addr;
>>>> -
>>>> #ifdef CONFIG_FA_DUMP
>>>> /*
>>>> * Don't hot-remove memory that falls in fadump boot memory area
>>>> * and memory that is reserved for capturing old kernel memory.
>>>> */
>>>> - if (is_fadump_memory_area(phys_addr, block_sz))
>>>> + if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>>>> return false;
>>>> #endif
>>>> -
>>>> - for (i = 0; i < scns_per_block; i++) {
>>>> - pfn = PFN_DOWN(phys_addr);
>>>> - if (!pfn_in_present_section(pfn)) {
>>>> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>>> - continue;
>>>> - }
>>>> -
>>>> - rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>>>> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>>> - }
>>>> -
>>>> - return rc;
>>>> + /* device_offline() will determine if we can actually remove this lmb */
>>>> + return true;
>> So I think here swaps the check and do sequence. At least it breaks
>> dlpar_memory_remove_by_count(). It is doable to remove
>> is_mem_section_removable(), but here should be more effort to re-arrange
>> the code.
>>
>
> Thanks Pingfan,
>
> 1. "swaps the check and do sequence":
>
> Partially. Any caller of dlpar_remove_lmb() already has to deal with
> false positives. device_offline() can easily fail after
> dlpar_remove_lmb() == true. It's inherently racy.

Sorry, s/dlpar_remove_lmb/lmb_is_removable/


--
Thanks,

David / dhildenb

2020-04-09 16:27:43

by Pingfan Liu

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()



On 04/09/2020 03:26 PM, David Hildenbrand wrote:
> On 09.04.20 04:59, piliu wrote:
>>
>>
>> On 04/08/2020 10:46 AM, Baoquan He wrote:
>>> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>>>
>>> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>>> blocks as removable"), the user space interface to compute whether a memory
>>>> block can be offlined (exposed via
>>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>>
>>> Pingfan, can you have a look at this change on PPC? Please feel free to
>>> give comments if any concern, or offer ack if it's OK to you.
>>>
>>>>
>>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>>> we'll start by:
>>>> 1. Testing if it contains any holes, and reject if so
>>>> 2. Testing if pages belong to different zones, and reject if so
>>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>>
>>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>>> it can easily result in false positives/negatives. Let's stop manually
>>>> checking is_mem_section_removable(), and let device_offline() handle it
>>>> completely instead. We can remove the racy is_mem_section_removable()
>>>> implementation next.
>>>>
>>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>>> zone lock when isolating), but maybe we should optimize that
>>>> implementation instead if this ever becomes a real problem (after all,
>>>> memory unplug is already an expensive operation). We started using
>>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>>> Implement memory hotplug remove in the kernel"), with the initial
>>>> hotremove support of lmbs.
>>>>
>>>> Cc: Nathan Fontenot <[email protected]>
>>>> Cc: Michael Ellerman <[email protected]>
>>>> Cc: Benjamin Herrenschmidt <[email protected]>
>>>> Cc: Paul Mackerras <[email protected]>
>>>> Cc: Michal Hocko <[email protected]>
>>>> Cc: Andrew Morton <[email protected]>
>>>> Cc: Oscar Salvador <[email protected]>
>>>> Cc: Baoquan He <[email protected]>
>>>> Cc: Wei Yang <[email protected]>
>>>> Signed-off-by: David Hildenbrand <[email protected]>
>>>> ---
>>>> .../platforms/pseries/hotplug-memory.c | 26 +++----------------
>>>> 1 file changed, 3 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> index b2cde1732301..5ace2f9a277e 100644
>>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>>>>
>>>> static bool lmb_is_removable(struct drmem_lmb *lmb)
>>>> {
>>>> - int i, scns_per_block;
>>>> - bool rc = true;
>>>> - unsigned long pfn, block_sz;
>>>> - u64 phys_addr;
>>>> -
>>>> if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>>>> return false;
>>>>
>>>> - block_sz = memory_block_size_bytes();
>>>> - scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>>>> - phys_addr = lmb->base_addr;
>>>> -
>>>> #ifdef CONFIG_FA_DUMP
>>>> /*
>>>> * Don't hot-remove memory that falls in fadump boot memory area
>>>> * and memory that is reserved for capturing old kernel memory.
>>>> */
>>>> - if (is_fadump_memory_area(phys_addr, block_sz))
>>>> + if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>>>> return false;
>>>> #endif
>>>> -
>>>> - for (i = 0; i < scns_per_block; i++) {
>>>> - pfn = PFN_DOWN(phys_addr);
>>>> - if (!pfn_in_present_section(pfn)) {
>>>> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>>> - continue;
>>>> - }
>>>> -
>>>> - rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>>>> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>>> - }
>>>> -
>>>> - return rc;
>>>> + /* device_offline() will determine if we can actually remove this lmb */
>>>> + return true;
>> So I think here swaps the check and do sequence. At least it breaks
>> dlpar_memory_remove_by_count(). It is doable to remove
>> is_mem_section_removable(), but here should be more effort to re-arrange
>> the code.
>>
>
> Thanks Pingfan,
>
> 1. "swaps the check and do sequence":
>
> Partially. Any caller of dlpar_remove_lmb() already has to deal with
> false positives. device_offline() can easily fail after
> dlpar_remove_lmb() == true. It's inherently racy.
>
> 2. "breaks dlpar_memory_remove_by_count()"
>
> Can you elaborate why it "breaks" it? It will simply try to
> offline+remove lmbs, detect that it wasn't able to offline+remove as
> much as it wanted (which could happen before as well easily), and re-add
> the already offlined+removed ones.
>
I overlooked the re-add logic. Then I think
dlpar_memory_remove_by_count() is OK with this patch.
> 3. "more effort to re-arrange the code"
>
> What would be your suggestion?
>
I had thought about merging the two loop "for_each_drmem_lmb()", and do
check inside the loop. But now it is needless.

The only concerned left is "if (lmbs_available < lmbs_to_remove)" fails
to alarm due to the weaken checking in lmb_is_removable(). Then after
heavy migration in offline_pages, we encounters this limit, and need to
re-add them back.

But I think it is a rare case plus hot-remove is also not a quite
frequent event. So it is worth to simplify the code by this patch.

Thanks for your classification.

For [1/2]
Reviewed-by: Pingfan Liu <[email protected]>