Hi,
The last version of this series has been posted here [1]. It has seen
some more serious testing (thanks to Reza Arbab) and fixes for the found
issues. I have also decided to drop patch 1 [2] because it turned out to
be more complicated than I initially thought [3]. Few more patches were
added to deal with expectation on zone/node initialization.
I have rebased on top of the current mmotm-2017-04-07-15-53. It
conflicts with HMM because it touches memory hotplug as
well. We have discussed [4] with Jérôme and he agreed to
rebase on top of this rework [5] so I have reverted his series
before applyig mine. I will help him to resolve the resulting
conflicts. You can find the whole series including the HMM revers in
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git branch
attempts/rewrite-mem_hotplug
Motivation:
Movable onlining is a real hack with many downsides - mainly
reintroduction of lowmem/highmem issues we used to have on 32b systems -
but it is the only way to make the memory hotremove more reliable which
is something that people are asking for.
The current semantic of memory movable onlinening is really cumbersome,
however. The main reason for this is that the udev driven approach is
basically unusable because udev races with the memory probing while only
the last memory block or the one adjacent to the existing zone_movable
are allowed to be onlined movable. In short the criterion for the
successful online_movable changes under udev's feet. A reliable udev
approach would require a 2 phase approach where the first successful
movable online would have to check all the previous blocks and online
them in descending order. This is hard to be considered sane.
This patchset aims at making the onlining semantic more usable. First of
all it allows to online memory movable as long as it doesn't clash with
the existing ZONE_NORMAL. That means that ZONE_NORMAL and ZONE_MOVABLE
cannot overlap. Currently I preserve the original ordering semantic so
the zone always precedes the movable zone but I have plans to remove this
restriction in future because it is not really necessary.
First 3 patches are cleanups which should be ready to be merged right
away (unless I have missed something subtle of course).
Patch 4 deals with ZONE_DEVICE dependencies down the __add_pages path.
Patch 5 deals with implicit assumptions of register_one_node on pgdat
initialization.
Patch 6 is the core of the change. In order to make it easier to review
I have tried it to be as minimalistic as possible and the large code
removal is moved to patch 9.
Patch 7 is a trivial follow up cleanup. Patch 8 fixes sparse warnings
and finally patch 9 removes the unused code.
I have tested the patches in kvm:
# qemu-system-x86_64 -enable-kvm -monitor pty -m 2G,slots=4,maxmem=4G -numa node,mem=1G -numa node,mem=1G ...
and then probed the additional memory by
(qemu) object_add memory-backend-ram,id=mem1,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1
Then I have used this simple script to probe the memory block by hand
# cat probe_memblock.sh
#!/bin/sh
BLOCK_NR=$1
# echo $((0x100000000+$BLOCK_NR*(128<<20))) > /sys/devices/system/memory/probe
# for i in $(seq 10); do sh probe_memblock.sh $i; done
# grep . /sys/devices/system/memory/memory3?/valid_zones 2>/dev/null
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Normal Movable
/sys/devices/system/memory/memory35/valid_zones:Normal Movable
/sys/devices/system/memory/memory36/valid_zones:Normal Movable
/sys/devices/system/memory/memory37/valid_zones:Normal Movable
/sys/devices/system/memory/memory38/valid_zones:Normal Movable
/sys/devices/system/memory/memory39/valid_zones:Normal Movable
The main difference to the original implementation is that all new
memblocks can be both online_kernel and online_movable initially
because there is no clash obviously. For the comparison the original
implementation would have
/sys/devices/system/memory/memory33/valid_zones:Normal
/sys/devices/system/memory/memory34/valid_zones:Normal
/sys/devices/system/memory/memory35/valid_zones:Normal
/sys/devices/system/memory/memory36/valid_zones:Normal
/sys/devices/system/memory/memory37/valid_zones:Normal
/sys/devices/system/memory/memory38/valid_zones:Normal
/sys/devices/system/memory/memory39/valid_zones:Normal Movable
Now
# echo online_movable > /sys/devices/system/memory/memory34/state
# grep . /sys/devices/system/memory/memory3?/valid_zones 2>/dev/null
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Movable
/sys/devices/system/memory/memory35/valid_zones:Movable
/sys/devices/system/memory/memory36/valid_zones:Movable
/sys/devices/system/memory/memory37/valid_zones:Movable
/sys/devices/system/memory/memory38/valid_zones:Movable
/sys/devices/system/memory/memory39/valid_zones:Movable
Block 33 can still be online both kernel and movable while all
the remaining can be only movable.
/proc/zonelist says
Node 0, zone Normal
pages free 0
min 0
low 0
high 0
spanned 0
present 0
--
Node 0, zone Movable
pages free 32753
min 85
low 117
high 149
spanned 32768
present 32768
A new memblock at a lower address will result in a new memblock (32)
which will still allow both Normal and Movable.
# sh probe_memblock.sh 0
# grep . /sys/devices/system/memory/memory3[2-5]/valid_zones 2>/dev/null
/sys/devices/system/memory/memory32/valid_zones:Normal Movable
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Movable
/sys/devices/system/memory/memory35/valid_zones:Movable
and online_kernel will convert it to the zone normal properly
while 33 can be still onlined both ways.
# echo online_kernel > /sys/devices/system/memory/memory32/state
# grep . /sys/devices/system/memory/memory3[2-5]/valid_zones 2>/dev/null
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Movable
/sys/devices/system/memory/memory35/valid_zones:Movable
/proc/zoneinfo will now tell
Node 0, zone Normal
pages free 65441
min 165
low 230
high 295
spanned 65536
present 65536
--
Node 0, zone Movable
pages free 32740
min 82
low 114
high 146
spanned 32768
present 32768
so both zones have one memblock spanned and present.
Onlining 39 should associate this block to the movable zone
# echo online > /sys/devices/system/memory/memory39/state
/proc/zoneinfo will now tell
Node 0, zone Normal
pages free 32765
min 80
low 112
high 144
spanned 32768
present 32768
--
Node 0, zone Movable
pages free 65501
min 160
low 225
high 290
spanned 196608
present 65536
so we will have a movable zone which spans 6 memblocks, 2 present and 4
representing a hole.
Offlining both movable blocks will lead to the zone with no present
pages which is the expected behavior I believe.
# echo offline > /sys/devices/system/memory/memory39/state
# echo offline > /sys/devices/system/memory/memory34/state
# grep -A6 "Movable\|Normal" /proc/zoneinfo
Node 0, zone Normal
pages free 32735
min 90
low 122
high 154
spanned 32768
present 32768
--
Node 0, zone Movable
pages free 0
min 0
low 0
high 0
spanned 196608
present 0
Any thoughts, complains, suggestions?
As a bonus we will get a nice cleanup in the memory hotplug codebase
arch/ia64/mm/init.c | 11 +-
arch/powerpc/mm/mem.c | 12 +-
arch/s390/mm/init.c | 32 +--
arch/sh/mm/init.c | 10 +-
arch/x86/mm/init_32.c | 7 +-
arch/x86/mm/init_64.c | 11 +-
drivers/base/memory.c | 74 ++++---
drivers/base/node.c | 58 ++----
include/linux/memory_hotplug.h | 19 +-
include/linux/mmzone.h | 16 +-
include/linux/node.h | 35 +++-
kernel/memremap.c | 6 +-
mm/memory_hotplug.c | 451 ++++++++++++++---------------------------
mm/page_alloc.c | 8 +-
mm/sparse.c | 3 +-
15 files changed, 284 insertions(+), 469 deletions(-)
Shortlog says:
Michal Hocko (9):
mm: remove return value from init_currently_empty_zone
mm, memory_hotplug: use node instead of zone in can_online_high_movable
mm: drop page_initialized check from get_nid_for_pfn
mm, memory_hotplug: get rid of is_zone_device_section
mm, memory_hotplug: split up register_one_node
mm, memory_hotplug: do not associate hotadded memory to zones until online
mm, memory_hotplug: replace for_device by want_memblock in arch_add_memory
mm, memory_hotplug: fix the section mismatch warning
mm, memory_hotplug: remove unused cruft after memory hotplug rework
[1] http://lkml.kernel.org/r/[email protected]
[2] http://lkml.kernel.org/r/[email protected]
[3] http://lkml.kernel.org/r/[email protected]
[4] http://lkml.kernel.org/r/[email protected]
[5] http://lkml.kernel.org/r/[email protected]
From: Michal Hocko <[email protected]>
the primary purpose of this helper is to query the node state so use
the node id directly. This is a preparatory patch for later changes.
This shouldn't introduce any functional change
Signed-off-by: Michal Hocko <[email protected]>
---
mm/memory_hotplug.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 9ed251811ec3..342332f29364 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -940,15 +940,15 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
* When CONFIG_MOVABLE_NODE, we permit onlining of a node which doesn't have
* normal memory.
*/
-static bool can_online_high_movable(struct zone *zone)
+static bool can_online_high_movable(int nid)
{
return true;
}
#else /* CONFIG_MOVABLE_NODE */
/* ensure every online node has NORMAL memory */
-static bool can_online_high_movable(struct zone *zone)
+static bool can_online_high_movable(int nid)
{
- return node_state(zone_to_nid(zone), N_NORMAL_MEMORY);
+ return node_state(nid, N_NORMAL_MEMORY);
}
#endif /* CONFIG_MOVABLE_NODE */
@@ -1082,7 +1082,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
if ((zone_idx(zone) > ZONE_NORMAL ||
online_type == MMOP_ONLINE_MOVABLE) &&
- !can_online_high_movable(zone))
+ !can_online_high_movable(pfn_to_nid(pfn)))
return -EINVAL;
if (online_type == MMOP_ONLINE_KERNEL) {
--
2.11.0
From: Michal Hocko <[email protected]>
c04fc586c1a4 ("mm: show node to memory section relationship with
symlinks in sysfs") has added means to export memblock<->node
association into the sysfs. It has also introduced get_nid_for_pfn
which is a rather confusing counterpart of pfn_to_nid which checks also
whether the pfn page is already initialized (page_initialized). This
is done by checking page::lru != NULL which doesn't make any sense at
all. Nothing in this path really relies on the lru list being used or
initialized. Just remove it because this will become a problem with
later patches.
Thanks to Reza Arbab for testing which revealed this to be a problem
(http://lkml.kernel.org/r/[email protected])
Signed-off-by: Michal Hocko <[email protected]>
---
drivers/base/node.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 5548f9686016..06294d69779b 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -368,21 +368,14 @@ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)
}
#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
-#define page_initialized(page) (page->lru.next)
-
static int __ref get_nid_for_pfn(unsigned long pfn)
{
- struct page *page;
-
if (!pfn_valid_within(pfn))
return -1;
#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
if (system_state == SYSTEM_BOOTING)
return early_pfn_to_nid(pfn);
#endif
- page = pfn_to_page(pfn);
- if (!page_initialized(page))
- return -1;
return pfn_to_nid(pfn);
}
--
2.11.0
From: Michal Hocko <[email protected]>
zone_for_memory doesn't have any user anymore as well as the whole zone
shifting infrastructure so drop them all.
This shouldn't introduce any functional changes.
Signed-off-by: Michal Hocko <[email protected]>
---
include/linux/memory_hotplug.h | 2 -
mm/memory_hotplug.c | 207 -----------------------------------------
2 files changed, 209 deletions(-)
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index c28d0aba7525..a9985f6c460a 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -274,8 +274,6 @@ extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
void *arg, int (*func)(struct memory_block *, void *));
extern int add_memory(int nid, u64 start, u64 size);
extern int add_memory_resource(int nid, struct resource *resource, bool online);
-extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
- bool for_device);
extern int arch_add_memory(int nid, u64 start, u64 size, bool want_memblock);
extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
unsigned long nr_pages);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index be8be844d340..94e96ca790f6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -299,180 +299,6 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat)
}
#endif /* CONFIG_HAVE_BOOTMEM_INFO_NODE */
-static void __meminit grow_zone_span(struct zone *zone, unsigned long start_pfn,
- unsigned long end_pfn)
-{
- unsigned long old_zone_end_pfn;
-
- zone_span_writelock(zone);
-
- old_zone_end_pfn = zone_end_pfn(zone);
- if (zone_is_empty(zone) || start_pfn < zone->zone_start_pfn)
- zone->zone_start_pfn = start_pfn;
-
- zone->spanned_pages = max(old_zone_end_pfn, end_pfn) -
- zone->zone_start_pfn;
-
- zone_span_writeunlock(zone);
-}
-
-static void resize_zone(struct zone *zone, unsigned long start_pfn,
- unsigned long end_pfn)
-{
- zone_span_writelock(zone);
-
- if (end_pfn - start_pfn) {
- zone->zone_start_pfn = start_pfn;
- zone->spanned_pages = end_pfn - start_pfn;
- } else {
- /*
- * make it consist as free_area_init_core(),
- * if spanned_pages = 0, then keep start_pfn = 0
- */
- zone->zone_start_pfn = 0;
- zone->spanned_pages = 0;
- }
-
- zone_span_writeunlock(zone);
-}
-
-static void fix_zone_id(struct zone *zone, unsigned long start_pfn,
- unsigned long end_pfn)
-{
- enum zone_type zid = zone_idx(zone);
- int nid = zone->zone_pgdat->node_id;
- unsigned long pfn;
-
- for (pfn = start_pfn; pfn < end_pfn; pfn++)
- set_page_links(pfn_to_page(pfn), zid, nid, pfn);
-}
-
-static void __ref ensure_zone_is_initialized(struct zone *zone,
- unsigned long start_pfn, unsigned long num_pages)
-{
- if (!zone_is_initialized(zone))
- init_currently_empty_zone(zone, start_pfn, num_pages);
-}
-
-static int __meminit move_pfn_range_left(struct zone *z1, struct zone *z2,
- unsigned long start_pfn, unsigned long end_pfn)
-{
- unsigned long flags;
- unsigned long z1_start_pfn;
-
- ensure_zone_is_initialized(z1, start_pfn, end_pfn - start_pfn);
-
- pgdat_resize_lock(z1->zone_pgdat, &flags);
-
- /* can't move pfns which are higher than @z2 */
- if (end_pfn > zone_end_pfn(z2))
- goto out_fail;
- /* the move out part must be at the left most of @z2 */
- if (start_pfn > z2->zone_start_pfn)
- goto out_fail;
- /* must included/overlap */
- if (end_pfn <= z2->zone_start_pfn)
- goto out_fail;
-
- /* use start_pfn for z1's start_pfn if z1 is empty */
- if (!zone_is_empty(z1))
- z1_start_pfn = z1->zone_start_pfn;
- else
- z1_start_pfn = start_pfn;
-
- resize_zone(z1, z1_start_pfn, end_pfn);
- resize_zone(z2, end_pfn, zone_end_pfn(z2));
-
- pgdat_resize_unlock(z1->zone_pgdat, &flags);
-
- fix_zone_id(z1, start_pfn, end_pfn);
-
- return 0;
-out_fail:
- pgdat_resize_unlock(z1->zone_pgdat, &flags);
- return -1;
-}
-
-static int __meminit move_pfn_range_right(struct zone *z1, struct zone *z2,
- unsigned long start_pfn, unsigned long end_pfn)
-{
- unsigned long flags;
- unsigned long z2_end_pfn;
-
- ensure_zone_is_initialized(z2, start_pfn, end_pfn - start_pfn);
-
- pgdat_resize_lock(z1->zone_pgdat, &flags);
-
- /* can't move pfns which are lower than @z1 */
- if (z1->zone_start_pfn > start_pfn)
- goto out_fail;
- /* the move out part mast at the right most of @z1 */
- if (zone_end_pfn(z1) > end_pfn)
- goto out_fail;
- /* must included/overlap */
- if (start_pfn >= zone_end_pfn(z1))
- goto out_fail;
-
- /* use end_pfn for z2's end_pfn if z2 is empty */
- if (!zone_is_empty(z2))
- z2_end_pfn = zone_end_pfn(z2);
- else
- z2_end_pfn = end_pfn;
-
- resize_zone(z1, z1->zone_start_pfn, start_pfn);
- resize_zone(z2, start_pfn, z2_end_pfn);
-
- pgdat_resize_unlock(z1->zone_pgdat, &flags);
-
- fix_zone_id(z2, start_pfn, end_pfn);
-
- return 0;
-out_fail:
- pgdat_resize_unlock(z1->zone_pgdat, &flags);
- return -1;
-}
-
-static void __meminit grow_pgdat_span(struct pglist_data *pgdat, unsigned long start_pfn,
- unsigned long end_pfn)
-{
- unsigned long old_pgdat_end_pfn = pgdat_end_pfn(pgdat);
-
- if (!pgdat->node_spanned_pages || start_pfn < pgdat->node_start_pfn)
- pgdat->node_start_pfn = start_pfn;
-
- pgdat->node_spanned_pages = max(old_pgdat_end_pfn, end_pfn) -
- pgdat->node_start_pfn;
-}
-
-static int __meminit __add_zone(struct zone *zone, unsigned long phys_start_pfn)
-{
- struct pglist_data *pgdat = zone->zone_pgdat;
- int nr_pages = PAGES_PER_SECTION;
- int nid = pgdat->node_id;
- int zone_type;
- unsigned long flags, pfn;
-
- zone_type = zone - pgdat->node_zones;
- ensure_zone_is_initialized(zone, phys_start_pfn, nr_pages);
-
- pgdat_resize_lock(zone->zone_pgdat, &flags);
- grow_zone_span(zone, phys_start_pfn, phys_start_pfn + nr_pages);
- grow_pgdat_span(zone->zone_pgdat, phys_start_pfn,
- phys_start_pfn + nr_pages);
- pgdat_resize_unlock(zone->zone_pgdat, &flags);
- memmap_init_zone(nr_pages, nid, zone_type,
- phys_start_pfn, MEMMAP_HOTPLUG);
-
- /* online_page_range is called later and expects pages reserved */
- for (pfn = phys_start_pfn; pfn < phys_start_pfn + nr_pages; pfn++) {
- if (!pfn_valid(pfn))
- continue;
-
- SetPageReserved(pfn_to_page(pfn));
- }
- return 0;
-}
-
static int __meminit __add_section(int nid, unsigned long phys_start_pfn, bool want_memblock)
{
int ret;
@@ -1349,39 +1175,6 @@ static int check_hotplug_memory_range(u64 start, u64 size)
return 0;
}
-/*
- * If movable zone has already been setup, newly added memory should be check.
- * If its address is higher than movable zone, it should be added as movable.
- * Without this check, movable zone may overlap with other zone.
- */
-static int should_add_memory_movable(int nid, u64 start, u64 size)
-{
- unsigned long start_pfn = start >> PAGE_SHIFT;
- pg_data_t *pgdat = NODE_DATA(nid);
- struct zone *movable_zone = pgdat->node_zones + ZONE_MOVABLE;
-
- if (zone_is_empty(movable_zone))
- return 0;
-
- if (movable_zone->zone_start_pfn <= start_pfn)
- return 1;
-
- return 0;
-}
-
-int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
- bool for_device)
-{
-#ifdef CONFIG_ZONE_DEVICE
- if (for_device)
- return ZONE_DEVICE;
-#endif
- if (should_add_memory_movable(nid, start, size))
- return ZONE_MOVABLE;
-
- return zone_default;
-}
-
static int online_memory_block(struct memory_block *mem, void *arg)
{
return device_online(&mem->dev);
--
2.11.0
From: Michal Hocko <[email protected]>
Tobias has reported following section mismatches introduced by "mm,
memory_hotplug: do not associate hotadded memory to zones until online".
WARNING: mm/built-in.o(.text+0x5a1c2): Section mismatch in reference from the function move_pfn_range_to_zone() to the function .meminit.text:memmap_init_zone()
The function move_pfn_range_to_zone() references
the function __meminit memmap_init_zone().
This is often because move_pfn_range_to_zone lacks a __meminit
annotation or the annotation of memmap_init_zone is wrong.
WARNING: mm/built-in.o(.text+0x5a25b): Section mismatch in reference from the function move_pfn_range_to_zone() to the function .meminit.text:init_currently_empty_zone()
The function move_pfn_range_to_zone() references
the function __meminit init_currently_empty_zone().
This is often because move_pfn_range_to_zone lacks a __meminit
annotation or the annotation of init_currently_empty_zone is wrong.
WARNING: vmlinux.o(.text+0x188aa2): Section mismatch in reference from the function move_pfn_range_to_zone() to the function .meminit.text:memmap_init_zone()
The function move_pfn_range_to_zone() references
the function __meminit memmap_init_zone().
This is often because move_pfn_range_to_zone lacks a __meminit
annotation or the annotation of memmap_init_zone is wrong.
WARNING: vmlinux.o(.text+0x188b3b): Section mismatch in reference from the function move_pfn_range_to_zone() to the function .meminit.text:init_currently_empty_zone()
The function move_pfn_range_to_zone() references
the function __meminit init_currently_empty_zone().
This is often because move_pfn_range_to_zone lacks a __meminit
annotation or the annotation of init_currently_empty_zone is wrong.
Both memmap_init_zone and init_currently_empty_zone are marked __meminit
but move_pfn_range_to_zone is used outside of __meminit sections (e.g.
devm_memremap_pages) so we have to hide it from the checker by __ref
annotation.
Reported-by: Tobias Regnery <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
mm/memory_hotplug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 43e84758057b..be8be844d340 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1065,7 +1065,7 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
pgdat->node_spanned_pages = max(start_pfn + nr_pages, old_end_pfn) - pgdat->node_start_pfn;
}
-void move_pfn_range_to_zone(struct zone *zone,
+void __ref move_pfn_range_to_zone(struct zone *zone,
unsigned long start_pfn, unsigned long nr_pages)
{
struct pglist_data *pgdat = zone->zone_pgdat;
--
2.11.0
From: Michal Hocko <[email protected]>
Memory hotplug (add_memory_resource) has to reinitialize node
infrastructure if the node is offline (one which went through the
complete add_memory(); remove_memory() cycle). That involves node
registration to the kobj infrastructure (register_node), the proper
association with cpus (register_cpu_under_node) and finally creation of
node<->memblock symlinks (link_mem_sections).
The last part requires to know node_start_pfn and node_spanned_pages
which we currently have but a leter patch will postpone this
initialization to the onlining phase which happens later. In fact we do
not need to rely on the early pgdat initialization even now because the
currently hot added pfn range is currently known.
Split register_one_node into core which does all the common work for
the boot time NUMA initialization and the hotplug (__register_one_node).
register_one_node keeps the full initialization while hotplug calls
__register_one_node and manually calls link_mem_sections for the proper
range.
This shouldn't introduce any functional change.
Signed-off-by: Michal Hocko <[email protected]>
---
drivers/base/node.c | 51 ++++++++++++++++++++-------------------------------
include/linux/node.h | 35 ++++++++++++++++++++++++++++++++++-
mm/memory_hotplug.c | 17 ++++++++++++++++-
3 files changed, 70 insertions(+), 33 deletions(-)
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 06294d69779b..dff5b53f7905 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -461,10 +461,9 @@ int unregister_mem_sect_under_nodes(struct memory_block *mem_blk,
return 0;
}
-static int link_mem_sections(int nid)
+int link_mem_sections(int nid, unsigned long start_pfn, unsigned long nr_pages)
{
- unsigned long start_pfn = NODE_DATA(nid)->node_start_pfn;
- unsigned long end_pfn = start_pfn + NODE_DATA(nid)->node_spanned_pages;
+ unsigned long end_pfn = start_pfn + nr_pages;
unsigned long pfn;
struct memory_block *mem_blk = NULL;
int err = 0;
@@ -552,10 +551,7 @@ static int node_memory_callback(struct notifier_block *self,
return NOTIFY_OK;
}
#endif /* CONFIG_HUGETLBFS */
-#else /* !CONFIG_MEMORY_HOTPLUG_SPARSE */
-
-static int link_mem_sections(int nid) { return 0; }
-#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
+#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
#if !defined(CONFIG_MEMORY_HOTPLUG_SPARSE) || \
!defined(CONFIG_HUGETLBFS)
@@ -569,39 +565,32 @@ static void init_node_hugetlb_work(int nid) { }
#endif
-int register_one_node(int nid)
+int __register_one_node(int nid)
{
- int error = 0;
+ int p_node = parent_node(nid);
+ struct node *parent = NULL;
+ int error;
int cpu;
- if (node_online(nid)) {
- int p_node = parent_node(nid);
- struct node *parent = NULL;
-
- if (p_node != nid)
- parent = node_devices[p_node];
-
- node_devices[nid] = kzalloc(sizeof(struct node), GFP_KERNEL);
- if (!node_devices[nid])
- return -ENOMEM;
-
- error = register_node(node_devices[nid], nid, parent);
+ if (p_node != nid)
+ parent = node_devices[p_node];
- /* link cpu under this node */
- for_each_present_cpu(cpu) {
- if (cpu_to_node(cpu) == nid)
- register_cpu_under_node(cpu, nid);
- }
+ node_devices[nid] = kzalloc(sizeof(struct node), GFP_KERNEL);
+ if (!node_devices[nid])
+ return -ENOMEM;
- /* link memory sections under this node */
- error = link_mem_sections(nid);
+ error = register_node(node_devices[nid], nid, parent);
- /* initialize work queue for memory hot plug */
- init_node_hugetlb_work(nid);
+ /* link cpu under this node */
+ for_each_present_cpu(cpu) {
+ if (cpu_to_node(cpu) == nid)
+ register_cpu_under_node(cpu, nid);
}
- return error;
+ /* initialize work queue for memory hot plug */
+ init_node_hugetlb_work(nid);
+ return error;
}
void unregister_one_node(int nid)
diff --git a/include/linux/node.h b/include/linux/node.h
index 2115ad5d6f19..d1751beb462c 100644
--- a/include/linux/node.h
+++ b/include/linux/node.h
@@ -30,9 +30,38 @@ struct memory_block;
extern struct node *node_devices[];
typedef void (*node_registration_func_t)(struct node *);
+#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA)
+extern int link_mem_sections(int nid, unsigned long start_pfn, unsigned long nr_pages);
+#else
+static inline int link_mem_sections(int nid, unsigned long start_pfn, unsigned long nr_pages)
+{
+ return 0;
+}
+#endif
+
extern void unregister_node(struct node *node);
#ifdef CONFIG_NUMA
-extern int register_one_node(int nid);
+/* Core of the node registration - only memory hotplug should use this */
+extern int __register_one_node(int nid);
+
+/* Registers an online node */
+static inline int register_one_node(int nid)
+{
+ int error = 0;
+
+ if (node_online(nid)) {
+ struct pglist_data *pgdat = NODE_DATA(nid);
+
+ error = __register_one_node(nid);
+ if (error)
+ return error;
+ /* link memory sections under this node */
+ error = link_mem_sections(nid, pgdat->node_start_pfn, pgdat->node_spanned_pages);
+ }
+
+ return error;
+}
+
extern void unregister_one_node(int nid);
extern int register_cpu_under_node(unsigned int cpu, unsigned int nid);
extern int unregister_cpu_under_node(unsigned int cpu, unsigned int nid);
@@ -46,6 +75,10 @@ extern void register_hugetlbfs_with_node(node_registration_func_t doregister,
node_registration_func_t unregister);
#endif
#else
+static inline int __register_one_node(int nid)
+{
+ return 0;
+}
static inline int register_one_node(int nid)
{
return 0;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 1570b3eea493..f5df0fe15ddf 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1387,7 +1387,22 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
node_set_online(nid);
if (new_node) {
- ret = register_one_node(nid);
+ unsigned long start_pfn = start >> PAGE_SHIFT;
+ unsigned long nr_pages = size >> PAGE_SHIFT;
+
+ ret = __register_one_node(nid);
+ if (ret)
+ goto register_fail;
+
+ /*
+ * link memory sections under this node. This is already
+ * done when creatig memory section in register_new_memory
+ * but that depends to have the node registered so offline
+ * nodes have to go through register_node.
+ * TODO clean up this mess.
+ */
+ ret = link_mem_sections(nid, start_pfn, nr_pages);
+register_fail:
/*
* If sysfs file of new node can't create, cpu on the node
* can't be hot-added. There is no rollback way now.
--
2.11.0
From: Michal Hocko <[email protected]>
arch_add_memory gets for_device argument which then controls whether we
want to create memblocks for created memory sections. Simplify the logic
by telling whether we want memblocks directly rather than going through
pointless negation. This also makes the api easier to understand because
it is clear what we want rather than nothing telling for_device which
can mean anything.
This shouldn't introduce any functional change.
Cc: Dan Williams <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
arch/ia64/mm/init.c | 4 ++--
arch/powerpc/mm/mem.c | 4 ++--
arch/s390/mm/init.c | 4 ++--
arch/sh/mm/init.c | 4 ++--
arch/x86/mm/init_32.c | 4 ++--
arch/x86/mm/init_64.c | 4 ++--
include/linux/memory_hotplug.h | 2 +-
kernel/memremap.c | 2 +-
mm/memory_hotplug.c | 2 +-
9 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index efe46742905a..b02c789e3c86 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -645,13 +645,13 @@ mem_init (void)
}
#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
+int arch_add_memory(int nid, u64 start, u64 size, bool want_memblock)
{
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;
- ret = __add_pages(nid, start_pfn, nr_pages, !for_device);
+ ret = __add_pages(nid, start_pfn, nr_pages, want_memblock);
if (ret)
printk("%s: Problem encountered in __add_pages() as ret=%d\n",
__func__, ret);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index d3decea056a0..2f2e5eaa10e3 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -126,7 +126,7 @@ int __weak remove_section_mapping(unsigned long start, unsigned long end)
return -ENODEV;
}
-int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
+int arch_add_memory(int nid, u64 start, u64 size, bool want_memblock)
{
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
@@ -141,7 +141,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
return -EFAULT;
}
- return __add_pages(nid, start_pfn, nr_pages, !for_device);
+ return __add_pages(nid, start_pfn, nr_pages, want_memblock);
}
#ifdef CONFIG_MEMORY_HOTREMOVE
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 2d9f3f91b08d..597aad4e05d4 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -153,7 +153,7 @@ void __init free_initrd_mem(unsigned long start, unsigned long end)
#endif
#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
+int arch_add_memory(int nid, u64 start, u64 size, bool want_memblock)
{
unsigned long start_pfn = PFN_DOWN(start);
unsigned long size_pages = PFN_DOWN(size);
@@ -163,7 +163,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
if (rc)
return rc;
- rc = __add_pages(nid, start_pfn, size_pages, !for_device);
+ rc = __add_pages(nid, start_pfn, size_pages, want_memblock);
if (rc)
vmem_remove_mapping(start, size);
return rc;
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index 3813a610a2bb..bf726af5f1a5 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -485,14 +485,14 @@ void free_initrd_mem(unsigned long start, unsigned long end)
#endif
#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
+int arch_add_memory(int nid, u64 start, u64 size, bool want_memblock)
{
unsigned long start_pfn = PFN_DOWN(start);
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;
/* We only have ZONE_NORMAL, so this is easy.. */
- ret = __add_pages(nid, start_pfn, nr_pages, !for_device);
+ ret = __add_pages(nid, start_pfn, nr_pages, want_memblock);
if (unlikely(ret))
printk("%s: Failed, __add_pages() == %d\n", __func__, ret);
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 3c66da076053..3423bb4156e5 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -826,12 +826,12 @@ void __init mem_init(void)
}
#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
+int arch_add_memory(int nid, u64 start, u64 size, bool want_memblock)
{
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
- return __add_pages(nid, start_pfn, nr_pages, !for_device);
+ return __add_pages(nid, start_pfn, nr_pages, want_memblock);
}
#ifdef CONFIG_MEMORY_HOTREMOVE
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 07dbd32f6583..754d47cb2847 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -637,7 +637,7 @@ static void update_end_of_memory_vars(u64 start, u64 size)
}
}
-int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
+int arch_add_memory(int nid, u64 start, u64 size, bool want_memblock)
{
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
@@ -645,7 +645,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
init_memory_mapping(start, start + size);
- ret = __add_pages(nid, start_pfn, nr_pages, !for_device);
+ ret = __add_pages(nid, start_pfn, nr_pages, want_memblock);
WARN_ON_ONCE(ret);
/* update max_pfn, max_low_pfn and high_memory */
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 98470ea5536b..c28d0aba7525 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -276,7 +276,7 @@ extern int add_memory(int nid, u64 start, u64 size);
extern int add_memory_resource(int nid, struct resource *resource, bool online);
extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
bool for_device);
-extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
+extern int arch_add_memory(int nid, u64 start, u64 size, bool want_memblock);
extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
unsigned long nr_pages);
extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
diff --git a/kernel/memremap.c b/kernel/memremap.c
index 61aaa41f4e18..ea714eee029c 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -363,7 +363,7 @@ void *devm_memremap_pages(struct device *dev, struct resource *res,
goto err_pfn_remap;
mem_hotplug_begin();
- error = arch_add_memory(nid, align_start, align_size, true);
+ error = arch_add_memory(nid, align_start, align_size, false);
if (!error)
move_pfn_range_to_zone(&NODE_DATA(nid)->node_zones[ZONE_DEVICE],
align_start >> PAGE_SHIFT,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b7351de3978c..43e84758057b 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1427,7 +1427,7 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
}
/* call arch's memory hotadd */
- ret = arch_add_memory(nid, start, size, false);
+ ret = arch_add_memory(nid, start, size, true);
if (ret < 0)
goto error;
--
2.11.0
From: Michal Hocko <[email protected]>
device memory hotplug hooks into regular memory hotplug only half way.
It needs memory sections to track struct pages but there is no
need/desire to associate those sections with memory blocks and export
them to the userspace via sysfs because they cannot be onlined anyway.
This is currently expressed by for_device argument to arch_add_memory
which then makes sure to associate the given memory range with
ZONE_DEVICE. register_new_memory then relies on is_zone_device_section
to distinguish special memory hotplug from the regular one. While this
works now, later patches in this series want to move __add_zone outside
of arch_add_memory path so we have to come up with something else.
Add want_memblock down the __add_pages path and use it to control
whether the section->memblock association should be done. arch_add_memory
then just trivially want memblock for everything but for_device hotplug.
remove_memory_section doesn't need is_zone_device_section either. We can
simply skip all the memblock specific cleanup if there is no memblock
for the given section.
This shouldn't introduce any functional change.
Cc: Dan Williams <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
arch/ia64/mm/init.c | 2 +-
arch/powerpc/mm/mem.c | 2 +-
arch/s390/mm/init.c | 2 +-
arch/sh/mm/init.c | 2 +-
arch/x86/mm/init_32.c | 2 +-
arch/x86/mm/init_64.c | 2 +-
drivers/base/memory.c | 22 ++++++++--------------
include/linux/memory_hotplug.h | 2 +-
mm/memory_hotplug.c | 11 +++++++----
9 files changed, 22 insertions(+), 25 deletions(-)
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 06cdaef54b2e..62085fd902e6 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -657,7 +657,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
zone = pgdat->node_zones +
zone_for_memory(nid, start, size, ZONE_NORMAL, for_device);
- ret = __add_pages(nid, zone, start_pfn, nr_pages);
+ ret = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
if (ret)
printk("%s: Problem encountered in __add_pages() as ret=%d\n",
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 5f844337de21..ea3e09a62f38 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -149,7 +149,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
zone = pgdata->node_zones +
zone_for_memory(nid, start, size, 0, for_device);
- return __add_pages(nid, zone, start_pfn, nr_pages);
+ return __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
}
#ifdef CONFIG_MEMORY_HOTREMOVE
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index bf5b8a0c4ff7..5c84346e5211 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -182,7 +182,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
continue;
nr_pages = (start_pfn + size_pages > zone_end_pfn) ?
zone_end_pfn - start_pfn : size_pages;
- rc = __add_pages(nid, zone, start_pfn, nr_pages);
+ rc = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
if (rc)
break;
start_pfn += nr_pages;
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index 75491862d900..a9d57f75ae8c 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -498,7 +498,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
ret = __add_pages(nid, pgdat->node_zones +
zone_for_memory(nid, start, size, ZONE_NORMAL,
for_device),
- start_pfn, nr_pages);
+ start_pfn, nr_pages, !for_device);
if (unlikely(ret))
printk("%s: Failed, __add_pages() == %d\n", __func__, ret);
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index c68078fd06fd..4b0f05328af0 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -834,7 +834,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
- return __add_pages(nid, zone, start_pfn, nr_pages);
+ return __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
}
#ifdef CONFIG_MEMORY_HOTREMOVE
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 7eef17239378..39cfaee93975 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -652,7 +652,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
init_memory_mapping(start, start + size);
- ret = __add_pages(nid, zone, start_pfn, nr_pages);
+ ret = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
WARN_ON_ONCE(ret);
/* update max_pfn, max_low_pfn and high_memory */
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index cc4f1d0cbffe..89c15e942852 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -685,14 +685,6 @@ static int add_memory_block(int base_section_nr)
return 0;
}
-static bool is_zone_device_section(struct mem_section *ms)
-{
- struct page *page;
-
- page = sparse_decode_mem_map(ms->section_mem_map, __section_nr(ms));
- return is_zone_device_page(page);
-}
-
/*
* need an interface for the VM to add new memory regions,
* but without onlining it.
@@ -702,9 +694,6 @@ int register_new_memory(int nid, struct mem_section *section)
int ret = 0;
struct memory_block *mem;
- if (is_zone_device_section(section))
- return 0;
-
mutex_lock(&mem_sysfs_mutex);
mem = find_memory_block(section);
@@ -741,11 +730,16 @@ static int remove_memory_section(unsigned long node_id,
{
struct memory_block *mem;
- if (is_zone_device_section(section))
- return 0;
-
mutex_lock(&mem_sysfs_mutex);
+
+ /*
+ * Some users of the memory hotplug do not want/need memblock to
+ * track all sections. Skip over those.
+ */
mem = find_memory_block(section);
+ if (!mem)
+ return 0;
+
unregister_mem_sect_under_nodes(mem, __section_nr(section));
mem->section_count--;
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 134a2f69c21a..3c8cf86201c3 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -111,7 +111,7 @@ extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
/* reasonably generic interface to expand the physical pages in a zone */
extern int __add_pages(int nid, struct zone *zone, unsigned long start_pfn,
- unsigned long nr_pages);
+ unsigned long nr_pages, bool want_memblock);
#ifdef CONFIG_NUMA
extern int memory_add_physaddr_to_nid(u64 start);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 342332f29364..1570b3eea493 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -493,7 +493,7 @@ static int __meminit __add_zone(struct zone *zone, unsigned long phys_start_pfn)
}
static int __meminit __add_section(int nid, struct zone *zone,
- unsigned long phys_start_pfn)
+ unsigned long phys_start_pfn, bool want_memblock)
{
int ret;
@@ -510,7 +510,10 @@ static int __meminit __add_section(int nid, struct zone *zone,
if (ret < 0)
return ret;
- return register_new_memory(nid, __pfn_to_section(phys_start_pfn));
+ if (want_memblock)
+ ret = register_new_memory(nid, __pfn_to_section(phys_start_pfn));
+
+ return ret;
}
/*
@@ -520,7 +523,7 @@ static int __meminit __add_section(int nid, struct zone *zone,
* add the new pages.
*/
int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
- unsigned long nr_pages)
+ unsigned long nr_pages, bool want_memblock)
{
unsigned long i;
int err = 0;
@@ -548,7 +551,7 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
}
for (i = start_sec; i <= end_sec; i++) {
- err = __add_section(nid, zone, section_nr_to_pfn(i));
+ err = __add_section(nid, zone, section_nr_to_pfn(i), want_memblock);
/*
* EEXIST is finally dealt with by ioresource collision
--
2.11.0
From: Michal Hocko <[email protected]>
The current memory hotplug implementation relies on having all the
struct pages associate with a zone/node during the physical hotplug phase
(arch_add_memory->__add_pages->__add_section->__add_zone). In the vast
majority of cases this means that they are added to ZONE_NORMAL. This
has been so since 9d99aaa31f59 ("[PATCH] x86_64: Support memory hotadd
without sparsemem") and it wasn't a big deal back then because movable
onlining didn't exist yet.
Much later memory hotplug wanted to (ab)use ZONE_MOVABLE for movable
onlining 511c2aba8f07 ("mm, memory-hotplug: dynamic configure movable
memory and portion memory") and then things got more complicated. Rather
than reconsidering the zone association which was no longer needed
(because the memory hotplug already depended on SPARSEMEM) a convoluted
semantic of zone shifting has been developed. Only the currently last
memblock or the one adjacent to the zone_movable can be onlined movable.
This essentially means that the online type changes as the new memblocks
are added.
Let's simulate memory hot online manually
Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal
/sys/devices/system/memory/memory34/valid_zones:Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Movable Normal
This is an awkward semantic because an udev event is sent as soon as the
block is onlined and an udev handler might want to online it based on
some policy (e.g. association with a node) but it will inherently race
with new blocks showing up.
This patch changes the physical online phase to not associate pages
with any zone at all. All the pages are just marked reserved and wait
for the onlining phase to be associated with the zone as per the online
request. There are only two requirements
- existing ZONE_NORMAL and ZONE_MOVABLE cannot overlap
- ZONE_NORMAL precedes ZONE_MOVABLE in physical addresses
the later on is not an inherent requirement and can be changed in the
future. It preserves the current behavior and made the code slightly
simpler. This is subject to change in future.
This means that the same physical online steps as above will lead to the
following state:
Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal Movable
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal Movable
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal Movable
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Movable
Implementation:
The current move_pfn_range is reimplemented to check the above
requirements (allow_online_pfn_range) and then updates the respective
zone (move_pfn_range_to_zone), the pgdat and links all the pages in the
pfn range with the zone/node. __add_pages is updated to not require the
zone and only initializes sections in the range. This allowed to
simplify the arch_add_memory code (s390 could get rid of quite some
of code).
devm_memremap_pages is the only user of arch_add_memory which relies
on the zone association because it only hooks into the memory hotplug
only half way. It uses it to associate the new memory with ZONE_DEVICE
but doesn't allow it to be {on,off}lined via sysfs. This means that this
particular code path has to call move_pfn_range_to_zone explicitly.
The original zone shifting code is kept in place and will be removed in
the follow up patch for an easier review.
Changes since v1
- we have to associate the page with the node early (in __add_section),
because pfn_to_node depends on struct page containing this
information - based on testing by Reza Arbab
- resize_{zone,pgdat}_range has to check whether they are popoulated -
Reza Arbab
- fix devm_memremap_pages to use pfn rather than physical address -
Jérôme Glisse
- move_pfn_range has to check for intersection with zone_movable rather
than to rely on allow_online_pfn_range(MMOP_ONLINE_MOVABLE) for
MMOP_ONLINE_KEEP
Cc: Lai Jiangshan <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: [email protected]
Acked-by: Heiko Carstens <[email protected]> # For s390 bits
Signed-off-by: Michal Hocko <[email protected]>
---
arch/ia64/mm/init.c | 9 +-
arch/powerpc/mm/mem.c | 10 +-
arch/s390/mm/init.c | 30 +-----
arch/sh/mm/init.c | 8 +-
arch/x86/mm/init_32.c | 5 +-
arch/x86/mm/init_64.c | 9 +-
drivers/base/memory.c | 52 ++++++-----
include/linux/memory_hotplug.h | 13 +--
include/linux/mmzone.h | 14 +++
kernel/memremap.c | 4 +
mm/memory_hotplug.c | 201 +++++++++++++++++++++++++----------------
mm/sparse.c | 3 +-
12 files changed, 186 insertions(+), 172 deletions(-)
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 62085fd902e6..efe46742905a 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -647,18 +647,11 @@ mem_init (void)
#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- pg_data_t *pgdat;
- struct zone *zone;
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;
- pgdat = NODE_DATA(nid);
-
- zone = pgdat->node_zones +
- zone_for_memory(nid, start, size, ZONE_NORMAL, for_device);
- ret = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
-
+ ret = __add_pages(nid, start_pfn, nr_pages, !for_device);
if (ret)
printk("%s: Problem encountered in __add_pages() as ret=%d\n",
__func__, ret);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index ea3e09a62f38..d3decea056a0 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -128,14 +128,10 @@ int __weak remove_section_mapping(unsigned long start, unsigned long end)
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- struct pglist_data *pgdata;
- struct zone *zone;
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
int rc;
- pgdata = NODE_DATA(nid);
-
start = (unsigned long)__va(start);
rc = create_section_mapping(start, start + size);
if (rc) {
@@ -145,11 +141,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
return -EFAULT;
}
- /* this should work for most non-highmem platforms */
- zone = pgdata->node_zones +
- zone_for_memory(nid, start, size, 0, for_device);
-
- return __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
+ return __add_pages(nid, start_pfn, nr_pages, !for_device);
}
#ifdef CONFIG_MEMORY_HOTREMOVE
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 5c84346e5211..2d9f3f91b08d 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -155,41 +155,15 @@ void __init free_initrd_mem(unsigned long start, unsigned long end)
#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- unsigned long zone_start_pfn, zone_end_pfn, nr_pages;
unsigned long start_pfn = PFN_DOWN(start);
unsigned long size_pages = PFN_DOWN(size);
- pg_data_t *pgdat = NODE_DATA(nid);
- struct zone *zone;
- int rc, i;
+ int rc;
rc = vmem_add_mapping(start, size);
if (rc)
return rc;
- for (i = 0; i < MAX_NR_ZONES; i++) {
- zone = pgdat->node_zones + i;
- if (zone_idx(zone) != ZONE_MOVABLE) {
- /* Add range within existing zone limits, if possible */
- zone_start_pfn = zone->zone_start_pfn;
- zone_end_pfn = zone->zone_start_pfn +
- zone->spanned_pages;
- } else {
- /* Add remaining range to ZONE_MOVABLE */
- zone_start_pfn = start_pfn;
- zone_end_pfn = start_pfn + size_pages;
- }
- if (start_pfn < zone_start_pfn || start_pfn >= zone_end_pfn)
- continue;
- nr_pages = (start_pfn + size_pages > zone_end_pfn) ?
- zone_end_pfn - start_pfn : size_pages;
- rc = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
- if (rc)
- break;
- start_pfn += nr_pages;
- size_pages -= nr_pages;
- if (!size_pages)
- break;
- }
+ rc = __add_pages(nid, start_pfn, size_pages, !for_device);
if (rc)
vmem_remove_mapping(start, size);
return rc;
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index a9d57f75ae8c..3813a610a2bb 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -487,18 +487,12 @@ void free_initrd_mem(unsigned long start, unsigned long end)
#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- pg_data_t *pgdat;
unsigned long start_pfn = PFN_DOWN(start);
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;
- pgdat = NODE_DATA(nid);
-
/* We only have ZONE_NORMAL, so this is easy.. */
- ret = __add_pages(nid, pgdat->node_zones +
- zone_for_memory(nid, start, size, ZONE_NORMAL,
- for_device),
- start_pfn, nr_pages, !for_device);
+ ret = __add_pages(nid, start_pfn, nr_pages, !for_device);
if (unlikely(ret))
printk("%s: Failed, __add_pages() == %d\n", __func__, ret);
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 4b0f05328af0..3c66da076053 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -828,13 +828,10 @@ void __init mem_init(void)
#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- struct pglist_data *pgdata = NODE_DATA(nid);
- struct zone *zone = pgdata->node_zones +
- zone_for_memory(nid, start, size, ZONE_HIGHMEM, for_device);
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
- return __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
+ return __add_pages(nid, start_pfn, nr_pages, !for_device);
}
#ifdef CONFIG_MEMORY_HOTREMOVE
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 39cfaee93975..07dbd32f6583 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -637,22 +637,15 @@ static void update_end_of_memory_vars(u64 start, u64 size)
}
}
-/*
- * Memory is added always to NORMAL zone. This means you will never get
- * additional DMA/DMA32 memory.
- */
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- struct pglist_data *pgdat = NODE_DATA(nid);
- struct zone *zone = pgdat->node_zones +
- zone_for_memory(nid, start, size, ZONE_NORMAL, for_device);
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;
init_memory_mapping(start, start + size);
- ret = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
+ ret = __add_pages(nid, start_pfn, nr_pages, !for_device);
WARN_ON_ONCE(ret);
/* update max_pfn, max_low_pfn and high_memory */
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 89c15e942852..1c6fdacbccd3 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -388,39 +388,43 @@ static ssize_t show_valid_zones(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct memory_block *mem = to_memory_block(dev);
- unsigned long start_pfn, end_pfn;
- unsigned long valid_start, valid_end, valid_pages;
+ unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
- struct zone *zone;
- int zone_shift = 0;
+ unsigned long valid_start_pfn, valid_end_pfn;
+ bool append = false;
+ int nid;
- start_pfn = section_nr_to_pfn(mem->start_section_nr);
- end_pfn = start_pfn + nr_pages;
-
- /* The block contains more than one zone can not be offlined. */
- if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, &valid_end))
+ /*
+ * The block contains more than one zone can not be offlined.
+ * This can happen e.g. for ZONE_DMA and ZONE_DMA32
+ */
+ if (!test_pages_in_a_zone(start_pfn, start_pfn + nr_pages, &valid_start_pfn, &valid_end_pfn))
return sprintf(buf, "none\n");
- zone = page_zone(pfn_to_page(valid_start));
- valid_pages = valid_end - valid_start;
-
- /* MMOP_ONLINE_KEEP */
- sprintf(buf, "%s", zone->name);
+ start_pfn = valid_start_pfn;
+ nr_pages = valid_end_pfn - valid_end_pfn;
- /* MMOP_ONLINE_KERNEL */
- zone_can_shift(valid_start, valid_pages, ZONE_NORMAL, &zone_shift);
- if (zone_shift) {
- strcat(buf, " ");
- strcat(buf, (zone + zone_shift)->name);
+ /*
+ * Check the existing zone. Make sure that we do that only on the
+ * online nodes otherwise the page_zone is not reliable
+ */
+ if (mem->state == MEM_ONLINE) {
+ strcat(buf, page_zone(pfn_to_page(start_pfn))->name);
+ goto out;
}
- /* MMOP_ONLINE_MOVABLE */
- zone_can_shift(valid_start, valid_pages, ZONE_MOVABLE, &zone_shift);
- if (zone_shift) {
- strcat(buf, " ");
- strcat(buf, (zone + zone_shift)->name);
+ nid = pfn_to_nid(start_pfn);
+ if (allow_online_pfn_range(nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL)) {
+ strcat(buf, NODE_DATA(nid)->node_zones[ZONE_NORMAL].name);
+ append = true;
}
+ if (allow_online_pfn_range(nid, start_pfn, nr_pages, MMOP_ONLINE_MOVABLE)) {
+ if (append)
+ strcat(buf, " ");
+ strcat(buf, NODE_DATA(nid)->node_zones[ZONE_MOVABLE].name);
+ }
+out:
strcat(buf, "\n");
return strlen(buf);
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 3c8cf86201c3..98470ea5536b 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -109,8 +109,8 @@ extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
unsigned long nr_pages);
#endif /* CONFIG_MEMORY_HOTREMOVE */
-/* reasonably generic interface to expand the physical pages in a zone */
-extern int __add_pages(int nid, struct zone *zone, unsigned long start_pfn,
+/* reasonably generic interface to expand the physical pages */
+extern int __add_pages(int nid, unsigned long start_pfn,
unsigned long nr_pages, bool want_memblock);
#ifdef CONFIG_NUMA
@@ -277,15 +277,16 @@ extern int add_memory_resource(int nid, struct resource *resource, bool online);
extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
bool for_device);
extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
+extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
+ unsigned long nr_pages);
extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
extern bool is_memblock_offlined(struct memory_block *mem);
extern void remove_memory(int nid, u64 start, u64 size);
-extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn);
+extern int sparse_add_one_section(struct pglist_data *pgdat, unsigned long start_pfn);
extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms,
unsigned long map_offset);
extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map,
unsigned long pnum);
-extern bool zone_can_shift(unsigned long pfn, unsigned long nr_pages,
- enum zone_type target, int *zone_shift);
-
+extern bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages,
+ int online_type);
#endif /* __LINUX_MEMORY_HOTPLUG_H */
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0fc121bbf4ff..ec2f987ec549 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -533,6 +533,20 @@ static inline bool zone_is_empty(struct zone *zone)
}
/*
+ * Return true if [start_pfn, start_pfn + nr_pages) range has a non-mpty
+ * intersection with the given zone
+ */
+static inline bool zone_intersects(struct zone *zone,
+ unsigned long start_pfn, unsigned long nr_pages)
+{
+ if (zone->zone_start_pfn <= start_pfn && start_pfn < zone_end_pfn(zone))
+ return true;
+ if (start_pfn + nr_pages > start_pfn && !zone_is_empty(zone))
+ return true;
+ return false;
+}
+
+/*
* The "priority" of VM scanning is how much of the queues we will scan in one
* go. A value of 12 for DEF_PRIORITY implies that we will scan 1/4096th of the
* queues ("queue_length >> 12") during an aging round.
diff --git a/kernel/memremap.c b/kernel/memremap.c
index 07e85e5229da..61aaa41f4e18 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -364,6 +364,10 @@ void *devm_memremap_pages(struct device *dev, struct resource *res,
mem_hotplug_begin();
error = arch_add_memory(nid, align_start, align_size, true);
+ if (!error)
+ move_pfn_range_to_zone(&NODE_DATA(nid)->node_zones[ZONE_DEVICE],
+ align_start >> PAGE_SHIFT,
+ align_size >> PAGE_SHIFT);
mem_hotplug_done();
if (error)
goto err_add_memory;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index f5df0fe15ddf..b7351de3978c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -432,25 +432,6 @@ static int __meminit move_pfn_range_right(struct zone *z1, struct zone *z2,
return -1;
}
-static struct zone * __meminit move_pfn_range(int zone_shift,
- unsigned long start_pfn, unsigned long end_pfn)
-{
- struct zone *zone = page_zone(pfn_to_page(start_pfn));
- int ret = 0;
-
- if (zone_shift < 0)
- ret = move_pfn_range_left(zone + zone_shift, zone,
- start_pfn, end_pfn);
- else if (zone_shift)
- ret = move_pfn_range_right(zone, zone + zone_shift,
- start_pfn, end_pfn);
-
- if (ret)
- return NULL;
-
- return zone + zone_shift;
-}
-
static void __meminit grow_pgdat_span(struct pglist_data *pgdat, unsigned long start_pfn,
unsigned long end_pfn)
{
@@ -492,23 +473,34 @@ static int __meminit __add_zone(struct zone *zone, unsigned long phys_start_pfn)
return 0;
}
-static int __meminit __add_section(int nid, struct zone *zone,
- unsigned long phys_start_pfn, bool want_memblock)
+static int __meminit __add_section(int nid, unsigned long phys_start_pfn, bool want_memblock)
{
int ret;
+ int i;
if (pfn_valid(phys_start_pfn))
return -EEXIST;
- ret = sparse_add_one_section(zone, phys_start_pfn);
-
+ ret = sparse_add_one_section(NODE_DATA(nid), phys_start_pfn);
if (ret < 0)
return ret;
- ret = __add_zone(zone, phys_start_pfn);
+ /*
+ * Make all the pages reserved so that nobody will stumble over half
+ * initialized state.
+ * FIXME: We also have to associate it with a node because pfn_to_node
+ * relies on having page with the proper node.
+ */
+ for (i = 0; i < PAGES_PER_SECTION; i++) {
+ unsigned long pfn = phys_start_pfn + i;
+ struct page *page;
+ if (!pfn_valid(pfn))
+ continue;
- if (ret < 0)
- return ret;
+ page = pfn_to_page(pfn);
+ set_page_node(page, nid);
+ SetPageReserved(page);
+ }
if (want_memblock)
ret = register_new_memory(nid, __pfn_to_section(phys_start_pfn));
@@ -522,7 +514,7 @@ static int __meminit __add_section(int nid, struct zone *zone,
* call this function after deciding the zone to which to
* add the new pages.
*/
-int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
+int __ref __add_pages(int nid, unsigned long phys_start_pfn,
unsigned long nr_pages, bool want_memblock)
{
unsigned long i;
@@ -530,8 +522,6 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
int start_sec, end_sec;
struct vmem_altmap *altmap;
- clear_zone_contiguous(zone);
-
/* during initialize mem_map, align hot-added range to section */
start_sec = pfn_to_section_nr(phys_start_pfn);
end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1);
@@ -551,7 +541,7 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
}
for (i = start_sec; i <= end_sec; i++) {
- err = __add_section(nid, zone, section_nr_to_pfn(i), want_memblock);
+ err = __add_section(nid, section_nr_to_pfn(i), want_memblock);
/*
* EEXIST is finally dealt with by ioresource collision
@@ -564,7 +554,6 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
}
vmemmap_populate_print_last();
out:
- set_zone_contiguous(zone);
return err;
}
EXPORT_SYMBOL_GPL(__add_pages);
@@ -1029,39 +1018,114 @@ static void node_states_set_node(int node, struct memory_notify *arg)
node_set_state(node, N_MEMORY);
}
-bool zone_can_shift(unsigned long pfn, unsigned long nr_pages,
- enum zone_type target, int *zone_shift)
+bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type)
{
- struct zone *zone = page_zone(pfn_to_page(pfn));
- enum zone_type idx = zone_idx(zone);
- int i;
+ struct pglist_data *pgdat = NODE_DATA(nid);
+ struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE];
+ struct zone *normal_zone = &pgdat->node_zones[ZONE_NORMAL];
- *zone_shift = 0;
+ /*
+ * TODO there shouldn't be any inherent reason to have ZONE_NORMAL
+ * physically before ZONE_MOVABLE. All we need is they do not
+ * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE
+ * though so let's stick with it for simplicity for now.
+ * TODO make sure we do not overlap with ZONE_DEVICE
+ */
+ if (online_type == MMOP_ONLINE_KERNEL) {
+ if (!populated_zone(movable_zone))
+ return true;
+ return movable_zone->zone_start_pfn >= pfn + nr_pages;
+ } else if (online_type == MMOP_ONLINE_MOVABLE) {
+ return zone_end_pfn(normal_zone) <= pfn;
+ }
- if (idx < target) {
- /* pages must be at end of current zone */
- if (pfn + nr_pages != zone_end_pfn(zone))
- return false;
+ /* MMOP_ONLINE_KEEP will always succeed and inherits the current zone */
+ return online_type == MMOP_ONLINE_KEEP;
+}
+
+static void __meminit resize_zone_range(struct zone *zone, unsigned long start_pfn,
+ unsigned long nr_pages)
+{
+ unsigned long old_end_pfn = zone_end_pfn(zone);
+
+ if (zone_is_empty(zone) || start_pfn < zone->zone_start_pfn)
+ zone->zone_start_pfn = start_pfn;
+
+ zone->spanned_pages = max(start_pfn + nr_pages, old_end_pfn) - zone->zone_start_pfn;
+}
+
+static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned long start_pfn,
+ unsigned long nr_pages)
+{
+ unsigned long old_end_pfn = pgdat_end_pfn(pgdat);
- /* no zones in use between current zone and target */
- for (i = idx + 1; i < target; i++)
- if (zone_is_initialized(zone - idx + i))
- return false;
+ if (!pgdat->node_spanned_pages || start_pfn < pgdat->node_start_pfn)
+ pgdat->node_start_pfn = start_pfn;
+
+ pgdat->node_spanned_pages = max(start_pfn + nr_pages, old_end_pfn) - pgdat->node_start_pfn;
+}
+
+void move_pfn_range_to_zone(struct zone *zone,
+ unsigned long start_pfn, unsigned long nr_pages)
+{
+ struct pglist_data *pgdat = zone->zone_pgdat;
+ int nid = pgdat->node_id;
+ unsigned long flags;
+ unsigned long i;
+
+ if (zone_is_empty(zone))
+ init_currently_empty_zone(zone, start_pfn, nr_pages);
+
+ clear_zone_contiguous(zone);
+
+ /* TODO Huh pgdat is irqsave while zone is not. It used to be like that before */
+ pgdat_resize_lock(pgdat, &flags);
+ zone_span_writelock(zone);
+ resize_zone_range(zone, start_pfn, nr_pages);
+ zone_span_writeunlock(zone);
+ resize_pgdat_range(pgdat, start_pfn, nr_pages);
+ pgdat_resize_unlock(pgdat, &flags);
+
+ /*
+ * TODO now we have a visible range of pages which are not associated
+ * with their zone properly. Not nice but set_pfnblock_flags_mask
+ * expects the zone spans the pfn range. All the pages in the range
+ * are reserved so nobody should be touching them so we should be safe
+ */
+ memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn, MEMMAP_HOTPLUG);
+ for (i = 0; i < nr_pages; i++) {
+ unsigned long pfn = start_pfn + i;
+ set_page_links(pfn_to_page(pfn), zone_idx(zone), nid, pfn);
}
- if (target < idx) {
- /* pages must be at beginning of current zone */
- if (pfn != zone->zone_start_pfn)
- return false;
+ set_zone_contiguous(zone);
+}
+
+/*
+ * Associates the given pfn range with the given node and the zone appropriate
+ * for the given online type.
+ */
+static struct zone * __meminit move_pfn_range(int online_type, int nid,
+ unsigned long start_pfn, unsigned long nr_pages)
+{
+ struct pglist_data *pgdat = NODE_DATA(nid);
+ struct zone *zone = &pgdat->node_zones[ZONE_NORMAL];
- /* no zones in use between current zone and target */
- for (i = target + 1; i < idx; i++)
- if (zone_is_initialized(zone - idx + i))
- return false;
+ if (online_type == MMOP_ONLINE_KEEP) {
+ struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE];
+ /*
+ * MMOP_ONLINE_KEEP inherits the current zone which is
+ * ZONE_NORMAL by default but we might be within ZONE_MOVABLE
+ * already.
+ */
+ if (zone_intersects(movable_zone, start_pfn, nr_pages))
+ zone = movable_zone;
+ } else if (online_type == MMOP_ONLINE_MOVABLE) {
+ zone = &pgdat->node_zones[ZONE_MOVABLE];
}
- *zone_shift = target - idx;
- return true;
+ move_pfn_range_to_zone(zone, start_pfn, nr_pages);
+ return zone;
}
/* Must be protected by mem_hotplug_begin() */
@@ -1074,29 +1138,16 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
int nid;
int ret;
struct memory_notify arg;
- int zone_shift = 0;
- /*
- * This doesn't need a lock to do pfn_to_page().
- * The section can't be removed here because of the
- * memory_block->state_mutex.
- */
- zone = page_zone(pfn_to_page(pfn));
-
- if ((zone_idx(zone) > ZONE_NORMAL ||
- online_type == MMOP_ONLINE_MOVABLE) &&
- !can_online_high_movable(pfn_to_nid(pfn)))
+ nid = pfn_to_nid(pfn);
+ if (!allow_online_pfn_range(nid, pfn, nr_pages, online_type))
return -EINVAL;
- if (online_type == MMOP_ONLINE_KERNEL) {
- if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, &zone_shift))
- return -EINVAL;
- } else if (online_type == MMOP_ONLINE_MOVABLE) {
- if (!zone_can_shift(pfn, nr_pages, ZONE_MOVABLE, &zone_shift))
- return -EINVAL;
- }
+ if (online_type == MMOP_ONLINE_MOVABLE && !can_online_high_movable(nid))
+ return -EINVAL;
- zone = move_pfn_range(zone_shift, pfn, pfn + nr_pages);
+ /* associate pfn range with the zone */
+ zone = move_pfn_range(online_type, nid, pfn, nr_pages);
if (!zone)
return -EINVAL;
@@ -1104,8 +1155,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
arg.nr_pages = nr_pages;
node_states_check_changes_online(nr_pages, zone, &arg);
- nid = zone_to_nid(zone);
-
ret = memory_notify(MEM_GOING_ONLINE, &arg);
ret = notifier_to_errno(ret);
if (ret)
diff --git a/mm/sparse.c b/mm/sparse.c
index 6903c8fc3085..d75407882598 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -686,10 +686,9 @@ static void free_map_bootmem(struct page *memmap)
* set. If this is <=0, then that means that the passed-in
* map was not consumed and must be freed.
*/
-int __meminit sparse_add_one_section(struct zone *zone, unsigned long start_pfn)
+int __meminit sparse_add_one_section(struct pglist_data *pgdat, unsigned long start_pfn)
{
unsigned long section_nr = pfn_to_section_nr(start_pfn);
- struct pglist_data *pgdat = zone->zone_pgdat;
struct mem_section *ms;
struct page *memmap;
unsigned long *usemap;
--
2.11.0
From: Michal Hocko <[email protected]>
init_currently_empty_zone doesn't have any error to return yet it is
still an int and callers try to be defensive and try to handle potential
error. Remove this nonsense and simplify all callers.
This patch shouldn't have any visible effect
Signed-off-by: Michal Hocko <[email protected]>
---
include/linux/mmzone.h | 2 +-
mm/memory_hotplug.c | 23 +++++------------------
mm/page_alloc.c | 8 ++------
3 files changed, 8 insertions(+), 25 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index ebaccd4e7d8c..0fc121bbf4ff 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -771,7 +771,7 @@ enum memmap_context {
MEMMAP_EARLY,
MEMMAP_HOTPLUG,
};
-extern int init_currently_empty_zone(struct zone *zone, unsigned long start_pfn,
+extern void init_currently_empty_zone(struct zone *zone, unsigned long start_pfn,
unsigned long size);
extern void lruvec_init(struct lruvec *lruvec);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 257166ebdff0..9ed251811ec3 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -347,27 +347,20 @@ static void fix_zone_id(struct zone *zone, unsigned long start_pfn,
set_page_links(pfn_to_page(pfn), zid, nid, pfn);
}
-/* Can fail with -ENOMEM from allocating a wait table with vmalloc() or
- * alloc_bootmem_node_nopanic()/memblock_virt_alloc_node_nopanic() */
-static int __ref ensure_zone_is_initialized(struct zone *zone,
+static void __ref ensure_zone_is_initialized(struct zone *zone,
unsigned long start_pfn, unsigned long num_pages)
{
if (!zone_is_initialized(zone))
- return init_currently_empty_zone(zone, start_pfn, num_pages);
-
- return 0;
+ init_currently_empty_zone(zone, start_pfn, num_pages);
}
static int __meminit move_pfn_range_left(struct zone *z1, struct zone *z2,
unsigned long start_pfn, unsigned long end_pfn)
{
- int ret;
unsigned long flags;
unsigned long z1_start_pfn;
- ret = ensure_zone_is_initialized(z1, start_pfn, end_pfn - start_pfn);
- if (ret)
- return ret;
+ ensure_zone_is_initialized(z1, start_pfn, end_pfn - start_pfn);
pgdat_resize_lock(z1->zone_pgdat, &flags);
@@ -403,13 +396,10 @@ static int __meminit move_pfn_range_left(struct zone *z1, struct zone *z2,
static int __meminit move_pfn_range_right(struct zone *z1, struct zone *z2,
unsigned long start_pfn, unsigned long end_pfn)
{
- int ret;
unsigned long flags;
unsigned long z2_end_pfn;
- ret = ensure_zone_is_initialized(z2, start_pfn, end_pfn - start_pfn);
- if (ret)
- return ret;
+ ensure_zone_is_initialized(z2, start_pfn, end_pfn - start_pfn);
pgdat_resize_lock(z1->zone_pgdat, &flags);
@@ -480,12 +470,9 @@ static int __meminit __add_zone(struct zone *zone, unsigned long phys_start_pfn)
int nid = pgdat->node_id;
int zone_type;
unsigned long flags, pfn;
- int ret;
zone_type = zone - pgdat->node_zones;
- ret = ensure_zone_is_initialized(zone, phys_start_pfn, nr_pages);
- if (ret)
- return ret;
+ ensure_zone_is_initialized(zone, phys_start_pfn, nr_pages);
pgdat_resize_lock(zone->zone_pgdat, &flags);
grow_zone_span(zone, phys_start_pfn, phys_start_pfn + nr_pages);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9c587000d408..0cacba69ab04 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5517,7 +5517,7 @@ static __meminit void zone_pcp_init(struct zone *zone)
zone_batchsize(zone));
}
-int __meminit init_currently_empty_zone(struct zone *zone,
+void __meminit init_currently_empty_zone(struct zone *zone,
unsigned long zone_start_pfn,
unsigned long size)
{
@@ -5535,8 +5535,6 @@ int __meminit init_currently_empty_zone(struct zone *zone,
zone_init_free_lists(zone);
zone->initialized = 1;
-
- return 0;
}
#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
@@ -5999,7 +5997,6 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
{
enum zone_type j;
int nid = pgdat->node_id;
- int ret;
pgdat_resize_init(pgdat);
#ifdef CONFIG_NUMA_BALANCING
@@ -6081,8 +6078,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
set_pageblock_order();
setup_usemap(pgdat, zone, zone_start_pfn, size);
- ret = init_currently_empty_zone(zone, zone_start_pfn, size);
- BUG_ON(ret);
+ init_currently_empty_zone(zone, zone_start_pfn, size);
memmap_init(size, nid, j, zone_start_pfn);
}
}
--
2.11.0
On Mon, 10 Apr 2017 13:03:42 +0200
Michal Hocko <[email protected]> wrote:
> Hi,
> The last version of this series has been posted here [1]. It has seen
> some more serious testing (thanks to Reza Arbab) and fixes for the found
> issues. I have also decided to drop patch 1 [2] because it turned out to
> be more complicated than I initially thought [3]. Few more patches were
> added to deal with expectation on zone/node initialization.
>
> I have rebased on top of the current mmotm-2017-04-07-15-53. It
> conflicts with HMM because it touches memory hotplug as
> well. We have discussed [4] with Jérôme and he agreed to
> rebase on top of this rework [5] so I have reverted his series
> before applyig mine. I will help him to resolve the resulting
> conflicts. You can find the whole series including the HMM revers in
> git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git branch
> attempts/rewrite-mem_hotplug
>
> Motivation:
> Movable onlining is a real hack with many downsides - mainly
> reintroduction of lowmem/highmem issues we used to have on 32b systems -
> but it is the only way to make the memory hotremove more reliable which
> is something that people are asking for.
>
> The current semantic of memory movable onlinening is really cumbersome,
> however. The main reason for this is that the udev driven approach is
> basically unusable because udev races with the memory probing while only
> the last memory block or the one adjacent to the existing zone_movable
> are allowed to be onlined movable. In short the criterion for the
> successful online_movable changes under udev's feet. A reliable udev
> approach would require a 2 phase approach where the first successful
> movable online would have to check all the previous blocks and online
> them in descending order. This is hard to be considered sane.
>
> This patchset aims at making the onlining semantic more usable. First of
> all it allows to online memory movable as long as it doesn't clash with
> the existing ZONE_NORMAL. That means that ZONE_NORMAL and ZONE_MOVABLE
> cannot overlap. Currently I preserve the original ordering semantic so
> the zone always precedes the movable zone but I have plans to remove this
> restriction in future because it is not really necessary.
>
> First 3 patches are cleanups which should be ready to be merged right
> away (unless I have missed something subtle of course).
>
> Patch 4 deals with ZONE_DEVICE dependencies down the __add_pages path.
>
> Patch 5 deals with implicit assumptions of register_one_node on pgdat
> initialization.
>
> Patch 6 is the core of the change. In order to make it easier to review
> I have tried it to be as minimalistic as possible and the large code
> removal is moved to patch 9.
>
> Patch 7 is a trivial follow up cleanup. Patch 8 fixes sparse warnings
> and finally patch 9 removes the unused code.
>
> I have tested the patches in kvm:
> # qemu-system-x86_64 -enable-kvm -monitor pty -m 2G,slots=4,maxmem=4G -numa node,mem=1G -numa node,mem=1G ...
>
> and then probed the additional memory by
> (qemu) object_add memory-backend-ram,id=mem1,size=1G
> (qemu) device_add pc-dimm,id=dimm1,memdev=mem1
Hi Michal,
I've given series some dumb testing, see below for unexpected changes I've noticed.
Using the same CLI as above plus hotpluggable dimms present at startup
(it still uses hotplug path as dimms aren't reported in e820)
-object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
-device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=0
so dimm1 => memory3[23] and dimm0 => memory3[45]
#issue1:
unable to online memblock as NORMAL adjacent to onlined MOVABLE
1: after boot
memory32:offline removable: 0 zones: Normal Movable
memory33:offline removable: 0 zones: Normal Movable
memory34:offline removable: 0 zones: Normal Movable
memory35:offline removable: 0 zones: Normal Movable
2: online as movable 1st dimm
#echo online_movable > memory32/state
#echo online_movable > memory33/state
everything is as expected:
memory32:online removable: 1 zones: Movable
memory33:online removable: 1 zones: Movable
memory34:offline removable: 0 zones: Movable
memory35:offline removable: 0 zones: Movable
3: try to offline memory32 and online as NORMAL
#echo offline > memory32/state
memory32:offline removable: 1 zones: Normal Movable
memory33:online removable: 1 zones: Movable
memory34:offline removable: 0 zones: Movable
memory35:offline removable: 0 zones: Movable
#echo online_kernel > memory32/state
write error: Invalid argument
// that's not what's expected
memory32:offline removable: 1 zones: Normal Movable
memory33:online removable: 1 zones: Movable
memory34:offline removable: 0 zones: Movable
memory35:offline removable: 0 zones: Movable
======
#issue2: dimm1 assigned to node 1 on qemu CLI
memblock is onlined as movable by default
// after boot
memory32:offline removable: 1 zones: Normal
memory33:offline removable: 1 zones: Normal Movable
memory34:offline removable: 1 zones: Normal
memory35:offline removable: 1 zones: Normal Movable
// not related to this issue but notice not all blocks are
// "Normal Movable" when compared when both dimms on node 0 /#issue1/
#echo online_movable > memory33/state
#echo online > memory32/state
memory32:online removable: 1 zones: Movable
memory33:online removable: 1 zones: Movable
before series memory32 goes to zone NORMAL as expected
memory32:online removable: 0 zones: Normal Movable
memory33:online removable: 1 zones: Movable Normal
======
#issue3:
removable flag flipped to non-removable state
// before series at commit ef0b577b6:
memory32:offline removable: 0 zones: Normal Movable
memory33:offline removable: 0 zones: Normal Movable
memory34:offline removable: 0 zones: Normal Movable
memory35:offline removable: 0 zones: Normal Movable
// after series at commit 6a010434
memory32:offline removable: 1 zones: Normal
memory33:offline removable: 1 zones: Normal
memory34:offline removable: 1 zones: Normal
memory35:offline removable: 1 zones: Normal Movable
also looking at #issue1 removable flag state doesn't
seem to be consistent between state changes but maybe that's
been broken before
On Mon 10-04-17 16:27:49, Igor Mammedov wrote:
[...]
> Hi Michal,
>
> I've given series some dumb testing, see below for unexpected changes I've noticed.
>
> Using the same CLI as above plus hotpluggable dimms present at startup
> (it still uses hotplug path as dimms aren't reported in e820)
>
> -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
> -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=0
>
> so dimm1 => memory3[23] and dimm0 => memory3[45]
>
> #issue1:
> unable to online memblock as NORMAL adjacent to onlined MOVABLE
>
> 1: after boot
> memory32:offline removable: 0 zones: Normal Movable
> memory33:offline removable: 0 zones: Normal Movable
> memory34:offline removable: 0 zones: Normal Movable
> memory35:offline removable: 0 zones: Normal Movable
>
> 2: online as movable 1st dimm
>
> #echo online_movable > memory32/state
> #echo online_movable > memory33/state
>
> everything is as expected:
> memory32:online removable: 1 zones: Movable
> memory33:online removable: 1 zones: Movable
> memory34:offline removable: 0 zones: Movable
> memory35:offline removable: 0 zones: Movable
>
> 3: try to offline memory32 and online as NORMAL
>
> #echo offline > memory32/state
> memory32:offline removable: 1 zones: Normal Movable
> memory33:online removable: 1 zones: Movable
> memory34:offline removable: 0 zones: Movable
> memory35:offline removable: 0 zones: Movable
OK, this is not expected. We are not shifting zones anymore so the range
which was online_movable will not become available to the zone Normal.
So this must be something broken down the show_valid_zones path. I will
investigate.
>
> #echo online_kernel > memory32/state
> write error: Invalid argument
> // that's not what's expected
this is proper behavior with the current implementation. Does anything
depend on the zone reusing?
> memory32:offline removable: 1 zones: Normal Movable
> memory33:online removable: 1 zones: Movable
> memory34:offline removable: 0 zones: Movable
> memory35:offline removable: 0 zones: Movable
>
>
> ======
> #issue2: dimm1 assigned to node 1 on qemu CLI
> memblock is onlined as movable by default
>
> // after boot
> memory32:offline removable: 1 zones: Normal
> memory33:offline removable: 1 zones: Normal Movable
> memory34:offline removable: 1 zones: Normal
> memory35:offline removable: 1 zones: Normal Movable
> // not related to this issue but notice not all blocks are
> // "Normal Movable" when compared when both dimms on node 0 /#issue1/
yes they should be
> #echo online_movable > memory33/state
> #echo online > memory32/state
>
> memory32:online removable: 1 zones: Movable
> memory33:online removable: 1 zones: Movable
>
> before series memory32 goes to zone NORMAL as expected
> memory32:online removable: 0 zones: Normal Movable
> memory33:online removable: 1 zones: Movable Normal
OK, I will double check.
> ======
> #issue3:
> removable flag flipped to non-removable state
>
> // before series at commit ef0b577b6:
> memory32:offline removable: 0 zones: Normal Movable
> memory33:offline removable: 0 zones: Normal Movable
> memory34:offline removable: 0 zones: Normal Movable
> memory35:offline removable: 0 zones: Normal Movable
>
> // after series at commit 6a010434
> memory32:offline removable: 1 zones: Normal
> memory33:offline removable: 1 zones: Normal
> memory34:offline removable: 1 zones: Normal
> memory35:offline removable: 1 zones: Normal Movable
>
> also looking at #issue1 removable flag state doesn't
> seem to be consistent between state changes but maybe that's
> been broken before
OK, will have a look.
Thanks for your testing!
--
Michal Hocko
SUSE Labs
[dropping Lai Jiangshan whose email bounces]
On Mon 10-04-17 16:56:39, Michal Hocko wrote:
> On Mon 10-04-17 16:27:49, Igor Mammedov wrote:
> [...]
> > Hi Michal,
> >
> > I've given series some dumb testing, see below for unexpected changes I've noticed.
> >
> > Using the same CLI as above plus hotpluggable dimms present at startup
> > (it still uses hotplug path as dimms aren't reported in e820)
> >
> > -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
> > -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=0
> >
> > so dimm1 => memory3[23] and dimm0 => memory3[45]
> >
> > #issue1:
> > unable to online memblock as NORMAL adjacent to onlined MOVABLE
> >
> > 1: after boot
> > memory32:offline removable: 0 zones: Normal Movable
> > memory33:offline removable: 0 zones: Normal Movable
> > memory34:offline removable: 0 zones: Normal Movable
> > memory35:offline removable: 0 zones: Normal Movable
> >
> > 2: online as movable 1st dimm
> >
> > #echo online_movable > memory32/state
> > #echo online_movable > memory33/state
> >
> > everything is as expected:
> > memory32:online removable: 1 zones: Movable
> > memory33:online removable: 1 zones: Movable
> > memory34:offline removable: 0 zones: Movable
> > memory35:offline removable: 0 zones: Movable
> >
> > 3: try to offline memory32 and online as NORMAL
> >
> > #echo offline > memory32/state
> > memory32:offline removable: 1 zones: Normal Movable
> > memory33:online removable: 1 zones: Movable
> > memory34:offline removable: 0 zones: Movable
> > memory35:offline removable: 0 zones: Movable
>
> OK, this is not expected. We are not shifting zones anymore so the range
> which was online_movable will not become available to the zone Normal.
> So this must be something broken down the show_valid_zones path. I will
> investigate.
Heh, this one is embarrassing
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 1c6fdacbccd3..9677b6b711b0 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -402,7 +402,7 @@ static ssize_t show_valid_zones(struct device *dev,
return sprintf(buf, "none\n");
start_pfn = valid_start_pfn;
- nr_pages = valid_end_pfn - valid_end_pfn;
+ nr_pages = valid_end_pfn - start_pfn;
/*
* Check the existing zone. Make sure that we do that only on the
--
Michal Hocko
SUSE Labs
On Mon 10-04-17 17:22:28, Michal Hocko wrote:
[...]
> Heh, this one is embarrassing
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 1c6fdacbccd3..9677b6b711b0 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -402,7 +402,7 @@ static ssize_t show_valid_zones(struct device *dev,
> return sprintf(buf, "none\n");
>
> start_pfn = valid_start_pfn;
> - nr_pages = valid_end_pfn - valid_end_pfn;
> + nr_pages = valid_end_pfn - start_pfn;
>
> /*
> * Check the existing zone. Make sure that we do that only on the
Btw. while starting into the code I think that allow_online_pfn_range is
also wrong and we need the following
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 94e96ca790f6..035165ceefef 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -858,7 +858,7 @@ bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages,
* TODO make sure we do not overlap with ZONE_DEVICE
*/
if (online_type == MMOP_ONLINE_KERNEL) {
- if (!populated_zone(movable_zone))
+ if (!movable_zone->spanned_pages)
return true;
return movable_zone->zone_start_pfn >= pfn + nr_pages;
} else if (online_type == MMOP_ONLINE_MOVABLE) {
because we would allow ZONE_NORMAL after the full movable zone has been
offlined.
--
Michal Hocko
SUSE Labs
On Mon, Apr 10, 2017 at 01:03:42PM +0200, Michal Hocko wrote:
>This patchset aims at making the onlining semantic more usable. First
>of all it allows to online memory movable as long as it doesn't clash
>with the existing ZONE_NORMAL. That means that ZONE_NORMAL and
>ZONE_MOVABLE cannot overlap. Currently I preserve the original ordering
>semantic so the zone always precedes the movable zone but I have plans
>to remove this restriction in future because it is not really
>necessary.
Thanks for addressing my issues. I see Igor found a few other things to
square away, but FWIW,
Tested-by: Reza Arbab <[email protected]>
--
Reza Arbab
On Mon 10-04-17 16:27:49, Igor Mammedov wrote:
[...]
> #issue3:
> removable flag flipped to non-removable state
>
> // before series at commit ef0b577b6:
> memory32:offline removable: 0 zones: Normal Movable
> memory33:offline removable: 0 zones: Normal Movable
> memory34:offline removable: 0 zones: Normal Movable
> memory35:offline removable: 0 zones: Normal Movable
did you mean _after_ the series because the bellow looks like
the original behavior (at least valid_zones).
> // after series at commit 6a010434
> memory32:offline removable: 1 zones: Normal
> memory33:offline removable: 1 zones: Normal
> memory34:offline removable: 1 zones: Normal
> memory35:offline removable: 1 zones: Normal Movable
>
> also looking at #issue1 removable flag state doesn't
> seem to be consistent between state changes but maybe that's
> been broken before
Well, the file has a very questionable semantic. It doesn't provide
a stable information. Anyway put that aside.
is_pageblock_removable_nolock relies on having zone association
which we do not have yet if the memblock is offline. So we need
the following. I will queue this as a preparatory patch.
---
>From 4f3ebc02f4d552d3fe114787ca8a38cc68702208 Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Mon, 10 Apr 2017 17:59:03 +0200
Subject: [PATCH] mm, memory_hotplug: consider offline memblocks removable
is_pageblock_removable_nolock relies on having zone association to
examine all the page blocks to check whether they are movable or free.
This is just wasting of cycles when the memblock is offline. Later patch
in the series will also change the time when the page is associated with
a zone so we let's bail out early if the memblock is offline.
Reported-by: Igor Mammedov <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
drivers/base/memory.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 9677b6b711b0..0c29ec5598ea 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -128,6 +128,9 @@ static ssize_t show_mem_removable(struct device *dev,
int ret = 1;
struct memory_block *mem = to_memory_block(dev);
+ if (mem->stat != MEM_ONLINE)
+ goto out;
+
for (i = 0; i < sections_per_block; i++) {
if (!present_section_nr(mem->start_section_nr + i))
continue;
@@ -135,6 +138,7 @@ static ssize_t show_mem_removable(struct device *dev,
ret &= is_mem_section_removable(pfn, PAGES_PER_SECTION);
}
+out:
return sprintf(buf, "%d\n", ret);
}
--
2.11.0
--
Michal Hocko
SUSE Labs
On Mon 10-04-17 16:27:49, Igor Mammedov wrote:
[...]
> -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
> -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=0
are you sure both of them should be node=0?
What is the full comman line you use?
--
Michal Hocko
SUSE Labs
On Mon, Apr 10, 2017 at 01:03:46PM +0200, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> device memory hotplug hooks into regular memory hotplug only half way.
> It needs memory sections to track struct pages but there is no
> need/desire to associate those sections with memory blocks and export
> them to the userspace via sysfs because they cannot be onlined anyway.
>
> This is currently expressed by for_device argument to arch_add_memory
> which then makes sure to associate the given memory range with
> ZONE_DEVICE. register_new_memory then relies on is_zone_device_section
> to distinguish special memory hotplug from the regular one. While this
> works now, later patches in this series want to move __add_zone outside
> of arch_add_memory path so we have to come up with something else.
>
> Add want_memblock down the __add_pages path and use it to control
> whether the section->memblock association should be done. arch_add_memory
> then just trivially want memblock for everything but for_device hotplug.
>
> remove_memory_section doesn't need is_zone_device_section either. We can
> simply skip all the memblock specific cleanup if there is no memblock
> for the given section.
>
> This shouldn't introduce any functional change.
>
> Cc: Dan Williams <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
[...]
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 342332f29364..1570b3eea493 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -493,7 +493,7 @@ static int __meminit __add_zone(struct zone *zone, unsigned long phys_start_pfn)
> }
>
> static int __meminit __add_section(int nid, struct zone *zone,
> - unsigned long phys_start_pfn)
> + unsigned long phys_start_pfn, bool want_memblock)
> {
> int ret;
>
> @@ -510,7 +510,10 @@ static int __meminit __add_section(int nid, struct zone *zone,
> if (ret < 0)
> return ret;
>
> - return register_new_memory(nid, __pfn_to_section(phys_start_pfn));
> + if (want_memblock)
> + ret = register_new_memory(nid, __pfn_to_section(phys_start_pfn));
> +
> + return ret;
> }
The above is wrong for ZONE_DEVICE sparse_add_one_section() will return a
positive value (on success) thus ret > 0 and other function in the hotplug
path will interpret positive value as an error.
I suggest something like:
if (!want_memblock)
return 0;
return register_new_memory(nid, __pfn_to_section(phys_start_pfn));
}
instead (also avoid a > 80 columns warning message).
Cheers,
J?r?me
This contains two minor fixes spotted based on testing by Igor Mammedov.
---
>From d829579cc7061255f818f9aeaa3aa2cd82fec75a Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Wed, 29 Mar 2017 16:07:00 +0200
Subject: [PATCH] mm, memory_hotplug: do not associate hotadded memory to zones
until online
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The current memory hotplug implementation relies on having all the
struct pages associate with a zone/node during the physical hotplug phase
(arch_add_memory->__add_pages->__add_section->__add_zone). In the vast
majority of cases this means that they are added to ZONE_NORMAL. This
has been so since 9d99aaa31f59 ("[PATCH] x86_64: Support memory hotadd
without sparsemem") and it wasn't a big deal back then because movable
onlining didn't exist yet.
Much later memory hotplug wanted to (ab)use ZONE_MOVABLE for movable
onlining 511c2aba8f07 ("mm, memory-hotplug: dynamic configure movable
memory and portion memory") and then things got more complicated. Rather
than reconsidering the zone association which was no longer needed
(because the memory hotplug already depended on SPARSEMEM) a convoluted
semantic of zone shifting has been developed. Only the currently last
memblock or the one adjacent to the zone_movable can be onlined movable.
This essentially means that the online type changes as the new memblocks
are added.
Let's simulate memory hot online manually
Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal
/sys/devices/system/memory/memory34/valid_zones:Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Movable Normal
This is an awkward semantic because an udev event is sent as soon as the
block is onlined and an udev handler might want to online it based on
some policy (e.g. association with a node) but it will inherently race
with new blocks showing up.
This patch changes the physical online phase to not associate pages
with any zone at all. All the pages are just marked reserved and wait
for the onlining phase to be associated with the zone as per the online
request. There are only two requirements
- existing ZONE_NORMAL and ZONE_MOVABLE cannot overlap
- ZONE_NORMAL precedes ZONE_MOVABLE in physical addresses
the later on is not an inherent requirement and can be changed in the
future. It preserves the current behavior and made the code slightly
simpler. This is subject to change in future.
This means that the same physical online steps as above will lead to the
following state:
Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal Movable
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal Movable
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Normal Movable
/sys/devices/system/memory/memory32/valid_zones:Normal Movable
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Movable
Implementation:
The current move_pfn_range is reimplemented to check the above
requirements (allow_online_pfn_range) and then updates the respective
zone (move_pfn_range_to_zone), the pgdat and links all the pages in the
pfn range with the zone/node. __add_pages is updated to not require the
zone and only initializes sections in the range. This allowed to
simplify the arch_add_memory code (s390 could get rid of quite some
of code).
devm_memremap_pages is the only user of arch_add_memory which relies
on the zone association because it only hooks into the memory hotplug
only half way. It uses it to associate the new memory with ZONE_DEVICE
but doesn't allow it to be {on,off}lined via sysfs. This means that this
particular code path has to call move_pfn_range_to_zone explicitly.
The original zone shifting code is kept in place and will be removed in
the follow up patch for an easier review.
Changes since v1
- we have to associate the page with the node early (in __add_section),
because pfn_to_node depends on struct page containing this
information - based on testing by Reza Arbab
- resize_{zone,pgdat}_range has to check whether they are popoulated -
Reza Arbab
- fix devm_memremap_pages to use pfn rather than physical address -
J?r?me Glisse
- move_pfn_range has to check for intersection with zone_movable rather
than to rely on allow_online_pfn_range(MMOP_ONLINE_MOVABLE) for
MMOP_ONLINE_KEEP
Changes since v2
- fix show_valid_zones nr_pages calculation
- allow_online_pfn_range has to check managed pages rather than present
Cc: Dan Williams <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: [email protected]
Acked-by: Heiko Carstens <[email protected]> # For s390 bits
Signed-off-by: Michal Hocko <[email protected]>
---
arch/ia64/mm/init.c | 9 +-
arch/powerpc/mm/mem.c | 10 +-
arch/s390/mm/init.c | 30 +-----
arch/sh/mm/init.c | 8 +-
arch/x86/mm/init_32.c | 5 +-
arch/x86/mm/init_64.c | 9 +-
drivers/base/memory.c | 52 ++++++-----
include/linux/memory_hotplug.h | 13 +--
include/linux/mmzone.h | 14 +++
kernel/memremap.c | 4 +
mm/memory_hotplug.c | 201 +++++++++++++++++++++++++----------------
mm/sparse.c | 3 +-
12 files changed, 186 insertions(+), 172 deletions(-)
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 62085fd902e6..efe46742905a 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -647,18 +647,11 @@ mem_init (void)
#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- pg_data_t *pgdat;
- struct zone *zone;
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;
- pgdat = NODE_DATA(nid);
-
- zone = pgdat->node_zones +
- zone_for_memory(nid, start, size, ZONE_NORMAL, for_device);
- ret = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
-
+ ret = __add_pages(nid, start_pfn, nr_pages, !for_device);
if (ret)
printk("%s: Problem encountered in __add_pages() as ret=%d\n",
__func__, ret);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index ea3e09a62f38..d3decea056a0 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -128,14 +128,10 @@ int __weak remove_section_mapping(unsigned long start, unsigned long end)
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- struct pglist_data *pgdata;
- struct zone *zone;
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
int rc;
- pgdata = NODE_DATA(nid);
-
start = (unsigned long)__va(start);
rc = create_section_mapping(start, start + size);
if (rc) {
@@ -145,11 +141,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
return -EFAULT;
}
- /* this should work for most non-highmem platforms */
- zone = pgdata->node_zones +
- zone_for_memory(nid, start, size, 0, for_device);
-
- return __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
+ return __add_pages(nid, start_pfn, nr_pages, !for_device);
}
#ifdef CONFIG_MEMORY_HOTREMOVE
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 5c84346e5211..2d9f3f91b08d 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -155,41 +155,15 @@ void __init free_initrd_mem(unsigned long start, unsigned long end)
#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- unsigned long zone_start_pfn, zone_end_pfn, nr_pages;
unsigned long start_pfn = PFN_DOWN(start);
unsigned long size_pages = PFN_DOWN(size);
- pg_data_t *pgdat = NODE_DATA(nid);
- struct zone *zone;
- int rc, i;
+ int rc;
rc = vmem_add_mapping(start, size);
if (rc)
return rc;
- for (i = 0; i < MAX_NR_ZONES; i++) {
- zone = pgdat->node_zones + i;
- if (zone_idx(zone) != ZONE_MOVABLE) {
- /* Add range within existing zone limits, if possible */
- zone_start_pfn = zone->zone_start_pfn;
- zone_end_pfn = zone->zone_start_pfn +
- zone->spanned_pages;
- } else {
- /* Add remaining range to ZONE_MOVABLE */
- zone_start_pfn = start_pfn;
- zone_end_pfn = start_pfn + size_pages;
- }
- if (start_pfn < zone_start_pfn || start_pfn >= zone_end_pfn)
- continue;
- nr_pages = (start_pfn + size_pages > zone_end_pfn) ?
- zone_end_pfn - start_pfn : size_pages;
- rc = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
- if (rc)
- break;
- start_pfn += nr_pages;
- size_pages -= nr_pages;
- if (!size_pages)
- break;
- }
+ rc = __add_pages(nid, start_pfn, size_pages, !for_device);
if (rc)
vmem_remove_mapping(start, size);
return rc;
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index a9d57f75ae8c..3813a610a2bb 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -487,18 +487,12 @@ void free_initrd_mem(unsigned long start, unsigned long end)
#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- pg_data_t *pgdat;
unsigned long start_pfn = PFN_DOWN(start);
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;
- pgdat = NODE_DATA(nid);
-
/* We only have ZONE_NORMAL, so this is easy.. */
- ret = __add_pages(nid, pgdat->node_zones +
- zone_for_memory(nid, start, size, ZONE_NORMAL,
- for_device),
- start_pfn, nr_pages, !for_device);
+ ret = __add_pages(nid, start_pfn, nr_pages, !for_device);
if (unlikely(ret))
printk("%s: Failed, __add_pages() == %d\n", __func__, ret);
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 4b0f05328af0..3c66da076053 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -828,13 +828,10 @@ void __init mem_init(void)
#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- struct pglist_data *pgdata = NODE_DATA(nid);
- struct zone *zone = pgdata->node_zones +
- zone_for_memory(nid, start, size, ZONE_HIGHMEM, for_device);
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
- return __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
+ return __add_pages(nid, start_pfn, nr_pages, !for_device);
}
#ifdef CONFIG_MEMORY_HOTREMOVE
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 39cfaee93975..07dbd32f6583 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -637,22 +637,15 @@ static void update_end_of_memory_vars(u64 start, u64 size)
}
}
-/*
- * Memory is added always to NORMAL zone. This means you will never get
- * additional DMA/DMA32 memory.
- */
int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
- struct pglist_data *pgdat = NODE_DATA(nid);
- struct zone *zone = pgdat->node_zones +
- zone_for_memory(nid, start, size, ZONE_NORMAL, for_device);
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;
init_memory_mapping(start, start + size);
- ret = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
+ ret = __add_pages(nid, start_pfn, nr_pages, !for_device);
WARN_ON_ONCE(ret);
/* update max_pfn, max_low_pfn and high_memory */
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index cf413e25cfdd..0c29ec5598ea 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -392,39 +392,43 @@ static ssize_t show_valid_zones(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct memory_block *mem = to_memory_block(dev);
- unsigned long start_pfn, end_pfn;
- unsigned long valid_start, valid_end, valid_pages;
+ unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
- struct zone *zone;
- int zone_shift = 0;
+ unsigned long valid_start_pfn, valid_end_pfn;
+ bool append = false;
+ int nid;
- start_pfn = section_nr_to_pfn(mem->start_section_nr);
- end_pfn = start_pfn + nr_pages;
-
- /* The block contains more than one zone can not be offlined. */
- if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, &valid_end))
+ /*
+ * The block contains more than one zone can not be offlined.
+ * This can happen e.g. for ZONE_DMA and ZONE_DMA32
+ */
+ if (!test_pages_in_a_zone(start_pfn, start_pfn + nr_pages, &valid_start_pfn, &valid_end_pfn))
return sprintf(buf, "none\n");
- zone = page_zone(pfn_to_page(valid_start));
- valid_pages = valid_end - valid_start;
-
- /* MMOP_ONLINE_KEEP */
- sprintf(buf, "%s", zone->name);
+ start_pfn = valid_start_pfn;
+ nr_pages = valid_end_pfn - start_pfn;
- /* MMOP_ONLINE_KERNEL */
- zone_can_shift(valid_start, valid_pages, ZONE_NORMAL, &zone_shift);
- if (zone_shift) {
- strcat(buf, " ");
- strcat(buf, (zone + zone_shift)->name);
+ /*
+ * Check the existing zone. Make sure that we do that only on the
+ * online nodes otherwise the page_zone is not reliable
+ */
+ if (mem->state == MEM_ONLINE) {
+ strcat(buf, page_zone(pfn_to_page(start_pfn))->name);
+ goto out;
}
- /* MMOP_ONLINE_MOVABLE */
- zone_can_shift(valid_start, valid_pages, ZONE_MOVABLE, &zone_shift);
- if (zone_shift) {
- strcat(buf, " ");
- strcat(buf, (zone + zone_shift)->name);
+ nid = pfn_to_nid(start_pfn);
+ if (allow_online_pfn_range(nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL)) {
+ strcat(buf, NODE_DATA(nid)->node_zones[ZONE_NORMAL].name);
+ append = true;
}
+ if (allow_online_pfn_range(nid, start_pfn, nr_pages, MMOP_ONLINE_MOVABLE)) {
+ if (append)
+ strcat(buf, " ");
+ strcat(buf, NODE_DATA(nid)->node_zones[ZONE_MOVABLE].name);
+ }
+out:
strcat(buf, "\n");
return strlen(buf);
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 3c8cf86201c3..98470ea5536b 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -109,8 +109,8 @@ extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
unsigned long nr_pages);
#endif /* CONFIG_MEMORY_HOTREMOVE */
-/* reasonably generic interface to expand the physical pages in a zone */
-extern int __add_pages(int nid, struct zone *zone, unsigned long start_pfn,
+/* reasonably generic interface to expand the physical pages */
+extern int __add_pages(int nid, unsigned long start_pfn,
unsigned long nr_pages, bool want_memblock);
#ifdef CONFIG_NUMA
@@ -277,15 +277,16 @@ extern int add_memory_resource(int nid, struct resource *resource, bool online);
extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
bool for_device);
extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
+extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
+ unsigned long nr_pages);
extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
extern bool is_memblock_offlined(struct memory_block *mem);
extern void remove_memory(int nid, u64 start, u64 size);
-extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn);
+extern int sparse_add_one_section(struct pglist_data *pgdat, unsigned long start_pfn);
extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms,
unsigned long map_offset);
extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map,
unsigned long pnum);
-extern bool zone_can_shift(unsigned long pfn, unsigned long nr_pages,
- enum zone_type target, int *zone_shift);
-
+extern bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages,
+ int online_type);
#endif /* __LINUX_MEMORY_HOTPLUG_H */
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0fc121bbf4ff..ec2f987ec549 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -533,6 +533,20 @@ static inline bool zone_is_empty(struct zone *zone)
}
/*
+ * Return true if [start_pfn, start_pfn + nr_pages) range has a non-mpty
+ * intersection with the given zone
+ */
+static inline bool zone_intersects(struct zone *zone,
+ unsigned long start_pfn, unsigned long nr_pages)
+{
+ if (zone->zone_start_pfn <= start_pfn && start_pfn < zone_end_pfn(zone))
+ return true;
+ if (start_pfn + nr_pages > start_pfn && !zone_is_empty(zone))
+ return true;
+ return false;
+}
+
+/*
* The "priority" of VM scanning is how much of the queues we will scan in one
* go. A value of 12 for DEF_PRIORITY implies that we will scan 1/4096th of the
* queues ("queue_length >> 12") during an aging round.
diff --git a/kernel/memremap.c b/kernel/memremap.c
index 07e85e5229da..61aaa41f4e18 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -364,6 +364,10 @@ void *devm_memremap_pages(struct device *dev, struct resource *res,
mem_hotplug_begin();
error = arch_add_memory(nid, align_start, align_size, true);
+ if (!error)
+ move_pfn_range_to_zone(&NODE_DATA(nid)->node_zones[ZONE_DEVICE],
+ align_start >> PAGE_SHIFT,
+ align_size >> PAGE_SHIFT);
mem_hotplug_done();
if (error)
goto err_add_memory;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index f5df0fe15ddf..a4fc8a04f33c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -432,25 +432,6 @@ static int __meminit move_pfn_range_right(struct zone *z1, struct zone *z2,
return -1;
}
-static struct zone * __meminit move_pfn_range(int zone_shift,
- unsigned long start_pfn, unsigned long end_pfn)
-{
- struct zone *zone = page_zone(pfn_to_page(start_pfn));
- int ret = 0;
-
- if (zone_shift < 0)
- ret = move_pfn_range_left(zone + zone_shift, zone,
- start_pfn, end_pfn);
- else if (zone_shift)
- ret = move_pfn_range_right(zone, zone + zone_shift,
- start_pfn, end_pfn);
-
- if (ret)
- return NULL;
-
- return zone + zone_shift;
-}
-
static void __meminit grow_pgdat_span(struct pglist_data *pgdat, unsigned long start_pfn,
unsigned long end_pfn)
{
@@ -492,23 +473,34 @@ static int __meminit __add_zone(struct zone *zone, unsigned long phys_start_pfn)
return 0;
}
-static int __meminit __add_section(int nid, struct zone *zone,
- unsigned long phys_start_pfn, bool want_memblock)
+static int __meminit __add_section(int nid, unsigned long phys_start_pfn, bool want_memblock)
{
int ret;
+ int i;
if (pfn_valid(phys_start_pfn))
return -EEXIST;
- ret = sparse_add_one_section(zone, phys_start_pfn);
-
+ ret = sparse_add_one_section(NODE_DATA(nid), phys_start_pfn);
if (ret < 0)
return ret;
- ret = __add_zone(zone, phys_start_pfn);
+ /*
+ * Make all the pages reserved so that nobody will stumble over half
+ * initialized state.
+ * FIXME: We also have to associate it with a node because pfn_to_node
+ * relies on having page with the proper node.
+ */
+ for (i = 0; i < PAGES_PER_SECTION; i++) {
+ unsigned long pfn = phys_start_pfn + i;
+ struct page *page;
+ if (!pfn_valid(pfn))
+ continue;
- if (ret < 0)
- return ret;
+ page = pfn_to_page(pfn);
+ set_page_node(page, nid);
+ SetPageReserved(page);
+ }
if (want_memblock)
ret = register_new_memory(nid, __pfn_to_section(phys_start_pfn));
@@ -522,7 +514,7 @@ static int __meminit __add_section(int nid, struct zone *zone,
* call this function after deciding the zone to which to
* add the new pages.
*/
-int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
+int __ref __add_pages(int nid, unsigned long phys_start_pfn,
unsigned long nr_pages, bool want_memblock)
{
unsigned long i;
@@ -530,8 +522,6 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
int start_sec, end_sec;
struct vmem_altmap *altmap;
- clear_zone_contiguous(zone);
-
/* during initialize mem_map, align hot-added range to section */
start_sec = pfn_to_section_nr(phys_start_pfn);
end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1);
@@ -551,7 +541,7 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
}
for (i = start_sec; i <= end_sec; i++) {
- err = __add_section(nid, zone, section_nr_to_pfn(i), want_memblock);
+ err = __add_section(nid, section_nr_to_pfn(i), want_memblock);
/*
* EEXIST is finally dealt with by ioresource collision
@@ -564,7 +554,6 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
}
vmemmap_populate_print_last();
out:
- set_zone_contiguous(zone);
return err;
}
EXPORT_SYMBOL_GPL(__add_pages);
@@ -1029,39 +1018,114 @@ static void node_states_set_node(int node, struct memory_notify *arg)
node_set_state(node, N_MEMORY);
}
-bool zone_can_shift(unsigned long pfn, unsigned long nr_pages,
- enum zone_type target, int *zone_shift)
+bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type)
{
- struct zone *zone = page_zone(pfn_to_page(pfn));
- enum zone_type idx = zone_idx(zone);
- int i;
+ struct pglist_data *pgdat = NODE_DATA(nid);
+ struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE];
+ struct zone *normal_zone = &pgdat->node_zones[ZONE_NORMAL];
- *zone_shift = 0;
+ /*
+ * TODO there shouldn't be any inherent reason to have ZONE_NORMAL
+ * physically before ZONE_MOVABLE. All we need is they do not
+ * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE
+ * though so let's stick with it for simplicity for now.
+ * TODO make sure we do not overlap with ZONE_DEVICE
+ */
+ if (online_type == MMOP_ONLINE_KERNEL) {
+ if (!movable_zone->spanned_pages)
+ return true;
+ return movable_zone->zone_start_pfn >= pfn + nr_pages;
+ } else if (online_type == MMOP_ONLINE_MOVABLE) {
+ return zone_end_pfn(normal_zone) <= pfn;
+ }
- if (idx < target) {
- /* pages must be at end of current zone */
- if (pfn + nr_pages != zone_end_pfn(zone))
- return false;
+ /* MMOP_ONLINE_KEEP will always succeed and inherits the current zone */
+ return online_type == MMOP_ONLINE_KEEP;
+}
+
+static void __meminit resize_zone_range(struct zone *zone, unsigned long start_pfn,
+ unsigned long nr_pages)
+{
+ unsigned long old_end_pfn = zone_end_pfn(zone);
+
+ if (zone_is_empty(zone) || start_pfn < zone->zone_start_pfn)
+ zone->zone_start_pfn = start_pfn;
+
+ zone->spanned_pages = max(start_pfn + nr_pages, old_end_pfn) - zone->zone_start_pfn;
+}
+
+static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned long start_pfn,
+ unsigned long nr_pages)
+{
+ unsigned long old_end_pfn = pgdat_end_pfn(pgdat);
- /* no zones in use between current zone and target */
- for (i = idx + 1; i < target; i++)
- if (zone_is_initialized(zone - idx + i))
- return false;
+ if (!pgdat->node_spanned_pages || start_pfn < pgdat->node_start_pfn)
+ pgdat->node_start_pfn = start_pfn;
+
+ pgdat->node_spanned_pages = max(start_pfn + nr_pages, old_end_pfn) - pgdat->node_start_pfn;
+}
+
+void move_pfn_range_to_zone(struct zone *zone,
+ unsigned long start_pfn, unsigned long nr_pages)
+{
+ struct pglist_data *pgdat = zone->zone_pgdat;
+ int nid = pgdat->node_id;
+ unsigned long flags;
+ unsigned long i;
+
+ if (zone_is_empty(zone))
+ init_currently_empty_zone(zone, start_pfn, nr_pages);
+
+ clear_zone_contiguous(zone);
+
+ /* TODO Huh pgdat is irqsave while zone is not. It used to be like that before */
+ pgdat_resize_lock(pgdat, &flags);
+ zone_span_writelock(zone);
+ resize_zone_range(zone, start_pfn, nr_pages);
+ zone_span_writeunlock(zone);
+ resize_pgdat_range(pgdat, start_pfn, nr_pages);
+ pgdat_resize_unlock(pgdat, &flags);
+
+ /*
+ * TODO now we have a visible range of pages which are not associated
+ * with their zone properly. Not nice but set_pfnblock_flags_mask
+ * expects the zone spans the pfn range. All the pages in the range
+ * are reserved so nobody should be touching them so we should be safe
+ */
+ memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn, MEMMAP_HOTPLUG);
+ for (i = 0; i < nr_pages; i++) {
+ unsigned long pfn = start_pfn + i;
+ set_page_links(pfn_to_page(pfn), zone_idx(zone), nid, pfn);
}
- if (target < idx) {
- /* pages must be at beginning of current zone */
- if (pfn != zone->zone_start_pfn)
- return false;
+ set_zone_contiguous(zone);
+}
+
+/*
+ * Associates the given pfn range with the given node and the zone appropriate
+ * for the given online type.
+ */
+static struct zone * __meminit move_pfn_range(int online_type, int nid,
+ unsigned long start_pfn, unsigned long nr_pages)
+{
+ struct pglist_data *pgdat = NODE_DATA(nid);
+ struct zone *zone = &pgdat->node_zones[ZONE_NORMAL];
- /* no zones in use between current zone and target */
- for (i = target + 1; i < idx; i++)
- if (zone_is_initialized(zone - idx + i))
- return false;
+ if (online_type == MMOP_ONLINE_KEEP) {
+ struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE];
+ /*
+ * MMOP_ONLINE_KEEP inherits the current zone which is
+ * ZONE_NORMAL by default but we might be within ZONE_MOVABLE
+ * already.
+ */
+ if (zone_intersects(movable_zone, start_pfn, nr_pages))
+ zone = movable_zone;
+ } else if (online_type == MMOP_ONLINE_MOVABLE) {
+ zone = &pgdat->node_zones[ZONE_MOVABLE];
}
- *zone_shift = target - idx;
- return true;
+ move_pfn_range_to_zone(zone, start_pfn, nr_pages);
+ return zone;
}
/* Must be protected by mem_hotplug_begin() */
@@ -1074,29 +1138,16 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
int nid;
int ret;
struct memory_notify arg;
- int zone_shift = 0;
- /*
- * This doesn't need a lock to do pfn_to_page().
- * The section can't be removed here because of the
- * memory_block->state_mutex.
- */
- zone = page_zone(pfn_to_page(pfn));
-
- if ((zone_idx(zone) > ZONE_NORMAL ||
- online_type == MMOP_ONLINE_MOVABLE) &&
- !can_online_high_movable(pfn_to_nid(pfn)))
+ nid = pfn_to_nid(pfn);
+ if (!allow_online_pfn_range(nid, pfn, nr_pages, online_type))
return -EINVAL;
- if (online_type == MMOP_ONLINE_KERNEL) {
- if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, &zone_shift))
- return -EINVAL;
- } else if (online_type == MMOP_ONLINE_MOVABLE) {
- if (!zone_can_shift(pfn, nr_pages, ZONE_MOVABLE, &zone_shift))
- return -EINVAL;
- }
+ if (online_type == MMOP_ONLINE_MOVABLE && !can_online_high_movable(nid))
+ return -EINVAL;
- zone = move_pfn_range(zone_shift, pfn, pfn + nr_pages);
+ /* associate pfn range with the zone */
+ zone = move_pfn_range(online_type, nid, pfn, nr_pages);
if (!zone)
return -EINVAL;
@@ -1104,8 +1155,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
arg.nr_pages = nr_pages;
node_states_check_changes_online(nr_pages, zone, &arg);
- nid = zone_to_nid(zone);
-
ret = memory_notify(MEM_GOING_ONLINE, &arg);
ret = notifier_to_errno(ret);
if (ret)
diff --git a/mm/sparse.c b/mm/sparse.c
index 6903c8fc3085..d75407882598 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -686,10 +686,9 @@ static void free_map_bootmem(struct page *memmap)
* set. If this is <=0, then that means that the passed-in
* map was not consumed and must be freed.
*/
-int __meminit sparse_add_one_section(struct zone *zone, unsigned long start_pfn)
+int __meminit sparse_add_one_section(struct pglist_data *pgdat, unsigned long start_pfn)
{
unsigned long section_nr = pfn_to_section_nr(start_pfn);
- struct pglist_data *pgdat = zone->zone_pgdat;
struct mem_section *ms;
struct page *memmap;
unsigned long *usemap;
--
2.11.0
--
Michal Hocko
SUSE Labs
On Mon 10-04-17 12:20:02, Jerome Glisse wrote:
> On Mon, Apr 10, 2017 at 01:03:46PM +0200, Michal Hocko wrote:
[...]
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 342332f29364..1570b3eea493 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -493,7 +493,7 @@ static int __meminit __add_zone(struct zone *zone, unsigned long phys_start_pfn)
> > }
> >
> > static int __meminit __add_section(int nid, struct zone *zone,
> > - unsigned long phys_start_pfn)
> > + unsigned long phys_start_pfn, bool want_memblock)
> > {
> > int ret;
> >
> > @@ -510,7 +510,10 @@ static int __meminit __add_section(int nid, struct zone *zone,
> > if (ret < 0)
> > return ret;
> >
> > - return register_new_memory(nid, __pfn_to_section(phys_start_pfn));
> > + if (want_memblock)
> > + ret = register_new_memory(nid, __pfn_to_section(phys_start_pfn));
> > +
> > + return ret;
> > }
>
> The above is wrong for ZONE_DEVICE sparse_add_one_section() will return a
> positive value (on success) thus ret > 0 and other function in the hotplug
> path will interpret positive value as an error.
>
> I suggest something like:
> if (!want_memblock)
> return 0;
>
> return register_new_memory(nid, __pfn_to_section(phys_start_pfn));
> }
You are right! I will fold the following. Thanks!
---
>From cc44b4a465b889910e74b3ccc2d12f4dd1c79065 Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Mon, 10 Apr 2017 18:29:11 +0200
Subject: [PATCH] fold me "mm, memory_hotplug: get rid of
is_zone_device_section"
- return 0 want_memblock == 0 from __add_section as per Jerome Glisse
---
mm/memory_hotplug.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 035165ceefef..9942d8937d0a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -299,7 +299,8 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat)
}
#endif /* CONFIG_HAVE_BOOTMEM_INFO_NODE */
-static int __meminit __add_section(int nid, unsigned long phys_start_pfn, bool want_memblock)
+static int __meminit __add_section(int nid, unsigned long phys_start_pfn,
+ bool want_memblock)
{
int ret;
int i;
@@ -328,10 +329,10 @@ static int __meminit __add_section(int nid, unsigned long phys_start_pfn, bool w
SetPageReserved(page);
}
- if (want_memblock)
- ret = register_new_memory(nid, __pfn_to_section(phys_start_pfn));
+ if (!want_memblock)
+ return 0
- return ret;
+ return register_new_memory(nid, __pfn_to_section(phys_start_pfn));
}
/*
--
2.11.0
--
Michal Hocko
SUSE Labs
On Mon, Apr 10, 2017 at 01:03:42PM +0200, Michal Hocko wrote:
> Hi,
> The last version of this series has been posted here [1]. It has seen
> some more serious testing (thanks to Reza Arbab) and fixes for the found
> issues. I have also decided to drop patch 1 [2] because it turned out to
> be more complicated than I initially thought [3]. Few more patches were
> added to deal with expectation on zone/node initialization.
>
> I have rebased on top of the current mmotm-2017-04-07-15-53. It
> conflicts with HMM because it touches memory hotplug as
> well. We have discussed [4] with J?r?me and he agreed to
> rebase on top of this rework [5] so I have reverted his series
> before applyig mine. I will help him to resolve the resulting
> conflicts. You can find the whole series including the HMM revers in
> git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git branch
> attempts/rewrite-mem_hotplug
>
So updated HMM patchset :
https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-v20
I am not posting yet as it seems there is couple thing you need to
fix in your patchset first. However if you could review :
https://cgit.freedesktop.org/~glisse/linux/commit/?h=hmm-v20&id=84fc68534e781cf6125d02b3bfdba4a51e82d9c9
As it was your idea, i just want to make sure i didn't denatured
it :)
Also as side note, v20 fix build issue by restricting HMM to x86-64
which is safer than pretending this can be use on any random arch
as build failures i am getting clearly shows that thing i assumed to
be true on all arch aren't.
Cheers,
J?r?me
On Mon 10-04-17 12:35:53, Jerome Glisse wrote:
> On Mon, Apr 10, 2017 at 01:03:42PM +0200, Michal Hocko wrote:
> > Hi,
> > The last version of this series has been posted here [1]. It has seen
> > some more serious testing (thanks to Reza Arbab) and fixes for the found
> > issues. I have also decided to drop patch 1 [2] because it turned out to
> > be more complicated than I initially thought [3]. Few more patches were
> > added to deal with expectation on zone/node initialization.
> >
> > I have rebased on top of the current mmotm-2017-04-07-15-53. It
> > conflicts with HMM because it touches memory hotplug as
> > well. We have discussed [4] with J?r?me and he agreed to
> > rebase on top of this rework [5] so I have reverted his series
> > before applyig mine. I will help him to resolve the resulting
> > conflicts. You can find the whole series including the HMM revers in
> > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git branch
> > attempts/rewrite-mem_hotplug
> >
>
> So updated HMM patchset :
> https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-v20
>
> I am not posting yet as it seems there is couple thing you need to
> fix in your patchset first. However if you could review :
I assume I will resubmit v3 after all the feedback is addressed here.
> https://cgit.freedesktop.org/~glisse/linux/commit/?h=hmm-v20&id=84fc68534e781cf6125d02b3bfdba4a51e82d9c9
>
> As it was your idea, i just want to make sure i didn't denatured
> it :)
OK, looks good to me. I would be more specific in the changelog though.
"
mm, memory_hotplug: introduce add_pages
There are new users of memory hotplug emerging. Some of them require
different subset of arch_add_memory. There are some which only require
allocation of struct pages without mapping those pages to the kernel
address space. We currently have __add_pages for that purpose. But this
is rather lowlevel and not very suitable for the code outside of the
memory hotplug. E.g. x86_64 wants to update max_pfn which should be
done by the caller. Introduce add_pages() which should care about those
details if they are needed. Each architecture should define its
implementation and select CONFIG_ARCH_HAS_ADD_PAGES. All others use
the currently existing __add_pages.
"
--
Michal Hocko
SUSE Labs
On Mon, 2017-04-10 at 12:35 -0400, Jerome Glisse wrote:
> On Mon, Apr 10, 2017 at 01:03:42PM +0200, Michal Hocko wrote:
> > Hi,
> > The last version of this series has been posted here [1]. It has seen
> > some more serious testing (thanks to Reza Arbab) and fixes for the found
> > issues. I have also decided to drop patch 1 [2] because it turned out to
> > be more complicated than I initially thought [3]. Few more patches were
> > added to deal with expectation on zone/node initialization.
> >
> > I have rebased on top of the current mmotm-2017-04-07-15-53. It
> > conflicts with HMM because it touches memory hotplug as
> > well. We have discussed [4] with Jérôme and he agreed to
> > rebase on top of this rework [5] so I have reverted his series
> > before applyig mine. I will help him to resolve the resulting
> > conflicts. You can find the whole series including the HMM revers in
> > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git branch
> > attempts/rewrite-mem_hotplug
> >
>
> So updated HMM patchset :
> https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-v20
>
> I am not posting yet as it seems there is couple thing you need to
> fix in your patchset first. However if you could review :
>
> https://cgit.freedesktop.org/~glisse/linux/commit/?h=hmm-v20&id=84fc68534e781cf6125d02b3bfdba4a51e82d9c9
>
> As it was your idea, i just want to make sure i didn't denatured
> it :)
>
> Also as side note, v20 fix build issue by restricting HMM to x86-64
> which is safer than pretending this can be use on any random arch
> as build failures i am getting clearly shows that thing i assumed to
> be true on all arch aren't.
In that case could you please document what an arch needs to do to enable
HMM? What are the dependencies and requirements?
Balbir Singh.
On Mon, 10 Apr 2017 18:09:41 +0200
Michal Hocko <[email protected]> wrote:
> On Mon 10-04-17 16:27:49, Igor Mammedov wrote:
> [...]
> > -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
> > -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=0
>
> are you sure both of them should be node=0?
>
> What is the full comman line you use?
CLI for issue 1, 3:
-enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node \
-drive if=virtio,file=disk.img -kernel bzImage -append 'root=/dev/vda1' \
-object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
-device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=0
for issue2:
-enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node \
-drive if=virtio,file=disk.img -kernel bzImage -append 'root=/dev/vda1' \
-object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
-device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=1
On Mon, 10 Apr 2017 16:56:39 +0200
Michal Hocko <[email protected]> wrote:
> On Mon 10-04-17 16:27:49, Igor Mammedov wrote:
> [...]
> > Hi Michal,
> >
> > I've given series some dumb testing, see below for unexpected changes I've noticed.
> >
> > Using the same CLI as above plus hotpluggable dimms present at startup
> > (it still uses hotplug path as dimms aren't reported in e820)
> >
> > -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
> > -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=0
> >
> > so dimm1 => memory3[23] and dimm0 => memory3[45]
> >
> > #issue1:
> > unable to online memblock as NORMAL adjacent to onlined MOVABLE
> >
> > 1: after boot
> > memory32:offline removable: 0 zones: Normal Movable
> > memory33:offline removable: 0 zones: Normal Movable
> > memory34:offline removable: 0 zones: Normal Movable
> > memory35:offline removable: 0 zones: Normal Movable
> >
> > 2: online as movable 1st dimm
> >
> > #echo online_movable > memory32/state
> > #echo online_movable > memory33/state
> >
> > everything is as expected:
> > memory32:online removable: 1 zones: Movable
> > memory33:online removable: 1 zones: Movable
> > memory34:offline removable: 0 zones: Movable
> > memory35:offline removable: 0 zones: Movable
> >
> > 3: try to offline memory32 and online as NORMAL
> >
> > #echo offline > memory32/state
> > memory32:offline removable: 1 zones: Normal Movable
> > memory33:online removable: 1 zones: Movable
> > memory34:offline removable: 0 zones: Movable
> > memory35:offline removable: 0 zones: Movable
>
> OK, this is not expected. We are not shifting zones anymore so the range
> which was online_movable will not become available to the zone Normal.
> So this must be something broken down the show_valid_zones path. I will
> investigate.
>
> >
> > #echo online_kernel > memory32/state
> > write error: Invalid argument
> > // that's not what's expected
>
> this is proper behavior with the current implementation. Does anything
> depend on the zone reusing?
if we didn't have zone imbalance issue in design,
the it wouldn't matter but as it stands it's not
minore issue.
Consider following,
one hotplugs some memory and onlines it as movable,
then one needs to hotplug some more but to do so
one one needs more memory from zone NORMAL and to keep
zone balance some memory in MOVABLE should be reonlined
as NORMAL
On Mon, 2017-04-10 at 13:03 +0200, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> init_currently_empty_zone doesn't have any error to return yet it is
> still an int and callers try to be defensive and try to handle potential
> error. Remove this nonsense and simplify all callers.
>
> This patch shouldn't have any visible effect
>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
This makes sense
Acked-by: Balbir Singh <[email protected]>
On Tue 11-04-17 10:01:52, Igor Mammedov wrote:
> On Mon, 10 Apr 2017 16:56:39 +0200
> Michal Hocko <[email protected]> wrote:
[...]
> > > #echo online_kernel > memory32/state
> > > write error: Invalid argument
> > > // that's not what's expected
> >
> > this is proper behavior with the current implementation. Does anything
> > depend on the zone reusing?
> if we didn't have zone imbalance issue in design,
> the it wouldn't matter but as it stands it's not
> minore issue.
>
> Consider following,
> one hotplugs some memory and onlines it as movable,
> then one needs to hotplug some more but to do so
> one one needs more memory from zone NORMAL and to keep
> zone balance some memory in MOVABLE should be reonlined
> as NORMAL
Is this something that we absolutely have to have right _now_? Or are you
OK if I address this in follow up series? Because it will make the
current code slightly more complex and to be honest I would rather like
to see this "core" merge and build more on top.
--
Michal Hocko
SUSE Labs
On Mon 10-04-17 10:43:04, Reza Arbab wrote:
> On Mon, Apr 10, 2017 at 01:03:42PM +0200, Michal Hocko wrote:
> >This patchset aims at making the onlining semantic more usable. First of
> >all it allows to online memory movable as long as it doesn't clash with
> >the existing ZONE_NORMAL. That means that ZONE_NORMAL and ZONE_MOVABLE
> >cannot overlap. Currently I preserve the original ordering semantic so the
> >zone always precedes the movable zone but I have plans to remove this
> >restriction in future because it is not really necessary.
>
> Thanks for addressing my issues. I see Igor found a few other things to
> square away, but FWIW,
>
> Tested-by: Reza Arbab <[email protected]>
OK, I have put this to "[PATCH 6/9] mm, memory_hotplug: do not associate
hotadded memory to zones until online" because that is the core of the
change that you have been testing. Let me know if you want the tag to
other patches as well.
--
Michal Hocko
SUSE Labs
On Tue 11-04-17 08:38:34, Igor Mammedov wrote:
> for issue2:
> -enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node \
> -drive if=virtio,file=disk.img -kernel bzImage -append 'root=/dev/vda1' \
> -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
> -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=1
I must be doing something wrong here...
qemu-system-x86_64 -enable-kvm -monitor telnet:127.0.0.1:9999,server,nowait -net nic -net user,hostfwd=tcp:127.0.0.1:5555-:22 -serial file:test.qcow_serial.log -enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=1 -drive file=test.qcow,if=ide,index=0
for i in $(seq 0 3)
do
sh probe_memblock.sh $i
done
# ls -l /sys/devices/system/memory/memory3?/node*
lrwxrwxrwx 1 root root 0 Apr 11 11:21 /sys/devices/system/memory/memory32/node0 -> ../../node/node0
lrwxrwxrwx 1 root root 0 Apr 11 11:21 /sys/devices/system/memory/memory33/node0 -> ../../node/node0
lrwxrwxrwx 1 root root 0 Apr 11 11:21 /sys/devices/system/memory/memory34/node0 -> ../../node/node0
lrwxrwxrwx 1 root root 0 Apr 11 11:21 /sys/devices/system/memory/memory35/node0 -> ../../node/node0
all of them end in the same node0
# grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Normal Movable
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Normal Movable
/sys/devices/system/memory/memory35/valid_zones:Normal Movable
--
Michal Hocko
SUSE Labs
On Tue, 11 Apr 2017 10:41:42 +0200
Michal Hocko <[email protected]> wrote:
> On Tue 11-04-17 10:01:52, Igor Mammedov wrote:
> > On Mon, 10 Apr 2017 16:56:39 +0200
> > Michal Hocko <[email protected]> wrote:
> [...]
> > > > #echo online_kernel > memory32/state
> > > > write error: Invalid argument
> > > > // that's not what's expected
> > >
> > > this is proper behavior with the current implementation. Does anything
> > > depend on the zone reusing?
> > if we didn't have zone imbalance issue in design,
> > the it wouldn't matter but as it stands it's not
> > minore issue.
> >
> > Consider following,
> > one hotplugs some memory and onlines it as movable,
> > then one needs to hotplug some more but to do so
> > one one needs more memory from zone NORMAL and to keep
> > zone balance some memory in MOVABLE should be reonlined
> > as NORMAL
>
> Is this something that we absolutely have to have right _now_? Or are you
> OK if I address this in follow up series? Because it will make the
> current code slightly more complex and to be honest I would rather like
> to see this "core" merge and build more on top.
It's fine by me to do it on top.
On Tue, 11 Apr 2017 11:23:07 +0200
Michal Hocko <[email protected]> wrote:
> On Tue 11-04-17 08:38:34, Igor Mammedov wrote:
> > for issue2:
> > -enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node \
> > -drive if=virtio,file=disk.img -kernel bzImage -append 'root=/dev/vda1' \
> > -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
> > -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=1
>
> I must be doing something wrong here...
> qemu-system-x86_64 -enable-kvm -monitor telnet:127.0.0.1:9999,server,nowait -net nic -net user,hostfwd=tcp:127.0.0.1:5555-:22 -serial file:test.qcow_serial.log -enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=1 -drive file=test.qcow,if=ide,index=0
>
> for i in $(seq 0 3)
> do
> sh probe_memblock.sh $i
> done
dimm to node mapping comes from ACPI subsystem (_PXM object in memory device),
which adds memory blocks automatically on hotplug.
you probably don't have ACPI_HOTPLUG_MEMORY config option enabled.
>
> # ls -l /sys/devices/system/memory/memory3?/node*
> lrwxrwxrwx 1 root root 0 Apr 11 11:21 /sys/devices/system/memory/memory32/node0 -> ../../node/node0
> lrwxrwxrwx 1 root root 0 Apr 11 11:21 /sys/devices/system/memory/memory33/node0 -> ../../node/node0
> lrwxrwxrwx 1 root root 0 Apr 11 11:21 /sys/devices/system/memory/memory34/node0 -> ../../node/node0
> lrwxrwxrwx 1 root root 0 Apr 11 11:21 /sys/devices/system/memory/memory35/node0 -> ../../node/node0
>
> all of them end in the same node0
>
> # grep . /sys/devices/system/memory/memory3?/valid_zones
> /sys/devices/system/memory/memory32/valid_zones:Normal Movable
> /sys/devices/system/memory/memory33/valid_zones:Normal Movable
> /sys/devices/system/memory/memory34/valid_zones:Normal Movable
> /sys/devices/system/memory/memory35/valid_zones:Normal Movable
>
On Tue 11-04-17 11:53:22, Igor Mammedov wrote:
> On Tue, 11 Apr 2017 10:41:42 +0200
> Michal Hocko <[email protected]> wrote:
>
> > On Tue 11-04-17 10:01:52, Igor Mammedov wrote:
> > > On Mon, 10 Apr 2017 16:56:39 +0200
> > > Michal Hocko <[email protected]> wrote:
> > [...]
> > > > > #echo online_kernel > memory32/state
> > > > > write error: Invalid argument
> > > > > // that's not what's expected
> > > >
> > > > this is proper behavior with the current implementation. Does anything
> > > > depend on the zone reusing?
> > > if we didn't have zone imbalance issue in design,
> > > the it wouldn't matter but as it stands it's not
> > > minore issue.
> > >
> > > Consider following,
> > > one hotplugs some memory and onlines it as movable,
> > > then one needs to hotplug some more but to do so
> > > one one needs more memory from zone NORMAL and to keep
> > > zone balance some memory in MOVABLE should be reonlined
> > > as NORMAL
> >
> > Is this something that we absolutely have to have right _now_? Or are you
> > OK if I address this in follow up series? Because it will make the
> > current code slightly more complex and to be honest I would rather like
> > to see this "core" merge and build more on top.
>
> It's fine by me to do it on top.
OK, I will document this in the changelog of the patch 6.
"
Please note that this patch also changes the original behavior when
offlining a memory block adjacent to another zone (Normal vs. Movable)
used to allow to change its movable type. This will be handled later.
"
--
Michal Hocko
SUSE Labs
On Tue 11-04-17 11:59:31, Igor Mammedov wrote:
> On Tue, 11 Apr 2017 11:23:07 +0200
> Michal Hocko <[email protected]> wrote:
>
> > On Tue 11-04-17 08:38:34, Igor Mammedov wrote:
> > > for issue2:
> > > -enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node \
> > > -drive if=virtio,file=disk.img -kernel bzImage -append 'root=/dev/vda1' \
> > > -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
> > > -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=1
> >
> > I must be doing something wrong here...
> > qemu-system-x86_64 -enable-kvm -monitor telnet:127.0.0.1:9999,server,nowait -net nic -net user,hostfwd=tcp:127.0.0.1:5555-:22 -serial file:test.qcow_serial.log -enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=1 -drive file=test.qcow,if=ide,index=0
> >
> > for i in $(seq 0 3)
> > do
> > sh probe_memblock.sh $i
> > done
>
> dimm to node mapping comes from ACPI subsystem (_PXM object in memory device),
> which adds memory blocks automatically on hotplug.
Hmm, memory_probe_store relies on memory_add_physaddr_to_nid which in
turn relies on numa_meminfo. I am not familiar with the intialization
and got lost in in the code rather quickly but I assumed this should get
the proper information from the ACPI subsystem. I will have to double
check.
> you probably don't have ACPI_HOTPLUG_MEMORY config option enabled.
Yes that is the case and enabling it made all 4 memblocks available
and associated with the proper node
# ls -l /sys/devices/system/memory/memory3?/node*
lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory32/node0 -> ../../node/node0
lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory33/node0 -> ../../node/node0
lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory34/node1 -> ../../node/node1
lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory35/node1 -> ../../node/node1
# grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Normal Movable
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Normal Movable
/sys/devices/system/memory/memory35/valid_zones:Normal Movable
I can even reproduce your problem
# echo online_movable > /sys/devices/system/memory/memory33/state
# echo online > /sys/devices/system/memory/memory32/state
# grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Movable
/sys/devices/system/memory/memory33/valid_zones:Movable
/sys/devices/system/memory/memory34/valid_zones:Normal Movable
/sys/devices/system/memory/memory35/valid_zones:Normal Movable
I will investigate this
--
Michal Hocko
SUSE Labs
On Tue 11-04-17 13:01:43, Michal Hocko wrote:
> On Tue 11-04-17 11:59:31, Igor Mammedov wrote:
> > On Tue, 11 Apr 2017 11:23:07 +0200
> > Michal Hocko <[email protected]> wrote:
> >
> > > On Tue 11-04-17 08:38:34, Igor Mammedov wrote:
> > > > for issue2:
> > > > -enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node \
> > > > -drive if=virtio,file=disk.img -kernel bzImage -append 'root=/dev/vda1' \
> > > > -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
> > > > -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=1
> > >
> > > I must be doing something wrong here...
> > > qemu-system-x86_64 -enable-kvm -monitor telnet:127.0.0.1:9999,server,nowait -net nic -net user,hostfwd=tcp:127.0.0.1:5555-:22 -serial file:test.qcow_serial.log -enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=1 -drive file=test.qcow,if=ide,index=0
> > >
> > > for i in $(seq 0 3)
> > > do
> > > sh probe_memblock.sh $i
> > > done
> >
> > dimm to node mapping comes from ACPI subsystem (_PXM object in memory device),
> > which adds memory blocks automatically on hotplug.
>
> Hmm, memory_probe_store relies on memory_add_physaddr_to_nid which in
> turn relies on numa_meminfo. I am not familiar with the intialization
> and got lost in in the code rather quickly but I assumed this should get
> the proper information from the ACPI subsystem. I will have to double
> check.
>
> > you probably don't have ACPI_HOTPLUG_MEMORY config option enabled.
>
> Yes that is the case and enabling it made all 4 memblocks available
> and associated with the proper node
> # ls -l /sys/devices/system/memory/memory3?/node*
> lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory32/node0 -> ../../node/node0
> lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory33/node0 -> ../../node/node0
> lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory34/node1 -> ../../node/node1
> lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory35/node1 -> ../../node/node1
>
> # grep . /sys/devices/system/memory/memory3?/valid_zones
> /sys/devices/system/memory/memory32/valid_zones:Normal Movable
> /sys/devices/system/memory/memory33/valid_zones:Normal Movable
> /sys/devices/system/memory/memory34/valid_zones:Normal Movable
> /sys/devices/system/memory/memory35/valid_zones:Normal Movable
>
> I can even reproduce your problem
> # echo online_movable > /sys/devices/system/memory/memory33/state
> # echo online > /sys/devices/system/memory/memory32/state
> # grep . /sys/devices/system/memory/memory3?/valid_zones
> /sys/devices/system/memory/memory32/valid_zones:Movable
> /sys/devices/system/memory/memory33/valid_zones:Movable
> /sys/devices/system/memory/memory34/valid_zones:Normal Movable
> /sys/devices/system/memory/memory35/valid_zones:Normal Movable
>
> I will investigate this
Dang, guess what. It is a similar type bug I've fixed in
show_valid_zones [1] already.
[1] http://lkml.kernel.org/r/[email protected]
---
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index ec2f987ec549..410c7ccb74fb 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -541,7 +541,7 @@ static inline bool zone_intersects(struct zone *zone,
{
if (zone->zone_start_pfn <= start_pfn && start_pfn < zone_end_pfn(zone))
return true;
- if (start_pfn + nr_pages > start_pfn && !zone_is_empty(zone))
+ if (start_pfn + nr_pages > zone->zone_start_pfn && !zone_is_empty(zone))
return true;
return false;
}
I have decided to make it more readable and did zone_is_empty check
first. Everything is in my git tree attempts/rewrite-mem_hotplug branch.
I have to test it but I believe this is the culprit here.
--
Michal Hocko
SUSE Labs
On Tue 11-04-17 13:38:16, Michal Hocko wrote:
> On Tue 11-04-17 13:01:43, Michal Hocko wrote:
> > On Tue 11-04-17 11:59:31, Igor Mammedov wrote:
> > > On Tue, 11 Apr 2017 11:23:07 +0200
> > > Michal Hocko <[email protected]> wrote:
> > >
> > > > On Tue 11-04-17 08:38:34, Igor Mammedov wrote:
> > > > > for issue2:
> > > > > -enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node \
> > > > > -drive if=virtio,file=disk.img -kernel bzImage -append 'root=/dev/vda1' \
> > > > > -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M \
> > > > > -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=1
> > > >
> > > > I must be doing something wrong here...
> > > > qemu-system-x86_64 -enable-kvm -monitor telnet:127.0.0.1:9999,server,nowait -net nic -net user,hostfwd=tcp:127.0.0.1:5555-:22 -serial file:test.qcow_serial.log -enable-kvm -m 2G,slots=4,maxmem=4G -smp 4 -numa node -numa node -object memory-backend-ram,id=mem1,size=256M -object memory-backend-ram,id=mem0,size=256M -device pc-dimm,id=dimm1,memdev=mem1,slot=1,node=0 -device pc-dimm,id=dimm0,memdev=mem0,slot=0,node=1 -drive file=test.qcow,if=ide,index=0
> > > >
> > > > for i in $(seq 0 3)
> > > > do
> > > > sh probe_memblock.sh $i
> > > > done
> > >
> > > dimm to node mapping comes from ACPI subsystem (_PXM object in memory device),
> > > which adds memory blocks automatically on hotplug.
> >
> > Hmm, memory_probe_store relies on memory_add_physaddr_to_nid which in
> > turn relies on numa_meminfo. I am not familiar with the intialization
> > and got lost in in the code rather quickly but I assumed this should get
> > the proper information from the ACPI subsystem. I will have to double
> > check.
> >
> > > you probably don't have ACPI_HOTPLUG_MEMORY config option enabled.
> >
> > Yes that is the case and enabling it made all 4 memblocks available
> > and associated with the proper node
> > # ls -l /sys/devices/system/memory/memory3?/node*
> > lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory32/node0 -> ../../node/node0
> > lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory33/node0 -> ../../node/node0
> > lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory34/node1 -> ../../node/node1
> > lrwxrwxrwx 1 root root 0 Apr 11 12:56 /sys/devices/system/memory/memory35/node1 -> ../../node/node1
> >
> > # grep . /sys/devices/system/memory/memory3?/valid_zones
> > /sys/devices/system/memory/memory32/valid_zones:Normal Movable
> > /sys/devices/system/memory/memory33/valid_zones:Normal Movable
> > /sys/devices/system/memory/memory34/valid_zones:Normal Movable
> > /sys/devices/system/memory/memory35/valid_zones:Normal Movable
> >
> > I can even reproduce your problem
> > # echo online_movable > /sys/devices/system/memory/memory33/state
> > # echo online > /sys/devices/system/memory/memory32/state
> > # grep . /sys/devices/system/memory/memory3?/valid_zones
> > /sys/devices/system/memory/memory32/valid_zones:Movable
> > /sys/devices/system/memory/memory33/valid_zones:Movable
> > /sys/devices/system/memory/memory34/valid_zones:Normal Movable
> > /sys/devices/system/memory/memory35/valid_zones:Normal Movable
> >
> > I will investigate this
>
> Dang, guess what. It is a similar type bug I've fixed in
> show_valid_zones [1] already.
>
> [1] http://lkml.kernel.org/r/[email protected]
> ---
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index ec2f987ec549..410c7ccb74fb 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -541,7 +541,7 @@ static inline bool zone_intersects(struct zone *zone,
> {
> if (zone->zone_start_pfn <= start_pfn && start_pfn < zone_end_pfn(zone))
> return true;
> - if (start_pfn + nr_pages > start_pfn && !zone_is_empty(zone))
> + if (start_pfn + nr_pages > zone->zone_start_pfn && !zone_is_empty(zone))
> return true;
> return false;
> }
>
> I have decided to make it more readable and did zone_is_empty check
> first. Everything is in my git tree attempts/rewrite-mem_hotplug branch.
> I have to test it but I believe this is the culprit here.
OK, tested and it seems to be fixed. Thanks again for your testing and
the kvm configuration which made my testing much easier (probing and
adding areas from the qemu monitor was just PITA)!
--
Michal Hocko
SUSE Labs
All the reported issue seem to be fixed and pushed to my git tree
attempts/rewrite-mem_hotplug branch. I will wait a day or two for more
feedback and then repost for the inclusion. I would really appreaciate
more testing/review!
--
Michal Hocko
SUSE Labs
On 04/10/2017 01:03 PM, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> init_currently_empty_zone doesn't have any error to return yet it is
> still an int and callers try to be defensive and try to handle potential
> error. Remove this nonsense and simplify all callers.
>
> This patch shouldn't have any visible effect
>
> Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
On 04/10/2017 01:03 PM, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> the primary purpose of this helper is to query the node state so use
> the node id directly. This is a preparatory patch for later changes.
>
> This shouldn't introduce any functional change
>
> Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
On 04/10/2017 01:03 PM, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> c04fc586c1a4 ("mm: show node to memory section relationship with
> symlinks in sysfs") has added means to export memblock<->node
> association into the sysfs. It has also introduced get_nid_for_pfn
> which is a rather confusing counterpart of pfn_to_nid which checks also
> whether the pfn page is already initialized (page_initialized). This
> is done by checking page::lru != NULL which doesn't make any sense at
> all. Nothing in this path really relies on the lru list being used or
> initialized. Just remove it because this will become a problem with
> later patches.
>
> Thanks to Reza Arbab for testing which revealed this to be a problem
> (http://lkml.kernel.org/r/[email protected])
>
> Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
On 04/10/2017 01:03 PM, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> device memory hotplug hooks into regular memory hotplug only half way.
> It needs memory sections to track struct pages but there is no
> need/desire to associate those sections with memory blocks and export
> them to the userspace via sysfs because they cannot be onlined anyway.
>
> This is currently expressed by for_device argument to arch_add_memory
> which then makes sure to associate the given memory range with
> ZONE_DEVICE. register_new_memory then relies on is_zone_device_section
> to distinguish special memory hotplug from the regular one. While this
> works now, later patches in this series want to move __add_zone outside
> of arch_add_memory path so we have to come up with something else.
>
> Add want_memblock down the __add_pages path and use it to control
> whether the section->memblock association should be done. arch_add_memory
> then just trivially want memblock for everything but for_device hotplug.
>
> remove_memory_section doesn't need is_zone_device_section either. We can
> simply skip all the memblock specific cleanup if there is no memblock
> for the given section.
>
> This shouldn't introduce any functional change.
>
> Cc: Dan Williams <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>
For the fixed version:
Acked-by: Vlastimil Babka <[email protected]>
On 04/10/2017 01:03 PM, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> Memory hotplug (add_memory_resource) has to reinitialize node
> infrastructure if the node is offline (one which went through the
> complete add_memory(); remove_memory() cycle). That involves node
> registration to the kobj infrastructure (register_node), the proper
> association with cpus (register_cpu_under_node) and finally creation of
> node<->memblock symlinks (link_mem_sections).
>
> The last part requires to know node_start_pfn and node_spanned_pages
> which we currently have but a leter patch will postpone this
> initialization to the onlining phase which happens later. In fact we do
> not need to rely on the early pgdat initialization even now because the
> currently hot added pfn range is currently known.
>
> Split register_one_node into core which does all the common work for
> the boot time NUMA initialization and the hotplug (__register_one_node).
> register_one_node keeps the full initialization while hotplug calls
> __register_one_node and manually calls link_mem_sections for the proper
> range.
>
> This shouldn't introduce any functional change.
>
> Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
nit:
> @@ -1387,7 +1387,22 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
> node_set_online(nid);
>
> if (new_node) {
> - ret = register_one_node(nid);
> + unsigned long start_pfn = start >> PAGE_SHIFT;
> + unsigned long nr_pages = size >> PAGE_SHIFT;
> +
> + ret = __register_one_node(nid);
> + if (ret)
> + goto register_fail;
> +
> + /*
> + * link memory sections under this node. This is already
> + * done when creatig memory section in register_new_memory
> + * but that depends to have the node registered so offline
> + * nodes have to go through register_node.
> + * TODO clean up this mess.
Is this a work-in-progress or final TODO? :)
> + */
> + ret = link_mem_sections(nid, start_pfn, nr_pages);
> +register_fail:
> /*
> * If sysfs file of new node can't create, cpu on the node
> * can't be hot-added. There is no rollback way now.
>
On Thu 13-04-17 16:05:17, Vlastimil Babka wrote:
> On 04/10/2017 01:03 PM, Michal Hocko wrote:
> > From: Michal Hocko <[email protected]>
> >
> > Memory hotplug (add_memory_resource) has to reinitialize node
> > infrastructure if the node is offline (one which went through the
> > complete add_memory(); remove_memory() cycle). That involves node
> > registration to the kobj infrastructure (register_node), the proper
> > association with cpus (register_cpu_under_node) and finally creation of
> > node<->memblock symlinks (link_mem_sections).
> >
> > The last part requires to know node_start_pfn and node_spanned_pages
> > which we currently have but a leter patch will postpone this
> > initialization to the onlining phase which happens later. In fact we do
> > not need to rely on the early pgdat initialization even now because the
> > currently hot added pfn range is currently known.
> >
> > Split register_one_node into core which does all the common work for
> > the boot time NUMA initialization and the hotplug (__register_one_node).
> > register_one_node keeps the full initialization while hotplug calls
> > __register_one_node and manually calls link_mem_sections for the proper
> > range.
> >
> > This shouldn't introduce any functional change.
> >
> > Signed-off-by: Michal Hocko <[email protected]>
>
> Acked-by: Vlastimil Babka <[email protected]>
Thanks!
> nit:
> > @@ -1387,7 +1387,22 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
> > node_set_online(nid);
> >
> > if (new_node) {
> > - ret = register_one_node(nid);
> > + unsigned long start_pfn = start >> PAGE_SHIFT;
> > + unsigned long nr_pages = size >> PAGE_SHIFT;
> > +
> > + ret = __register_one_node(nid);
> > + if (ret)
> > + goto register_fail;
> > +
> > + /*
> > + * link memory sections under this node. This is already
> > + * done when creatig memory section in register_new_memory
> > + * but that depends to have the node registered so offline
> > + * nodes have to go through register_node.
> > + * TODO clean up this mess.
>
> Is this a work-in-progress or final TODO? :)
I do not plan to address it in this series, but I will revisit it later.
There are more like this in other patches.
--
Michal Hocko
SUSE Labs
On 04/10/2017 07:03 AM, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> init_currently_empty_zone doesn't have any error to return yet it is
> still an int and callers try to be defensive and try to handle potential
> error. Remove this nonsense and simplify all callers.
>
> This patch shouldn't have any visible effect
>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
Reviewed-by: Yasuaki Ishimatsu <[email protected]>
Thanks,
Yasuaki Ishimatsu
> include/linux/mmzone.h | 2 +-
> mm/memory_hotplug.c | 23 +++++------------------
> mm/page_alloc.c | 8 ++------
> 3 files changed, 8 insertions(+), 25 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index ebaccd4e7d8c..0fc121bbf4ff 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -771,7 +771,7 @@ enum memmap_context {
> MEMMAP_EARLY,
> MEMMAP_HOTPLUG,
> };
> -extern int init_currently_empty_zone(struct zone *zone, unsigned long start_pfn,
> +extern void init_currently_empty_zone(struct zone *zone, unsigned long start_pfn,
> unsigned long size);
>
> extern void lruvec_init(struct lruvec *lruvec);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 257166ebdff0..9ed251811ec3 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -347,27 +347,20 @@ static void fix_zone_id(struct zone *zone, unsigned long start_pfn,
> set_page_links(pfn_to_page(pfn), zid, nid, pfn);
> }
>
> -/* Can fail with -ENOMEM from allocating a wait table with vmalloc() or
> - * alloc_bootmem_node_nopanic()/memblock_virt_alloc_node_nopanic() */
> -static int __ref ensure_zone_is_initialized(struct zone *zone,
> +static void __ref ensure_zone_is_initialized(struct zone *zone,
> unsigned long start_pfn, unsigned long num_pages)
> {
> if (!zone_is_initialized(zone))
> - return init_currently_empty_zone(zone, start_pfn, num_pages);
> -
> - return 0;
> + init_currently_empty_zone(zone, start_pfn, num_pages);
> }
>
> static int __meminit move_pfn_range_left(struct zone *z1, struct zone *z2,
> unsigned long start_pfn, unsigned long end_pfn)
> {
> - int ret;
> unsigned long flags;
> unsigned long z1_start_pfn;
>
> - ret = ensure_zone_is_initialized(z1, start_pfn, end_pfn - start_pfn);
> - if (ret)
> - return ret;
> + ensure_zone_is_initialized(z1, start_pfn, end_pfn - start_pfn);
>
> pgdat_resize_lock(z1->zone_pgdat, &flags);
>
> @@ -403,13 +396,10 @@ static int __meminit move_pfn_range_left(struct zone *z1, struct zone *z2,
> static int __meminit move_pfn_range_right(struct zone *z1, struct zone *z2,
> unsigned long start_pfn, unsigned long end_pfn)
> {
> - int ret;
> unsigned long flags;
> unsigned long z2_end_pfn;
>
> - ret = ensure_zone_is_initialized(z2, start_pfn, end_pfn - start_pfn);
> - if (ret)
> - return ret;
> + ensure_zone_is_initialized(z2, start_pfn, end_pfn - start_pfn);
>
> pgdat_resize_lock(z1->zone_pgdat, &flags);
>
> @@ -480,12 +470,9 @@ static int __meminit __add_zone(struct zone *zone, unsigned long phys_start_pfn)
> int nid = pgdat->node_id;
> int zone_type;
> unsigned long flags, pfn;
> - int ret;
>
> zone_type = zone - pgdat->node_zones;
> - ret = ensure_zone_is_initialized(zone, phys_start_pfn, nr_pages);
> - if (ret)
> - return ret;
> + ensure_zone_is_initialized(zone, phys_start_pfn, nr_pages);
>
> pgdat_resize_lock(zone->zone_pgdat, &flags);
> grow_zone_span(zone, phys_start_pfn, phys_start_pfn + nr_pages);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 9c587000d408..0cacba69ab04 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5517,7 +5517,7 @@ static __meminit void zone_pcp_init(struct zone *zone)
> zone_batchsize(zone));
> }
>
> -int __meminit init_currently_empty_zone(struct zone *zone,
> +void __meminit init_currently_empty_zone(struct zone *zone,
> unsigned long zone_start_pfn,
> unsigned long size)
> {
> @@ -5535,8 +5535,6 @@ int __meminit init_currently_empty_zone(struct zone *zone,
>
> zone_init_free_lists(zone);
> zone->initialized = 1;
> -
> - return 0;
> }
>
> #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
> @@ -5999,7 +5997,6 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
> {
> enum zone_type j;
> int nid = pgdat->node_id;
> - int ret;
>
> pgdat_resize_init(pgdat);
> #ifdef CONFIG_NUMA_BALANCING
> @@ -6081,8 +6078,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
>
> set_pageblock_order();
> setup_usemap(pgdat, zone, zone_start_pfn, size);
> - ret = init_currently_empty_zone(zone, zone_start_pfn, size);
> - BUG_ON(ret);
> + init_currently_empty_zone(zone, zone_start_pfn, size);
> memmap_init(size, nid, j, zone_start_pfn);
> }
> }
>
On 04/10/2017 07:03 AM, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> the primary purpose of this helper is to query the node state so use
> the node id directly. This is a preparatory patch for later changes.
>
> This shouldn't introduce any functional change
>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
Reviewed-by: Yasuaki Ishimatsu <[email protected]>
Thanks,
Yasuaki Ishimatsu
> mm/memory_hotplug.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 9ed251811ec3..342332f29364 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -940,15 +940,15 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
> * When CONFIG_MOVABLE_NODE, we permit onlining of a node which doesn't have
> * normal memory.
> */
> -static bool can_online_high_movable(struct zone *zone)
> +static bool can_online_high_movable(int nid)
> {
> return true;
> }
> #else /* CONFIG_MOVABLE_NODE */
> /* ensure every online node has NORMAL memory */
> -static bool can_online_high_movable(struct zone *zone)
> +static bool can_online_high_movable(int nid)
> {
> - return node_state(zone_to_nid(zone), N_NORMAL_MEMORY);
> + return node_state(nid, N_NORMAL_MEMORY);
> }
> #endif /* CONFIG_MOVABLE_NODE */
>
> @@ -1082,7 +1082,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
>
> if ((zone_idx(zone) > ZONE_NORMAL ||
> online_type == MMOP_ONLINE_MOVABLE) &&
> - !can_online_high_movable(zone))
> + !can_online_high_movable(pfn_to_nid(pfn)))
> return -EINVAL;
>
> if (online_type == MMOP_ONLINE_KERNEL) {
>
Hi,
here I 3 more preparatory patches which I meant to send on Thursday but
forgot... After more thinking about pfn walkers I have realized that
the current code doesn't check offline holes in zones. From a quick
review that doesn't seem to be a problem currently. Pfn walkers can race
with memory offlining and with the original hotplug impementation those
offline pages can change the zone but I wasn't able to find any serious
problem other than small confusion. The new hotplug code, will not have
any valid zone, though so those code paths should check PageReserved
to rule offline holes. I hope I have addressed all of them in these 3
patches. I would appreciate if Vlastimil and Jonsoo double check after
me.
From: Michal Hocko <[email protected]>
__reset_isolation_suitable walks the whole zone pfn range and it tries
to jump over holes by checking the zone for each page. It might still
stumble over offline pages, though. Skip those by checking PageReserved.
Signed-off-by: Michal Hocko <[email protected]>
---
mm/compaction.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/compaction.c b/mm/compaction.c
index de64dedefe0e..df4156d8b037 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -239,6 +239,8 @@ static void __reset_isolation_suitable(struct zone *zone)
continue;
page = pfn_to_page(pfn);
+ if (PageReserved(page))
+ continue;
if (zone != page_zone(page))
continue;
--
2.11.0
From: Michal Hocko <[email protected]>
__first_valid_page skips over invalid pfns in the range but it might
still stumble over offline pages. At least start_isolate_page_range
will mark those set_migratetype_isolate. This doesn't represent
any immediate AFAICS because alloc_contig_range will fail to isolate
those pages but it relies on not fully initialized page which will
become a problem later when we stop associating offline pages to zones.
So this is more a preparatory patch than a fix.
Signed-off-by: Michal Hocko <[email protected]>
---
mm/page_isolation.c | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 5092e4ef00c8..2b958f33a1eb 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -138,12 +138,18 @@ static inline struct page *
__first_valid_page(unsigned long pfn, unsigned long nr_pages)
{
int i;
- for (i = 0; i < nr_pages; i++)
- if (pfn_valid_within(pfn + i))
- break;
- if (unlikely(i == nr_pages))
- return NULL;
- return pfn_to_page(pfn + i);
+
+ for (i = 0; i < nr_pages; i++) {
+ struct page *page;
+
+ if (!pfn_valid_within(pfn + i))
+ continue;
+ page = pfn_to_page(pfn + i);
+ if (PageReserved(page))
+ continue;
+ return page;
+ }
+ return NULL;
}
/*
@@ -184,8 +190,12 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
undo:
for (pfn = start_pfn;
pfn < undo_pfn;
- pfn += pageblock_nr_pages)
- unset_migratetype_isolate(pfn_to_page(pfn), migratetype);
+ pfn += pageblock_nr_pages) {
+ struct page *page = pfn_to_page(pfn);
+ if (PageReserved(page))
+ continue;
+ unset_migratetype_isolate(page, migratetype);
+ }
return -EBUSY;
}
--
2.11.0
From: Michal Hocko <[email protected]>
__pageblock_pfn_to_page has two users currently, set_zone_contiguous
which checks whether the given zone contains holes and
pageblock_pfn_to_page which then carefully returns a first valid
page from the given pfn range for the given zone. This doesn't handle
zones which are not fully populated though. Memory pageblocks can be
offlined or might not have been onlined yet. In such a case the zone
should be considered to have holes otherwise pfn walkers can touch
and play with offline pages.
Current callers of pageblock_pfn_to_page in compaction seem to work
properly right now because they only isolate PageBuddy
(isolate_freepages_block) or PageLRU resp. __PageMovable
(isolate_migratepages_block) which will be always false for these pages.
It would be safer to skip these pages altogether, though. In order
to do that let's check PageReserved in __pageblock_pfn_to_page because
offline pages are reserved.
Signed-off-by: Michal Hocko <[email protected]>
---
mm/page_alloc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0cacba69ab04..dcbbcfdda60e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1351,6 +1351,8 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
return NULL;
start_page = pfn_to_page(start_pfn);
+ if (PageReserved(start_page))
+ return NULL;
if (page_zone(start_page) != zone)
return NULL;
--
2.11.0
On Sat, Apr 15, 2017 at 02:17:31PM +0200, Michal Hocko wrote:
> Hi,
> here I 3 more preparatory patches which I meant to send on Thursday but
> forgot... After more thinking about pfn walkers I have realized that
> the current code doesn't check offline holes in zones. From a quick
> review that doesn't seem to be a problem currently. Pfn walkers can race
> with memory offlining and with the original hotplug impementation those
> offline pages can change the zone but I wasn't able to find any serious
> problem other than small confusion. The new hotplug code, will not have
> any valid zone, though so those code paths should check PageReserved
> to rule offline holes. I hope I have addressed all of them in these 3
> patches. I would appreciate if Vlastimil and Jonsoo double check after
> me.
Hello, Michal.
s/Jonsoo/Joonsoo. :)
I'm not sure that it's a good idea to add PageResereved() check in pfn
walkers. First, this makes struct page validity check as two steps,
pfn_valid() and then PageResereved(). If we should not use struct page
in this case, it's better to pfn_valid() returns false rather than
adding a separate check. Anyway, we need to fix more places (all pfn
walker?) if we want to check validity by two steps.
The other problem I found is that your change will makes some
contiguous zones to be considered as non-contiguous. Memory allocated
by memblock API is also marked as PageResereved. If we consider this as
a hole, we will set such a zone as non-contiguous.
And, I guess that it's not enough to check PageResereved() in
pageblock_pfn_to_page() in order to skip these pages in compaction. If
holes are in the middle of the pageblock, pageblock_pfn_to_page()
cannot catch it and compaction will use struct page for this hole.
Therefore, I think that making pfn_valid() return false for not
onlined memory is a better solution for this problem. I don't know the
implementation detail for hotplug and I don't see your recent change
but we may defer memmap initialization until the zone is determined.
It will make pfn_valid() return false for un-initialized range.
Thanks.
On Mon 17-04-17 14:47:20, Joonsoo Kim wrote:
> On Sat, Apr 15, 2017 at 02:17:31PM +0200, Michal Hocko wrote:
> > Hi,
> > here I 3 more preparatory patches which I meant to send on Thursday but
> > forgot... After more thinking about pfn walkers I have realized that
> > the current code doesn't check offline holes in zones. From a quick
> > review that doesn't seem to be a problem currently. Pfn walkers can race
> > with memory offlining and with the original hotplug impementation those
> > offline pages can change the zone but I wasn't able to find any serious
> > problem other than small confusion. The new hotplug code, will not have
> > any valid zone, though so those code paths should check PageReserved
> > to rule offline holes. I hope I have addressed all of them in these 3
> > patches. I would appreciate if Vlastimil and Jonsoo double check after
> > me.
>
> Hello, Michal.
>
> s/Jonsoo/Joonsoo. :)
ups, sorry about that.
> I'm not sure that it's a good idea to add PageResereved() check in pfn
> walkers. First, this makes struct page validity check as two steps,
> pfn_valid() and then PageResereved().
Yes, those are two separate checkes because semantically they are
different. Not all pfn walkers do care about the online status.
> If we should not use struct page
> in this case, it's better to pfn_valid() returns false rather than
> adding a separate check. Anyway, we need to fix more places (all pfn
> walker?) if we want to check validity by two steps.
Which pfn walkers you have in mind?
> The other problem I found is that your change will makes some
> contiguous zones to be considered as non-contiguous. Memory allocated
> by memblock API is also marked as PageResereved. If we consider this as
> a hole, we will set such a zone as non-contiguous.
Why would that be a problem? We shouldn't touch those pages anyway?
> And, I guess that it's not enough to check PageResereved() in
> pageblock_pfn_to_page() in order to skip these pages in compaction. If
> holes are in the middle of the pageblock, pageblock_pfn_to_page()
> cannot catch it and compaction will use struct page for this hole.
Yes pageblock_pfn_to_page cannot catch it and it wouldn't with the
current implementation anyway. So the implementation won't be any worse
than with the current code. On the other hand offline holes will always
fill the whole pageblock (assuming those are not spanning multiple
memblocks).
> Therefore, I think that making pfn_valid() return false for not
> onlined memory is a better solution for this problem. I don't know the
> implementation detail for hotplug and I don't see your recent change
> but we may defer memmap initialization until the zone is determined.
> It will make pfn_valid() return false for un-initialized range.
I am not really sure. pfn_valid is used in many context and its only
purpose is to tell whether pfn_to_page will return a valid struct page
AFAIU.
I agree that having more checks is more error prone and we can add a
helper pfn_to_valid_page or something similar but I believe we can do
that on top of the current hotplug rework. This would require a non
trivial amount of changes and I believe that a lacking check for the
offline holes is not critical - we would (ab)use the lowest zone which
is similar to (ab)using ZONE_NORMAL/MOVABLE with the original code.
Thanks!
--
Michal Hocko
SUSE Labs
On Mon, Apr 10, 2017 at 01:03:46PM +0200, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> device memory hotplug hooks into regular memory hotplug only half way.
> It needs memory sections to track struct pages but there is no
> need/desire to associate those sections with memory blocks and export
> them to the userspace via sysfs because they cannot be onlined anyway.
>
> This is currently expressed by for_device argument to arch_add_memory
> which then makes sure to associate the given memory range with
> ZONE_DEVICE. register_new_memory then relies on is_zone_device_section
> to distinguish special memory hotplug from the regular one. While this
> works now, later patches in this series want to move __add_zone outside
> of arch_add_memory path so we have to come up with something else.
>
> Add want_memblock down the __add_pages path and use it to control
> whether the section->memblock association should be done. arch_add_memory
> then just trivially want memblock for everything but for_device hotplug.
>
> remove_memory_section doesn't need is_zone_device_section either. We can
> simply skip all the memblock specific cleanup if there is no memblock
> for the given section.
>
> This shouldn't introduce any functional change.
>
> Cc: Dan Williams <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> arch/ia64/mm/init.c | 2 +-
> arch/powerpc/mm/mem.c | 2 +-
> arch/s390/mm/init.c | 2 +-
> arch/sh/mm/init.c | 2 +-
> arch/x86/mm/init_32.c | 2 +-
> arch/x86/mm/init_64.c | 2 +-
> drivers/base/memory.c | 22 ++++++++--------------
> include/linux/memory_hotplug.h | 2 +-
> mm/memory_hotplug.c | 11 +++++++----
> 9 files changed, 22 insertions(+), 25 deletions(-)
>
> diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
> index 06cdaef54b2e..62085fd902e6 100644
> --- a/arch/ia64/mm/init.c
> +++ b/arch/ia64/mm/init.c
> @@ -657,7 +657,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
>
> zone = pgdat->node_zones +
> zone_for_memory(nid, start, size, ZONE_NORMAL, for_device);
> - ret = __add_pages(nid, zone, start_pfn, nr_pages);
> + ret = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
>
> if (ret)
> printk("%s: Problem encountered in __add_pages() as ret=%d\n",
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 5f844337de21..ea3e09a62f38 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -149,7 +149,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
> zone = pgdata->node_zones +
> zone_for_memory(nid, start, size, 0, for_device);
>
> - return __add_pages(nid, zone, start_pfn, nr_pages);
> + return __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
> }
>
> #ifdef CONFIG_MEMORY_HOTREMOVE
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index bf5b8a0c4ff7..5c84346e5211 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -182,7 +182,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
> continue;
> nr_pages = (start_pfn + size_pages > zone_end_pfn) ?
> zone_end_pfn - start_pfn : size_pages;
> - rc = __add_pages(nid, zone, start_pfn, nr_pages);
> + rc = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
> if (rc)
> break;
> start_pfn += nr_pages;
> diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
> index 75491862d900..a9d57f75ae8c 100644
> --- a/arch/sh/mm/init.c
> +++ b/arch/sh/mm/init.c
> @@ -498,7 +498,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
> ret = __add_pages(nid, pgdat->node_zones +
> zone_for_memory(nid, start, size, ZONE_NORMAL,
> for_device),
> - start_pfn, nr_pages);
> + start_pfn, nr_pages, !for_device);
> if (unlikely(ret))
> printk("%s: Failed, __add_pages() == %d\n", __func__, ret);
>
> diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
> index c68078fd06fd..4b0f05328af0 100644
> --- a/arch/x86/mm/init_32.c
> +++ b/arch/x86/mm/init_32.c
> @@ -834,7 +834,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
> unsigned long start_pfn = start >> PAGE_SHIFT;
> unsigned long nr_pages = size >> PAGE_SHIFT;
>
> - return __add_pages(nid, zone, start_pfn, nr_pages);
> + return __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
> }
>
> #ifdef CONFIG_MEMORY_HOTREMOVE
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 7eef17239378..39cfaee93975 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -652,7 +652,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
>
> init_memory_mapping(start, start + size);
>
> - ret = __add_pages(nid, zone, start_pfn, nr_pages);
> + ret = __add_pages(nid, zone, start_pfn, nr_pages, !for_device);
> WARN_ON_ONCE(ret);
>
> /* update max_pfn, max_low_pfn and high_memory */
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index cc4f1d0cbffe..89c15e942852 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -685,14 +685,6 @@ static int add_memory_block(int base_section_nr)
> return 0;
> }
>
> -static bool is_zone_device_section(struct mem_section *ms)
> -{
> - struct page *page;
> -
> - page = sparse_decode_mem_map(ms->section_mem_map, __section_nr(ms));
> - return is_zone_device_page(page);
> -}
> -
> /*
> * need an interface for the VM to add new memory regions,
> * but without onlining it.
> @@ -702,9 +694,6 @@ int register_new_memory(int nid, struct mem_section *section)
> int ret = 0;
> struct memory_block *mem;
>
> - if (is_zone_device_section(section))
> - return 0;
> -
> mutex_lock(&mem_sysfs_mutex);
>
> mem = find_memory_block(section);
> @@ -741,11 +730,16 @@ static int remove_memory_section(unsigned long node_id,
> {
> struct memory_block *mem;
>
> - if (is_zone_device_section(section))
> - return 0;
> -
> mutex_lock(&mem_sysfs_mutex);
> +
> + /*
> + * Some users of the memory hotplug do not want/need memblock to
> + * track all sections. Skip over those.
> + */
> mem = find_memory_block(section);
> + if (!mem)
> + return 0;
> +
Another bug above spoted by Evgeny Baskakov from NVidia, mutex unlock
is missing ie something like:
if (!mem) {
mutex_unlock(&mem_sysfs_mutex);
return 0;
}
Between when are you planning on reposting ? I was hoping sometime soon
so i can repost HMM on top. I know with springtime celebration eveyrone
is out collecting chocolate eggs :)
Cheers,
J?r?me
On Tue, Apr 11, 2017 at 10:03 AM, Michal Hocko <[email protected]> wrote:
> All the reported issue seem to be fixed and pushed to my git tree
> attempts/rewrite-mem_hotplug branch. I will wait a day or two for more
> feedback and then repost for the inclusion. I would really appreaciate
> more testing/review!
This still seems to be based on 4.10? It's missing some block-layer
fixes and other things that trigger failures in the nvdimm unit tests.
Can you rebase to a more recent 4.11-rc?
On Mon 17-04-17 14:51:12, Dan Williams wrote:
> On Tue, Apr 11, 2017 at 10:03 AM, Michal Hocko <[email protected]> wrote:
> > All the reported issue seem to be fixed and pushed to my git tree
> > attempts/rewrite-mem_hotplug branch. I will wait a day or two for more
> > feedback and then repost for the inclusion. I would really appreaciate
> > more testing/review!
>
> This still seems to be based on 4.10? It's missing some block-layer
> fixes and other things that trigger failures in the nvdimm unit tests.
> Can you rebase to a more recent 4.11-rc?
OK, I will rebase on top of linux-next. This has been based on mmotm
tree so far. Btw. is there anything that would change the current
implementation other than small context tweaks? In other words, do you
see any issues with the current implementation regarding nvdimm's
ZONE_DEVICE usage?
Thanks!
--
Michal Hocko
SUSE Labs
On Mon 17-04-17 16:12:35, Jerome Glisse wrote:
[...]
> > @@ -741,11 +730,16 @@ static int remove_memory_section(unsigned long node_id,
> > {
> > struct memory_block *mem;
> >
> > - if (is_zone_device_section(section))
> > - return 0;
> > -
> > mutex_lock(&mem_sysfs_mutex);
> > +
> > + /*
> > + * Some users of the memory hotplug do not want/need memblock to
> > + * track all sections. Skip over those.
> > + */
> > mem = find_memory_block(section);
> > + if (!mem)
> > + return 0;
> > +
>
> Another bug above spoted by Evgeny Baskakov from NVidia, mutex unlock
> is missing ie something like:
>
> if (!mem) {
> mutex_unlock(&mem_sysfs_mutex);
> return 0;
> }
Thanks for spotting this. I went with the following fixup
---
>From 38efdaf68b5c79df953385e9385581f75d46e651 Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Tue, 18 Apr 2017 09:17:31 +0200
Subject: [PATCH] fold me "mm, memory_hotplug: get rid of
is_zone_device_section"
- fix remove_memory_section unlock on find_memory_block failure
as per Jerome - spotted by Evgeny Baskakov
---
drivers/base/memory.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 45c25e2e3da4..5ae81617f11d 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -746,7 +746,7 @@ static int remove_memory_section(unsigned long node_id,
*/
mem = find_memory_block(section);
if (!mem)
- return 0;
+ goto out_unlock;
unregister_mem_sect_under_nodes(mem, __section_nr(section));
@@ -756,6 +756,7 @@ static int remove_memory_section(unsigned long node_id,
else
put_device(&mem->dev);
+out_unlock:
mutex_unlock(&mem_sysfs_mutex);
return 0;
}
--
2.11.0
> Between when are you planning on reposting ?
this weak, the sooner the better.
Thanks!
--
Michal Hocko
SUSE Labs
On 04/10/2017 06:02 PM, Michal Hocko wrote:
> On Mon 10-04-17 16:27:49, Igor Mammedov wrote:
> [...]
>> #issue3:
>> removable flag flipped to non-removable state
>>
>> // before series at commit ef0b577b6:
>> memory32:offline removable: 0 zones: Normal Movable
>> memory33:offline removable: 0 zones: Normal Movable
>> memory34:offline removable: 0 zones: Normal Movable
>> memory35:offline removable: 0 zones: Normal Movable
>
> did you mean _after_ the series because the bellow looks like
> the original behavior (at least valid_zones).
>
>> // after series at commit 6a010434
>> memory32:offline removable: 1 zones: Normal
>> memory33:offline removable: 1 zones: Normal
>> memory34:offline removable: 1 zones: Normal
>> memory35:offline removable: 1 zones: Normal Movable
>>
>> also looking at #issue1 removable flag state doesn't
>> seem to be consistent between state changes but maybe that's
>> been broken before
>
> Well, the file has a very questionable semantic. It doesn't provide
> a stable information. Anyway put that aside.
> is_pageblock_removable_nolock relies on having zone association
> which we do not have yet if the memblock is offline. So we need
> the following. I will queue this as a preparatory patch.
> ---
> From 4f3ebc02f4d552d3fe114787ca8a38cc68702208 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <[email protected]>
> Date: Mon, 10 Apr 2017 17:59:03 +0200
> Subject: [PATCH] mm, memory_hotplug: consider offline memblocks removable
>
> is_pageblock_removable_nolock relies on having zone association to
> examine all the page blocks to check whether they are movable or free.
> This is just wasting of cycles when the memblock is offline. Later patch
> in the series will also change the time when the page is associated with
> a zone so we let's bail out early if the memblock is offline.
>
> Reported-by: Igor Mammedov <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
> ---
> drivers/base/memory.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 9677b6b711b0..0c29ec5598ea 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -128,6 +128,9 @@ static ssize_t show_mem_removable(struct device *dev,
> int ret = 1;
> struct memory_block *mem = to_memory_block(dev);
>
> + if (mem->stat != MEM_ONLINE)
> + goto out;
> +
> for (i = 0; i < sections_per_block; i++) {
> if (!present_section_nr(mem->start_section_nr + i))
> continue;
> @@ -135,6 +138,7 @@ static ssize_t show_mem_removable(struct device *dev,
> ret &= is_mem_section_removable(pfn, PAGES_PER_SECTION);
> }
>
> +out:
> return sprintf(buf, "%d\n", ret);
> }
>
>
On 04/15/2017 02:17 PM, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> __pageblock_pfn_to_page has two users currently, set_zone_contiguous
> which checks whether the given zone contains holes and
> pageblock_pfn_to_page which then carefully returns a first valid
> page from the given pfn range for the given zone. This doesn't handle
> zones which are not fully populated though. Memory pageblocks can be
> offlined or might not have been onlined yet. In such a case the zone
> should be considered to have holes otherwise pfn walkers can touch
> and play with offline pages.
>
> Current callers of pageblock_pfn_to_page in compaction seem to work
> properly right now because they only isolate PageBuddy
> (isolate_freepages_block) or PageLRU resp. __PageMovable
> (isolate_migratepages_block) which will be always false for these pages.
> It would be safer to skip these pages altogether, though. In order
> to do that let's check PageReserved in __pageblock_pfn_to_page because
> offline pages are reserved.
My issue with this is that PageReserved can be also set for other
reasons than offlined block, e.g. by a random driver. So there are two
suboptimal scenarios:
- PageReserved is set on some page in the middle of pageblock. It won't
be detected by this patch. This violates the "it would be safer" argument.
- PageReserved is set on just the first (few) page(s) and because of
this patch, we skip it completely and won't compact the rest of it.
So if we decide we really need to check PageReserved to ensure safety,
then we have to check it on each page. But I hope the existing criteria
in compaction scanners are sufficient. Unless the semantic is that if
somebody sets PageReserved, he's free to repurpose the rest of flags at
his will (IMHO that's not the case).
The pageblock-level check them becomes a performance optimization so
when there's an "offline hole", compaction won't iterate it page by
page. But the downside is the false positive resulting in skipping whole
pageblock due to single page.
I guess it's uncommon for a longlived offline holes to exist, so we
could simply just drop this?
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> mm/page_alloc.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0cacba69ab04..dcbbcfdda60e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1351,6 +1351,8 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
> return NULL;
>
> start_page = pfn_to_page(start_pfn);
> + if (PageReserved(start_page))
> + return NULL;
>
> if (page_zone(start_page) != zone)
> return NULL;
>
On Tue 18-04-17 10:45:23, Vlastimil Babka wrote:
> On 04/15/2017 02:17 PM, Michal Hocko wrote:
> > From: Michal Hocko <[email protected]>
> >
> > __pageblock_pfn_to_page has two users currently, set_zone_contiguous
> > which checks whether the given zone contains holes and
> > pageblock_pfn_to_page which then carefully returns a first valid
> > page from the given pfn range for the given zone. This doesn't handle
> > zones which are not fully populated though. Memory pageblocks can be
> > offlined or might not have been onlined yet. In such a case the zone
> > should be considered to have holes otherwise pfn walkers can touch
> > and play with offline pages.
> >
> > Current callers of pageblock_pfn_to_page in compaction seem to work
> > properly right now because they only isolate PageBuddy
> > (isolate_freepages_block) or PageLRU resp. __PageMovable
> > (isolate_migratepages_block) which will be always false for these pages.
> > It would be safer to skip these pages altogether, though. In order
> > to do that let's check PageReserved in __pageblock_pfn_to_page because
> > offline pages are reserved.
>
> My issue with this is that PageReserved can be also set for other
> reasons than offlined block, e.g. by a random driver. So there are two
> suboptimal scenarios:
>
> - PageReserved is set on some page in the middle of pageblock. It won't
> be detected by this patch. This violates the "it would be safer" argument.
> - PageReserved is set on just the first (few) page(s) and because of
> this patch, we skip it completely and won't compact the rest of it.
Why would that be a big problem? PageReserved is used only very seldom
and few page blocks skipped would seem like a minor issue to me.
> So if we decide we really need to check PageReserved to ensure safety,
> then we have to check it on each page. But I hope the existing criteria
> in compaction scanners are sufficient. Unless the semantic is that if
> somebody sets PageReserved, he's free to repurpose the rest of flags at
> his will (IMHO that's not the case).
I am not aware of any such user. PageReserved has always been about "the
core mm should touch these pages and modify their state" AFAIR.
But I believe that touching those holes just asks for problems so I
would rather have them covered.
> The pageblock-level check them becomes a performance optimization so
> when there's an "offline hole", compaction won't iterate it page by
> page. But the downside is the false positive resulting in skipping whole
> pageblock due to single page.
> I guess it's uncommon for a longlived offline holes to exist, so we
> could simply just drop this?
This is hard to tell but I can imagine that some memory hotplug
balloning drivers might want to offline hole into existing zones.
> > Signed-off-by: Michal Hocko <[email protected]>
> > ---
> > mm/page_alloc.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 0cacba69ab04..dcbbcfdda60e 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1351,6 +1351,8 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
> > return NULL;
> >
> > start_page = pfn_to_page(start_pfn);
> > + if (PageReserved(start_page))
> > + return NULL;
> >
> > if (page_zone(start_page) != zone)
> > return NULL;
> >
--
Michal Hocko
SUSE Labs
On Tue, Apr 18, 2017 at 12:14 AM, Michal Hocko <[email protected]> wrote:
> On Mon 17-04-17 14:51:12, Dan Williams wrote:
>> On Tue, Apr 11, 2017 at 10:03 AM, Michal Hocko <[email protected]> wrote:
>> > All the reported issue seem to be fixed and pushed to my git tree
>> > attempts/rewrite-mem_hotplug branch. I will wait a day or two for more
>> > feedback and then repost for the inclusion. I would really appreaciate
>> > more testing/review!
>>
>> This still seems to be based on 4.10? It's missing some block-layer
>> fixes and other things that trigger failures in the nvdimm unit tests.
>> Can you rebase to a more recent 4.11-rc?
>
> OK, I will rebase on top of linux-next. This has been based on mmotm
> tree so far. Btw. is there anything that would change the current
> implementation other than small context tweaks? In other words, do you
> see any issues with the current implementation regarding nvdimm's
> ZONE_DEVICE usage?
I don't foresee any issues, but I wanted to be able to run the latest
test suite to be sure.
On Tue 18-04-17 09:42:57, Dan Williams wrote:
> On Tue, Apr 18, 2017 at 12:14 AM, Michal Hocko <[email protected]> wrote:
> > On Mon 17-04-17 14:51:12, Dan Williams wrote:
> >> On Tue, Apr 11, 2017 at 10:03 AM, Michal Hocko <[email protected]> wrote:
> >> > All the reported issue seem to be fixed and pushed to my git tree
> >> > attempts/rewrite-mem_hotplug branch. I will wait a day or two for more
> >> > feedback and then repost for the inclusion. I would really appreaciate
> >> > more testing/review!
> >>
> >> This still seems to be based on 4.10? It's missing some block-layer
> >> fixes and other things that trigger failures in the nvdimm unit tests.
> >> Can you rebase to a more recent 4.11-rc?
> >
> > OK, I will rebase on top of linux-next. This has been based on mmotm
> > tree so far. Btw. is there anything that would change the current
> > implementation other than small context tweaks? In other words, do you
> > see any issues with the current implementation regarding nvdimm's
> > ZONE_DEVICE usage?
>
> I don't foresee any issues, but I wanted to be able to run the latest
> test suite to be sure.
OK, the rebase on top of the current linux-next is in my git tree [1]
attempts/rewrite-mem_hotplug branch. I will post the full series
tomorrow hopefully.
[1] git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
--
Michal Hocko
SUSE Labs
On 04/18/2017 11:27 AM, Michal Hocko wrote:
> On Tue 18-04-17 10:45:23, Vlastimil Babka wrote:
>> On 04/15/2017 02:17 PM, Michal Hocko wrote:
>>> From: Michal Hocko <[email protected]>
>>>
>>
>> My issue with this is that PageReserved can be also set for other
>> reasons than offlined block, e.g. by a random driver. So there are two
>> suboptimal scenarios:
>>
>> - PageReserved is set on some page in the middle of pageblock. It won't
>> be detected by this patch. This violates the "it would be safer" argument.
>> - PageReserved is set on just the first (few) page(s) and because of
>> this patch, we skip it completely and won't compact the rest of it.
>
> Why would that be a big problem? PageReserved is used only very seldom
> and few page blocks skipped would seem like a minor issue to me.
Yes it's not critical, just suboptimal. Can be improved later.
>> So if we decide we really need to check PageReserved to ensure safety,
>> then we have to check it on each page. But I hope the existing criteria
>> in compaction scanners are sufficient. Unless the semantic is that if
>> somebody sets PageReserved, he's free to repurpose the rest of flags at
>> his will (IMHO that's not the case).
>
> I am not aware of any such user. PageReserved has always been about "the
> core mm should touch these pages and modify their state" AFAIR.
> But I believe that touching those holes just asks for problems so I
> would rather have them covered.
OK. I guess it's OK to use PageReserved of first pageblock page to
determine if we can trust page_zone(), because the memory offline
scenario should have sufficient granularity and not make holes inside
pageblock?
>> The pageblock-level check them becomes a performance optimization so
>> when there's an "offline hole", compaction won't iterate it page by
>> page. But the downside is the false positive resulting in skipping whole
>> pageblock due to single page.
>> I guess it's uncommon for a longlived offline holes to exist, so we
>> could simply just drop this?
>
> This is hard to tell but I can imagine that some memory hotplug
> balloning drivers might want to offline hole into existing zones.
OK.
On Wed 19-04-17 13:59:40, Vlastimil Babka wrote:
> On 04/18/2017 11:27 AM, Michal Hocko wrote:
[...]
> > I am not aware of any such user. PageReserved has always been about "the
> > core mm should touch these pages and modify their state" AFAIR.
> > But I believe that touching those holes just asks for problems so I
> > would rather have them covered.
>
> OK. I guess it's OK to use PageReserved of first pageblock page to
> determine if we can trust page_zone(), because the memory offline
> scenario should have sufficient granularity and not make holes inside
> pageblock?
Yes memblocks should be section size aligned and that is 128M resp. 2GB
on large machines. So we are talking about much larger than page block
granularity here.
Anyway, Joonsoo didn't like the the explicit PageReserved checks so I
have come with pfn_to_online_page which hides this implementation
detail. How do you like the following instead?
---
>From 0f5544b5d01f4bc1572e43cc2a0156ae33a2922c Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Thu, 13 Apr 2017 10:28:45 +0200
Subject: [PATCH] mm: consider zone which is not fully populated to have holes
__pageblock_pfn_to_page has two users currently, set_zone_contiguous
which checks whether the given zone contains holes and
pageblock_pfn_to_page which then carefully returns a first valid
page from the given pfn range for the given zone. This doesn't handle
zones which are not fully populated though. Memory pageblocks can be
offlined or might not have been onlined yet. In such a case the zone
should be considered to have holes otherwise pfn walkers can touch
and play with offline pages.
Current callers of pageblock_pfn_to_page in compaction seem to work
properly right now because they only isolate PageBuddy
(isolate_freepages_block) or PageLRU resp. __PageMovable
(isolate_migratepages_block) which will be always false for these pages.
It would be safer to skip these pages altogether, though. In order
to do that let's add pfn_to_online_page helper which checks PageReserved
because offline pages are reserved until they are onlined. There might
be other users of the PageReserved flag but they are rare and even if we
hit into those pages we should skip them in pfn walkers anyway. So this
is not harmful.
Use the new helper in __pageblock_pfn_to_page and skip the whole page
block in such a case. Vlastimil has noted that we might skip over
the page block even when there is a single reserved page but that
shouldn't lead to major issues because reserved pages are used very
seldom.
Signed-off-by: Michal Hocko <[email protected]>
---
include/linux/memory_hotplug.h | 28 ++++++++++++++++++++++++++++
mm/page_alloc.c | 4 +++-
2 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 3c8cf86201c3..736fe73e65af 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -14,6 +14,26 @@ struct memory_block;
struct resource;
#ifdef CONFIG_MEMORY_HOTPLUG
+/*
+ * Return page for the valid pfn only if the page is online.
+ * Offline pages are marked reserved. There are other users of PageReserved
+ * but pfn walkers should avoid them in general so such a false positive
+ * is not harmful.
+ *
+ * It would be great if this was a static inline but dependency hell doesn't
+ * allow that for now.
+ */
+#define pfn_to_online_page(pfn) \
+({ \
+ struct page *___page = NULL; \
+ \
+ if (pfn_valid(pfn)) { \
+ ___page = pfn_to_page(pfn); \
+ if (unlikely(PageReserved(___page))) \
+ ___page = NULL; \
+ } \
+ ___page; \
+})
/*
* Types for free bootmem stored in page->lru.next. These have to be in
@@ -203,6 +223,14 @@ extern void set_zone_contiguous(struct zone *zone);
extern void clear_zone_contiguous(struct zone *zone);
#else /* ! CONFIG_MEMORY_HOTPLUG */
+#define pfn_to_online_page(pfn) \
+({ \
+ struct page *___page = NULL; \
+ if (pfn_valid(pfn)) \
+ ___page = pfn_to_page(pfn); \
+ ___page; \
+ })
+
/*
* Stub functions for when hotplug is off
*/
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5d72d29a6ece..9dd814f4e7f5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1353,7 +1353,9 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn))
return NULL;
- start_page = pfn_to_page(start_pfn);
+ start_page = pfn_to_online_page(start_pfn);
+ if (!start_page)
+ return NULL;
if (page_zone(start_page) != zone)
return NULL;
--
2.11.0
--
Michal Hocko
SUSE Labs
On 04/19/2017 02:16 PM, Michal Hocko wrote:
> On Wed 19-04-17 13:59:40, Vlastimil Babka wrote:
>> On 04/18/2017 11:27 AM, Michal Hocko wrote:
> [...]
>>> I am not aware of any such user. PageReserved has always been about "the
>>> core mm should touch these pages and modify their state" AFAIR.
>>> But I believe that touching those holes just asks for problems so I
>>> would rather have them covered.
>>
>> OK. I guess it's OK to use PageReserved of first pageblock page to
>> determine if we can trust page_zone(), because the memory offline
>> scenario should have sufficient granularity and not make holes inside
>> pageblock?
>
> Yes memblocks should be section size aligned and that is 128M resp. 2GB
> on large machines. So we are talking about much larger than page block
> granularity here.
>
> Anyway, Joonsoo didn't like the the explicit PageReserved checks so I
> have come with pfn_to_online_page which hides this implementation
> detail. How do you like the following instead?
Yeah that's OK. The other two patches will be updated as well?
Ideally we would later convert this helper to use some special values
for zone/node id (such as -1) instead of PageReserved to indicate an
offline node, as we discussed.
> ---
> From 0f5544b5d01f4bc1572e43cc2a0156ae33a2922c Mon Sep 17 00:00:00 2001
> From: Michal Hocko <[email protected]>
> Date: Thu, 13 Apr 2017 10:28:45 +0200
> Subject: [PATCH] mm: consider zone which is not fully populated to have holes
>
> __pageblock_pfn_to_page has two users currently, set_zone_contiguous
> which checks whether the given zone contains holes and
> pageblock_pfn_to_page which then carefully returns a first valid
> page from the given pfn range for the given zone. This doesn't handle
> zones which are not fully populated though. Memory pageblocks can be
> offlined or might not have been onlined yet. In such a case the zone
> should be considered to have holes otherwise pfn walkers can touch
> and play with offline pages.
>
> Current callers of pageblock_pfn_to_page in compaction seem to work
> properly right now because they only isolate PageBuddy
> (isolate_freepages_block) or PageLRU resp. __PageMovable
> (isolate_migratepages_block) which will be always false for these pages.
> It would be safer to skip these pages altogether, though. In order
> to do that let's add pfn_to_online_page helper which checks PageReserved
> because offline pages are reserved until they are onlined. There might
> be other users of the PageReserved flag but they are rare and even if we
> hit into those pages we should skip them in pfn walkers anyway. So this
> is not harmful.
>
> Use the new helper in __pageblock_pfn_to_page and skip the whole page
> block in such a case. Vlastimil has noted that we might skip over
> the page block even when there is a single reserved page but that
> shouldn't lead to major issues because reserved pages are used very
> seldom.
>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> include/linux/memory_hotplug.h | 28 ++++++++++++++++++++++++++++
> mm/page_alloc.c | 4 +++-
> 2 files changed, 31 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 3c8cf86201c3..736fe73e65af 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -14,6 +14,26 @@ struct memory_block;
> struct resource;
>
> #ifdef CONFIG_MEMORY_HOTPLUG
> +/*
> + * Return page for the valid pfn only if the page is online.
> + * Offline pages are marked reserved. There are other users of PageReserved
> + * but pfn walkers should avoid them in general so such a false positive
> + * is not harmful.
> + *
> + * It would be great if this was a static inline but dependency hell doesn't
> + * allow that for now.
> + */
> +#define pfn_to_online_page(pfn) \
> +({ \
> + struct page *___page = NULL; \
> + \
> + if (pfn_valid(pfn)) { \
> + ___page = pfn_to_page(pfn); \
> + if (unlikely(PageReserved(___page))) \
> + ___page = NULL; \
> + } \
> + ___page; \
> +})
>
> /*
> * Types for free bootmem stored in page->lru.next. These have to be in
> @@ -203,6 +223,14 @@ extern void set_zone_contiguous(struct zone *zone);
> extern void clear_zone_contiguous(struct zone *zone);
>
> #else /* ! CONFIG_MEMORY_HOTPLUG */
> +#define pfn_to_online_page(pfn) \
> +({ \
> + struct page *___page = NULL; \
> + if (pfn_valid(pfn)) \
> + ___page = pfn_to_page(pfn); \
> + ___page; \
> + })
> +
> /*
> * Stub functions for when hotplug is off
> */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5d72d29a6ece..9dd814f4e7f5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1353,7 +1353,9 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
> if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn))
> return NULL;
>
> - start_page = pfn_to_page(start_pfn);
> + start_page = pfn_to_online_page(start_pfn);
> + if (!start_page)
> + return NULL;
>
> if (page_zone(start_page) != zone)
> return NULL;
>
On Wed 19-04-17 14:34:54, Vlastimil Babka wrote:
> On 04/19/2017 02:16 PM, Michal Hocko wrote:
> > On Wed 19-04-17 13:59:40, Vlastimil Babka wrote:
> >> On 04/18/2017 11:27 AM, Michal Hocko wrote:
> > [...]
> >>> I am not aware of any such user. PageReserved has always been about "the
> >>> core mm should touch these pages and modify their state" AFAIR.
> >>> But I believe that touching those holes just asks for problems so I
> >>> would rather have them covered.
> >>
> >> OK. I guess it's OK to use PageReserved of first pageblock page to
> >> determine if we can trust page_zone(), because the memory offline
> >> scenario should have sufficient granularity and not make holes inside
> >> pageblock?
> >
> > Yes memblocks should be section size aligned and that is 128M resp. 2GB
> > on large machines. So we are talking about much larger than page block
> > granularity here.
> >
> > Anyway, Joonsoo didn't like the the explicit PageReserved checks so I
> > have come with pfn_to_online_page which hides this implementation
> > detail. How do you like the following instead?
>
> Yeah that's OK. The other two patches will be updated as well?
yes
> Ideally we would later convert this helper to use some special values
> for zone/node id (such as -1) instead of PageReserved to indicate an
> offline node, as we discussed.
I have considered zone_id to be -1 but there is just too much code which
uses the id to translate it to the struct zone * directly and that would
lead to subtle bugs. On the other hand zone_id == 0 is not optimal but
much safer from that POV. I will think about the safest way forward long
term but my intention was to have something reasonably good for starter.
--
Michal Hocko
SUSE Labs
On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> On Mon 17-04-17 14:47:20, Joonsoo Kim wrote:
> > On Sat, Apr 15, 2017 at 02:17:31PM +0200, Michal Hocko wrote:
> > > Hi,
> > > here I 3 more preparatory patches which I meant to send on Thursday but
> > > forgot... After more thinking about pfn walkers I have realized that
> > > the current code doesn't check offline holes in zones. From a quick
> > > review that doesn't seem to be a problem currently. Pfn walkers can race
> > > with memory offlining and with the original hotplug impementation those
> > > offline pages can change the zone but I wasn't able to find any serious
> > > problem other than small confusion. The new hotplug code, will not have
> > > any valid zone, though so those code paths should check PageReserved
> > > to rule offline holes. I hope I have addressed all of them in these 3
> > > patches. I would appreciate if Vlastimil and Jonsoo double check after
> > > me.
> >
> > Hello, Michal.
> >
> > s/Jonsoo/Joonsoo. :)
>
> ups, sorry about that.
>
> > I'm not sure that it's a good idea to add PageResereved() check in pfn
> > walkers. First, this makes struct page validity check as two steps,
> > pfn_valid() and then PageResereved().
>
> Yes, those are two separate checkes because semantically they are
> different. Not all pfn walkers do care about the online status.
If offlined page has no valid information, reading information
about offlined pages are just wrong. So, all pfn walkers that reads
information about the page should do care about it.
I guess that many callers for pfn_valid() is in this category.
>
> > If we should not use struct page
> > in this case, it's better to pfn_valid() returns false rather than
> > adding a separate check. Anyway, we need to fix more places (all pfn
> > walker?) if we want to check validity by two steps.
>
> Which pfn walkers you have in mind?
For example, kpagecount_read() in fs/proc/page.c. I searched it by
using pfn_valid().
> > The other problem I found is that your change will makes some
> > contiguous zones to be considered as non-contiguous. Memory allocated
> > by memblock API is also marked as PageResereved. If we consider this as
> > a hole, we will set such a zone as non-contiguous.
>
> Why would that be a problem? We shouldn't touch those pages anyway?
Skipping those pages in compaction are valid so no problem in this
case.
The problem I mentioned above is that adding PageReserved() check in
__pageblock_pfn_to_page() invalidates optimization by
set_zone_contiguous(). In compaction, we need to get a valid struct
page and it requires a lot of work. There is performance problem
report due to this so set_zone_contiguous() optimization is added. It
checks if the zone is contiguous or not in boot time. If zone is
determined as contiguous, we can easily get a valid struct page in
runtime without expensive checks.
Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
woule make that zone->contiguous usually returns false since memory
used by memblock API is marked as PageReserved() and your patch regard
it as a hole. It invalidates set_zone_contiguous() optimization and I
worry about it.
>
> > And, I guess that it's not enough to check PageResereved() in
> > pageblock_pfn_to_page() in order to skip these pages in compaction. If
> > holes are in the middle of the pageblock, pageblock_pfn_to_page()
> > cannot catch it and compaction will use struct page for this hole.
>
> Yes pageblock_pfn_to_page cannot catch it and it wouldn't with the
> current implementation anyway. So the implementation won't be any worse
> than with the current code. On the other hand offline holes will always
> fill the whole pageblock (assuming those are not spanning multiple
> memblocks).
>
> > Therefore, I think that making pfn_valid() return false for not
> > onlined memory is a better solution for this problem. I don't know the
> > implementation detail for hotplug and I don't see your recent change
> > but we may defer memmap initialization until the zone is determined.
> > It will make pfn_valid() return false for un-initialized range.
>
> I am not really sure. pfn_valid is used in many context and its only
> purpose is to tell whether pfn_to_page will return a valid struct page
> AFAIU.
>
> I agree that having more checks is more error prone and we can add a
> helper pfn_to_valid_page or something similar but I believe we can do
> that on top of the current hotplug rework. This would require a non
> trivial amount of changes and I believe that a lacking check for the
> offline holes is not critical - we would (ab)use the lowest zone which
> is similar to (ab)using ZONE_NORMAL/MOVABLE with the original code.
I'm not objecting your hotplug rework. In fact, I don't know the
relationship between this work and hotplug rework. I'm agreeing
with checking offline holes but I don't like the design and
implementation about it.
Let me clarify my desire(?) for this issue.
1. If pfn_valid() returns true, struct page has valid information, at
least, in flags (zone id, node id, flags, etc...). So, we can use them
without checking PageResereved().
2. pfn_valid() for offlined holes returns false. This can be easily
(?) implemented by manipulating SECTION_MAP_MASK in hotplug code. I
guess that there is no reason that pfn_valid() returns true for
offlined holes. If there is, please let me know.
3. We don't need to check PageReserved() in most of pfn walkers in
order to check offline holes.
Thanks.
On Tue, Apr 18, 2017 at 12:54 PM, Michal Hocko <[email protected]> wrote:
> On Tue 18-04-17 09:42:57, Dan Williams wrote:
>> On Tue, Apr 18, 2017 at 12:14 AM, Michal Hocko <[email protected]> wrote:
>> > On Mon 17-04-17 14:51:12, Dan Williams wrote:
>> >> On Tue, Apr 11, 2017 at 10:03 AM, Michal Hocko <[email protected]> wrote:
>> >> > All the reported issue seem to be fixed and pushed to my git tree
>> >> > attempts/rewrite-mem_hotplug branch. I will wait a day or two for more
>> >> > feedback and then repost for the inclusion. I would really appreaciate
>> >> > more testing/review!
>> >>
>> >> This still seems to be based on 4.10? It's missing some block-layer
>> >> fixes and other things that trigger failures in the nvdimm unit tests.
>> >> Can you rebase to a more recent 4.11-rc?
>> >
>> > OK, I will rebase on top of linux-next. This has been based on mmotm
>> > tree so far. Btw. is there anything that would change the current
>> > implementation other than small context tweaks? In other words, do you
>> > see any issues with the current implementation regarding nvdimm's
>> > ZONE_DEVICE usage?
>>
>> I don't foresee any issues, but I wanted to be able to run the latest
>> test suite to be sure.
>
> OK, the rebase on top of the current linux-next is in my git tree [1]
> attempts/rewrite-mem_hotplug branch. I will post the full series
> tomorrow hopefully.
>
> [1] git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
I'm hitting the following with the "device-dax" unit test [1]. Does
not look like your changes, but I'm kicking off a bisect between
v4.11-rc7 and this branch tip.
[1]: https://github.com/pmem/ndctl/blob/master/test/device-dax.c
---
[ 547.047430] BUG: unable to handle kernel paging request at ffff880001000000
[ 547.048954] IP: native_set_pte_at+0x1/0x10
[ 547.049967] PGD 3197067
[ 547.049968] P4D 3197067
[ 547.050779] PUD 3198067
[ 547.051589] PMD 33ff00067
[ 547.052401] PTE 8000000001000161
[ 547.053237]
[ 547.054819] Oops: 0003 [#1] SMP DEBUG_PAGEALLOC
[ 547.055907] Dumping ftrace buffer:
[ 547.056864] (ftrace buffer empty)
[ 547.057815] Modules linked in: nd_blk(O) ip6t_rpfilter ip6t_REJECT
nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc
ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_n
at_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
iptable_mangle iptable_raw iptable_security ebtable_filter
ebtables ip6table_filter ip6_tables crct10dif_pclmul crc32_pclmul
crc32c_intel ghash_clmulni_intel dax_pmem(O) nd_pmem(O) dax(O)
nd_btt(O) nfit(O) nd_e820(O) tpm_tis libnvdimm(O) serio_raw
tpm_tis_core tpm nfit_test_iomap(O) nfsd nfs_acl [last unloaded: nfit_test]
[ 547.069034] CPU: 17 PID: 9526 Comm: lt-ndctl Tainted: G O
4.11.0-rc7-next-20170418+ #34
[ 547.071122] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.9.3-1.fc25 04/01/2014
[ 547.073163] task: ffff880322f518c0 task.stack: ffffc90002f08000
[ 547.074433] RIP: 0010:native_set_pte_at+0x1/0x10
[ 547.075523] RSP: 0018:ffffc90002f0bb90 EFLAGS: 00010246
[ 547.076703] RAX: 0000000000000000 RBX: ffff880200000000 RCX: 0000000000000000
[ 547.078129] RDX: ffff880001000000 RSI: ffff880200000000 RDI: ffffffff820892e0
[ 547.079549] RBP: ffffc90002f0bba8 R08: 0000000000000000 R09: ffffffff81ec7264
[ 547.080969] R10: ffffc90002f0bb28 R11: ffff880322f518c0 R12: ffff880200200000
[ 547.082389] R13: ffff88033ff00000 R14: ffff880200001000 R15: ffff880200000000
[ 547.083816] FS: 00007fc08d585380(0000) GS:ffff880336040000(0000)
knlGS:0000000000000000
[ 547.085777] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 547.087013] CR2: ffff880001000000 CR3: 000000019fe1f000 CR4: 00000000000406e0
[ 547.091667] Call Trace:
[ 547.092466] ? pte_clear.constprop.18+0x26/0x2b
[ 547.093563] remove_pagetable+0x4af/0x783
[ 547.094582] arch_remove_memory+0xa2/0xc0
[ 547.095598] devm_memremap_pages_release+0xde/0x330
[ 547.096726] release_nodes+0x16d/0x2b0
[ 547.097702] devres_release_all+0x3c/0x50
[ 547.098726] device_release_driver_internal+0x16d/0x210
[ 547.099900] device_release_driver+0x12/0x20
[ 547.100949] unbind_store+0x10f/0x160
[
On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
[...]
> > Which pfn walkers you have in mind?
>
> For example, kpagecount_read() in fs/proc/page.c. I searched it by
> using pfn_valid().
Yeah, I've checked that one and in fact this is a good example of the
case where you do not really care about holes. It just checks the page
count which is a valid information under any circumstances.
> > > The other problem I found is that your change will makes some
> > > contiguous zones to be considered as non-contiguous. Memory allocated
> > > by memblock API is also marked as PageResereved. If we consider this as
> > > a hole, we will set such a zone as non-contiguous.
> >
> > Why would that be a problem? We shouldn't touch those pages anyway?
>
> Skipping those pages in compaction are valid so no problem in this
> case.
>
> The problem I mentioned above is that adding PageReserved() check in
> __pageblock_pfn_to_page() invalidates optimization by
> set_zone_contiguous(). In compaction, we need to get a valid struct
> page and it requires a lot of work. There is performance problem
> report due to this so set_zone_contiguous() optimization is added. It
> checks if the zone is contiguous or not in boot time. If zone is
> determined as contiguous, we can easily get a valid struct page in
> runtime without expensive checks.
OK, I see. I've had some vague understading and the clarification helps.
> Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
> woule make that zone->contiguous usually returns false since memory
> used by memblock API is marked as PageReserved() and your patch regard
> it as a hole. It invalidates set_zone_contiguous() optimization and I
> worry about it.
OK, fair enough. I did't consider memblock allocations. I will rethink
this patch but there are essentially 3 options
- use a different criterion for the offline holes dection. I
have just realized we might do it by storing the online
information into the mem sections
- drop this patch
- move the PageReferenced check down the chain into
isolate_freepages_block resp. isolate_migratepages_block
I would prefer 3 over 2 over 1. I definitely want to make this more
robust so 1 is preferable long term but I do not want this to be a
roadblock to the rest of the rework. Does that sound acceptable to you?
[..]
> Let me clarify my desire(?) for this issue.
>
> 1. If pfn_valid() returns true, struct page has valid information, at
> least, in flags (zone id, node id, flags, etc...). So, we can use them
> without checking PageResereved().
This is no longer true after my rework. Pages are associated with the
zone during _onlining_ rather than when they are physically hotpluged.
Basically only the nid is set properly. Strictly speaking this is the
case also without my rework because the zone might change during online
phase so you cannot assume it is correct even now. It just happens that
it more or less works just fine.
> 2. pfn_valid() for offlined holes returns false. This can be easily
> (?) implemented by manipulating SECTION_MAP_MASK in hotplug code. I
> guess that there is no reason that pfn_valid() returns true for
> offlined holes. If there is, please let me know.
There is some code which really expects that pfn_valid returns true iff
there is a struct page and it doesn't care about the online status.
E.g. hotplug code itself so no, we cannot change pfn_valid. What we can
do though is to add pfn_to_online_page which would do the proper check.
I have already sent [1]. As noted above we can (ab)use the remaining bit
in SECTION_MAP_MASK to detect offline pages more robustly.
> 3. We don't need to check PageReserved() in most of pfn walkers in
> order to check offline holes.
We still have to distinguish those who care about offline pages from
those who do not care about it.
Thanks!
--
Michal Hocko
SUSE Labs
On 04/10/2017 06:25 PM, Michal Hocko wrote:
> This contains two minor fixes spotted based on testing by Igor Mammedov.
> ---
> From d829579cc7061255f818f9aeaa3aa2cd82fec75a Mon Sep 17 00:00:00 2001
> From: Michal Hocko <[email protected]>
> Date: Wed, 29 Mar 2017 16:07:00 +0200
> Subject: [PATCH] mm, memory_hotplug: do not associate hotadded memory to zones
> until online
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> The current memory hotplug implementation relies on having all the
> struct pages associate with a zone/node during the physical hotplug phase
> (arch_add_memory->__add_pages->__add_section->__add_zone). In the vast
> majority of cases this means that they are added to ZONE_NORMAL. This
> has been so since 9d99aaa31f59 ("[PATCH] x86_64: Support memory hotadd
> without sparsemem") and it wasn't a big deal back then because movable
> onlining didn't exist yet.
>
> Much later memory hotplug wanted to (ab)use ZONE_MOVABLE for movable
> onlining 511c2aba8f07 ("mm, memory-hotplug: dynamic configure movable
> memory and portion memory") and then things got more complicated. Rather
> than reconsidering the zone association which was no longer needed
> (because the memory hotplug already depended on SPARSEMEM) a convoluted
> semantic of zone shifting has been developed. Only the currently last
> memblock or the one adjacent to the zone_movable can be onlined movable.
> This essentially means that the online type changes as the new memblocks
> are added.
>
> Let's simulate memory hot online manually
> Normal Movable
>
> /sys/devices/system/memory/memory32/valid_zones:Normal
> /sys/devices/system/memory/memory33/valid_zones:Normal Movable
>
> /sys/devices/system/memory/memory32/valid_zones:Normal
> /sys/devices/system/memory/memory33/valid_zones:Normal
> /sys/devices/system/memory/memory34/valid_zones:Normal Movable
>
> /sys/devices/system/memory/memory32/valid_zones:Normal
> /sys/devices/system/memory/memory33/valid_zones:Normal Movable
> /sys/devices/system/memory/memory34/valid_zones:Movable Normal
Commands seem to be missing above?
> This is an awkward semantic because an udev event is sent as soon as the
> block is onlined and an udev handler might want to online it based on
> some policy (e.g. association with a node) but it will inherently race
> with new blocks showing up.
>
> This patch changes the physical online phase to not associate pages
> with any zone at all. All the pages are just marked reserved and wait
> for the onlining phase to be associated with the zone as per the online
> request. There are only two requirements
> - existing ZONE_NORMAL and ZONE_MOVABLE cannot overlap
> - ZONE_NORMAL precedes ZONE_MOVABLE in physical addresses
> the later on is not an inherent requirement and can be changed in the
> future. It preserves the current behavior and made the code slightly
> simpler. This is subject to change in future.
>
> This means that the same physical online steps as above will lead to the
> following state:
> Normal Movable
>
> /sys/devices/system/memory/memory32/valid_zones:Normal Movable
> /sys/devices/system/memory/memory33/valid_zones:Normal Movable
>
> /sys/devices/system/memory/memory32/valid_zones:Normal Movable
> /sys/devices/system/memory/memory33/valid_zones:Normal Movable
> /sys/devices/system/memory/memory34/valid_zones:Normal Movable
>
> /sys/devices/system/memory/memory32/valid_zones:Normal Movable
> /sys/devices/system/memory/memory33/valid_zones:Normal Movable
> /sys/devices/system/memory/memory34/valid_zones:Movable
Ditto.
> Implementation:
> The current move_pfn_range is reimplemented to check the above
> requirements (allow_online_pfn_range) and then updates the respective
> zone (move_pfn_range_to_zone), the pgdat and links all the pages in the
> pfn range with the zone/node. __add_pages is updated to not require the
> zone and only initializes sections in the range. This allowed to
> simplify the arch_add_memory code (s390 could get rid of quite some
> of code).
>
> devm_memremap_pages is the only user of arch_add_memory which relies
> on the zone association because it only hooks into the memory hotplug
> only half way. It uses it to associate the new memory with ZONE_DEVICE
> but doesn't allow it to be {on,off}lined via sysfs. This means that this
> particular code path has to call move_pfn_range_to_zone explicitly.
>
> The original zone shifting code is kept in place and will be removed in
> the follow up patch for an easier review.
>
> Changes since v1
> - we have to associate the page with the node early (in __add_section),
> because pfn_to_node depends on struct page containing this
> information - based on testing by Reza Arbab
> - resize_{zone,pgdat}_range has to check whether they are popoulated -
> Reza Arbab
> - fix devm_memremap_pages to use pfn rather than physical address -
> J?r?me Glisse
> - move_pfn_range has to check for intersection with zone_movable rather
> than to rely on allow_online_pfn_range(MMOP_ONLINE_MOVABLE) for
> MMOP_ONLINE_KEEP
>
> Changes since v2
> - fix show_valid_zones nr_pages calculation
> - allow_online_pfn_range has to check managed pages rather than present
>
> Cc: Dan Williams <[email protected]>
> Cc: Martin Schwidefsky <[email protected]>
> Cc: [email protected]
> Acked-by: Heiko Carstens <[email protected]> # For s390 bits
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> arch/ia64/mm/init.c | 9 +-
> arch/powerpc/mm/mem.c | 10 +-
> arch/s390/mm/init.c | 30 +-----
> arch/sh/mm/init.c | 8 +-
> arch/x86/mm/init_32.c | 5 +-
> arch/x86/mm/init_64.c | 9 +-
> drivers/base/memory.c | 52 ++++++-----
> include/linux/memory_hotplug.h | 13 +--
> include/linux/mmzone.h | 14 +++
> kernel/memremap.c | 4 +
> mm/memory_hotplug.c | 201 +++++++++++++++++++++++++----------------
> mm/sparse.c | 3 +-
> 12 files changed, 186 insertions(+), 172 deletions(-)
...
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -533,6 +533,20 @@ static inline bool zone_is_empty(struct zone *zone)
> }
>
> /*
> + * Return true if [start_pfn, start_pfn + nr_pages) range has a non-mpty
non-empty
> + * intersection with the given zone
> + */
> +static inline bool zone_intersects(struct zone *zone,
> + unsigned long start_pfn, unsigned long nr_pages)
> +{
I'm looking at your current mmotm tree branch, which looks like this:
+ * Return true if [start_pfn, start_pfn + nr_pages) range has a non-mpty
+ * intersection with the given zone
+ */
+static inline bool zone_intersects(struct zone *zone,
+ unsigned long start_pfn, unsigned long nr_pages)
+{
+ if (zone_is_empty(zone))
+ return false;
+ if (zone->zone_start_pfn <= start_pfn && start_pfn < zone_end_pfn(zone))
+ return true;
+ if (start_pfn + nr_pages > zone->zone_start_pfn)
+ return true;
A false positive is possible here, when start_pfn >= zone_end_pfn(zone)?
+ return false;
+}
+
+/*
...
> @@ -1029,39 +1018,114 @@ static void node_states_set_node(int node, struct memory_notify *arg)
> node_set_state(node, N_MEMORY);
> }
>
> -bool zone_can_shift(unsigned long pfn, unsigned long nr_pages,
> - enum zone_type target, int *zone_shift)
> +bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type)
> {
> - struct zone *zone = page_zone(pfn_to_page(pfn));
> - enum zone_type idx = zone_idx(zone);
> - int i;
> + struct pglist_data *pgdat = NODE_DATA(nid);
> + struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE];
> + struct zone *normal_zone = &pgdat->node_zones[ZONE_NORMAL];
>
> - *zone_shift = 0;
> + /*
> + * TODO there shouldn't be any inherent reason to have ZONE_NORMAL
> + * physically before ZONE_MOVABLE. All we need is they do not
> + * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE
> + * though so let's stick with it for simplicity for now.
> + * TODO make sure we do not overlap with ZONE_DEVICE
Is this last TODO a blocker, unlike the others?
...
@ -1074,29 +1138,16 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
> int nid;
> int ret;
> struct memory_notify arg;
> - int zone_shift = 0;
>
> - /*
> - * This doesn't need a lock to do pfn_to_page().
> - * The section can't be removed here because of the
> - * memory_block->state_mutex.
> - */
> - zone = page_zone(pfn_to_page(pfn));
> -
> - if ((zone_idx(zone) > ZONE_NORMAL ||
> - online_type == MMOP_ONLINE_MOVABLE) &&
> - !can_online_high_movable(pfn_to_nid(pfn)))
> + nid = pfn_to_nid(pfn);
> + if (!allow_online_pfn_range(nid, pfn, nr_pages, online_type))
> return -EINVAL;
>
> - if (online_type == MMOP_ONLINE_KERNEL) {
> - if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, &zone_shift))
> - return -EINVAL;
> - } else if (online_type == MMOP_ONLINE_MOVABLE) {
> - if (!zone_can_shift(pfn, nr_pages, ZONE_MOVABLE, &zone_shift))
> - return -EINVAL;
> - }
> + if (online_type == MMOP_ONLINE_MOVABLE && !can_online_high_movable(nid))
> + return -EINVAL;
>
> - zone = move_pfn_range(zone_shift, pfn, pfn + nr_pages);
> + /* associate pfn range with the zone */
> + zone = move_pfn_range(online_type, nid, pfn, nr_pages);
> if (!zone)
> return -EINVAL;
Nit: This !zone currently cannot happen.
>
> @@ -1104,8 +1155,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
> arg.nr_pages = nr_pages;
> node_states_check_changes_online(nr_pages, zone, &arg);
>
> - nid = zone_to_nid(zone);
> -
> ret = memory_notify(MEM_GOING_ONLINE, &arg);
> ret = notifier_to_errno(ret);
> if (ret)
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 6903c8fc3085..d75407882598 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -686,10 +686,9 @@ static void free_map_bootmem(struct page *memmap)
> * set. If this is <=0, then that means that the passed-in
> * map was not consumed and must be freed.
> */
> -int __meminit sparse_add_one_section(struct zone *zone, unsigned long start_pfn)
> +int __meminit sparse_add_one_section(struct pglist_data *pgdat, unsigned long start_pfn)
> {
> unsigned long section_nr = pfn_to_section_nr(start_pfn);
> - struct pglist_data *pgdat = zone->zone_pgdat;
> struct mem_section *ms;
> struct page *memmap;
> unsigned long *usemap;
>
On 04/10/2017 01:03 PM, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> arch_add_memory gets for_device argument which then controls whether we
> want to create memblocks for created memory sections. Simplify the logic
> by telling whether we want memblocks directly rather than going through
> pointless negation. This also makes the api easier to understand because
> it is clear what we want rather than nothing telling for_device which
> can mean anything.
>
> This shouldn't introduce any functional change.
>
> Cc: Dan Williams <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
On 04/10/2017 01:03 PM, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> zone_for_memory doesn't have any user anymore as well as the whole zone
> shifting infrastructure so drop them all.
>
> This shouldn't introduce any functional changes.
>
> Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
On Thu 20-04-17 09:28:20, Michal Hocko wrote:
> On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
[...]
> > Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
> > woule make that zone->contiguous usually returns false since memory
> > used by memblock API is marked as PageReserved() and your patch regard
> > it as a hole. It invalidates set_zone_contiguous() optimization and I
> > worry about it.
>
> OK, fair enough. I did't consider memblock allocations. I will rethink
> this patch but there are essentially 3 options
> - use a different criterion for the offline holes dection. I
> have just realized we might do it by storing the online
> information into the mem sections
> - drop this patch
> - move the PageReferenced check down the chain into
> isolate_freepages_block resp. isolate_migratepages_block
>
> I would prefer 3 over 2 over 1. I definitely want to make this more
> robust so 1 is preferable long term but I do not want this to be a
> roadblock to the rest of the rework. Does that sound acceptable to you?
So I've played with all three options just to see how the outcome would
look like and it turned out that going with 1 will be easiest in the
end. What do you think about the following? It should be free of any
false positives. I have only compile tested it yet.
---
>From 747794c13c0e82b55b793a31cdbe1a84ee1c6920 Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Thu, 13 Apr 2017 10:28:45 +0200
Subject: [PATCH] mm: consider zone which is not fully populated to have holes
__pageblock_pfn_to_page has two users currently, set_zone_contiguous
which checks whether the given zone contains holes and
pageblock_pfn_to_page which then carefully returns a first valid
page from the given pfn range for the given zone. This doesn't handle
zones which are not fully populated though. Memory pageblocks can be
offlined or might not have been onlined yet. In such a case the zone
should be considered to have holes otherwise pfn walkers can touch
and play with offline pages.
Current callers of pageblock_pfn_to_page in compaction seem to work
properly right now because they only isolate PageBuddy
(isolate_freepages_block) or PageLRU resp. __PageMovable
(isolate_migratepages_block) which will be always false for these pages.
It would be safer to skip these pages altogether, though.
In order to do this patch adds a new memory section state
(SECTION_IS_ONLINE) which is set in memory_present (during boot
time) or in online_pages_range during the memory hotplug. Similarly
offline_mem_sections clears the bit and it is called when the memory
range is offlined.
pfn_to_online_page helper is then added which check the mem section and
only returns a page if it is onlined already.
Use the new helper in __pageblock_pfn_to_page and skip the whole page
block in such a case.
Signed-off-by: Michal Hocko <[email protected]>
---
include/linux/memory_hotplug.h | 21 ++++++++++++++++++++
include/linux/mmzone.h | 20 ++++++++++++++++++-
mm/memory_hotplug.c | 3 +++
mm/page_alloc.c | 5 ++++-
mm/sparse.c | 45 +++++++++++++++++++++++++++++++++++++++++-
5 files changed, 91 insertions(+), 3 deletions(-)
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 3c8cf86201c3..fc1c873504eb 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -14,6 +14,19 @@ struct memory_block;
struct resource;
#ifdef CONFIG_MEMORY_HOTPLUG
+/*
+ * Return page for the valid pfn only if the page is online. All pfn
+ * walkers which rely on the fully initialized page->flags and others
+ * should use this rather than pfn_valid && pfn_to_page
+ */
+#define pfn_to_online_page(pfn) \
+({ \
+ struct page *___page = NULL; \
+ \
+ if (online_section_nr(pfn_to_section_nr(pfn))) \
+ ___page = pfn_to_page(pfn); \
+ ___page; \
+})
/*
* Types for free bootmem stored in page->lru.next. These have to be in
@@ -203,6 +216,14 @@ extern void set_zone_contiguous(struct zone *zone);
extern void clear_zone_contiguous(struct zone *zone);
#else /* ! CONFIG_MEMORY_HOTPLUG */
+#define pfn_to_online_page(pfn) \
+({ \
+ struct page *___page = NULL; \
+ if (pfn_valid(pfn)) \
+ ___page = pfn_to_page(pfn); \
+ ___page; \
+ })
+
/*
* Stub functions for when hotplug is off
*/
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0fc121bbf4ff..cad16ac080f5 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1143,7 +1143,8 @@ extern unsigned long usemap_size(void);
*/
#define SECTION_MARKED_PRESENT (1UL<<0)
#define SECTION_HAS_MEM_MAP (1UL<<1)
-#define SECTION_MAP_LAST_BIT (1UL<<2)
+#define SECTION_IS_ONLINE (1UL<<2)
+#define SECTION_MAP_LAST_BIT (1UL<<3)
#define SECTION_MAP_MASK (~(SECTION_MAP_LAST_BIT-1))
#define SECTION_NID_SHIFT 2
@@ -1174,6 +1175,23 @@ static inline int valid_section_nr(unsigned long nr)
return valid_section(__nr_to_section(nr));
}
+static inline int online_section(struct mem_section *section)
+{
+ return (section && (section->section_mem_map & SECTION_IS_ONLINE));
+}
+
+static inline int online_section_nr(unsigned long nr)
+{
+ return online_section(__nr_to_section(nr));
+}
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+void online_mem_sections(unsigned long start_pfn, unsigned long end_pfn);
+#ifdef CONFIG_MEMORY_HOTREMOVE
+void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn);
+#endif
+#endif
+
static inline struct mem_section *__pfn_to_section(unsigned long pfn)
{
return __nr_to_section(pfn_to_section_nr(pfn));
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index caa58338d121..98f565c279bf 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -929,6 +929,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
unsigned long i;
unsigned long onlined_pages = *(unsigned long *)arg;
struct page *page;
+
+ online_mem_sections(start_pfn, start_pfn + nr_pages);
+
if (PageReserved(pfn_to_page(start_pfn)))
for (i = 0; i < nr_pages; i++) {
page = pfn_to_page(start_pfn + i);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5d72d29a6ece..fa752de84eef 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1353,7 +1353,9 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn))
return NULL;
- start_page = pfn_to_page(start_pfn);
+ start_page = pfn_to_online_page(start_pfn);
+ if (!start_page)
+ return NULL;
if (page_zone(start_page) != zone)
return NULL;
@@ -7686,6 +7688,7 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
break;
if (pfn == end_pfn)
return;
+ offline_mem_sections(pfn, end_pfn);
zone = page_zone(pfn_to_page(pfn));
spin_lock_irqsave(&zone->lock, flags);
pfn = start_pfn;
diff --git a/mm/sparse.c b/mm/sparse.c
index 6903c8fc3085..79017f90d8fc 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -185,7 +185,8 @@ void __init memory_present(int nid, unsigned long start, unsigned long end)
ms = __nr_to_section(section);
if (!ms->section_mem_map)
ms->section_mem_map = sparse_encode_early_nid(nid) |
- SECTION_MARKED_PRESENT;
+ SECTION_MARKED_PRESENT |
+ SECTION_IS_ONLINE;
}
}
@@ -590,6 +591,48 @@ void __init sparse_init(void)
}
#ifdef CONFIG_MEMORY_HOTPLUG
+
+/* Mark all memory sections within the pfn range as online */
+void online_mem_sections(unsigned long start_pfn, unsigned long end_pfn)
+{
+ unsigned long pfn;
+
+ for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+ unsigned long section_nr = pfn_to_section_nr(start_pfn);
+ struct mem_section *ms;
+
+ /* onlining code should never touch invalid ranges */
+ if (WARN_ON(!valid_section_nr(section_nr)))
+ continue;
+
+ ms = __nr_to_section(section_nr);
+ ms->section_mem_map |= SECTION_IS_ONLINE;
+ }
+}
+
+#ifdef CONFIG_MEMORY_HOTREMOVE
+/* Mark all memory sections within the pfn range as online */
+void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn)
+{
+ unsigned long pfn;
+
+ for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+ unsigned long section_nr = pfn_to_section_nr(start_pfn);
+ struct mem_section *ms;
+
+ /*
+ * TODO this needs some double checking. Offlining code makes
+ * sure to check pfn_valid but those checks might be just bogus
+ */
+ if (WARN_ON(!valid_section_nr(section_nr)))
+ continue;
+
+ ms = __nr_to_section(section_nr);
+ ms->section_mem_map &= ~SECTION_IS_ONLINE;
+ }
+}
+#endif
+
#ifdef CONFIG_SPARSEMEM_VMEMMAP
static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid)
{
--
2.11.0
--
Michal Hocko
SUSE Labs
On Thu 20-04-17 10:25:27, Vlastimil Babka wrote:
> On 04/10/2017 06:25 PM, Michal Hocko wrote:
[...]
> > Let's simulate memory hot online manually
> > Normal Movable
> >
> > /sys/devices/system/memory/memory32/valid_zones:Normal
> > /sys/devices/system/memory/memory33/valid_zones:Normal Movable
> >
> > /sys/devices/system/memory/memory32/valid_zones:Normal
> > /sys/devices/system/memory/memory33/valid_zones:Normal
> > /sys/devices/system/memory/memory34/valid_zones:Normal Movable
> >
> > /sys/devices/system/memory/memory32/valid_zones:Normal
> > /sys/devices/system/memory/memory33/valid_zones:Normal Movable
> > /sys/devices/system/memory/memory34/valid_zones:Movable Normal
>
> Commands seem to be missing above?
Yes. git commit just dropped everything starting with # which happened
to be the bash prompt for my commands. I have changed that to $ and it
looks as follows
$ echo 0x100000000 > /sys/devices/system/memory/probe
$ grep . /sys/devices/system/memory/memory32/valid_zones
Normal Movable
$ echo $((0x100000000+(128<<20))) > /sys/devices/system/memory/probe
$ grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
$ echo $((0x100000000+2*(128<<20))) > /sys/devices/system/memory/probe
$ grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal
/sys/devices/system/memory/memory34/valid_zones:Normal Movable
$ echo online_movable > /sys/devices/system/memory/memory34/state
$ grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Movable Normal
[...]
> > This means that the same physical online steps as above will lead to the
> > following state:
> > Normal Movable
> >
> > /sys/devices/system/memory/memory32/valid_zones:Normal Movable
> > /sys/devices/system/memory/memory33/valid_zones:Normal Movable
> >
> > /sys/devices/system/memory/memory32/valid_zones:Normal Movable
> > /sys/devices/system/memory/memory33/valid_zones:Normal Movable
> > /sys/devices/system/memory/memory34/valid_zones:Normal Movable
> >
> > /sys/devices/system/memory/memory32/valid_zones:Normal Movable
> > /sys/devices/system/memory/memory33/valid_zones:Normal Movable
> > /sys/devices/system/memory/memory34/valid_zones:Movable
>
> Ditto.
This just copies the above so I didn't add those commands. I can if that
is preferable.
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -533,6 +533,20 @@ static inline bool zone_is_empty(struct zone *zone)
> > }
> >
> > /*
> > + * Return true if [start_pfn, start_pfn + nr_pages) range has a non-mpty
>
>
> non-empty
fixed
> > + * intersection with the given zone
> > + */
> > +static inline bool zone_intersects(struct zone *zone,
> > + unsigned long start_pfn, unsigned long nr_pages)
> > +{
>
> I'm looking at your current mmotm tree branch, which looks like this:
>
> + * Return true if [start_pfn, start_pfn + nr_pages) range has a non-mpty
> + * intersection with the given zone
> + */
> +static inline bool zone_intersects(struct zone *zone,
> + unsigned long start_pfn, unsigned long nr_pages)
> +{
> + if (zone_is_empty(zone))
> + return false;
> + if (zone->zone_start_pfn <= start_pfn && start_pfn < zone_end_pfn(zone))
> + return true;
> + if (start_pfn + nr_pages > zone->zone_start_pfn)
> + return true;
>
> A false positive is possible here, when start_pfn >= zone_end_pfn(zone)?
Ohh, right. Looks better?
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index eae6da28646e..611ff869fa4d 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -541,10 +541,14 @@ static inline bool zone_intersects(struct zone *zone,
{
if (zone_is_empty(zone))
return false;
- if (zone->zone_start_pfn <= start_pfn && start_pfn < zone_end_pfn(zone))
+ if (start_pfn >= zone_end_pfn(zone))
+ return false;
+
+ if (zone->zone_start_pfn <= start_pfn)
return true;
if (start_pfn + nr_pages > zone->zone_start_pfn)
return true;
+
return false;
}
> > @@ -1029,39 +1018,114 @@ static void node_states_set_node(int node, struct memory_notify *arg)
> > node_set_state(node, N_MEMORY);
> > }
> >
> > -bool zone_can_shift(unsigned long pfn, unsigned long nr_pages,
> > - enum zone_type target, int *zone_shift)
> > +bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type)
> > {
> > - struct zone *zone = page_zone(pfn_to_page(pfn));
> > - enum zone_type idx = zone_idx(zone);
> > - int i;
> > + struct pglist_data *pgdat = NODE_DATA(nid);
> > + struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE];
> > + struct zone *normal_zone = &pgdat->node_zones[ZONE_NORMAL];
> >
> > - *zone_shift = 0;
> > + /*
> > + * TODO there shouldn't be any inherent reason to have ZONE_NORMAL
> > + * physically before ZONE_MOVABLE. All we need is they do not
> > + * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE
> > + * though so let's stick with it for simplicity for now.
> > + * TODO make sure we do not overlap with ZONE_DEVICE
>
> Is this last TODO a blocker, unlike the others?
I think it is not but my knowledge of the zone device is very limited. I
was hoping for Dan's feedback here. From what I understand Zone device
occupies the high end of the address space so we shouldn't overlap here.
Is this correct Dan?
[...]
> > + if (online_type == MMOP_ONLINE_MOVABLE && !can_online_high_movable(nid))
> > + return -EINVAL;
> >
> > - zone = move_pfn_range(zone_shift, pfn, pfn + nr_pages);
> > + /* associate pfn range with the zone */
> > + zone = move_pfn_range(online_type, nid, pfn, nr_pages);
> > if (!zone)
> > return -EINVAL;
>
> Nit: This !zone currently cannot happen.
fixed
Thanks!
--
Michal Hocko
SUSE Labs
On 04/20/2017 11:06 AM, Michal Hocko wrote:
> On Thu 20-04-17 10:25:27, Vlastimil Babka wrote:
>>> + * intersection with the given zone
>>> + */
>>> +static inline bool zone_intersects(struct zone *zone,
>>> + unsigned long start_pfn, unsigned long nr_pages)
>>> +{
>>
>> I'm looking at your current mmotm tree branch, which looks like this:
>>
>> + * Return true if [start_pfn, start_pfn + nr_pages) range has a non-mpty
>> + * intersection with the given zone
>> + */
>> +static inline bool zone_intersects(struct zone *zone,
>> + unsigned long start_pfn, unsigned long nr_pages)
>> +{
>> + if (zone_is_empty(zone))
>> + return false;
>> + if (zone->zone_start_pfn <= start_pfn && start_pfn < zone_end_pfn(zone))
>> + return true;
>> + if (start_pfn + nr_pages > zone->zone_start_pfn)
>> + return true;
>>
>> A false positive is possible here, when start_pfn >= zone_end_pfn(zone)?
>
> Ohh, right. Looks better?
Yeah.
You can add for the whole patch
Acked-by: Vlastimil Babka <[email protected]>
But I can't guarantee some corner case won't surface. The hotplug code
is far from straightforward :(
On 04/20/2017 10:49 AM, Michal Hocko wrote:
> On Thu 20-04-17 09:28:20, Michal Hocko wrote:
>> On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> [...]
>>> Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
>>> woule make that zone->contiguous usually returns false since memory
>>> used by memblock API is marked as PageReserved() and your patch regard
>>> it as a hole. It invalidates set_zone_contiguous() optimization and I
>>> worry about it.
>>
>> OK, fair enough. I did't consider memblock allocations. I will rethink
>> this patch but there are essentially 3 options
>> - use a different criterion for the offline holes dection. I
>> have just realized we might do it by storing the online
>> information into the mem sections
>> - drop this patch
>> - move the PageReferenced check down the chain into
>> isolate_freepages_block resp. isolate_migratepages_block
>>
>> I would prefer 3 over 2 over 1. I definitely want to make this more
>> robust so 1 is preferable long term but I do not want this to be a
>> roadblock to the rest of the rework. Does that sound acceptable to you?
>
> So I've played with all three options just to see how the outcome would
> look like and it turned out that going with 1 will be easiest in the
> end. What do you think about the following? It should be free of any
> false positives. I have only compile tested it yet.
That looks fine, can't say immediately if fully correct. I think you'll
need to bump SECTION_NID_SHIFT as well and make sure things still fit?
Otherwise looks like nobody needed a new section bit since 2005, so we
should be fine.
> ---
> From 747794c13c0e82b55b793a31cdbe1a84ee1c6920 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <[email protected]>
> Date: Thu, 13 Apr 2017 10:28:45 +0200
> Subject: [PATCH] mm: consider zone which is not fully populated to have holes
>
> __pageblock_pfn_to_page has two users currently, set_zone_contiguous
> which checks whether the given zone contains holes and
> pageblock_pfn_to_page which then carefully returns a first valid
> page from the given pfn range for the given zone. This doesn't handle
> zones which are not fully populated though. Memory pageblocks can be
> offlined or might not have been onlined yet. In such a case the zone
> should be considered to have holes otherwise pfn walkers can touch
> and play with offline pages.
>
> Current callers of pageblock_pfn_to_page in compaction seem to work
> properly right now because they only isolate PageBuddy
> (isolate_freepages_block) or PageLRU resp. __PageMovable
> (isolate_migratepages_block) which will be always false for these pages.
> It would be safer to skip these pages altogether, though.
>
> In order to do this patch adds a new memory section state
> (SECTION_IS_ONLINE) which is set in memory_present (during boot
> time) or in online_pages_range during the memory hotplug. Similarly
> offline_mem_sections clears the bit and it is called when the memory
> range is offlined.
>
> pfn_to_online_page helper is then added which check the mem section and
> only returns a page if it is onlined already.
>
> Use the new helper in __pageblock_pfn_to_page and skip the whole page
> block in such a case.
>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> include/linux/memory_hotplug.h | 21 ++++++++++++++++++++
> include/linux/mmzone.h | 20 ++++++++++++++++++-
> mm/memory_hotplug.c | 3 +++
> mm/page_alloc.c | 5 ++++-
> mm/sparse.c | 45 +++++++++++++++++++++++++++++++++++++++++-
> 5 files changed, 91 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 3c8cf86201c3..fc1c873504eb 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -14,6 +14,19 @@ struct memory_block;
> struct resource;
>
> #ifdef CONFIG_MEMORY_HOTPLUG
> +/*
> + * Return page for the valid pfn only if the page is online. All pfn
> + * walkers which rely on the fully initialized page->flags and others
> + * should use this rather than pfn_valid && pfn_to_page
> + */
> +#define pfn_to_online_page(pfn) \
> +({ \
> + struct page *___page = NULL; \
> + \
> + if (online_section_nr(pfn_to_section_nr(pfn))) \
> + ___page = pfn_to_page(pfn); \
> + ___page; \
> +})
>
> /*
> * Types for free bootmem stored in page->lru.next. These have to be in
> @@ -203,6 +216,14 @@ extern void set_zone_contiguous(struct zone *zone);
> extern void clear_zone_contiguous(struct zone *zone);
>
> #else /* ! CONFIG_MEMORY_HOTPLUG */
> +#define pfn_to_online_page(pfn) \
> +({ \
> + struct page *___page = NULL; \
> + if (pfn_valid(pfn)) \
> + ___page = pfn_to_page(pfn); \
> + ___page; \
> + })
> +
> /*
> * Stub functions for when hotplug is off
> */
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 0fc121bbf4ff..cad16ac080f5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1143,7 +1143,8 @@ extern unsigned long usemap_size(void);
> */
> #define SECTION_MARKED_PRESENT (1UL<<0)
> #define SECTION_HAS_MEM_MAP (1UL<<1)
> -#define SECTION_MAP_LAST_BIT (1UL<<2)
> +#define SECTION_IS_ONLINE (1UL<<2)
> +#define SECTION_MAP_LAST_BIT (1UL<<3)
> #define SECTION_MAP_MASK (~(SECTION_MAP_LAST_BIT-1))
> #define SECTION_NID_SHIFT 2
>
> @@ -1174,6 +1175,23 @@ static inline int valid_section_nr(unsigned long nr)
> return valid_section(__nr_to_section(nr));
> }
>
> +static inline int online_section(struct mem_section *section)
> +{
> + return (section && (section->section_mem_map & SECTION_IS_ONLINE));
> +}
> +
> +static inline int online_section_nr(unsigned long nr)
> +{
> + return online_section(__nr_to_section(nr));
> +}
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +void online_mem_sections(unsigned long start_pfn, unsigned long end_pfn);
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn);
> +#endif
> +#endif
> +
> static inline struct mem_section *__pfn_to_section(unsigned long pfn)
> {
> return __nr_to_section(pfn_to_section_nr(pfn));
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index caa58338d121..98f565c279bf 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -929,6 +929,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
> unsigned long i;
> unsigned long onlined_pages = *(unsigned long *)arg;
> struct page *page;
> +
> + online_mem_sections(start_pfn, start_pfn + nr_pages);
> +
> if (PageReserved(pfn_to_page(start_pfn)))
> for (i = 0; i < nr_pages; i++) {
> page = pfn_to_page(start_pfn + i);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5d72d29a6ece..fa752de84eef 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1353,7 +1353,9 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
> if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn))
> return NULL;
>
> - start_page = pfn_to_page(start_pfn);
> + start_page = pfn_to_online_page(start_pfn);
> + if (!start_page)
> + return NULL;
>
> if (page_zone(start_page) != zone)
> return NULL;
> @@ -7686,6 +7688,7 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
> break;
> if (pfn == end_pfn)
> return;
> + offline_mem_sections(pfn, end_pfn);
> zone = page_zone(pfn_to_page(pfn));
> spin_lock_irqsave(&zone->lock, flags);
> pfn = start_pfn;
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 6903c8fc3085..79017f90d8fc 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -185,7 +185,8 @@ void __init memory_present(int nid, unsigned long start, unsigned long end)
> ms = __nr_to_section(section);
> if (!ms->section_mem_map)
> ms->section_mem_map = sparse_encode_early_nid(nid) |
> - SECTION_MARKED_PRESENT;
> + SECTION_MARKED_PRESENT |
> + SECTION_IS_ONLINE;
> }
> }
>
> @@ -590,6 +591,48 @@ void __init sparse_init(void)
> }
>
> #ifdef CONFIG_MEMORY_HOTPLUG
> +
> +/* Mark all memory sections within the pfn range as online */
> +void online_mem_sections(unsigned long start_pfn, unsigned long end_pfn)
> +{
> + unsigned long pfn;
> +
> + for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> + unsigned long section_nr = pfn_to_section_nr(start_pfn);
> + struct mem_section *ms;
> +
> + /* onlining code should never touch invalid ranges */
> + if (WARN_ON(!valid_section_nr(section_nr)))
> + continue;
> +
> + ms = __nr_to_section(section_nr);
> + ms->section_mem_map |= SECTION_IS_ONLINE;
> + }
> +}
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +/* Mark all memory sections within the pfn range as online */
> +void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn)
> +{
> + unsigned long pfn;
> +
> + for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> + unsigned long section_nr = pfn_to_section_nr(start_pfn);
> + struct mem_section *ms;
> +
> + /*
> + * TODO this needs some double checking. Offlining code makes
> + * sure to check pfn_valid but those checks might be just bogus
> + */
> + if (WARN_ON(!valid_section_nr(section_nr)))
> + continue;
> +
> + ms = __nr_to_section(section_nr);
> + ms->section_mem_map &= ~SECTION_IS_ONLINE;
> + }
> +}
> +#endif
> +
> #ifdef CONFIG_SPARSEMEM_VMEMMAP
> static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid)
> {
>
On Thu 20-04-17 13:56:34, Vlastimil Babka wrote:
> On 04/20/2017 10:49 AM, Michal Hocko wrote:
> > On Thu 20-04-17 09:28:20, Michal Hocko wrote:
> >> On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > [...]
> >>> Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
> >>> woule make that zone->contiguous usually returns false since memory
> >>> used by memblock API is marked as PageReserved() and your patch regard
> >>> it as a hole. It invalidates set_zone_contiguous() optimization and I
> >>> worry about it.
> >>
> >> OK, fair enough. I did't consider memblock allocations. I will rethink
> >> this patch but there are essentially 3 options
> >> - use a different criterion for the offline holes dection. I
> >> have just realized we might do it by storing the online
> >> information into the mem sections
> >> - drop this patch
> >> - move the PageReferenced check down the chain into
> >> isolate_freepages_block resp. isolate_migratepages_block
> >>
> >> I would prefer 3 over 2 over 1. I definitely want to make this more
> >> robust so 1 is preferable long term but I do not want this to be a
> >> roadblock to the rest of the rework. Does that sound acceptable to you?
> >
> > So I've played with all three options just to see how the outcome would
> > look like and it turned out that going with 1 will be easiest in the
> > end. What do you think about the following? It should be free of any
> > false positives. I have only compile tested it yet.
>
> That looks fine, can't say immediately if fully correct. I think you'll
> need to bump SECTION_NID_SHIFT as well and make sure things still fit?
> Otherwise looks like nobody needed a new section bit since 2005, so we
> should be fine.
You are absolutely right. Thanks for spotting this! I have folded this
in
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 611ff869fa4d..c412e6a3a1e9 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1166,7 +1166,7 @@ extern unsigned long usemap_size(void);
#define SECTION_IS_ONLINE (1UL<<2)
#define SECTION_MAP_LAST_BIT (1UL<<3)
#define SECTION_MAP_MASK (~(SECTION_MAP_LAST_BIT-1))
-#define SECTION_NID_SHIFT 2
+#define SECTION_NID_SHIFT 3
static inline struct page *__section_mem_map_addr(struct mem_section *section)
{
--
Michal Hocko
SUSE Labs
FYI, we noticed the following commit:
commit: 73821bb516920b2b38732ce992d11c08c5d8bd7d ("your mail")
url: https://github.com/0day-ci/linux/commits/Michal-Hocko/mm-consider-zone-which-is-not-fully-populated-to-have-holes/20170420-173046
in testcase: trinity
with following parameters:
runtime: 300s
test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/
on test machine: qemu-system-i386 -enable-kvm -smp 2 -m 320M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+--------------------------------------------------------+----------+------------+
| | v4.9-rc8 | 73821bb516 |
+--------------------------------------------------------+----------+------------+
| boot_successes | 5 | 0 |
| boot_failures | 4 | 19 |
| BUG:workqueue_lockup-pool | 4 | 12 |
| WARNING:at_mm/memblock.c:#memblock_virt_alloc_internal | 0 | 19 |
+--------------------------------------------------------+----------+------------+
[ 0.000000] WARNING: CPU: 0 PID: 0 at mm/memblock.c:1261 memblock_virt_alloc_internal+0xa2/0x3bb
[ 0.000000] Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc8-00001-g73821bb #1
[ 0.000000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 0.000000] c100d60b c1994dde c1a4fe84 c1a4fe58 c122041d c1a4fe70 c1044973 c1cf3fb9
[ 0.000000] b95b9258 00000001 00000001 c1a4fe8c c10449b7 00000009 00000000 c1a4fe84
[ 0.000000] c1993532 c1a4fea0 c1a4fec4 c1cf3fb9 c19934b3 000004ed c1993532 00000000
[ 0.000000] Call Trace:
[ 0.000000] [<c100d60b>] ? show_stack+0x59/0x5f
[ 0.000000] [<c122041d>] dump_stack+0x16/0x18
[ 0.000000] [<c1044973>] __warn+0x104/0x11b
[ 0.000000] [<c1cf3fb9>] ? memblock_virt_alloc_internal+0xa2/0x3bb
[ 0.000000] [<c10449b7>] warn_slowpath_fmt+0x2d/0x32
[ 0.000000] [<c1cf3fb9>] memblock_virt_alloc_internal+0xa2/0x3bb
[ 0.000000] [<c1034010>] ? pte_offset_kernel+0x10/0x1e
[ 0.000000] [<c1cf4681>] memblock_virt_alloc_try_nid+0x94/0xf1
[ 0.000000] [<c1cf5dcf>] sparse_mem_map_populate+0x35/0x50
[ 0.000000] [<c1cf61d2>] sparse_init+0x234/0x35f
[ 0.000000] [<c1cdcbe8>] paging_init+0x89/0xa5
[ 0.000000] [<c1cdccf3>] native_pagetable_init+0xef/0x200
[ 0.000000] [<c1cc2ed3>] setup_arch+0xda2/0xe34
[ 0.000000] [<c1cbd00c>] start_kernel+0x62/0x57b
[ 0.000000] [<c1cbc2ed>] i386_start_kernel+0xd4/0xec
[ 0.000000] ---[ end trace 0000000000000000 ]---
To reproduce:
git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong
On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> [...]
> > > Which pfn walkers you have in mind?
> >
> > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > using pfn_valid().
>
> Yeah, I've checked that one and in fact this is a good example of the
> case where you do not really care about holes. It just checks the page
> count which is a valid information under any circumstances.
I don't think so. First, it checks the page *map* count. Is it still valid
even if PageReserved() is set? What I'd like to ask in this example is
that what information is valid if PageReserved() is set. Is there any
design document on this? I think that we need to define/document it first.
And, I hope that all the information in flags field is valid in all
cases if pfn_valid() return true. By the design.
This makes all the exsiting pfn walkers happy since we don't need an
additional check for PageReserved().
>
> > > > The other problem I found is that your change will makes some
> > > > contiguous zones to be considered as non-contiguous. Memory allocated
> > > > by memblock API is also marked as PageResereved. If we consider this as
> > > > a hole, we will set such a zone as non-contiguous.
> > >
> > > Why would that be a problem? We shouldn't touch those pages anyway?
> >
> > Skipping those pages in compaction are valid so no problem in this
> > case.
> >
> > The problem I mentioned above is that adding PageReserved() check in
> > __pageblock_pfn_to_page() invalidates optimization by
> > set_zone_contiguous(). In compaction, we need to get a valid struct
> > page and it requires a lot of work. There is performance problem
> > report due to this so set_zone_contiguous() optimization is added. It
> > checks if the zone is contiguous or not in boot time. If zone is
> > determined as contiguous, we can easily get a valid struct page in
> > runtime without expensive checks.
>
> OK, I see. I've had some vague understading and the clarification helps.
>
> > Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
> > woule make that zone->contiguous usually returns false since memory
> > used by memblock API is marked as PageReserved() and your patch regard
> > it as a hole. It invalidates set_zone_contiguous() optimization and I
> > worry about it.
>
> OK, fair enough. I did't consider memblock allocations. I will rethink
> this patch but there are essentially 3 options
> - use a different criterion for the offline holes dection. I
> have just realized we might do it by storing the online
> information into the mem sections
> - drop this patch
> - move the PageReferenced check down the chain into
> isolate_freepages_block resp. isolate_migratepages_block
>
> I would prefer 3 over 2 over 1. I definitely want to make this more
> robust so 1 is preferable long term but I do not want this to be a
> roadblock to the rest of the rework. Does that sound acceptable to you?
I like #1 among of above options and I already see your patch for #1.
It's much better than your first attempt but I'm still not happy due
to the semantic of pfn_valid().
> [..]
> > Let me clarify my desire(?) for this issue.
> >
> > 1. If pfn_valid() returns true, struct page has valid information, at
> > least, in flags (zone id, node id, flags, etc...). So, we can use them
> > without checking PageResereved().
>
> This is no longer true after my rework. Pages are associated with the
> zone during _onlining_ rather than when they are physically hotpluged.
If your rework make information valid during _onlining_, my
suggestion is making pfn_valid() return false until onlining.
Caller of pfn_valid() expects that they can get valid information from
the struct page. There is no reason to access the struct page if they
can't get valid information from it. So, passing pfn_valid() should
guarantee that, at least, some kind of information is valid.
If pfn_valid() doesn't guarantee it, most of the pfn walker should
check PageResereved() to make sure that validity of information from
the struct page.
> Basically only the nid is set properly. Strictly speaking this is the
> case also without my rework because the zone might change during online
> phase so you cannot assume it is correct even now. It just happens that
> it more or less works just fine.
>
> > 2. pfn_valid() for offlined holes returns false. This can be easily
> > (?) implemented by manipulating SECTION_MAP_MASK in hotplug code. I
> > guess that there is no reason that pfn_valid() returns true for
> > offlined holes. If there is, please let me know.
>
> There is some code which really expects that pfn_valid returns true iff
> there is a struct page and it doesn't care about the online status.
> E.g. hotplug code itself so no, we cannot change pfn_valid. What we can
> do though is to add pfn_to_online_page which would do the proper check.
> I have already sent [1]. As noted above we can (ab)use the remaining bit
> in SECTION_MAP_MASK to detect offline pages more robustly.
Some pfn_valid() caller in hotplug code look wrong. They want to check
section's validity rather than pfn's validity. Others want to access
the struct page so they fit for my assumption (?) for pfn_valid().
Therefore, we can change that pfn_valid() return false until online.
> > 3. We don't need to check PageReserved() in most of pfn walkers in
> > order to check offline holes.
>
> We still have to distinguish those who care about offline pages from
> those who do not care about it.
Hotplug code can distinguish those by another way by using new section
mask as you did in a new patch. If someone excluding hotplug code do
care about offline pages, it would be just for optimization rather
than correteness. I think that it's okay.
Thanks.
On Fri 21-04-17 13:38:28, Joonsoo Kim wrote:
> On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> > On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> > [...]
> > > > Which pfn walkers you have in mind?
> > >
> > > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > > using pfn_valid().
> >
> > Yeah, I've checked that one and in fact this is a good example of the
> > case where you do not really care about holes. It just checks the page
> > count which is a valid information under any circumstances.
>
> I don't think so. First, it checks the page *map* count. Is it still valid
> even if PageReserved() is set?
I do not know about any user which would manipulate page map count for
referenced pages. The core MM code doesn't.
> What I'd like to ask in this example is
> that what information is valid if PageReserved() is set. Is there any
> design document on this? I think that we need to define/document it first.
NO, it is not AFAIK.
[...]
> > OK, fair enough. I did't consider memblock allocations. I will rethink
> > this patch but there are essentially 3 options
> > - use a different criterion for the offline holes dection. I
> > have just realized we might do it by storing the online
> > information into the mem sections
> > - drop this patch
> > - move the PageReferenced check down the chain into
> > isolate_freepages_block resp. isolate_migratepages_block
> >
> > I would prefer 3 over 2 over 1. I definitely want to make this more
> > robust so 1 is preferable long term but I do not want this to be a
> > roadblock to the rest of the rework. Does that sound acceptable to you?
>
> I like #1 among of above options and I already see your patch for #1.
> It's much better than your first attempt but I'm still not happy due
> to the semantic of pfn_valid().
You are trying to change a semantic of something that has a well defined
meaning. I disagree that we should change it. It might sound like a
simpler thing to do because pfn walkers will have to be checked but what
you are proposing is conflating two different things together.
> > [..]
> > > Let me clarify my desire(?) for this issue.
> > >
> > > 1. If pfn_valid() returns true, struct page has valid information, at
> > > least, in flags (zone id, node id, flags, etc...). So, we can use them
> > > without checking PageResereved().
> >
> > This is no longer true after my rework. Pages are associated with the
> > zone during _onlining_ rather than when they are physically hotpluged.
>
> If your rework make information valid during _onlining_, my
> suggestion is making pfn_valid() return false until onlining.
>
> Caller of pfn_valid() expects that they can get valid information from
> the struct page. There is no reason to access the struct page if they
> can't get valid information from it. So, passing pfn_valid() should
> guarantee that, at least, some kind of information is valid.
>
> If pfn_valid() doesn't guarantee it, most of the pfn walker should
> check PageResereved() to make sure that validity of information from
> the struct page.
This is true only for those walkers which really depend on the full
initialization. This is not the case for all of them. I do not see any
reason to introduce another _pfn_valid to just check whether there is a
struct page...
So please do not conflate those two different concepts together. I
believe that the most prominent pfn walkers should be covered now and
others can be evaluated later.
--
Michal Hocko
SUSE Labs
On Fri 21-04-17 10:46:22, kernel test robot wrote:
[...]
> [ 0.000000] WARNING: CPU: 0 PID: 0 at mm/memblock.c:1261 memblock_virt_alloc_internal+0xa2/0x3bb
> [ 0.000000] Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead
Your config doesn't hat CONFIG_NODES_SHIFT so MAX_NUMNODES is 1 and
due to bug spotted by Vlastimil the ONLINE flag will overlap the node
number and so sparse_early_nid will report MAX_NUMNODES. This should be
fixed by the followup fix
http://lkml.kernel.org/r/[email protected]
Thanks for the report anyway.
> [ 0.000000] Modules linked in:
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc8-00001-g73821bb #1
> [ 0.000000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
> [ 0.000000] c100d60b c1994dde c1a4fe84 c1a4fe58 c122041d c1a4fe70 c1044973 c1cf3fb9
> [ 0.000000] b95b9258 00000001 00000001 c1a4fe8c c10449b7 00000009 00000000 c1a4fe84
> [ 0.000000] c1993532 c1a4fea0 c1a4fec4 c1cf3fb9 c19934b3 000004ed c1993532 00000000
> [ 0.000000] Call Trace:
> [ 0.000000] [<c100d60b>] ? show_stack+0x59/0x5f
> [ 0.000000] [<c122041d>] dump_stack+0x16/0x18
> [ 0.000000] [<c1044973>] __warn+0x104/0x11b
> [ 0.000000] [<c1cf3fb9>] ? memblock_virt_alloc_internal+0xa2/0x3bb
> [ 0.000000] [<c10449b7>] warn_slowpath_fmt+0x2d/0x32
> [ 0.000000] [<c1cf3fb9>] memblock_virt_alloc_internal+0xa2/0x3bb
> [ 0.000000] [<c1034010>] ? pte_offset_kernel+0x10/0x1e
> [ 0.000000] [<c1cf4681>] memblock_virt_alloc_try_nid+0x94/0xf1
> [ 0.000000] [<c1cf5dcf>] sparse_mem_map_populate+0x35/0x50
> [ 0.000000] [<c1cf61d2>] sparse_init+0x234/0x35f
> [ 0.000000] [<c1cdcbe8>] paging_init+0x89/0xa5
> [ 0.000000] [<c1cdccf3>] native_pagetable_init+0xef/0x200
> [ 0.000000] [<c1cc2ed3>] setup_arch+0xda2/0xe34
> [ 0.000000] [<c1cbd00c>] start_kernel+0x62/0x57b
> [ 0.000000] [<c1cbc2ed>] i386_start_kernel+0xd4/0xec
> [ 0.000000] ---[ end trace 0000000000000000 ]---
>
>
> To reproduce:
>
> git clone https://github.com/01org/lkp-tests.git
> cd lkp-tests
> bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
>
>
>
> Thanks,
> Xiaolong
> #
> # Automatically generated file; DO NOT EDIT.
> # Linux/i386 4.9.0-rc8 Kernel Configuration
> #
> # CONFIG_64BIT is not set
> CONFIG_X86_32=y
> CONFIG_X86=y
> CONFIG_INSTRUCTION_DECODER=y
> CONFIG_OUTPUT_FORMAT="elf32-i386"
> CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig"
> CONFIG_LOCKDEP_SUPPORT=y
> CONFIG_STACKTRACE_SUPPORT=y
> CONFIG_MMU=y
> CONFIG_ARCH_MMAP_RND_BITS_MIN=8
> CONFIG_ARCH_MMAP_RND_BITS_MAX=16
> CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
> CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
> CONFIG_NEED_DMA_MAP_STATE=y
> CONFIG_NEED_SG_DMA_LENGTH=y
> CONFIG_GENERIC_ISA_DMA=y
> CONFIG_GENERIC_BUG=y
> CONFIG_GENERIC_HWEIGHT=y
> CONFIG_ARCH_MAY_HAVE_PC_FDC=y
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y
> CONFIG_GENERIC_CALIBRATE_DELAY=y
> CONFIG_ARCH_HAS_CPU_RELAX=y
> CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
> CONFIG_HAVE_SETUP_PER_CPU_AREA=y
> CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
> CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
> CONFIG_ARCH_HIBERNATION_POSSIBLE=y
> CONFIG_ARCH_SUSPEND_POSSIBLE=y
> CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
> CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
> CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
> CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
> CONFIG_X86_32_LAZY_GS=y
> CONFIG_ARCH_SUPPORTS_UPROBES=y
> CONFIG_FIX_EARLYCON_MEM=y
> CONFIG_DEBUG_RODATA=y
> CONFIG_PGTABLE_LEVELS=3
> CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
> CONFIG_CONSTRUCTORS=y
> CONFIG_IRQ_WORK=y
> CONFIG_BUILDTIME_EXTABLE_SORT=y
> CONFIG_THREAD_INFO_IN_TASK=y
>
> #
> # General setup
> #
> CONFIG_BROKEN_ON_SMP=y
> CONFIG_INIT_ENV_ARG_LIMIT=32
> CONFIG_CROSS_COMPILE=""
> # CONFIG_COMPILE_TEST is not set
> CONFIG_LOCALVERSION=""
> CONFIG_LOCALVERSION_AUTO=y
> CONFIG_HAVE_KERNEL_GZIP=y
> CONFIG_HAVE_KERNEL_BZIP2=y
> CONFIG_HAVE_KERNEL_LZMA=y
> CONFIG_HAVE_KERNEL_XZ=y
> CONFIG_HAVE_KERNEL_LZO=y
> CONFIG_HAVE_KERNEL_LZ4=y
> CONFIG_KERNEL_GZIP=y
> # CONFIG_KERNEL_BZIP2 is not set
> # CONFIG_KERNEL_LZMA is not set
> # CONFIG_KERNEL_XZ is not set
> # CONFIG_KERNEL_LZO is not set
> # CONFIG_KERNEL_LZ4 is not set
> CONFIG_DEFAULT_HOSTNAME="(none)"
> # CONFIG_SYSVIPC is not set
> # CONFIG_POSIX_MQUEUE is not set
> # CONFIG_CROSS_MEMORY_ATTACH is not set
> CONFIG_FHANDLE=y
> # CONFIG_USELIB is not set
> # CONFIG_AUDIT is not set
> CONFIG_HAVE_ARCH_AUDITSYSCALL=y
>
> #
> # IRQ subsystem
> #
> CONFIG_GENERIC_IRQ_PROBE=y
> CONFIG_GENERIC_IRQ_SHOW=y
> CONFIG_GENERIC_IRQ_CHIP=y
> CONFIG_IRQ_DOMAIN=y
> CONFIG_IRQ_DOMAIN_HIERARCHY=y
> CONFIG_GENERIC_MSI_IRQ=y
> CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
> # CONFIG_IRQ_DOMAIN_DEBUG is not set
> CONFIG_IRQ_FORCED_THREADING=y
> CONFIG_SPARSE_IRQ=y
> CONFIG_CLOCKSOURCE_WATCHDOG=y
> CONFIG_ARCH_CLOCKSOURCE_DATA=y
> CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
> CONFIG_GENERIC_TIME_VSYSCALL=y
> CONFIG_GENERIC_CLOCKEVENTS=y
> CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
> CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
> CONFIG_GENERIC_CMOS_UPDATE=y
>
> #
> # Timers subsystem
> #
> CONFIG_TICK_ONESHOT=y
> CONFIG_NO_HZ_COMMON=y
> # CONFIG_HZ_PERIODIC is not set
> CONFIG_NO_HZ_IDLE=y
> CONFIG_NO_HZ=y
> CONFIG_HIGH_RES_TIMERS=y
>
> #
> # CPU/Task time and stats accounting
> #
> CONFIG_TICK_CPU_ACCOUNTING=y
> CONFIG_IRQ_TIME_ACCOUNTING=y
> # CONFIG_BSD_PROCESS_ACCT is not set
> # CONFIG_TASKSTATS is not set
>
> #
> # RCU Subsystem
> #
> CONFIG_PREEMPT_RCU=y
> # CONFIG_RCU_EXPERT is not set
> CONFIG_SRCU=y
> # CONFIG_TASKS_RCU is not set
> CONFIG_RCU_STALL_COMMON=y
> CONFIG_TREE_RCU_TRACE=y
> # CONFIG_RCU_EXPEDITE_BOOT is not set
> CONFIG_BUILD_BIN2C=y
> CONFIG_IKCONFIG=y
> CONFIG_IKCONFIG_PROC=y
> CONFIG_LOG_BUF_SHIFT=17
> CONFIG_NMI_LOG_BUF_SHIFT=13
> CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
> CONFIG_CGROUPS=y
> # CONFIG_MEMCG is not set
> CONFIG_CGROUP_SCHED=y
> CONFIG_FAIR_GROUP_SCHED=y
> # CONFIG_CFS_BANDWIDTH is not set
> CONFIG_RT_GROUP_SCHED=y
> CONFIG_CGROUP_PIDS=y
> # CONFIG_CGROUP_FREEZER is not set
> # CONFIG_CGROUP_HUGETLB is not set
> CONFIG_CPUSETS=y
> CONFIG_PROC_PID_CPUSET=y
> # CONFIG_CGROUP_DEVICE is not set
> # CONFIG_CGROUP_CPUACCT is not set
> # CONFIG_CGROUP_PERF is not set
> # CONFIG_CGROUP_DEBUG is not set
> CONFIG_CHECKPOINT_RESTORE=y
> # CONFIG_NAMESPACES is not set
> CONFIG_SCHED_AUTOGROUP=y
> # CONFIG_SYSFS_DEPRECATED is not set
> # CONFIG_RELAY is not set
> CONFIG_BLK_DEV_INITRD=y
> CONFIG_INITRAMFS_SOURCE=""
> CONFIG_RD_GZIP=y
> # CONFIG_RD_BZIP2 is not set
> # CONFIG_RD_LZMA is not set
> CONFIG_RD_XZ=y
> # CONFIG_RD_LZO is not set
> # CONFIG_RD_LZ4 is not set
> # CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set
> CONFIG_CC_OPTIMIZE_FOR_SIZE=y
> CONFIG_SYSCTL=y
> CONFIG_ANON_INODES=y
> CONFIG_HAVE_UID16=y
> CONFIG_SYSCTL_EXCEPTION_TRACE=y
> CONFIG_HAVE_PCSPKR_PLATFORM=y
> CONFIG_BPF=y
> CONFIG_EXPERT=y
> CONFIG_UID16=y
> CONFIG_MULTIUSER=y
> # CONFIG_SGETMASK_SYSCALL is not set
> CONFIG_SYSFS_SYSCALL=y
> # CONFIG_SYSCTL_SYSCALL is not set
> CONFIG_KALLSYMS=y
> CONFIG_KALLSYMS_ALL=y
> # CONFIG_KALLSYMS_ABSOLUTE_PERCPU is not set
> CONFIG_KALLSYMS_BASE_RELATIVE=y
> CONFIG_PRINTK=y
> CONFIG_PRINTK_NMI=y
> CONFIG_BUG=y
> CONFIG_PCSPKR_PLATFORM=y
> CONFIG_BASE_FULL=y
> CONFIG_FUTEX=y
> CONFIG_EPOLL=y
> CONFIG_SIGNALFD=y
> CONFIG_TIMERFD=y
> CONFIG_EVENTFD=y
> CONFIG_BPF_SYSCALL=y
> # CONFIG_SHMEM is not set
> # CONFIG_AIO is not set
> CONFIG_ADVISE_SYSCALLS=y
> # CONFIG_USERFAULTFD is not set
> CONFIG_PCI_QUIRKS=y
> CONFIG_MEMBARRIER=y
> # CONFIG_EMBEDDED is not set
> CONFIG_HAVE_PERF_EVENTS=y
> CONFIG_PERF_USE_VMALLOC=y
>
> #
> # Kernel Performance Events And Counters
> #
> CONFIG_PERF_EVENTS=y
> CONFIG_DEBUG_PERF_USE_VMALLOC=y
> CONFIG_VM_EVENT_COUNTERS=y
> # CONFIG_SLUB_DEBUG is not set
> # CONFIG_COMPAT_BRK is not set
> # CONFIG_SLAB is not set
> CONFIG_SLUB=y
> # CONFIG_SLOB is not set
> CONFIG_SLAB_FREELIST_RANDOM=y
> # CONFIG_SYSTEM_DATA_VERIFICATION is not set
> CONFIG_PROFILING=y
> CONFIG_TRACEPOINTS=y
> # CONFIG_OPROFILE is not set
> CONFIG_HAVE_OPROFILE=y
> CONFIG_OPROFILE_NMI_TIMER=y
> # CONFIG_KPROBES is not set
> CONFIG_JUMP_LABEL=y
> CONFIG_STATIC_KEYS_SELFTEST=y
> # CONFIG_UPROBES is not set
> # CONFIG_HAVE_64BIT_ALIGNED_ACCESS is not set
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
> CONFIG_ARCH_USE_BUILTIN_BSWAP=y
> CONFIG_HAVE_IOREMAP_PROT=y
> CONFIG_HAVE_KPROBES=y
> CONFIG_HAVE_KRETPROBES=y
> CONFIG_HAVE_OPTPROBES=y
> CONFIG_HAVE_KPROBES_ON_FTRACE=y
> CONFIG_HAVE_NMI=y
> CONFIG_HAVE_ARCH_TRACEHOOK=y
> CONFIG_HAVE_DMA_CONTIGUOUS=y
> CONFIG_GENERIC_SMP_IDLE_THREAD=y
> CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
> CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
> CONFIG_HAVE_CLK=y
> CONFIG_HAVE_DMA_API_DEBUG=y
> CONFIG_HAVE_HW_BREAKPOINT=y
> CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
> CONFIG_HAVE_USER_RETURN_NOTIFIER=y
> CONFIG_HAVE_PERF_EVENTS_NMI=y
> CONFIG_HAVE_PERF_REGS=y
> CONFIG_HAVE_PERF_USER_STACK_DUMP=y
> CONFIG_HAVE_ARCH_JUMP_LABEL=y
> CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
> CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
> CONFIG_HAVE_CMPXCHG_LOCAL=y
> CONFIG_HAVE_CMPXCHG_DOUBLE=y
> CONFIG_ARCH_WANT_IPC_PARSE_VERSION=y
> CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
> CONFIG_SECCOMP_FILTER=y
> CONFIG_HAVE_GCC_PLUGINS=y
> CONFIG_GCC_PLUGINS=y
> CONFIG_GCC_PLUGIN_CYC_COMPLEXITY=y
> CONFIG_GCC_PLUGIN_LATENT_ENTROPY=y
> CONFIG_HAVE_CC_STACKPROTECTOR=y
> # CONFIG_CC_STACKPROTECTOR is not set
> CONFIG_CC_STACKPROTECTOR_NONE=y
> # CONFIG_CC_STACKPROTECTOR_REGULAR is not set
> # CONFIG_CC_STACKPROTECTOR_STRONG is not set
> CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y
> CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
> CONFIG_HAVE_ARCH_HUGE_VMAP=y
> CONFIG_MODULES_USE_ELF_REL=y
> CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
> CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
> CONFIG_HAVE_EXIT_THREAD=y
> CONFIG_ARCH_MMAP_RND_BITS=8
> CONFIG_HAVE_COPY_THREAD_TLS=y
> # CONFIG_HAVE_ARCH_HASH is not set
> CONFIG_ISA_BUS_API=y
> CONFIG_CLONE_BACKWARDS=y
> CONFIG_OLD_SIGSUSPEND3=y
> CONFIG_OLD_SIGACTION=y
> # CONFIG_CPU_NO_EFFICIENT_FFS is not set
> # CONFIG_HAVE_ARCH_VMAP_STACK is not set
>
> #
> # GCOV-based kernel profiling
> #
> CONFIG_GCOV_KERNEL=y
> CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
> # CONFIG_GCOV_PROFILE_ALL is not set
> # CONFIG_GCOV_FORMAT_AUTODETECT is not set
> # CONFIG_GCOV_FORMAT_3_4 is not set
> CONFIG_GCOV_FORMAT_4_7=y
> CONFIG_HAVE_GENERIC_DMA_COHERENT=y
> CONFIG_RT_MUTEXES=y
> CONFIG_BASE_SMALL=0
> CONFIG_MODULES=y
> # CONFIG_MODULE_FORCE_LOAD is not set
> # CONFIG_MODULE_UNLOAD is not set
> # CONFIG_MODVERSIONS is not set
> # CONFIG_MODULE_SRCVERSION_ALL is not set
> # CONFIG_MODULE_SIG is not set
> CONFIG_MODULE_COMPRESS=y
> # CONFIG_MODULE_COMPRESS_GZIP is not set
> CONFIG_MODULE_COMPRESS_XZ=y
> CONFIG_MODULES_TREE_LOOKUP=y
> # CONFIG_BLOCK is not set
> CONFIG_ASN1=y
> CONFIG_UNINLINE_SPIN_UNLOCK=y
> CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
> CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
> CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
> CONFIG_FREEZER=y
>
> #
> # Processor type and features
> #
> # CONFIG_ZONE_DMA is not set
> # CONFIG_SMP is not set
> CONFIG_X86_FEATURE_NAMES=y
> CONFIG_X86_FAST_FEATURE_TESTS=y
> CONFIG_X86_MPPARSE=y
> # CONFIG_GOLDFISH is not set
> # CONFIG_X86_EXTENDED_PLATFORM is not set
> # CONFIG_X86_INTEL_LPSS is not set
> CONFIG_X86_AMD_PLATFORM_DEVICE=y
> CONFIG_IOSF_MBI=y
> # CONFIG_IOSF_MBI_DEBUG is not set
> CONFIG_X86_32_IRIS=m
> # CONFIG_SCHED_OMIT_FRAME_POINTER is not set
> CONFIG_HYPERVISOR_GUEST=y
> CONFIG_PARAVIRT=y
> # CONFIG_PARAVIRT_DEBUG is not set
> # CONFIG_XEN is not set
> CONFIG_KVM_GUEST=y
> # CONFIG_KVM_DEBUG_FS is not set
> # CONFIG_LGUEST_GUEST is not set
> CONFIG_PARAVIRT_TIME_ACCOUNTING=y
> CONFIG_PARAVIRT_CLOCK=y
> CONFIG_NO_BOOTMEM=y
> # CONFIG_M486 is not set
> # CONFIG_M586 is not set
> # CONFIG_M586TSC is not set
> # CONFIG_M586MMX is not set
> # CONFIG_M686 is not set
> # CONFIG_MPENTIUMII is not set
> # CONFIG_MPENTIUMIII is not set
> # CONFIG_MPENTIUMM is not set
> # CONFIG_MPENTIUM4 is not set
> # CONFIG_MK6 is not set
> # CONFIG_MK7 is not set
> # CONFIG_MK8 is not set
> # CONFIG_MCRUSOE is not set
> # CONFIG_MEFFICEON is not set
> # CONFIG_MWINCHIPC6 is not set
> CONFIG_MWINCHIP3D=y
> # CONFIG_MELAN is not set
> # CONFIG_MGEODEGX1 is not set
> # CONFIG_MGEODE_LX is not set
> # CONFIG_MCYRIXIII is not set
> # CONFIG_MVIAC3_2 is not set
> # CONFIG_MVIAC7 is not set
> # CONFIG_MCORE2 is not set
> # CONFIG_MATOM is not set
> # CONFIG_X86_GENERIC is not set
> CONFIG_X86_INTERNODE_CACHE_SHIFT=5
> CONFIG_X86_L1_CACHE_SHIFT=5
> CONFIG_X86_ALIGNMENT_16=y
> CONFIG_X86_USE_PPRO_CHECKSUM=y
> CONFIG_X86_TSC=y
> CONFIG_X86_CMPXCHG64=y
> CONFIG_X86_MINIMUM_CPU_FAMILY=5
> CONFIG_PROCESSOR_SELECT=y
> # CONFIG_CPU_SUP_INTEL is not set
> # CONFIG_CPU_SUP_CYRIX_32 is not set
> CONFIG_CPU_SUP_AMD=y
> # CONFIG_CPU_SUP_CENTAUR is not set
> CONFIG_CPU_SUP_TRANSMETA_32=y
> CONFIG_CPU_SUP_UMC_32=y
> CONFIG_HPET_TIMER=y
> CONFIG_DMI=y
> CONFIG_SWIOTLB=y
> CONFIG_IOMMU_HELPER=y
> CONFIG_NR_CPUS=1
> # CONFIG_PREEMPT_NONE is not set
> # CONFIG_PREEMPT_VOLUNTARY is not set
> CONFIG_PREEMPT=y
> CONFIG_PREEMPT_COUNT=y
> CONFIG_UP_LATE_INIT=y
> CONFIG_X86_UP_APIC=y
> CONFIG_X86_UP_IOAPIC=y
> CONFIG_X86_LOCAL_APIC=y
> CONFIG_X86_IO_APIC=y
> # CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS is not set
> # CONFIG_X86_MCE is not set
>
> #
> # Performance monitoring
> #
> # CONFIG_PERF_EVENTS_AMD_POWER is not set
> CONFIG_X86_LEGACY_VM86=y
> CONFIG_VM86=y
> CONFIG_X86_16BIT=y
> CONFIG_X86_ESPFIX32=y
> # CONFIG_TOSHIBA is not set
> # CONFIG_I8K is not set
> CONFIG_X86_REBOOTFIXUPS=y
> # CONFIG_MICROCODE is not set
> CONFIG_X86_MSR=y
> CONFIG_X86_CPUID=m
> # CONFIG_NOHIGHMEM is not set
> # CONFIG_HIGHMEM4G is not set
> CONFIG_HIGHMEM64G=y
> CONFIG_VMSPLIT_3G=y
> # CONFIG_VMSPLIT_2G is not set
> # CONFIG_VMSPLIT_1G is not set
> CONFIG_PAGE_OFFSET=0xC0000000
> CONFIG_HIGHMEM=y
> CONFIG_X86_PAE=y
> CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
> CONFIG_ARCH_DMA_ADDR_T_64BIT=y
> CONFIG_NEED_NODE_MEMMAP_SIZE=y
> CONFIG_ARCH_FLATMEM_ENABLE=y
> CONFIG_ARCH_SPARSEMEM_ENABLE=y
> CONFIG_ARCH_SELECT_MEMORY_MODEL=y
> CONFIG_ILLEGAL_POINTER_VALUE=0
> CONFIG_SELECT_MEMORY_MODEL=y
> # CONFIG_FLATMEM_MANUAL is not set
> CONFIG_SPARSEMEM_MANUAL=y
> CONFIG_SPARSEMEM=y
> CONFIG_HAVE_MEMORY_PRESENT=y
> CONFIG_SPARSEMEM_STATIC=y
> CONFIG_HAVE_MEMBLOCK=y
> CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
> CONFIG_ARCH_DISCARD_MEMBLOCK=y
> # CONFIG_HAVE_BOOTMEM_INFO_NODE is not set
> # CONFIG_MEMORY_HOTPLUG is not set
> CONFIG_SPLIT_PTLOCK_CPUS=4
> CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
> CONFIG_COMPACTION=y
> CONFIG_MIGRATION=y
> CONFIG_PHYS_ADDR_T_64BIT=y
> CONFIG_VIRT_TO_BUS=y
> CONFIG_KSM=y
> CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
> # CONFIG_TRANSPARENT_HUGEPAGE is not set
> CONFIG_NEED_PER_CPU_KM=y
> CONFIG_CLEANCACHE=y
> # CONFIG_CMA is not set
> CONFIG_ZPOOL=m
> CONFIG_ZBUD=y
> # CONFIG_Z3FOLD is not set
> CONFIG_ZSMALLOC=y
> # CONFIG_PGTABLE_MAPPING is not set
> CONFIG_ZSMALLOC_STAT=y
> CONFIG_GENERIC_EARLY_IOREMAP=y
> CONFIG_ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT=y
> CONFIG_IDLE_PAGE_TRACKING=y
> CONFIG_FRAME_VECTOR=y
> CONFIG_HIGHPTE=y
> # CONFIG_X86_CHECK_BIOS_CORRUPTION is not set
> CONFIG_X86_RESERVE_LOW=64
> CONFIG_MATH_EMULATION=y
> CONFIG_MTRR=y
> # CONFIG_MTRR_SANITIZER is not set
> CONFIG_X86_PAT=y
> CONFIG_ARCH_USES_PG_UNCACHED=y
> CONFIG_ARCH_RANDOM=y
> # CONFIG_X86_SMAP is not set
> # CONFIG_EFI is not set
> CONFIG_SECCOMP=y
> # CONFIG_HZ_100 is not set
> # CONFIG_HZ_250 is not set
> # CONFIG_HZ_300 is not set
> CONFIG_HZ_1000=y
> CONFIG_HZ=1000
> CONFIG_SCHED_HRTICK=y
> # CONFIG_KEXEC is not set
> # CONFIG_CRASH_DUMP is not set
> CONFIG_PHYSICAL_START=0x1000000
> # CONFIG_RELOCATABLE is not set
> CONFIG_PHYSICAL_ALIGN=0x200000
> # CONFIG_COMPAT_VDSO is not set
> # CONFIG_CMDLINE_BOOL is not set
> CONFIG_MODIFY_LDT_SYSCALL=y
> CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
>
> #
> # Power management and ACPI options
> #
> CONFIG_SUSPEND=y
> CONFIG_SUSPEND_FREEZER=y
> CONFIG_SUSPEND_SKIP_SYNC=y
> CONFIG_PM_SLEEP=y
> # CONFIG_PM_AUTOSLEEP is not set
> # CONFIG_PM_WAKELOCKS is not set
> CONFIG_PM=y
> CONFIG_PM_DEBUG=y
> CONFIG_PM_ADVANCED_DEBUG=y
> CONFIG_PM_SLEEP_DEBUG=y
> # CONFIG_PM_TRACE_RTC is not set
> CONFIG_PM_CLK=y
> # CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
> CONFIG_ACPI=y
> CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
> CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
> CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
> CONFIG_ACPI_DEBUGGER=y
> CONFIG_ACPI_DEBUGGER_USER=m
> CONFIG_ACPI_SLEEP=y
> # CONFIG_ACPI_PROCFS_POWER is not set
> # CONFIG_ACPI_REV_OVERRIDE_POSSIBLE is not set
> CONFIG_ACPI_EC_DEBUGFS=m
> CONFIG_ACPI_AC=m
> # CONFIG_ACPI_BATTERY is not set
> CONFIG_ACPI_BUTTON=m
> CONFIG_ACPI_VIDEO=m
> CONFIG_ACPI_FAN=y
> CONFIG_ACPI_DOCK=y
> CONFIG_ACPI_CPU_FREQ_PSS=y
> CONFIG_ACPI_PROCESSOR_CSTATE=y
> CONFIG_ACPI_PROCESSOR_IDLE=y
> CONFIG_ACPI_PROCESSOR=y
> CONFIG_ACPI_PROCESSOR_AGGREGATOR=y
> CONFIG_ACPI_THERMAL=m
> # CONFIG_ACPI_CUSTOM_DSDT is not set
> CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
> # CONFIG_ACPI_TABLE_UPGRADE is not set
> CONFIG_ACPI_DEBUG=y
> # CONFIG_ACPI_PCI_SLOT is not set
> CONFIG_X86_PM_TIMER=y
> CONFIG_ACPI_CONTAINER=y
> CONFIG_ACPI_HOTPLUG_IOAPIC=y
> CONFIG_ACPI_SBS=m
> CONFIG_ACPI_HED=m
> # CONFIG_ACPI_CUSTOM_METHOD is not set
> # CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
> CONFIG_HAVE_ACPI_APEI=y
> CONFIG_HAVE_ACPI_APEI_NMI=y
> # CONFIG_ACPI_APEI is not set
> # CONFIG_DPTF_POWER is not set
> # CONFIG_PMIC_OPREGION is not set
> CONFIG_ACPI_CONFIGFS=y
> # CONFIG_SFI is not set
> # CONFIG_APM is not set
>
> #
> # CPU Frequency scaling
> #
> # CONFIG_CPU_FREQ is not set
>
> #
> # CPU Idle
> #
> CONFIG_CPU_IDLE=y
> # CONFIG_CPU_IDLE_GOV_LADDER is not set
> CONFIG_CPU_IDLE_GOV_MENU=y
> # CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
>
> #
> # Bus options (PCI etc.)
> #
> CONFIG_PCI=y
> CONFIG_PCI_GOBIOS=y
> # CONFIG_PCI_GOMMCONFIG is not set
> # CONFIG_PCI_GODIRECT is not set
> # CONFIG_PCI_GOANY is not set
> CONFIG_PCI_BIOS=y
> CONFIG_PCI_DOMAINS=y
> CONFIG_PCI_CNB20LE_QUIRK=y
> CONFIG_PCIEPORTBUS=y
> CONFIG_PCIEAER=y
> CONFIG_PCIE_ECRC=y
> # CONFIG_PCIEAER_INJECT is not set
> CONFIG_PCIEASPM=y
> # CONFIG_PCIEASPM_DEBUG is not set
> # CONFIG_PCIEASPM_DEFAULT is not set
> CONFIG_PCIEASPM_POWERSAVE=y
> # CONFIG_PCIEASPM_PERFORMANCE is not set
> CONFIG_PCIE_PME=y
> # CONFIG_PCIE_DPC is not set
> # CONFIG_PCIE_PTM is not set
> CONFIG_PCI_BUS_ADDR_T_64BIT=y
> CONFIG_PCI_MSI=y
> CONFIG_PCI_MSI_IRQ_DOMAIN=y
> # CONFIG_PCI_DEBUG is not set
> CONFIG_PCI_REALLOC_ENABLE_AUTO=y
> # CONFIG_PCI_STUB is not set
> # CONFIG_HT_IRQ is not set
> CONFIG_PCI_ATS=y
> CONFIG_PCI_IOV=y
> # CONFIG_PCI_PRI is not set
> # CONFIG_PCI_PASID is not set
> CONFIG_PCI_LABEL=y
> # CONFIG_HOTPLUG_PCI is not set
>
> #
> # PCI host controller drivers
> #
> # CONFIG_PCIE_DW_PLAT is not set
> CONFIG_ISA_BUS=y
> CONFIG_ISA_DMA_API=y
> # CONFIG_ISA is not set
> CONFIG_SCx200=m
> CONFIG_SCx200HR_TIMER=m
> # CONFIG_ALIX is not set
> # CONFIG_NET5501 is not set
> # CONFIG_GEOS is not set
> CONFIG_AMD_NB=y
> CONFIG_PCCARD=y
> CONFIG_PCMCIA=m
> # CONFIG_PCMCIA_LOAD_CIS is not set
> # CONFIG_CARDBUS is not set
>
> #
> # PC-card bridges
> #
> CONFIG_YENTA=y
> CONFIG_YENTA_O2=y
> CONFIG_YENTA_RICOH=y
> CONFIG_YENTA_TI=y
> # CONFIG_YENTA_TOSHIBA is not set
> # CONFIG_PD6729 is not set
> # CONFIG_I82092 is not set
> CONFIG_PCCARD_NONSTATIC=y
> CONFIG_RAPIDIO=m
> CONFIG_RAPIDIO_TSI721=m
> CONFIG_RAPIDIO_DISC_TIMEOUT=30
> # CONFIG_RAPIDIO_ENABLE_RX_TX_PORTS is not set
> CONFIG_RAPIDIO_DMA_ENGINE=y
> CONFIG_RAPIDIO_DEBUG=y
> # CONFIG_RAPIDIO_ENUM_BASIC is not set
> # CONFIG_RAPIDIO_CHMAN is not set
> CONFIG_RAPIDIO_MPORT_CDEV=m
>
> #
> # RapidIO Switch drivers
> #
> # CONFIG_RAPIDIO_TSI57X is not set
> CONFIG_RAPIDIO_CPS_XX=m
> CONFIG_RAPIDIO_TSI568=m
> CONFIG_RAPIDIO_CPS_GEN2=m
> CONFIG_RAPIDIO_RXS_GEN3=m
> CONFIG_X86_SYSFB=y
>
> #
> # Executable file formats / Emulations
> #
> CONFIG_BINFMT_ELF=y
> CONFIG_ELFCORE=y
> CONFIG_BINFMT_SCRIPT=y
> CONFIG_HAVE_AOUT=y
> # CONFIG_BINFMT_AOUT is not set
> CONFIG_BINFMT_MISC=y
> # CONFIG_COREDUMP is not set
> CONFIG_HAVE_ATOMIC_IOMAP=y
> CONFIG_PMC_ATOM=y
> CONFIG_NET=y
>
> #
> # Networking options
> #
> # CONFIG_PACKET is not set
> CONFIG_UNIX=y
> # CONFIG_UNIX_DIAG is not set
> # CONFIG_NET_KEY is not set
> # CONFIG_INET is not set
> # CONFIG_NETWORK_SECMARK is not set
> # CONFIG_NET_PTP_CLASSIFY is not set
> # CONFIG_NETWORK_PHY_TIMESTAMPING is not set
> # CONFIG_NETFILTER is not set
> # CONFIG_ATM is not set
> # CONFIG_BRIDGE is not set
> # CONFIG_VLAN_8021Q is not set
> # CONFIG_DECNET is not set
> # CONFIG_LLC2 is not set
> # CONFIG_IPX is not set
> # CONFIG_ATALK is not set
> # CONFIG_X25 is not set
> # CONFIG_LAPB is not set
> # CONFIG_PHONET is not set
> # CONFIG_IEEE802154 is not set
> # CONFIG_NET_SCHED is not set
> # CONFIG_DCB is not set
> # CONFIG_DNS_RESOLVER is not set
> # CONFIG_BATMAN_ADV is not set
> # CONFIG_VSOCKETS is not set
> # CONFIG_NETLINK_DIAG is not set
> # CONFIG_MPLS is not set
> # CONFIG_HSR is not set
> # CONFIG_SOCK_CGROUP_DATA is not set
> # CONFIG_CGROUP_NET_PRIO is not set
> # CONFIG_CGROUP_NET_CLASSID is not set
> CONFIG_NET_RX_BUSY_POLL=y
> CONFIG_BQL=y
>
> #
> # Network testing
> #
> # CONFIG_HAMRADIO is not set
> # CONFIG_CAN is not set
> # CONFIG_IRDA is not set
> # CONFIG_BT is not set
> # CONFIG_STREAM_PARSER is not set
> CONFIG_WIRELESS=y
> # CONFIG_CFG80211 is not set
> # CONFIG_LIB80211 is not set
>
> #
> # CFG80211 needs to be enabled for MAC80211
> #
> CONFIG_MAC80211_STA_HASH_MAX_SIZE=0
> # CONFIG_WIMAX is not set
> # CONFIG_RFKILL is not set
> # CONFIG_RFKILL_REGULATOR is not set
> # CONFIG_NET_9P is not set
> # CONFIG_CAIF is not set
> # CONFIG_NFC is not set
> # CONFIG_LWTUNNEL is not set
> # CONFIG_DST_CACHE is not set
> # CONFIG_NET_DEVLINK is not set
> CONFIG_MAY_USE_DEVLINK=y
>
> #
> # Device Drivers
> #
>
> #
> # Generic Driver Options
> #
> # CONFIG_UEVENT_HELPER is not set
> CONFIG_DEVTMPFS=y
> CONFIG_DEVTMPFS_MOUNT=y
> CONFIG_STANDALONE=y
> CONFIG_PREVENT_FIRMWARE_BUILD=y
> CONFIG_FW_LOADER=y
> # CONFIG_FIRMWARE_IN_KERNEL is not set
> CONFIG_EXTRA_FIRMWARE=""
> CONFIG_FW_LOADER_USER_HELPER=y
> # CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set
> # CONFIG_ALLOW_DEV_COREDUMP is not set
> # CONFIG_DEBUG_DRIVER is not set
> # CONFIG_DEBUG_DEVRES is not set
> CONFIG_DEBUG_TEST_DRIVER_REMOVE=y
> # CONFIG_SYS_HYPERVISOR is not set
> # CONFIG_GENERIC_CPU_DEVICES is not set
> CONFIG_GENERIC_CPU_AUTOPROBE=y
> CONFIG_REGMAP=y
> CONFIG_REGMAP_I2C=y
> CONFIG_REGMAP_MMIO=y
> CONFIG_REGMAP_IRQ=y
> CONFIG_DMA_SHARED_BUFFER=y
> # CONFIG_FENCE_TRACE is not set
>
> #
> # Bus devices
> #
> # CONFIG_CONNECTOR is not set
> CONFIG_MTD=m
> # CONFIG_MTD_TESTS is not set
> CONFIG_MTD_REDBOOT_PARTS=m
> CONFIG_MTD_REDBOOT_DIRECTORY_BLOCK=-1
> CONFIG_MTD_REDBOOT_PARTS_UNALLOCATED=y
> CONFIG_MTD_REDBOOT_PARTS_READONLY=y
> CONFIG_MTD_CMDLINE_PARTS=m
> CONFIG_MTD_OF_PARTS=m
> CONFIG_MTD_AR7_PARTS=m
>
> #
> # User Modules And Translation Layers
> #
> # CONFIG_MTD_OOPS is not set
> # CONFIG_MTD_PARTITIONED_MASTER is not set
>
> #
> # RAM/ROM/Flash chip drivers
> #
> CONFIG_MTD_CFI=m
> CONFIG_MTD_JEDECPROBE=m
> CONFIG_MTD_GEN_PROBE=m
> CONFIG_MTD_CFI_ADV_OPTIONS=y
> CONFIG_MTD_CFI_NOSWAP=y
> # CONFIG_MTD_CFI_BE_BYTE_SWAP is not set
> # CONFIG_MTD_CFI_LE_BYTE_SWAP is not set
> CONFIG_MTD_CFI_GEOMETRY=y
> # CONFIG_MTD_MAP_BANK_WIDTH_1 is not set
> CONFIG_MTD_MAP_BANK_WIDTH_2=y
> # CONFIG_MTD_MAP_BANK_WIDTH_4 is not set
> CONFIG_MTD_MAP_BANK_WIDTH_8=y
> CONFIG_MTD_MAP_BANK_WIDTH_16=y
> CONFIG_MTD_MAP_BANK_WIDTH_32=y
> CONFIG_MTD_CFI_I1=y
> CONFIG_MTD_CFI_I2=y
> # CONFIG_MTD_CFI_I4 is not set
> CONFIG_MTD_CFI_I8=y
> CONFIG_MTD_OTP=y
> # CONFIG_MTD_CFI_INTELEXT is not set
> CONFIG_MTD_CFI_AMDSTD=m
> # CONFIG_MTD_CFI_STAA is not set
> CONFIG_MTD_CFI_UTIL=m
> CONFIG_MTD_RAM=m
> # CONFIG_MTD_ROM is not set
> CONFIG_MTD_ABSENT=m
>
> #
> # Mapping drivers for chip access
> #
> CONFIG_MTD_COMPLEX_MAPPINGS=y
> CONFIG_MTD_PHYSMAP=m
> CONFIG_MTD_PHYSMAP_COMPAT=y
> CONFIG_MTD_PHYSMAP_START=0x8000000
> CONFIG_MTD_PHYSMAP_LEN=0
> CONFIG_MTD_PHYSMAP_BANKWIDTH=2
> # CONFIG_MTD_PHYSMAP_OF is not set
> CONFIG_MTD_SCx200_DOCFLASH=m
> # CONFIG_MTD_AMD76XROM is not set
> # CONFIG_MTD_ICHXROM is not set
> # CONFIG_MTD_ESB2ROM is not set
> CONFIG_MTD_CK804XROM=m
> # CONFIG_MTD_SCB2_FLASH is not set
> CONFIG_MTD_NETtel=m
> CONFIG_MTD_L440GX=m
> # CONFIG_MTD_PCI is not set
> # CONFIG_MTD_PCMCIA is not set
> CONFIG_MTD_GPIO_ADDR=m
> CONFIG_MTD_INTEL_VR_NOR=m
> CONFIG_MTD_PLATRAM=m
> CONFIG_MTD_LATCH_ADDR=m
>
> #
> # Self-contained MTD device drivers
> #
> CONFIG_MTD_PMC551=m
> # CONFIG_MTD_PMC551_BUGFIX is not set
> CONFIG_MTD_PMC551_DEBUG=y
> # CONFIG_MTD_SLRAM is not set
> CONFIG_MTD_PHRAM=m
> # CONFIG_MTD_MTDRAM is not set
>
> #
> # Disk-On-Chip Device Drivers
> #
> CONFIG_MTD_DOCG3=m
> CONFIG_BCH_CONST_M=14
> CONFIG_BCH_CONST_T=4
> CONFIG_MTD_NAND_ECC=m
> # CONFIG_MTD_NAND_ECC_SMC is not set
> CONFIG_MTD_NAND=m
> CONFIG_MTD_NAND_BCH=m
> CONFIG_MTD_NAND_ECC_BCH=y
> CONFIG_MTD_SM_COMMON=m
> CONFIG_MTD_NAND_DENALI=m
> CONFIG_MTD_NAND_DENALI_PCI=m
> CONFIG_MTD_NAND_DENALI_DT=m
> CONFIG_MTD_NAND_DENALI_SCRATCH_REG_ADDR=0xFF108018
> CONFIG_MTD_NAND_GPIO=m
> # CONFIG_MTD_NAND_OMAP_BCH_BUILD is not set
> CONFIG_MTD_NAND_IDS=m
> CONFIG_MTD_NAND_RICOH=m
> # CONFIG_MTD_NAND_DISKONCHIP is not set
> CONFIG_MTD_NAND_DOCG4=m
> CONFIG_MTD_NAND_CAFE=m
> CONFIG_MTD_NAND_CS553X=m
> # CONFIG_MTD_NAND_NANDSIM is not set
> # CONFIG_MTD_NAND_PLATFORM is not set
> CONFIG_MTD_NAND_HISI504=m
> CONFIG_MTD_NAND_MTK=m
> # CONFIG_MTD_ONENAND is not set
>
> #
> # LPDDR & LPDDR2 PCM memory drivers
> #
> CONFIG_MTD_LPDDR=m
> CONFIG_MTD_QINFO_PROBE=m
> # CONFIG_MTD_SPI_NOR is not set
> CONFIG_MTD_UBI=m
> CONFIG_MTD_UBI_WL_THRESHOLD=4096
> CONFIG_MTD_UBI_BEB_LIMIT=20
> # CONFIG_MTD_UBI_FASTMAP is not set
> # CONFIG_MTD_UBI_GLUEBI is not set
> CONFIG_DTC=y
> CONFIG_OF=y
> CONFIG_OF_UNITTEST=y
> CONFIG_OF_FLATTREE=y
> CONFIG_OF_EARLY_FLATTREE=y
> CONFIG_OF_DYNAMIC=y
> CONFIG_OF_ADDRESS=y
> CONFIG_OF_ADDRESS_PCI=y
> CONFIG_OF_IRQ=y
> CONFIG_OF_PCI=y
> CONFIG_OF_PCI_IRQ=y
> CONFIG_OF_RESOLVE=y
> CONFIG_OF_OVERLAY=y
> CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
> CONFIG_PARPORT=y
> CONFIG_PARPORT_PC=m
> # CONFIG_PARPORT_SERIAL is not set
> # CONFIG_PARPORT_PC_FIFO is not set
> CONFIG_PARPORT_PC_SUPERIO=y
> # CONFIG_PARPORT_PC_PCMCIA is not set
> # CONFIG_PARPORT_GSC is not set
> CONFIG_PARPORT_AX88796=y
> # CONFIG_PARPORT_1284 is not set
> CONFIG_PARPORT_NOT_PC=y
> CONFIG_PNP=y
> CONFIG_PNP_DEBUG_MESSAGES=y
>
> #
> # Protocols
> #
> CONFIG_PNPACPI=y
>
> #
> # Misc devices
> #
> CONFIG_SENSORS_LIS3LV02D=m
> CONFIG_AD525X_DPOT=m
> # CONFIG_AD525X_DPOT_I2C is not set
> CONFIG_DUMMY_IRQ=y
> # CONFIG_IBM_ASM is not set
> CONFIG_PHANTOM=y
> # CONFIG_SGI_IOC4 is not set
> CONFIG_TIFM_CORE=y
> CONFIG_TIFM_7XX1=y
> CONFIG_ICS932S401=m
> CONFIG_ENCLOSURE_SERVICES=y
> # CONFIG_CS5535_MFGPT is not set
> CONFIG_HP_ILO=m
> # CONFIG_APDS9802ALS is not set
> CONFIG_ISL29003=y
> CONFIG_ISL29020=y
> # CONFIG_SENSORS_TSL2550 is not set
> # CONFIG_SENSORS_BH1770 is not set
> CONFIG_SENSORS_APDS990X=y
> CONFIG_HMC6352=m
> # CONFIG_DS1682 is not set
> # CONFIG_VMWARE_BALLOON is not set
> # CONFIG_PCH_PHUB is not set
> CONFIG_USB_SWITCH_FSA9480=y
> # CONFIG_SRAM is not set
> CONFIG_PANEL=y
> CONFIG_PANEL_PARPORT=0
> CONFIG_PANEL_PROFILE=5
> # CONFIG_PANEL_CHANGE_MESSAGE is not set
> # CONFIG_C2PORT is not set
>
> #
> # EEPROM support
> #
> CONFIG_EEPROM_AT24=m
> CONFIG_EEPROM_LEGACY=m
> CONFIG_EEPROM_MAX6875=m
> CONFIG_EEPROM_93CX6=m
> # CONFIG_CB710_CORE is not set
>
> #
> # Texas Instruments shared transport line discipline
> #
> # CONFIG_TI_ST is not set
> CONFIG_SENSORS_LIS3_I2C=m
>
> #
> # Altera FPGA firmware download module
> #
> CONFIG_ALTERA_STAPL=m
> CONFIG_INTEL_MEI=m
> # CONFIG_INTEL_MEI_ME is not set
> CONFIG_INTEL_MEI_TXE=m
> CONFIG_VMWARE_VMCI=y
>
> #
> # Intel MIC Bus Driver
> #
>
> #
> # SCIF Bus Driver
> #
>
> #
> # VOP Bus Driver
> #
>
> #
> # Intel MIC Host Driver
> #
>
> #
> # Intel MIC Card Driver
> #
>
> #
> # SCIF Driver
> #
>
> #
> # Intel MIC Coprocessor State Management (COSM) Drivers
> #
>
> #
> # VOP Driver
> #
> CONFIG_ECHO=y
> # CONFIG_CXL_BASE is not set
> # CONFIG_CXL_AFU_DRIVER_OPS is not set
> CONFIG_HAVE_IDE=y
>
> #
> # SCSI device support
> #
> CONFIG_SCSI_MOD=y
> # CONFIG_SCSI_DMA is not set
> # CONFIG_SCSI_NETLINK is not set
> # CONFIG_FUSION is not set
>
> #
> # IEEE 1394 (FireWire) support
> #
> CONFIG_FIREWIRE=y
> CONFIG_FIREWIRE_OHCI=y
> CONFIG_FIREWIRE_NOSY=m
> CONFIG_MACINTOSH_DRIVERS=y
> # CONFIG_MAC_EMUMOUSEBTN is not set
> # CONFIG_NETDEVICES is not set
>
> #
> # Input device support
> #
> CONFIG_INPUT=y
> CONFIG_INPUT_LEDS=m
> CONFIG_INPUT_FF_MEMLESS=m
> CONFIG_INPUT_POLLDEV=m
> CONFIG_INPUT_SPARSEKMAP=m
> CONFIG_INPUT_MATRIXKMAP=m
>
> #
> # Userland interfaces
> #
> CONFIG_INPUT_MOUSEDEV=m
> # CONFIG_INPUT_MOUSEDEV_PSAUX is not set
> CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
> CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
> CONFIG_INPUT_JOYDEV=m
> # CONFIG_INPUT_EVDEV is not set
> CONFIG_INPUT_EVBUG=m
>
> #
> # Input Device Drivers
> #
> CONFIG_INPUT_KEYBOARD=y
> CONFIG_KEYBOARD_ADC=m
> # CONFIG_KEYBOARD_ADP5520 is not set
> CONFIG_KEYBOARD_ADP5588=m
> CONFIG_KEYBOARD_ADP5589=m
> CONFIG_KEYBOARD_ATKBD=y
> CONFIG_KEYBOARD_QT1070=m
> CONFIG_KEYBOARD_QT2160=m
> CONFIG_KEYBOARD_LKKBD=m
> CONFIG_KEYBOARD_GPIO=m
> CONFIG_KEYBOARD_GPIO_POLLED=m
> CONFIG_KEYBOARD_TCA6416=m
> # CONFIG_KEYBOARD_TCA8418 is not set
> CONFIG_KEYBOARD_MATRIX=m
> CONFIG_KEYBOARD_LM8323=m
> CONFIG_KEYBOARD_LM8333=m
> CONFIG_KEYBOARD_MAX7359=m
> CONFIG_KEYBOARD_MCS=m
> CONFIG_KEYBOARD_MPR121=m
> CONFIG_KEYBOARD_NEWTON=m
> CONFIG_KEYBOARD_OPENCORES=m
> CONFIG_KEYBOARD_SAMSUNG=m
> # CONFIG_KEYBOARD_STOWAWAY is not set
> CONFIG_KEYBOARD_SUNKBD=m
> CONFIG_KEYBOARD_STMPE=m
> CONFIG_KEYBOARD_OMAP4=m
> CONFIG_KEYBOARD_XTKBD=m
> CONFIG_KEYBOARD_CROS_EC=m
> CONFIG_KEYBOARD_CAP11XX=m
> CONFIG_KEYBOARD_BCM=m
> CONFIG_INPUT_MOUSE=y
> CONFIG_MOUSE_PS2=m
> # CONFIG_MOUSE_PS2_ALPS is not set
> CONFIG_MOUSE_PS2_BYD=y
> # CONFIG_MOUSE_PS2_LOGIPS2PP is not set
> CONFIG_MOUSE_PS2_SYNAPTICS=y
> # CONFIG_MOUSE_PS2_CYPRESS is not set
> # CONFIG_MOUSE_PS2_LIFEBOOK is not set
> CONFIG_MOUSE_PS2_TRACKPOINT=y
> # CONFIG_MOUSE_PS2_ELANTECH is not set
> CONFIG_MOUSE_PS2_SENTELIC=y
> CONFIG_MOUSE_PS2_TOUCHKIT=y
> CONFIG_MOUSE_PS2_FOCALTECH=y
> CONFIG_MOUSE_PS2_VMMOUSE=y
> CONFIG_MOUSE_SERIAL=m
> # CONFIG_MOUSE_APPLETOUCH is not set
> CONFIG_MOUSE_BCM5974=m
> CONFIG_MOUSE_CYAPA=m
> CONFIG_MOUSE_ELAN_I2C=m
> CONFIG_MOUSE_ELAN_I2C_I2C=y
> CONFIG_MOUSE_ELAN_I2C_SMBUS=y
> # CONFIG_MOUSE_VSXXXAA is not set
> CONFIG_MOUSE_GPIO=m
> CONFIG_MOUSE_SYNAPTICS_I2C=m
> # CONFIG_MOUSE_SYNAPTICS_USB is not set
> # CONFIG_INPUT_JOYSTICK is not set
> # CONFIG_INPUT_TABLET is not set
> CONFIG_INPUT_TOUCHSCREEN=y
> CONFIG_TOUCHSCREEN_PROPERTIES=y
> CONFIG_TOUCHSCREEN_AD7879=m
> CONFIG_TOUCHSCREEN_AD7879_I2C=m
> # CONFIG_TOUCHSCREEN_AR1021_I2C is not set
> # CONFIG_TOUCHSCREEN_ATMEL_MXT is not set
> CONFIG_TOUCHSCREEN_AUO_PIXCIR=m
> CONFIG_TOUCHSCREEN_BU21013=m
> CONFIG_TOUCHSCREEN_CHIPONE_ICN8318=m
> CONFIG_TOUCHSCREEN_CY8CTMG110=m
> CONFIG_TOUCHSCREEN_CYTTSP_CORE=m
> # CONFIG_TOUCHSCREEN_CYTTSP_I2C is not set
> CONFIG_TOUCHSCREEN_CYTTSP4_CORE=m
> # CONFIG_TOUCHSCREEN_CYTTSP4_I2C is not set
> # CONFIG_TOUCHSCREEN_DA9034 is not set
> CONFIG_TOUCHSCREEN_DA9052=m
> CONFIG_TOUCHSCREEN_DYNAPRO=m
> CONFIG_TOUCHSCREEN_HAMPSHIRE=m
> CONFIG_TOUCHSCREEN_EETI=m
> # CONFIG_TOUCHSCREEN_EGALAX is not set
> CONFIG_TOUCHSCREEN_EGALAX_SERIAL=m
> CONFIG_TOUCHSCREEN_FUJITSU=m
> CONFIG_TOUCHSCREEN_GOODIX=m
> CONFIG_TOUCHSCREEN_ILI210X=m
> # CONFIG_TOUCHSCREEN_GUNZE is not set
> CONFIG_TOUCHSCREEN_EKTF2127=m
> CONFIG_TOUCHSCREEN_ELAN=m
> # CONFIG_TOUCHSCREEN_ELO is not set
> CONFIG_TOUCHSCREEN_WACOM_W8001=m
> # CONFIG_TOUCHSCREEN_WACOM_I2C is not set
> CONFIG_TOUCHSCREEN_MAX11801=m
> # CONFIG_TOUCHSCREEN_MCS5000 is not set
> CONFIG_TOUCHSCREEN_MMS114=m
> CONFIG_TOUCHSCREEN_MELFAS_MIP4=m
> CONFIG_TOUCHSCREEN_MTOUCH=m
> CONFIG_TOUCHSCREEN_IMX6UL_TSC=m
> CONFIG_TOUCHSCREEN_INEXIO=m
> # CONFIG_TOUCHSCREEN_MK712 is not set
> CONFIG_TOUCHSCREEN_PENMOUNT=m
> CONFIG_TOUCHSCREEN_EDT_FT5X06=m
> CONFIG_TOUCHSCREEN_TOUCHRIGHT=m
> CONFIG_TOUCHSCREEN_TOUCHWIN=m
> CONFIG_TOUCHSCREEN_TI_AM335X_TSC=m
> CONFIG_TOUCHSCREEN_PIXCIR=m
> # CONFIG_TOUCHSCREEN_WDT87XX_I2C is not set
> # CONFIG_TOUCHSCREEN_USB_COMPOSITE is not set
> CONFIG_TOUCHSCREEN_TOUCHIT213=m
> CONFIG_TOUCHSCREEN_TSC_SERIO=m
> CONFIG_TOUCHSCREEN_TSC200X_CORE=m
> CONFIG_TOUCHSCREEN_TSC2004=m
> CONFIG_TOUCHSCREEN_TSC2007=m
> CONFIG_TOUCHSCREEN_RM_TS=m
> CONFIG_TOUCHSCREEN_SILEAD=m
> CONFIG_TOUCHSCREEN_SIS_I2C=m
> CONFIG_TOUCHSCREEN_ST1232=m
> CONFIG_TOUCHSCREEN_STMPE=m
> CONFIG_TOUCHSCREEN_SUR40=m
> # CONFIG_TOUCHSCREEN_SX8654 is not set
> CONFIG_TOUCHSCREEN_TPS6507X=m
> # CONFIG_TOUCHSCREEN_ZFORCE is not set
> CONFIG_TOUCHSCREEN_COLIBRI_VF50=m
> CONFIG_TOUCHSCREEN_ROHM_BU21023=m
> # CONFIG_INPUT_MISC is not set
> CONFIG_RMI4_CORE=m
> CONFIG_RMI4_I2C=m
> # CONFIG_RMI4_F11 is not set
> # CONFIG_RMI4_F12 is not set
> # CONFIG_RMI4_F30 is not set
> # CONFIG_RMI4_F54 is not set
>
> #
> # Hardware I/O ports
> #
> CONFIG_SERIO=y
> CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
> CONFIG_SERIO_I8042=y
> CONFIG_SERIO_SERPORT=y
> CONFIG_SERIO_CT82C710=m
> # CONFIG_SERIO_PARKBD is not set
> # CONFIG_SERIO_PCIPS2 is not set
> CONFIG_SERIO_LIBPS2=y
> # CONFIG_SERIO_RAW is not set
> # CONFIG_SERIO_ALTERA_PS2 is not set
> CONFIG_SERIO_PS2MULT=y
> CONFIG_SERIO_ARC_PS2=m
> CONFIG_SERIO_APBPS2=m
> CONFIG_HYPERV_KEYBOARD=m
> # CONFIG_USERIO is not set
> CONFIG_GAMEPORT=m
> CONFIG_GAMEPORT_NS558=m
> CONFIG_GAMEPORT_L4=m
> CONFIG_GAMEPORT_EMU10K1=m
> CONFIG_GAMEPORT_FM801=m
>
> #
> # Character devices
> #
> CONFIG_TTY=y
> # CONFIG_VT is not set
> CONFIG_UNIX98_PTYS=y
> CONFIG_LEGACY_PTYS=y
> CONFIG_LEGACY_PTY_COUNT=256
> # CONFIG_SERIAL_NONSTANDARD is not set
> # CONFIG_NOZOMI is not set
> # CONFIG_N_GSM is not set
> # CONFIG_TRACE_SINK is not set
> # CONFIG_DEVMEM is not set
> CONFIG_DEVKMEM=y
>
> #
> # Serial drivers
> #
> CONFIG_SERIAL_EARLYCON=y
> CONFIG_SERIAL_8250=y
> CONFIG_SERIAL_8250_DEPRECATED_OPTIONS=y
> CONFIG_SERIAL_8250_PNP=y
> # CONFIG_SERIAL_8250_FINTEK is not set
> CONFIG_SERIAL_8250_CONSOLE=y
> CONFIG_SERIAL_8250_DMA=y
> CONFIG_SERIAL_8250_PCI=y
> # CONFIG_SERIAL_8250_CS is not set
> CONFIG_SERIAL_8250_NR_UARTS=4
> CONFIG_SERIAL_8250_RUNTIME_UARTS=4
> # CONFIG_SERIAL_8250_EXTENDED is not set
> # CONFIG_SERIAL_8250_FSL is not set
> # CONFIG_SERIAL_8250_DW is not set
> # CONFIG_SERIAL_8250_RT288X is not set
> CONFIG_SERIAL_8250_LPSS=y
> CONFIG_SERIAL_8250_MID=y
> # CONFIG_SERIAL_8250_MOXA is not set
> # CONFIG_SERIAL_OF_PLATFORM is not set
>
> #
> # Non-8250 serial port support
> #
> # CONFIG_SERIAL_UARTLITE is not set
> CONFIG_SERIAL_CORE=y
> CONFIG_SERIAL_CORE_CONSOLE=y
> # CONFIG_SERIAL_JSM is not set
> # CONFIG_SERIAL_SCCNXP is not set
> # CONFIG_SERIAL_SC16IS7XX is not set
> # CONFIG_SERIAL_TIMBERDALE is not set
> # CONFIG_SERIAL_ALTERA_JTAGUART is not set
> # CONFIG_SERIAL_ALTERA_UART is not set
> # CONFIG_SERIAL_PCH_UART is not set
> # CONFIG_SERIAL_XILINX_PS_UART is not set
> # CONFIG_SERIAL_ARC is not set
> # CONFIG_SERIAL_RP2 is not set
> # CONFIG_SERIAL_FSL_LPUART is not set
> # CONFIG_SERIAL_CONEXANT_DIGICOLOR is not set
> # CONFIG_SERIAL_MEN_Z135 is not set
> # CONFIG_TTY_PRINTK is not set
> # CONFIG_PRINTER is not set
> CONFIG_PPDEV=m
> # CONFIG_VIRTIO_CONSOLE is not set
> # CONFIG_IPMI_HANDLER is not set
> CONFIG_HW_RANDOM=m
> # CONFIG_HW_RANDOM_TIMERIOMEM is not set
> # CONFIG_HW_RANDOM_INTEL is not set
> # CONFIG_HW_RANDOM_AMD is not set
> CONFIG_HW_RANDOM_GEODE=m
> CONFIG_HW_RANDOM_VIA=m
> CONFIG_HW_RANDOM_VIRTIO=m
> CONFIG_HW_RANDOM_TPM=m
> CONFIG_NVRAM=m
> # CONFIG_R3964 is not set
> CONFIG_APPLICOM=m
> # CONFIG_SONYPI is not set
>
> #
> # PCMCIA character devices
> #
> # CONFIG_SYNCLINK_CS is not set
> CONFIG_CARDMAN_4000=m
> # CONFIG_CARDMAN_4040 is not set
> # CONFIG_MWAVE is not set
> CONFIG_SCx200_GPIO=m
> CONFIG_PC8736x_GPIO=y
> CONFIG_NSC_GPIO=y
> # CONFIG_HPET is not set
> CONFIG_HANGCHECK_TIMER=y
> CONFIG_TCG_TPM=y
> # CONFIG_TCG_TIS is not set
> CONFIG_TCG_TIS_I2C_ATMEL=y
> # CONFIG_TCG_TIS_I2C_INFINEON is not set
> # CONFIG_TCG_TIS_I2C_NUVOTON is not set
> CONFIG_TCG_NSC=m
> CONFIG_TCG_ATMEL=m
> # CONFIG_TCG_INFINEON is not set
> CONFIG_TCG_CRB=y
> CONFIG_TCG_VTPM_PROXY=y
> # CONFIG_TCG_TIS_ST33ZP24_I2C is not set
> # CONFIG_TELCLOCK is not set
> CONFIG_DEVPORT=y
> # CONFIG_XILLYBUS is not set
>
> #
> # I2C support
> #
> CONFIG_I2C=y
> CONFIG_ACPI_I2C_OPREGION=y
> CONFIG_I2C_BOARDINFO=y
> # CONFIG_I2C_COMPAT is not set
> CONFIG_I2C_CHARDEV=m
> CONFIG_I2C_MUX=y
>
> #
> # Multiplexer I2C Chip support
> #
> CONFIG_I2C_ARB_GPIO_CHALLENGE=y
> CONFIG_I2C_MUX_GPIO=m
> CONFIG_I2C_MUX_PCA9541=m
> # CONFIG_I2C_MUX_PCA954x is not set
> # CONFIG_I2C_MUX_PINCTRL is not set
> CONFIG_I2C_MUX_REG=m
> # CONFIG_I2C_DEMUX_PINCTRL is not set
> CONFIG_I2C_HELPER_AUTO=y
> CONFIG_I2C_SMBUS=y
> CONFIG_I2C_ALGOBIT=y
> CONFIG_I2C_ALGOPCA=y
>
> #
> # I2C Hardware Bus support
> #
>
> #
> # PC SMBus host controller drivers
> #
> # CONFIG_I2C_ALI1535 is not set
> CONFIG_I2C_ALI1563=y
> CONFIG_I2C_ALI15X3=m
> # CONFIG_I2C_AMD756 is not set
> CONFIG_I2C_AMD8111=m
> CONFIG_I2C_I801=y
> CONFIG_I2C_ISCH=y
> CONFIG_I2C_ISMT=m
> # CONFIG_I2C_PIIX4 is not set
> # CONFIG_I2C_NFORCE2 is not set
> CONFIG_I2C_SIS5595=m
> CONFIG_I2C_SIS630=m
> CONFIG_I2C_SIS96X=m
> # CONFIG_I2C_VIA is not set
> CONFIG_I2C_VIAPRO=y
>
> #
> # ACPI drivers
> #
> CONFIG_I2C_SCMI=y
>
> #
> # I2C system bus drivers (mostly embedded / system-on-chip)
> #
> # CONFIG_I2C_CBUS_GPIO is not set
> CONFIG_I2C_DESIGNWARE_CORE=m
> CONFIG_I2C_DESIGNWARE_PLATFORM=m
> # CONFIG_I2C_DESIGNWARE_PCI is not set
> # CONFIG_I2C_DESIGNWARE_BAYTRAIL is not set
> # CONFIG_I2C_EG20T is not set
> CONFIG_I2C_EMEV2=m
> # CONFIG_I2C_GPIO is not set
> CONFIG_I2C_KEMPLD=m
> # CONFIG_I2C_OCORES is not set
> CONFIG_I2C_PCA_PLATFORM=y
> # CONFIG_I2C_PXA is not set
> # CONFIG_I2C_PXA_PCI is not set
> # CONFIG_I2C_RK3X is not set
> CONFIG_I2C_SIMTEC=y
> CONFIG_I2C_XILINX=y
>
> #
> # External I2C/SMBus adapter drivers
> #
> CONFIG_I2C_DIOLAN_U2C=y
> # CONFIG_I2C_DLN2 is not set
> # CONFIG_I2C_PARPORT is not set
> CONFIG_I2C_PARPORT_LIGHT=y
> CONFIG_I2C_ROBOTFUZZ_OSIF=m
> # CONFIG_I2C_TAOS_EVM is not set
> CONFIG_I2C_TINY_USB=m
> # CONFIG_I2C_VIPERBOARD is not set
>
> #
> # Other I2C/SMBus bus drivers
> #
> CONFIG_I2C_CROS_EC_TUNNEL=m
> CONFIG_SCx200_ACB=m
> # CONFIG_I2C_STUB is not set
> CONFIG_I2C_SLAVE=y
> # CONFIG_I2C_SLAVE_EEPROM is not set
> # CONFIG_I2C_DEBUG_CORE is not set
> # CONFIG_I2C_DEBUG_ALGO is not set
> # CONFIG_I2C_DEBUG_BUS is not set
> # CONFIG_SPI is not set
> # CONFIG_SPMI is not set
> CONFIG_HSI=y
> CONFIG_HSI_BOARDINFO=y
>
> #
> # HSI controllers
> #
>
> #
> # HSI clients
> #
> # CONFIG_HSI_CHAR is not set
>
> #
> # PPS support
> #
> CONFIG_PPS=m
> # CONFIG_PPS_DEBUG is not set
>
> #
> # PPS clients support
> #
> # CONFIG_PPS_CLIENT_KTIMER is not set
> # CONFIG_PPS_CLIENT_LDISC is not set
> CONFIG_PPS_CLIENT_PARPORT=m
> CONFIG_PPS_CLIENT_GPIO=m
>
> #
> # PPS generators support
> #
>
> #
> # PTP clock support
> #
> # CONFIG_PTP_1588_CLOCK is not set
>
> #
> # Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
> #
> # CONFIG_PTP_1588_CLOCK_PCH is not set
> CONFIG_PINCTRL=y
>
> #
> # Pin controllers
> #
> CONFIG_PINMUX=y
> CONFIG_PINCONF=y
> CONFIG_GENERIC_PINCONF=y
> CONFIG_DEBUG_PINCTRL=y
> # CONFIG_PINCTRL_AS3722 is not set
> CONFIG_PINCTRL_AMD=y
> # CONFIG_PINCTRL_SINGLE is not set
> # CONFIG_PINCTRL_PALMAS is not set
> # CONFIG_PINCTRL_BAYTRAIL is not set
> CONFIG_PINCTRL_CHERRYVIEW=y
> CONFIG_PINCTRL_INTEL=m
> CONFIG_PINCTRL_BROXTON=m
> # CONFIG_PINCTRL_SUNRISEPOINT is not set
> CONFIG_GPIOLIB=y
> CONFIG_OF_GPIO=y
> CONFIG_GPIO_ACPI=y
> CONFIG_GPIOLIB_IRQCHIP=y
> # CONFIG_DEBUG_GPIO is not set
> CONFIG_GPIO_SYSFS=y
> CONFIG_GPIO_GENERIC=y
> CONFIG_GPIO_MAX730X=y
>
> #
> # Memory mapped GPIO drivers
> #
> # CONFIG_GPIO_74XX_MMIO is not set
> CONFIG_GPIO_ALTERA=m
> # CONFIG_GPIO_AMDPT is not set
> # CONFIG_GPIO_AXP209 is not set
> CONFIG_GPIO_DWAPB=m
> # CONFIG_GPIO_GENERIC_PLATFORM is not set
> CONFIG_GPIO_GRGPIO=y
> CONFIG_GPIO_ICH=m
> # CONFIG_GPIO_LYNXPOINT is not set
> CONFIG_GPIO_MENZ127=m
> CONFIG_GPIO_MOCKUP=y
> # CONFIG_GPIO_SYSCON is not set
> # CONFIG_GPIO_VX855 is not set
> CONFIG_GPIO_XILINX=m
> # CONFIG_GPIO_ZX is not set
>
> #
> # Port-mapped I/O GPIO drivers
> #
> CONFIG_GPIO_104_DIO_48E=m
> CONFIG_GPIO_104_IDIO_16=y
> # CONFIG_GPIO_104_IDI_48 is not set
> CONFIG_GPIO_F7188X=m
> # CONFIG_GPIO_GPIO_MM is not set
> CONFIG_GPIO_IT87=m
> CONFIG_GPIO_SCH=m
> CONFIG_GPIO_SCH311X=m
> CONFIG_GPIO_WS16C48=m
>
> #
> # I2C GPIO expanders
> #
> CONFIG_GPIO_ADP5588=y
> # CONFIG_GPIO_ADP5588_IRQ is not set
> # CONFIG_GPIO_ADNP is not set
> CONFIG_GPIO_MAX7300=y
> # CONFIG_GPIO_MAX732X is not set
> # CONFIG_GPIO_PCA953X is not set
> CONFIG_GPIO_PCF857X=m
> # CONFIG_GPIO_SX150X is not set
> # CONFIG_GPIO_TPIC2810 is not set
> # CONFIG_GPIO_TS4900 is not set
>
> #
> # MFD GPIO expanders
> #
> CONFIG_GPIO_ADP5520=y
> CONFIG_GPIO_ARIZONA=y
> CONFIG_GPIO_CS5535=m
> CONFIG_GPIO_DA9052=y
> CONFIG_GPIO_DA9055=y
> # CONFIG_GPIO_DLN2 is not set
> CONFIG_GPIO_KEMPLD=m
> CONFIG_GPIO_LP3943=m
> CONFIG_GPIO_LP873X=m
> CONFIG_GPIO_PALMAS=y
> CONFIG_GPIO_RC5T583=y
> CONFIG_GPIO_STMPE=y
> CONFIG_GPIO_TPS65218=m
> # CONFIG_GPIO_TPS65910 is not set
> CONFIG_GPIO_TPS65912=m
> # CONFIG_GPIO_TWL6040 is not set
> CONFIG_GPIO_WM8350=y
> CONFIG_GPIO_WM8994=m
>
> #
> # PCI GPIO expanders
> #
> CONFIG_GPIO_AMD8111=y
> CONFIG_GPIO_BT8XX=m
> CONFIG_GPIO_ML_IOH=y
> CONFIG_GPIO_PCH=y
> CONFIG_GPIO_RDC321X=y
> # CONFIG_GPIO_SODAVILLE is not set
>
> #
> # SPI or I2C GPIO expanders
> #
> CONFIG_GPIO_MCP23S08=m
>
> #
> # USB GPIO expanders
> #
> CONFIG_GPIO_VIPERBOARD=m
> CONFIG_W1=y
>
> #
> # 1-wire Bus Masters
> #
> CONFIG_W1_MASTER_MATROX=m
> CONFIG_W1_MASTER_DS2490=y
> CONFIG_W1_MASTER_DS2482=m
> CONFIG_W1_MASTER_DS1WM=m
> CONFIG_W1_MASTER_GPIO=y
>
> #
> # 1-wire Slaves
> #
> CONFIG_W1_SLAVE_THERM=m
> CONFIG_W1_SLAVE_SMEM=y
> CONFIG_W1_SLAVE_DS2408=m
> # CONFIG_W1_SLAVE_DS2408_READBACK is not set
> # CONFIG_W1_SLAVE_DS2413 is not set
> CONFIG_W1_SLAVE_DS2406=m
> # CONFIG_W1_SLAVE_DS2423 is not set
> CONFIG_W1_SLAVE_DS2431=y
> CONFIG_W1_SLAVE_DS2433=y
> CONFIG_W1_SLAVE_DS2433_CRC=y
> # CONFIG_W1_SLAVE_DS2760 is not set
> # CONFIG_W1_SLAVE_DS2780 is not set
> CONFIG_W1_SLAVE_DS2781=m
> CONFIG_W1_SLAVE_DS28E04=m
> CONFIG_W1_SLAVE_BQ27000=y
> CONFIG_POWER_AVS=y
> CONFIG_POWER_RESET=y
> CONFIG_POWER_RESET_AS3722=y
> # CONFIG_POWER_RESET_GPIO is not set
> CONFIG_POWER_RESET_GPIO_RESTART=y
> # CONFIG_POWER_RESET_LTC2952 is not set
> CONFIG_POWER_RESET_RESTART=y
> CONFIG_POWER_RESET_SYSCON=y
> # CONFIG_POWER_RESET_SYSCON_POWEROFF is not set
> # CONFIG_SYSCON_REBOOT_MODE is not set
> CONFIG_POWER_SUPPLY=y
> # CONFIG_POWER_SUPPLY_DEBUG is not set
> CONFIG_PDA_POWER=y
> CONFIG_GENERIC_ADC_BATTERY=m
> CONFIG_MAX8925_POWER=y
> # CONFIG_WM8350_POWER is not set
> # CONFIG_TEST_POWER is not set
> # CONFIG_BATTERY_ACT8945A is not set
> # CONFIG_BATTERY_DS2780 is not set
> # CONFIG_BATTERY_DS2781 is not set
> # CONFIG_BATTERY_DS2782 is not set
> CONFIG_BATTERY_SBS=y
> # CONFIG_BATTERY_BQ27XXX is not set
> CONFIG_BATTERY_DA9030=y
> CONFIG_BATTERY_DA9052=m
> # CONFIG_AXP288_CHARGER is not set
> # CONFIG_AXP288_FUEL_GAUGE is not set
> CONFIG_BATTERY_MAX17040=y
> # CONFIG_BATTERY_MAX17042 is not set
> # CONFIG_CHARGER_ISP1704 is not set
> # CONFIG_CHARGER_MAX8903 is not set
> CONFIG_CHARGER_LP8727=m
> # CONFIG_CHARGER_GPIO is not set
> CONFIG_CHARGER_MANAGER=y
> CONFIG_CHARGER_MAX77693=y
> CONFIG_CHARGER_BQ2415X=y
> CONFIG_CHARGER_BQ24190=y
> # CONFIG_CHARGER_BQ24257 is not set
> # CONFIG_CHARGER_BQ24735 is not set
> CONFIG_CHARGER_BQ25890=y
> CONFIG_CHARGER_SMB347=m
> # CONFIG_CHARGER_TPS65090 is not set
> # CONFIG_CHARGER_TPS65217 is not set
> CONFIG_BATTERY_GAUGE_LTC2941=m
> CONFIG_CHARGER_RT9455=y
> CONFIG_AXP20X_POWER=m
> CONFIG_HWMON=y
> CONFIG_HWMON_VID=y
> CONFIG_HWMON_DEBUG_CHIP=y
>
> #
> # Native drivers
> #
> CONFIG_SENSORS_ABITUGURU=m
> CONFIG_SENSORS_ABITUGURU3=m
> CONFIG_SENSORS_AD7414=y
> # CONFIG_SENSORS_AD7418 is not set
> # CONFIG_SENSORS_ADM1021 is not set
> CONFIG_SENSORS_ADM1025=y
> CONFIG_SENSORS_ADM1026=m
> # CONFIG_SENSORS_ADM1029 is not set
> CONFIG_SENSORS_ADM1031=y
> # CONFIG_SENSORS_ADM9240 is not set
> # CONFIG_SENSORS_ADT7410 is not set
> # CONFIG_SENSORS_ADT7411 is not set
> CONFIG_SENSORS_ADT7462=m
> CONFIG_SENSORS_ADT7470=y
> CONFIG_SENSORS_ADT7475=y
> # CONFIG_SENSORS_ASC7621 is not set
> CONFIG_SENSORS_K8TEMP=y
> CONFIG_SENSORS_K10TEMP=y
> CONFIG_SENSORS_FAM15H_POWER=y
> # CONFIG_SENSORS_APPLESMC is not set
> # CONFIG_SENSORS_ASB100 is not set
> CONFIG_SENSORS_ATXP1=y
> # CONFIG_SENSORS_DS620 is not set
> CONFIG_SENSORS_DS1621=y
> CONFIG_SENSORS_DELL_SMM=y
> CONFIG_SENSORS_DA9052_ADC=m
> CONFIG_SENSORS_DA9055=y
> CONFIG_SENSORS_I5K_AMB=m
> CONFIG_SENSORS_F71805F=y
> # CONFIG_SENSORS_F71882FG is not set
> CONFIG_SENSORS_F75375S=m
> CONFIG_SENSORS_FSCHMD=y
> # CONFIG_SENSORS_GL518SM is not set
> # CONFIG_SENSORS_GL520SM is not set
> CONFIG_SENSORS_G760A=y
> # CONFIG_SENSORS_G762 is not set
> # CONFIG_SENSORS_GPIO_FAN is not set
> # CONFIG_SENSORS_HIH6130 is not set
> CONFIG_SENSORS_IIO_HWMON=m
> CONFIG_SENSORS_I5500=m
> # CONFIG_SENSORS_CORETEMP is not set
> CONFIG_SENSORS_IT87=y
> CONFIG_SENSORS_JC42=y
> CONFIG_SENSORS_POWR1220=y
> CONFIG_SENSORS_LINEAGE=y
> # CONFIG_SENSORS_LTC2945 is not set
> CONFIG_SENSORS_LTC2990=y
> CONFIG_SENSORS_LTC4151=y
> CONFIG_SENSORS_LTC4215=m
> CONFIG_SENSORS_LTC4222=m
> CONFIG_SENSORS_LTC4245=y
> CONFIG_SENSORS_LTC4260=m
> CONFIG_SENSORS_LTC4261=m
> CONFIG_SENSORS_MAX16065=y
> CONFIG_SENSORS_MAX1619=m
> CONFIG_SENSORS_MAX1668=m
> # CONFIG_SENSORS_MAX197 is not set
> CONFIG_SENSORS_MAX6639=y
> CONFIG_SENSORS_MAX6642=m
> CONFIG_SENSORS_MAX6650=y
> # CONFIG_SENSORS_MAX6697 is not set
> # CONFIG_SENSORS_MAX31790 is not set
> CONFIG_SENSORS_MCP3021=m
> # CONFIG_SENSORS_LM63 is not set
> CONFIG_SENSORS_LM73=m
> CONFIG_SENSORS_LM75=y
> # CONFIG_SENSORS_LM77 is not set
> CONFIG_SENSORS_LM78=m
> CONFIG_SENSORS_LM80=m
> # CONFIG_SENSORS_LM83 is not set
> # CONFIG_SENSORS_LM85 is not set
> # CONFIG_SENSORS_LM87 is not set
> CONFIG_SENSORS_LM90=y
> # CONFIG_SENSORS_LM92 is not set
> CONFIG_SENSORS_LM93=y
> CONFIG_SENSORS_LM95234=m
> # CONFIG_SENSORS_LM95241 is not set
> CONFIG_SENSORS_LM95245=m
> # CONFIG_SENSORS_PC87360 is not set
> CONFIG_SENSORS_PC87427=y
> CONFIG_SENSORS_NTC_THERMISTOR=m
> CONFIG_SENSORS_NCT6683=y
> CONFIG_SENSORS_NCT6775=m
> # CONFIG_SENSORS_NCT7802 is not set
> # CONFIG_SENSORS_NCT7904 is not set
> CONFIG_SENSORS_PCF8591=m
> CONFIG_PMBUS=y
> CONFIG_SENSORS_PMBUS=m
> CONFIG_SENSORS_ADM1275=y
> CONFIG_SENSORS_LM25066=y
> CONFIG_SENSORS_LTC2978=y
> # CONFIG_SENSORS_LTC2978_REGULATOR is not set
> CONFIG_SENSORS_LTC3815=y
> CONFIG_SENSORS_MAX16064=m
> # CONFIG_SENSORS_MAX20751 is not set
> CONFIG_SENSORS_MAX34440=m
> CONFIG_SENSORS_MAX8688=y
> CONFIG_SENSORS_TPS40422=y
> CONFIG_SENSORS_UCD9000=m
> # CONFIG_SENSORS_UCD9200 is not set
> CONFIG_SENSORS_ZL6100=y
> CONFIG_SENSORS_PWM_FAN=y
> CONFIG_SENSORS_SHT15=y
> # CONFIG_SENSORS_SHT21 is not set
> # CONFIG_SENSORS_SHT3x is not set
> # CONFIG_SENSORS_SHTC1 is not set
> CONFIG_SENSORS_SIS5595=m
> CONFIG_SENSORS_DME1737=y
> # CONFIG_SENSORS_EMC1403 is not set
> CONFIG_SENSORS_EMC2103=y
> # CONFIG_SENSORS_EMC6W201 is not set
> CONFIG_SENSORS_SMSC47M1=y
> CONFIG_SENSORS_SMSC47M192=y
> CONFIG_SENSORS_SMSC47B397=y
> # CONFIG_SENSORS_SCH56XX_COMMON is not set
> CONFIG_SENSORS_SMM665=y
> CONFIG_SENSORS_ADC128D818=y
> CONFIG_SENSORS_ADS1015=m
> # CONFIG_SENSORS_ADS7828 is not set
> CONFIG_SENSORS_AMC6821=y
> # CONFIG_SENSORS_INA209 is not set
> # CONFIG_SENSORS_INA2XX is not set
> CONFIG_SENSORS_INA3221=m
> CONFIG_SENSORS_TC74=m
> CONFIG_SENSORS_THMC50=y
> CONFIG_SENSORS_TMP102=m
> CONFIG_SENSORS_TMP103=y
> # CONFIG_SENSORS_TMP401 is not set
> # CONFIG_SENSORS_TMP421 is not set
> CONFIG_SENSORS_VIA_CPUTEMP=y
> CONFIG_SENSORS_VIA686A=m
> CONFIG_SENSORS_VT1211=m
> CONFIG_SENSORS_VT8231=m
> # CONFIG_SENSORS_W83781D is not set
> CONFIG_SENSORS_W83791D=m
> CONFIG_SENSORS_W83792D=m
> # CONFIG_SENSORS_W83793 is not set
> # CONFIG_SENSORS_W83795 is not set
> # CONFIG_SENSORS_W83L785TS is not set
> CONFIG_SENSORS_W83L786NG=m
> # CONFIG_SENSORS_W83627HF is not set
> CONFIG_SENSORS_W83627EHF=m
> CONFIG_SENSORS_WM8350=y
> CONFIG_SENSORS_XGENE=y
>
> #
> # ACPI drivers
> #
> CONFIG_SENSORS_ACPI_POWER=m
> # CONFIG_SENSORS_ATK0110 is not set
> CONFIG_THERMAL=y
> CONFIG_THERMAL_HWMON=y
> # CONFIG_THERMAL_OF is not set
> CONFIG_THERMAL_WRITABLE_TRIPS=y
> # CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE is not set
> CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE=y
> # CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
> # CONFIG_THERMAL_DEFAULT_GOV_POWER_ALLOCATOR is not set
> CONFIG_THERMAL_GOV_FAIR_SHARE=y
> # CONFIG_THERMAL_GOV_STEP_WISE is not set
> # CONFIG_THERMAL_GOV_BANG_BANG is not set
> CONFIG_THERMAL_GOV_USER_SPACE=y
> # CONFIG_THERMAL_GOV_POWER_ALLOCATOR is not set
> # CONFIG_THERMAL_EMULATION is not set
> CONFIG_INTEL_SOC_DTS_IOSF_CORE=m
> CONFIG_INTEL_SOC_DTS_THERMAL=m
>
> #
> # ACPI INT340X thermal drivers
> #
> CONFIG_INT340X_THERMAL=m
> CONFIG_ACPI_THERMAL_REL=m
> CONFIG_INT3406_THERMAL=m
> CONFIG_INTEL_PCH_THERMAL=m
> CONFIG_GENERIC_ADC_THERMAL=m
> # CONFIG_WATCHDOG is not set
> CONFIG_SSB_POSSIBLE=y
>
> #
> # Sonics Silicon Backplane
> #
> CONFIG_SSB=y
> CONFIG_SSB_SPROM=y
> CONFIG_SSB_PCIHOST_POSSIBLE=y
> CONFIG_SSB_PCIHOST=y
> # CONFIG_SSB_B43_PCI_BRIDGE is not set
> # CONFIG_SSB_SILENT is not set
> # CONFIG_SSB_DEBUG is not set
> CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y
> # CONFIG_SSB_DRIVER_PCICORE is not set
> # CONFIG_SSB_DRIVER_GPIO is not set
> CONFIG_BCMA_POSSIBLE=y
>
> #
> # Broadcom specific AMBA
> #
> CONFIG_BCMA=m
> CONFIG_BCMA_HOST_PCI_POSSIBLE=y
> CONFIG_BCMA_HOST_PCI=y
> # CONFIG_BCMA_HOST_SOC is not set
> CONFIG_BCMA_DRIVER_PCI=y
> CONFIG_BCMA_DRIVER_GMAC_CMN=y
> CONFIG_BCMA_DRIVER_GPIO=y
> # CONFIG_BCMA_DEBUG is not set
>
> #
> # Multifunction device drivers
> #
> CONFIG_MFD_CORE=y
> CONFIG_MFD_CS5535=m
> CONFIG_MFD_ACT8945A=m
> CONFIG_MFD_AS3711=y
> CONFIG_MFD_AS3722=y
> CONFIG_PMIC_ADP5520=y
> # CONFIG_MFD_AAT2870_CORE is not set
> # CONFIG_MFD_ATMEL_FLEXCOM is not set
> CONFIG_MFD_ATMEL_HLCDC=m
> CONFIG_MFD_BCM590XX=y
> CONFIG_MFD_AXP20X=y
> CONFIG_MFD_AXP20X_I2C=y
> CONFIG_MFD_CROS_EC=m
> # CONFIG_MFD_CROS_EC_I2C is not set
> CONFIG_PMIC_DA903X=y
> CONFIG_PMIC_DA9052=y
> CONFIG_MFD_DA9052_I2C=y
> CONFIG_MFD_DA9055=y
> # CONFIG_MFD_DA9062 is not set
> # CONFIG_MFD_DA9063 is not set
> # CONFIG_MFD_DA9150 is not set
> CONFIG_MFD_DLN2=m
> # CONFIG_MFD_EXYNOS_LPASS is not set
> # CONFIG_MFD_MC13XXX_I2C is not set
> CONFIG_MFD_HI6421_PMIC=m
> # CONFIG_HTC_PASIC3 is not set
> CONFIG_HTC_I2CPLD=y
> CONFIG_MFD_INTEL_QUARK_I2C_GPIO=y
> CONFIG_LPC_ICH=y
> CONFIG_LPC_SCH=y
> # CONFIG_INTEL_SOC_PMIC is not set
> CONFIG_MFD_INTEL_LPSS=y
> CONFIG_MFD_INTEL_LPSS_ACPI=y
> # CONFIG_MFD_INTEL_LPSS_PCI is not set
> # CONFIG_MFD_JANZ_CMODIO is not set
> CONFIG_MFD_KEMPLD=m
> CONFIG_MFD_88PM800=m
> CONFIG_MFD_88PM805=y
> # CONFIG_MFD_88PM860X is not set
> # CONFIG_MFD_MAX14577 is not set
> # CONFIG_MFD_MAX77620 is not set
> CONFIG_MFD_MAX77686=m
> CONFIG_MFD_MAX77693=y
> CONFIG_MFD_MAX77843=y
> CONFIG_MFD_MAX8907=m
> CONFIG_MFD_MAX8925=y
> # CONFIG_MFD_MAX8997 is not set
> CONFIG_MFD_MAX8998=y
> CONFIG_MFD_MT6397=y
> # CONFIG_MFD_MENF21BMC is not set
> CONFIG_MFD_VIPERBOARD=m
> # CONFIG_MFD_RETU is not set
> # CONFIG_MFD_PCF50633 is not set
> CONFIG_MFD_RDC321X=y
> CONFIG_MFD_RTSX_PCI=m
> # CONFIG_MFD_RT5033 is not set
> # CONFIG_MFD_RTSX_USB is not set
> CONFIG_MFD_RC5T583=y
> CONFIG_MFD_RK808=y
> # CONFIG_MFD_RN5T618 is not set
> # CONFIG_MFD_SEC_CORE is not set
> CONFIG_MFD_SI476X_CORE=y
> CONFIG_MFD_SM501=m
> # CONFIG_MFD_SM501_GPIO is not set
> CONFIG_MFD_SKY81452=m
> # CONFIG_MFD_SMSC is not set
> CONFIG_ABX500_CORE=y
> CONFIG_AB3100_CORE=y
> CONFIG_AB3100_OTP=m
> CONFIG_MFD_STMPE=y
>
> #
> # STMicroelectronics STMPE Interface Drivers
> #
> CONFIG_STMPE_I2C=y
> CONFIG_MFD_SYSCON=y
> CONFIG_MFD_TI_AM335X_TSCADC=y
> CONFIG_MFD_LP3943=m
> # CONFIG_MFD_LP8788 is not set
> CONFIG_MFD_PALMAS=y
> CONFIG_TPS6105X=y
> # CONFIG_TPS65010 is not set
> # CONFIG_TPS6507X is not set
> # CONFIG_MFD_TPS65086 is not set
> CONFIG_MFD_TPS65090=y
> CONFIG_MFD_TPS65217=m
> CONFIG_MFD_TI_LP873X=m
> CONFIG_MFD_TPS65218=y
> # CONFIG_MFD_TPS6586X is not set
> CONFIG_MFD_TPS65910=y
> CONFIG_MFD_TPS65912=m
> CONFIG_MFD_TPS65912_I2C=m
> # CONFIG_MFD_TPS80031 is not set
> # CONFIG_TWL4030_CORE is not set
> CONFIG_TWL6040_CORE=y
> CONFIG_MFD_WL1273_CORE=m
> CONFIG_MFD_LM3533=m
> # CONFIG_MFD_TIMBERDALE is not set
> # CONFIG_MFD_TC3589X is not set
> # CONFIG_MFD_TMIO is not set
> CONFIG_MFD_VX855=y
> CONFIG_MFD_ARIZONA=y
> CONFIG_MFD_ARIZONA_I2C=y
> CONFIG_MFD_CS47L24=y
> # CONFIG_MFD_WM5102 is not set
> # CONFIG_MFD_WM5110 is not set
> # CONFIG_MFD_WM8997 is not set
> CONFIG_MFD_WM8998=y
> # CONFIG_MFD_WM8400 is not set
> # CONFIG_MFD_WM831X_I2C is not set
> CONFIG_MFD_WM8350=y
> CONFIG_MFD_WM8350_I2C=y
> CONFIG_MFD_WM8994=y
> CONFIG_REGULATOR=y
> # CONFIG_REGULATOR_DEBUG is not set
> CONFIG_REGULATOR_FIXED_VOLTAGE=y
> # CONFIG_REGULATOR_VIRTUAL_CONSUMER is not set
> # CONFIG_REGULATOR_USERSPACE_CONSUMER is not set
> CONFIG_REGULATOR_88PM800=m
> # CONFIG_REGULATOR_ACT8865 is not set
> # CONFIG_REGULATOR_ACT8945A is not set
> # CONFIG_REGULATOR_AD5398 is not set
> CONFIG_REGULATOR_ANATOP=y
> CONFIG_REGULATOR_AB3100=y
> CONFIG_REGULATOR_AS3711=m
> # CONFIG_REGULATOR_AS3722 is not set
> # CONFIG_REGULATOR_AXP20X is not set
> # CONFIG_REGULATOR_BCM590XX is not set
> CONFIG_REGULATOR_DA903X=y
> # CONFIG_REGULATOR_DA9052 is not set
> CONFIG_REGULATOR_DA9055=m
> CONFIG_REGULATOR_DA9210=m
> # CONFIG_REGULATOR_DA9211 is not set
> CONFIG_REGULATOR_FAN53555=m
> CONFIG_REGULATOR_GPIO=m
> CONFIG_REGULATOR_HI6421=m
> # CONFIG_REGULATOR_ISL9305 is not set
> CONFIG_REGULATOR_ISL6271A=m
> CONFIG_REGULATOR_LP3971=y
> CONFIG_REGULATOR_LP3972=m
> # CONFIG_REGULATOR_LP872X is not set
> CONFIG_REGULATOR_LP873X=m
> # CONFIG_REGULATOR_LP8755 is not set
> CONFIG_REGULATOR_LTC3589=m
> CONFIG_REGULATOR_LTC3676=y
> CONFIG_REGULATOR_MAX1586=m
> # CONFIG_REGULATOR_MAX8649 is not set
> # CONFIG_REGULATOR_MAX8660 is not set
> # CONFIG_REGULATOR_MAX8907 is not set
> # CONFIG_REGULATOR_MAX8925 is not set
> CONFIG_REGULATOR_MAX8952=m
> # CONFIG_REGULATOR_MAX8998 is not set
> # CONFIG_REGULATOR_MAX77686 is not set
> # CONFIG_REGULATOR_MAX77693 is not set
> CONFIG_REGULATOR_MAX77802=m
> # CONFIG_REGULATOR_MT6311 is not set
> # CONFIG_REGULATOR_MT6323 is not set
> # CONFIG_REGULATOR_MT6397 is not set
> CONFIG_REGULATOR_PALMAS=y
> CONFIG_REGULATOR_PFUZE100=m
> CONFIG_REGULATOR_PV88060=y
> # CONFIG_REGULATOR_PV88080 is not set
> CONFIG_REGULATOR_PV88090=y
> CONFIG_REGULATOR_PWM=m
> # CONFIG_REGULATOR_RC5T583 is not set
> CONFIG_REGULATOR_RK808=m
> CONFIG_REGULATOR_SKY81452=m
> CONFIG_REGULATOR_TPS51632=m
> # CONFIG_REGULATOR_TPS6105X is not set
> # CONFIG_REGULATOR_TPS62360 is not set
> # CONFIG_REGULATOR_TPS65023 is not set
> CONFIG_REGULATOR_TPS6507X=m
> CONFIG_REGULATOR_TPS65090=y
> CONFIG_REGULATOR_TPS65217=m
> CONFIG_REGULATOR_TPS65218=m
> # CONFIG_REGULATOR_TPS65910 is not set
> CONFIG_REGULATOR_TPS65912=m
> CONFIG_REGULATOR_WM8350=m
> # CONFIG_REGULATOR_WM8994 is not set
> CONFIG_MEDIA_SUPPORT=m
>
> #
> # Multimedia core support
> #
> # CONFIG_MEDIA_CAMERA_SUPPORT is not set
> # CONFIG_MEDIA_ANALOG_TV_SUPPORT is not set
> # CONFIG_MEDIA_DIGITAL_TV_SUPPORT is not set
> CONFIG_MEDIA_RADIO_SUPPORT=y
> CONFIG_MEDIA_SDR_SUPPORT=y
> # CONFIG_MEDIA_RC_SUPPORT is not set
> CONFIG_VIDEO_DEV=m
> CONFIG_VIDEO_V4L2=m
> # CONFIG_VIDEO_ADV_DEBUG is not set
> # CONFIG_VIDEO_FIXED_MINOR_RANGES is not set
> CONFIG_VIDEOBUF2_CORE=m
> CONFIG_VIDEOBUF2_MEMOPS=m
> CONFIG_VIDEOBUF2_VMALLOC=m
> CONFIG_VIDEOBUF2_DMA_SG=m
> # CONFIG_TTPCI_EEPROM is not set
>
> #
> # Media drivers
> #
> CONFIG_MEDIA_USB_SUPPORT=y
>
> #
> # Software defined radio USB devices
> #
> CONFIG_USB_AIRSPY=m
> # CONFIG_USB_HACKRF is not set
> CONFIG_MEDIA_PCI_SUPPORT=y
>
> #
> # Supported MMC/SDIO adapters
> #
> CONFIG_RADIO_ADAPTERS=y
> CONFIG_RADIO_TEA575X=m
> CONFIG_RADIO_SI470X=y
> # CONFIG_USB_SI470X is not set
> # CONFIG_I2C_SI470X is not set
> # CONFIG_RADIO_SI4713 is not set
> CONFIG_USB_MR800=m
> # CONFIG_USB_DSBR is not set
> CONFIG_RADIO_MAXIRADIO=m
> # CONFIG_RADIO_SHARK is not set
> # CONFIG_RADIO_SHARK2 is not set
> CONFIG_USB_KEENE=m
> CONFIG_USB_RAREMONO=m
> CONFIG_USB_MA901=m
> CONFIG_RADIO_TEA5764=m
> # CONFIG_RADIO_SAA7706H is not set
> CONFIG_RADIO_TEF6862=m
> CONFIG_RADIO_WL1273=m
>
> #
> # Texas Instruments WL128x FM driver (ST based)
> #
> CONFIG_CYPRESS_FIRMWARE=m
>
> #
> # Media ancillary drivers (tuners, sensors, i2c, spi, frontends)
> #
> # CONFIG_MEDIA_SUBDRV_AUTOSELECT is not set
> CONFIG_MEDIA_ATTACH=y
>
> #
> # I2C Encoders, decoders, sensors and other helper chips
> #
>
> #
> # Audio decoders, processors and mixers
> #
> CONFIG_VIDEO_TVAUDIO=m
> CONFIG_VIDEO_TDA7432=m
> CONFIG_VIDEO_TDA9840=m
> # CONFIG_VIDEO_TEA6415C is not set
> CONFIG_VIDEO_TEA6420=m
> CONFIG_VIDEO_MSP3400=m
> CONFIG_VIDEO_CS3308=m
> CONFIG_VIDEO_CS5345=m
> CONFIG_VIDEO_CS53L32A=m
> CONFIG_VIDEO_TLV320AIC23B=m
> CONFIG_VIDEO_UDA1342=m
> CONFIG_VIDEO_WM8775=m
> CONFIG_VIDEO_WM8739=m
> # CONFIG_VIDEO_VP27SMPX is not set
> # CONFIG_VIDEO_SONY_BTF_MPX is not set
>
> #
> # RDS decoders
> #
> # CONFIG_VIDEO_SAA6588 is not set
>
> #
> # Video decoders
> #
> # CONFIG_VIDEO_ADV7183 is not set
> CONFIG_VIDEO_BT819=m
> CONFIG_VIDEO_BT856=m
> # CONFIG_VIDEO_BT866 is not set
> CONFIG_VIDEO_KS0127=m
> # CONFIG_VIDEO_ML86V7667 is not set
> CONFIG_VIDEO_SAA7110=m
> CONFIG_VIDEO_SAA711X=m
> CONFIG_VIDEO_TVP514X=m
> CONFIG_VIDEO_TVP5150=m
> CONFIG_VIDEO_TVP7002=m
> CONFIG_VIDEO_TW2804=m
> CONFIG_VIDEO_TW9903=m
> CONFIG_VIDEO_TW9906=m
> CONFIG_VIDEO_VPX3220=m
>
> #
> # Video and audio decoders
> #
> CONFIG_VIDEO_SAA717X=m
> CONFIG_VIDEO_CX25840=m
>
> #
> # Video encoders
> #
> CONFIG_VIDEO_SAA7127=m
> # CONFIG_VIDEO_SAA7185 is not set
> CONFIG_VIDEO_ADV7170=m
> CONFIG_VIDEO_ADV7175=m
> # CONFIG_VIDEO_ADV7343 is not set
> CONFIG_VIDEO_ADV7393=m
> # CONFIG_VIDEO_AK881X is not set
> # CONFIG_VIDEO_THS8200 is not set
>
> #
> # Camera sensor devices
> #
> CONFIG_VIDEO_MT9M111=m
>
> #
> # Flash devices
> #
>
> #
> # Video improvement chips
> #
> CONFIG_VIDEO_UPD64031A=m
> CONFIG_VIDEO_UPD64083=m
>
> #
> # Audio/Video compression chips
> #
> # CONFIG_VIDEO_SAA6752HS is not set
>
> #
> # Miscellaneous helper chips
> #
> CONFIG_VIDEO_THS7303=m
> # CONFIG_VIDEO_M52790 is not set
>
> #
> # Sensors used on soc_camera driver
> #
>
> #
> # SPI helper chips
> #
> CONFIG_MEDIA_TUNER=m
>
> #
> # Customize TV tuners
> #
> CONFIG_MEDIA_TUNER_SIMPLE=m
> CONFIG_MEDIA_TUNER_TDA8290=m
> CONFIG_MEDIA_TUNER_TDA827X=m
> CONFIG_MEDIA_TUNER_TDA18271=m
> CONFIG_MEDIA_TUNER_TDA9887=m
> CONFIG_MEDIA_TUNER_TEA5761=m
> CONFIG_MEDIA_TUNER_TEA5767=m
> CONFIG_MEDIA_TUNER_MT20XX=m
> CONFIG_MEDIA_TUNER_MT2060=m
> CONFIG_MEDIA_TUNER_MT2063=m
> CONFIG_MEDIA_TUNER_MT2266=m
> CONFIG_MEDIA_TUNER_MT2131=m
> # CONFIG_MEDIA_TUNER_QT1010 is not set
> CONFIG_MEDIA_TUNER_XC2028=m
> CONFIG_MEDIA_TUNER_XC5000=m
> CONFIG_MEDIA_TUNER_XC4000=m
> # CONFIG_MEDIA_TUNER_MXL5005S is not set
> CONFIG_MEDIA_TUNER_MXL5007T=m
> # CONFIG_MEDIA_TUNER_MC44S803 is not set
> CONFIG_MEDIA_TUNER_MAX2165=m
> # CONFIG_MEDIA_TUNER_TDA18218 is not set
> CONFIG_MEDIA_TUNER_FC0011=m
> CONFIG_MEDIA_TUNER_FC0012=m
> CONFIG_MEDIA_TUNER_FC0013=m
> # CONFIG_MEDIA_TUNER_TDA18212 is not set
> # CONFIG_MEDIA_TUNER_E4000 is not set
> # CONFIG_MEDIA_TUNER_FC2580 is not set
> # CONFIG_MEDIA_TUNER_M88RS6000T is not set
> CONFIG_MEDIA_TUNER_TUA9001=m
> CONFIG_MEDIA_TUNER_SI2157=m
> # CONFIG_MEDIA_TUNER_IT913X is not set
> CONFIG_MEDIA_TUNER_R820T=m
> # CONFIG_MEDIA_TUNER_MXL301RF is not set
> CONFIG_MEDIA_TUNER_QM1D1C0042=m
>
> #
> # Customise DVB Frontends
> #
> CONFIG_DVB_AU8522=m
> CONFIG_DVB_AU8522_V4L=m
> CONFIG_DVB_TUNER_DIB0070=m
> CONFIG_DVB_TUNER_DIB0090=m
>
> #
> # Tools to develop new frontends
> #
> # CONFIG_DVB_DUMMY_FE is not set
>
> #
> # Graphics support
> #
> CONFIG_AGP=m
> # CONFIG_AGP_ALI is not set
> # CONFIG_AGP_ATI is not set
> CONFIG_AGP_AMD=m
> CONFIG_AGP_AMD64=m
> CONFIG_AGP_INTEL=m
> CONFIG_AGP_NVIDIA=m
> # CONFIG_AGP_SIS is not set
> CONFIG_AGP_SWORKS=m
> CONFIG_AGP_VIA=m
> # CONFIG_AGP_EFFICEON is not set
> CONFIG_INTEL_GTT=m
> CONFIG_VGA_ARB=y
> CONFIG_VGA_ARB_MAX_GPUS=16
> # CONFIG_VGA_SWITCHEROO is not set
> CONFIG_DRM=m
> CONFIG_DRM_DP_AUX_CHARDEV=y
> CONFIG_DRM_KMS_HELPER=m
> # CONFIG_DRM_FBDEV_EMULATION is not set
> CONFIG_DRM_LOAD_EDID_FIRMWARE=y
> CONFIG_DRM_TTM=m
>
> #
> # I2C encoder or helper chips
> #
> CONFIG_DRM_I2C_CH7006=m
> CONFIG_DRM_I2C_SIL164=m
> CONFIG_DRM_I2C_NXP_TDA998X=m
> # CONFIG_DRM_RADEON is not set
> CONFIG_DRM_AMDGPU=m
> CONFIG_DRM_AMDGPU_SI=y
> CONFIG_DRM_AMDGPU_CIK=y
> # CONFIG_DRM_AMDGPU_USERPTR is not set
> CONFIG_DRM_AMDGPU_GART_DEBUGFS=y
>
> #
> # ACP (Audio CoProcessor) Configuration
> #
> # CONFIG_DRM_AMD_ACP is not set
> CONFIG_DRM_NOUVEAU=m
> CONFIG_NOUVEAU_DEBUG=5
> CONFIG_NOUVEAU_DEBUG_DEFAULT=3
> CONFIG_DRM_NOUVEAU_BACKLIGHT=y
> # CONFIG_DRM_I915 is not set
> CONFIG_DRM_VGEM=m
> # CONFIG_DRM_VMWGFX is not set
> CONFIG_DRM_GMA500=m
> CONFIG_DRM_GMA600=y
> CONFIG_DRM_GMA3600=y
> CONFIG_DRM_UDL=m
> CONFIG_DRM_AST=m
> # CONFIG_DRM_MGAG200 is not set
> # CONFIG_DRM_CIRRUS_QEMU is not set
> CONFIG_DRM_QXL=m
> # CONFIG_DRM_BOCHS is not set
> CONFIG_DRM_VIRTIO_GPU=m
> CONFIG_DRM_PANEL=y
>
> #
> # Display Panels
> #
> CONFIG_DRM_PANEL_SIMPLE=m
> # CONFIG_DRM_PANEL_SAMSUNG_S6E8AA0 is not set
> CONFIG_DRM_BRIDGE=y
>
> #
> # Display Interface Bridges
> #
> # CONFIG_DRM_ANALOGIX_ANX78XX is not set
> # CONFIG_DRM_DUMB_VGA_DAC is not set
> CONFIG_DRM_NXP_PTN3460=m
> # CONFIG_DRM_PARADE_PS8622 is not set
> # CONFIG_DRM_SII902X is not set
> CONFIG_DRM_TOSHIBA_TC358767=m
> CONFIG_DRM_I2C_ADV7511=m
> # CONFIG_DRM_I2C_ADV7533 is not set
> # CONFIG_DRM_ARCPGU is not set
> CONFIG_DRM_LEGACY=y
> # CONFIG_DRM_TDFX is not set
> CONFIG_DRM_R128=m
> # CONFIG_DRM_MGA is not set
> CONFIG_DRM_SIS=m
> CONFIG_DRM_VIA=m
> CONFIG_DRM_SAVAGE=m
>
> #
> # Frame buffer Devices
> #
> CONFIG_FB=y
> CONFIG_FIRMWARE_EDID=y
> CONFIG_FB_CMDLINE=y
> CONFIG_FB_NOTIFY=y
> CONFIG_FB_DDC=y
> CONFIG_FB_BOOT_VESA_SUPPORT=y
> CONFIG_FB_CFB_FILLRECT=y
> CONFIG_FB_CFB_COPYAREA=y
> CONFIG_FB_CFB_IMAGEBLIT=y
> # CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
> CONFIG_FB_SYS_FILLRECT=y
> CONFIG_FB_SYS_COPYAREA=y
> CONFIG_FB_SYS_IMAGEBLIT=y
> # CONFIG_FB_FOREIGN_ENDIAN is not set
> CONFIG_FB_SYS_FOPS=y
> CONFIG_FB_DEFERRED_IO=y
> CONFIG_FB_HECUBA=m
> CONFIG_FB_SVGALIB=y
> # CONFIG_FB_MACMODES is not set
> CONFIG_FB_BACKLIGHT=y
> CONFIG_FB_MODE_HELPERS=y
> CONFIG_FB_TILEBLITTING=y
>
> #
> # Frame buffer hardware drivers
> #
> CONFIG_FB_CIRRUS=y
> CONFIG_FB_PM2=y
> CONFIG_FB_PM2_FIFO_DISCONNECT=y
> CONFIG_FB_CYBER2000=y
> # CONFIG_FB_CYBER2000_DDC is not set
> # CONFIG_FB_ARC is not set
> # CONFIG_FB_ASILIANT is not set
> # CONFIG_FB_IMSTT is not set
> # CONFIG_FB_VGA16 is not set
> CONFIG_FB_VESA=y
> CONFIG_FB_N411=m
> CONFIG_FB_HGA=m
> CONFIG_FB_OPENCORES=y
> CONFIG_FB_S1D13XXX=m
> CONFIG_FB_NVIDIA=m
> CONFIG_FB_NVIDIA_I2C=y
> # CONFIG_FB_NVIDIA_DEBUG is not set
> CONFIG_FB_NVIDIA_BACKLIGHT=y
> CONFIG_FB_RIVA=y
> CONFIG_FB_RIVA_I2C=y
> # CONFIG_FB_RIVA_DEBUG is not set
> # CONFIG_FB_RIVA_BACKLIGHT is not set
> CONFIG_FB_I740=m
> # CONFIG_FB_I810 is not set
> CONFIG_FB_LE80578=m
> CONFIG_FB_CARILLO_RANCH=m
> CONFIG_FB_INTEL=m
> # CONFIG_FB_INTEL_DEBUG is not set
> CONFIG_FB_INTEL_I2C=y
> # CONFIG_FB_MATROX is not set
> CONFIG_FB_RADEON=y
> # CONFIG_FB_RADEON_I2C is not set
> # CONFIG_FB_RADEON_BACKLIGHT is not set
> CONFIG_FB_RADEON_DEBUG=y
> # CONFIG_FB_ATY128 is not set
> # CONFIG_FB_ATY is not set
> CONFIG_FB_S3=y
> # CONFIG_FB_S3_DDC is not set
> CONFIG_FB_SAVAGE=y
> # CONFIG_FB_SAVAGE_I2C is not set
> # CONFIG_FB_SAVAGE_ACCEL is not set
> CONFIG_FB_SIS=y
> CONFIG_FB_SIS_300=y
> # CONFIG_FB_SIS_315 is not set
> # CONFIG_FB_VIA is not set
> CONFIG_FB_NEOMAGIC=m
> CONFIG_FB_KYRO=m
> CONFIG_FB_3DFX=m
> CONFIG_FB_3DFX_ACCEL=y
> CONFIG_FB_3DFX_I2C=y
> CONFIG_FB_VOODOO1=y
> # CONFIG_FB_VT8623 is not set
> CONFIG_FB_TRIDENT=y
> CONFIG_FB_ARK=m
> # CONFIG_FB_PM3 is not set
> # CONFIG_FB_CARMINE is not set
> CONFIG_FB_GEODE=y
> CONFIG_FB_GEODE_LX=m
> CONFIG_FB_GEODE_GX=m
> # CONFIG_FB_GEODE_GX1 is not set
> # CONFIG_FB_SM501 is not set
> CONFIG_FB_SMSCUFX=y
> CONFIG_FB_UDL=y
> CONFIG_FB_IBM_GXT4500=m
> CONFIG_FB_VIRTUAL=m
> # CONFIG_FB_METRONOME is not set
> CONFIG_FB_MB862XX=y
> CONFIG_FB_MB862XX_PCI_GDC=y
> # CONFIG_FB_MB862XX_I2C is not set
> # CONFIG_FB_BROADSHEET is not set
> CONFIG_FB_AUO_K190X=y
> CONFIG_FB_AUO_K1900=m
> CONFIG_FB_AUO_K1901=y
> CONFIG_FB_HYPERV=y
> CONFIG_FB_SIMPLE=y
> CONFIG_FB_SSD1307=y
> CONFIG_FB_SM712=m
> CONFIG_BACKLIGHT_LCD_SUPPORT=y
> # CONFIG_LCD_CLASS_DEVICE is not set
> CONFIG_BACKLIGHT_CLASS_DEVICE=y
> # CONFIG_BACKLIGHT_GENERIC is not set
> CONFIG_BACKLIGHT_LM3533=m
> CONFIG_BACKLIGHT_PWM=m
> # CONFIG_BACKLIGHT_DA903X is not set
> CONFIG_BACKLIGHT_DA9052=m
> CONFIG_BACKLIGHT_MAX8925=m
> CONFIG_BACKLIGHT_APPLE=y
> CONFIG_BACKLIGHT_PM8941_WLED=m
> # CONFIG_BACKLIGHT_SAHARA is not set
> CONFIG_BACKLIGHT_ADP5520=y
> CONFIG_BACKLIGHT_ADP8860=m
> # CONFIG_BACKLIGHT_ADP8870 is not set
> CONFIG_BACKLIGHT_LM3630A=y
> # CONFIG_BACKLIGHT_LM3639 is not set
> # CONFIG_BACKLIGHT_LP855X is not set
> CONFIG_BACKLIGHT_SKY81452=m
> CONFIG_BACKLIGHT_TPS65217=m
> # CONFIG_BACKLIGHT_AS3711 is not set
> CONFIG_BACKLIGHT_GPIO=y
> # CONFIG_BACKLIGHT_LV5207LP is not set
> CONFIG_BACKLIGHT_BD6107=y
> CONFIG_VGASTATE=y
> CONFIG_VIDEOMODE_HELPERS=y
> CONFIG_HDMI=y
> CONFIG_LOGO=y
> # CONFIG_LOGO_LINUX_MONO is not set
> CONFIG_LOGO_LINUX_VGA16=y
> CONFIG_LOGO_LINUX_CLUT224=y
> # CONFIG_SOUND is not set
>
> #
> # HID support
> #
> CONFIG_HID=m
> # CONFIG_HID_BATTERY_STRENGTH is not set
> CONFIG_HIDRAW=y
> CONFIG_UHID=m
> CONFIG_HID_GENERIC=m
>
> #
> # Special HID drivers
> #
> CONFIG_HID_A4TECH=m
> CONFIG_HID_ACRUX=m
> CONFIG_HID_ACRUX_FF=y
> CONFIG_HID_APPLE=m
> CONFIG_HID_APPLEIR=m
> CONFIG_HID_ASUS=m
> # CONFIG_HID_AUREAL is not set
> CONFIG_HID_BELKIN=m
> CONFIG_HID_BETOP_FF=m
> CONFIG_HID_CHERRY=m
> CONFIG_HID_CHICONY=m
> # CONFIG_HID_CORSAIR is not set
> CONFIG_HID_CMEDIA=m
> CONFIG_HID_CP2112=m
> # CONFIG_HID_CYPRESS is not set
> CONFIG_HID_DRAGONRISE=m
> # CONFIG_DRAGONRISE_FF is not set
> CONFIG_HID_EMS_FF=m
> CONFIG_HID_ELECOM=m
> CONFIG_HID_ELO=m
> # CONFIG_HID_EZKEY is not set
> CONFIG_HID_GEMBIRD=m
> CONFIG_HID_GFRM=m
> # CONFIG_HID_HOLTEK is not set
> # CONFIG_HID_GT683R is not set
> CONFIG_HID_KEYTOUCH=m
> CONFIG_HID_KYE=m
> CONFIG_HID_UCLOGIC=m
> CONFIG_HID_WALTOP=m
> CONFIG_HID_GYRATION=m
> # CONFIG_HID_ICADE is not set
> CONFIG_HID_TWINHAN=m
> # CONFIG_HID_KENSINGTON is not set
> # CONFIG_HID_LCPOWER is not set
> CONFIG_HID_LED=m
> CONFIG_HID_LENOVO=m
> CONFIG_HID_LOGITECH=m
> CONFIG_HID_LOGITECH_DJ=m
> CONFIG_HID_LOGITECH_HIDPP=m
> CONFIG_LOGITECH_FF=y
> # CONFIG_LOGIRUMBLEPAD2_FF is not set
> # CONFIG_LOGIG940_FF is not set
> # CONFIG_LOGIWHEELS_FF is not set
> CONFIG_HID_MAGICMOUSE=m
> CONFIG_HID_MICROSOFT=m
> # CONFIG_HID_MONTEREY is not set
> # CONFIG_HID_MULTITOUCH is not set
> CONFIG_HID_NTRIG=m
> # CONFIG_HID_ORTEK is not set
> CONFIG_HID_PANTHERLORD=m
> CONFIG_PANTHERLORD_FF=y
> # CONFIG_HID_PENMOUNT is not set
> CONFIG_HID_PETALYNX=m
> # CONFIG_HID_PICOLCD is not set
> CONFIG_HID_PLANTRONICS=m
> # CONFIG_HID_PRIMAX is not set
> # CONFIG_HID_ROCCAT is not set
> CONFIG_HID_SAITEK=m
> CONFIG_HID_SAMSUNG=m
> # CONFIG_HID_SONY is not set
> CONFIG_HID_SPEEDLINK=m
> CONFIG_HID_STEELSERIES=m
> CONFIG_HID_SUNPLUS=m
> # CONFIG_HID_RMI is not set
> CONFIG_HID_GREENASIA=m
> CONFIG_GREENASIA_FF=y
> CONFIG_HID_HYPERV_MOUSE=m
> # CONFIG_HID_SMARTJOYPLUS is not set
> CONFIG_HID_TIVO=m
> CONFIG_HID_TOPSEED=m
> CONFIG_HID_THINGM=m
> CONFIG_HID_THRUSTMASTER=m
> # CONFIG_THRUSTMASTER_FF is not set
> CONFIG_HID_WACOM=m
> # CONFIG_HID_WIIMOTE is not set
> # CONFIG_HID_XINMO is not set
> CONFIG_HID_ZEROPLUS=m
> CONFIG_ZEROPLUS_FF=y
> CONFIG_HID_ZYDACRON=m
> CONFIG_HID_SENSOR_HUB=m
> # CONFIG_HID_SENSOR_CUSTOM_SENSOR is not set
> # CONFIG_HID_ALPS is not set
>
> #
> # USB HID support
> #
> CONFIG_USB_HID=m
> # CONFIG_HID_PID is not set
> CONFIG_USB_HIDDEV=y
>
> #
> # USB HID Boot Protocol drivers
> #
> # CONFIG_USB_KBD is not set
> # CONFIG_USB_MOUSE is not set
>
> #
> # I2C HID support
> #
> CONFIG_I2C_HID=m
> CONFIG_USB_OHCI_LITTLE_ENDIAN=y
> CONFIG_USB_SUPPORT=y
> CONFIG_USB_COMMON=y
> CONFIG_USB_ARCH_HAS_HCD=y
> CONFIG_USB=y
> # CONFIG_USB_ANNOUNCE_NEW_DEVICES is not set
>
> #
> # Miscellaneous USB options
> #
> # CONFIG_USB_DEFAULT_PERSIST is not set
> CONFIG_USB_DYNAMIC_MINORS=y
> # CONFIG_USB_OTG is not set
> CONFIG_USB_OTG_WHITELIST=y
> CONFIG_USB_OTG_BLACKLIST_HUB=y
> # CONFIG_USB_LEDS_TRIGGER_USBPORT is not set
> # CONFIG_USB_MON is not set
> CONFIG_USB_WUSB=m
> CONFIG_USB_WUSB_CBAF=y
> CONFIG_USB_WUSB_CBAF_DEBUG=y
>
> #
> # USB Host Controller Drivers
> #
> # CONFIG_USB_C67X00_HCD is not set
> CONFIG_USB_XHCI_HCD=y
> CONFIG_USB_XHCI_PCI=y
> CONFIG_USB_XHCI_PLATFORM=y
> CONFIG_USB_EHCI_HCD=y
> CONFIG_USB_EHCI_ROOT_HUB_TT=y
> CONFIG_USB_EHCI_TT_NEWSCHED=y
> CONFIG_USB_EHCI_PCI=y
> CONFIG_USB_EHCI_HCD_PLATFORM=y
> CONFIG_USB_OXU210HP_HCD=y
> CONFIG_USB_ISP116X_HCD=m
> CONFIG_USB_ISP1362_HCD=m
> CONFIG_USB_FOTG210_HCD=y
> CONFIG_USB_OHCI_HCD=m
> CONFIG_USB_OHCI_HCD_PCI=m
> CONFIG_USB_OHCI_HCD_SSB=y
> CONFIG_USB_OHCI_HCD_PLATFORM=m
> CONFIG_USB_UHCI_HCD=y
> CONFIG_USB_SL811_HCD=y
> # CONFIG_USB_SL811_HCD_ISO is not set
> CONFIG_USB_SL811_CS=m
> CONFIG_USB_R8A66597_HCD=y
> # CONFIG_USB_WHCI_HCD is not set
> # CONFIG_USB_HWA_HCD is not set
> CONFIG_USB_HCD_BCMA=m
> CONFIG_USB_HCD_SSB=y
> # CONFIG_USB_HCD_TEST_MODE is not set
>
> #
> # USB Device Class drivers
> #
> # CONFIG_USB_ACM is not set
> CONFIG_USB_PRINTER=m
> CONFIG_USB_WDM=m
> CONFIG_USB_TMC=m
>
> #
> # NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
> #
>
> #
> # also be needed; see USB_STORAGE Help for more info
> #
>
> #
> # USB Imaging devices
> #
> CONFIG_USB_MDC800=y
> # CONFIG_USBIP_CORE is not set
> # CONFIG_USB_MUSB_HDRC is not set
> CONFIG_USB_DWC3=y
> CONFIG_USB_DWC3_HOST=y
>
> #
> # Platform Glue Driver Support
> #
> CONFIG_USB_DWC3_PCI=y
> # CONFIG_USB_DWC3_OF_SIMPLE is not set
> CONFIG_USB_DWC2=m
> # CONFIG_USB_DWC2_HOST is not set
>
> #
> # Gadget/Dual-role mode requires USB Gadget support to be enabled
> #
> CONFIG_USB_DWC2_PERIPHERAL=y
> # CONFIG_USB_DWC2_DUAL_ROLE is not set
> CONFIG_USB_DWC2_PCI=m
> CONFIG_USB_DWC2_DEBUG=y
> # CONFIG_USB_DWC2_VERBOSE is not set
> # CONFIG_USB_DWC2_TRACK_MISSED_SOFS is not set
> # CONFIG_USB_DWC2_DEBUG_PERIODIC is not set
> # CONFIG_USB_CHIPIDEA is not set
> # CONFIG_USB_ISP1760 is not set
>
> #
> # USB port drivers
> #
> CONFIG_USB_USS720=y
> # CONFIG_USB_SERIAL is not set
>
> #
> # USB Miscellaneous drivers
> #
> CONFIG_USB_EMI62=y
> CONFIG_USB_EMI26=m
> CONFIG_USB_ADUTUX=m
> CONFIG_USB_SEVSEG=m
> CONFIG_USB_RIO500=y
> CONFIG_USB_LEGOTOWER=m
> # CONFIG_USB_LCD is not set
> CONFIG_USB_CYPRESS_CY7C63=m
> # CONFIG_USB_CYTHERM is not set
> # CONFIG_USB_IDMOUSE is not set
> # CONFIG_USB_FTDI_ELAN is not set
> CONFIG_USB_APPLEDISPLAY=m
> # CONFIG_USB_SISUSBVGA is not set
> CONFIG_USB_LD=m
> CONFIG_USB_TRANCEVIBRATOR=m
> CONFIG_USB_IOWARRIOR=y
> CONFIG_USB_TEST=m
> # CONFIG_USB_EHSET_TEST_FIXTURE is not set
> # CONFIG_USB_ISIGHTFW is not set
> CONFIG_USB_YUREX=m
> CONFIG_USB_EZUSB_FX2=y
> # CONFIG_USB_HSIC_USB3503 is not set
> CONFIG_USB_HSIC_USB4604=m
> CONFIG_USB_LINK_LAYER_TEST=m
> CONFIG_USB_CHAOSKEY=m
> CONFIG_UCSI=y
>
> #
> # USB Physical Layer drivers
> #
> CONFIG_USB_PHY=y
> CONFIG_NOP_USB_XCEIV=m
> CONFIG_USB_GPIO_VBUS=m
> CONFIG_USB_ISP1301=y
> CONFIG_USB_GADGET=m
> CONFIG_USB_GADGET_DEBUG=y
> CONFIG_USB_GADGET_VERBOSE=y
> # CONFIG_USB_GADGET_DEBUG_FILES is not set
> CONFIG_USB_GADGET_DEBUG_FS=y
> CONFIG_USB_GADGET_VBUS_DRAW=2
> CONFIG_USB_GADGET_STORAGE_NUM_BUFFERS=2
>
> #
> # USB Peripheral Controller
> #
> CONFIG_USB_FOTG210_UDC=m
> # CONFIG_USB_GR_UDC is not set
> CONFIG_USB_R8A66597=m
> CONFIG_USB_PXA27X=m
> # CONFIG_USB_MV_UDC is not set
> # CONFIG_USB_MV_U3D is not set
> # CONFIG_USB_M66592 is not set
> CONFIG_USB_BDC_UDC=m
>
> #
> # Platform Support
> #
> CONFIG_USB_BDC_PCI=m
> # CONFIG_USB_AMD5536UDC is not set
> # CONFIG_USB_NET2272 is not set
> CONFIG_USB_NET2280=m
> CONFIG_USB_GOKU=m
> CONFIG_USB_EG20T=m
> # CONFIG_USB_GADGET_XILINX is not set
> CONFIG_USB_DUMMY_HCD=m
> CONFIG_USB_LIBCOMPOSITE=m
> CONFIG_USB_F_FS=m
> CONFIG_USB_F_UVC=m
> CONFIG_USB_F_HID=m
> CONFIG_USB_F_PRINTER=m
> CONFIG_USB_CONFIGFS=m
> # CONFIG_USB_CONFIGFS_SERIAL is not set
> # CONFIG_USB_CONFIGFS_ACM is not set
> # CONFIG_USB_CONFIGFS_OBEX is not set
> # CONFIG_USB_CONFIGFS_NCM is not set
> # CONFIG_USB_CONFIGFS_ECM is not set
> # CONFIG_USB_CONFIGFS_ECM_SUBSET is not set
> # CONFIG_USB_CONFIGFS_RNDIS is not set
> # CONFIG_USB_CONFIGFS_EEM is not set
> # CONFIG_USB_CONFIGFS_F_LB_SS is not set
> # CONFIG_USB_CONFIGFS_F_FS is not set
> CONFIG_USB_CONFIGFS_F_HID=y
> CONFIG_USB_CONFIGFS_F_UVC=y
> # CONFIG_USB_CONFIGFS_F_PRINTER is not set
> # CONFIG_USB_ZERO is not set
> # CONFIG_USB_ETH is not set
> # CONFIG_USB_G_NCM is not set
> CONFIG_USB_GADGETFS=m
> CONFIG_USB_FUNCTIONFS=m
> # CONFIG_USB_FUNCTIONFS_ETH is not set
> # CONFIG_USB_FUNCTIONFS_RNDIS is not set
> CONFIG_USB_FUNCTIONFS_GENERIC=y
> # CONFIG_USB_G_SERIAL is not set
> CONFIG_USB_G_PRINTER=m
> # CONFIG_USB_CDC_COMPOSITE is not set
> # CONFIG_USB_G_HID is not set
> # CONFIG_USB_G_DBGP is not set
> # CONFIG_USB_G_WEBCAM is not set
> CONFIG_USB_LED_TRIG=y
> # CONFIG_USB_ULPI_BUS is not set
> CONFIG_UWB=y
> # CONFIG_UWB_HWA is not set
> CONFIG_UWB_WHCI=y
> # CONFIG_MMC is not set
> CONFIG_MEMSTICK=m
> CONFIG_MEMSTICK_DEBUG=y
>
> #
> # MemoryStick drivers
> #
> CONFIG_MEMSTICK_UNSAFE_RESUME=y
>
> #
> # MemoryStick Host Controller Drivers
> #
> CONFIG_MEMSTICK_TIFM_MS=m
> CONFIG_MEMSTICK_JMICRON_38X=m
> # CONFIG_MEMSTICK_R592 is not set
> # CONFIG_MEMSTICK_REALTEK_PCI is not set
> CONFIG_NEW_LEDS=y
> CONFIG_LEDS_CLASS=m
> CONFIG_LEDS_CLASS_FLASH=m
>
> #
> # LED drivers
> #
> CONFIG_LEDS_AAT1290=m
> CONFIG_LEDS_BCM6328=m
> CONFIG_LEDS_BCM6358=m
> # CONFIG_LEDS_LM3530 is not set
> CONFIG_LEDS_LM3533=m
> CONFIG_LEDS_LM3642=m
> # CONFIG_LEDS_NET48XX is not set
> # CONFIG_LEDS_WRAP is not set
> CONFIG_LEDS_PCA9532=m
> # CONFIG_LEDS_PCA9532_GPIO is not set
> CONFIG_LEDS_GPIO=m
> CONFIG_LEDS_LP3944=m
> # CONFIG_LEDS_LP3952 is not set
> CONFIG_LEDS_LP55XX_COMMON=m
> # CONFIG_LEDS_LP5521 is not set
> # CONFIG_LEDS_LP5523 is not set
> CONFIG_LEDS_LP5562=m
> # CONFIG_LEDS_LP8501 is not set
> # CONFIG_LEDS_LP8860 is not set
> # CONFIG_LEDS_CLEVO_MAIL is not set
> # CONFIG_LEDS_PCA955X is not set
> CONFIG_LEDS_PCA963X=m
> CONFIG_LEDS_WM8350=m
> # CONFIG_LEDS_DA903X is not set
> CONFIG_LEDS_DA9052=m
> CONFIG_LEDS_PWM=m
> CONFIG_LEDS_REGULATOR=m
> # CONFIG_LEDS_BD2802 is not set
> CONFIG_LEDS_INTEL_SS4200=m
> # CONFIG_LEDS_LT3593 is not set
> CONFIG_LEDS_ADP5520=m
> # CONFIG_LEDS_DELL_NETBOOKS is not set
> CONFIG_LEDS_TCA6507=m
> # CONFIG_LEDS_TLC591XX is not set
> CONFIG_LEDS_MAX77693=m
> CONFIG_LEDS_LM355x=m
> CONFIG_LEDS_OT200=m
> # CONFIG_LEDS_KTD2692 is not set
> CONFIG_LEDS_IS31FL319X=m
> # CONFIG_LEDS_IS31FL32XX is not set
>
> #
> # LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
> #
> CONFIG_LEDS_BLINKM=m
>
> #
> # LED Triggers
> #
> CONFIG_LEDS_TRIGGERS=y
> CONFIG_LEDS_TRIGGER_TIMER=m
> CONFIG_LEDS_TRIGGER_ONESHOT=m
> # CONFIG_LEDS_TRIGGER_MTD is not set
> # CONFIG_LEDS_TRIGGER_HEARTBEAT is not set
> CONFIG_LEDS_TRIGGER_BACKLIGHT=m
> # CONFIG_LEDS_TRIGGER_CPU is not set
> CONFIG_LEDS_TRIGGER_GPIO=y
> CONFIG_LEDS_TRIGGER_DEFAULT_ON=m
>
> #
> # iptables trigger is under Netfilter config (LED target)
> #
> # CONFIG_LEDS_TRIGGER_TRANSIENT is not set
> CONFIG_LEDS_TRIGGER_CAMERA=m
> # CONFIG_LEDS_TRIGGER_PANIC is not set
> CONFIG_ACCESSIBILITY=y
> CONFIG_EDAC_ATOMIC_SCRUB=y
> CONFIG_EDAC_SUPPORT=y
> CONFIG_EDAC=y
> # CONFIG_EDAC_LEGACY_SYSFS is not set
> CONFIG_EDAC_DEBUG=y
> # CONFIG_EDAC_MM_EDAC is not set
> CONFIG_RTC_LIB=y
> CONFIG_RTC_MC146818_LIB=y
> # CONFIG_RTC_CLASS is not set
> CONFIG_DMADEVICES=y
> # CONFIG_DMADEVICES_DEBUG is not set
>
> #
> # DMA Devices
> #
> CONFIG_DMA_ENGINE=y
> CONFIG_DMA_VIRTUAL_CHANNELS=y
> CONFIG_DMA_ACPI=y
> CONFIG_DMA_OF=y
> CONFIG_FSL_EDMA=m
> CONFIG_INTEL_IDMA64=m
> # CONFIG_PCH_DMA is not set
> CONFIG_QCOM_HIDMA_MGMT=y
> # CONFIG_QCOM_HIDMA is not set
> CONFIG_DW_DMAC_CORE=y
> # CONFIG_DW_DMAC is not set
> CONFIG_DW_DMAC_PCI=y
> CONFIG_HSU_DMA=y
>
> #
> # DMA Clients
> #
> CONFIG_ASYNC_TX_DMA=y
> CONFIG_DMATEST=m
>
> #
> # DMABUF options
> #
> CONFIG_SYNC_FILE=y
> CONFIG_SW_SYNC=y
> # CONFIG_AUXDISPLAY is not set
> CONFIG_UIO=m
> # CONFIG_UIO_CIF is not set
> # CONFIG_UIO_PDRV_GENIRQ is not set
> CONFIG_UIO_DMEM_GENIRQ=m
> CONFIG_UIO_AEC=m
> CONFIG_UIO_SERCOS3=m
> CONFIG_UIO_PCI_GENERIC=m
> CONFIG_UIO_NETX=m
> CONFIG_UIO_PRUSS=m
> CONFIG_UIO_MF624=m
> CONFIG_VIRT_DRIVERS=y
> CONFIG_VIRTIO=m
>
> #
> # Virtio drivers
> #
> CONFIG_VIRTIO_PCI=m
> # CONFIG_VIRTIO_PCI_LEGACY is not set
> # CONFIG_VIRTIO_BALLOON is not set
> # CONFIG_VIRTIO_INPUT is not set
> CONFIG_VIRTIO_MMIO=m
> # CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES is not set
>
> #
> # Microsoft Hyper-V guest support
> #
> CONFIG_HYPERV=y
> CONFIG_HYPERV_BALLOON=m
> # CONFIG_STAGING is not set
> CONFIG_X86_PLATFORM_DEVICES=y
> # CONFIG_ACER_WMI is not set
> # CONFIG_ACERHDF is not set
> CONFIG_ALIENWARE_WMI=m
> CONFIG_ASUS_LAPTOP=m
> CONFIG_DELL_SMBIOS=m
> # CONFIG_DELL_LAPTOP is not set
> CONFIG_DELL_WMI=m
> CONFIG_DELL_WMI_AIO=m
> # CONFIG_DELL_SMO8800 is not set
> CONFIG_FUJITSU_LAPTOP=m
> CONFIG_FUJITSU_LAPTOP_DEBUG=y
> CONFIG_FUJITSU_TABLET=m
> CONFIG_TC1100_WMI=y
> CONFIG_HP_ACCEL=m
> CONFIG_HP_WIRELESS=m
> CONFIG_HP_WMI=m
> CONFIG_PANASONIC_LAPTOP=m
> CONFIG_THINKPAD_ACPI=m
> CONFIG_THINKPAD_ACPI_DEBUGFACILITIES=y
> CONFIG_THINKPAD_ACPI_DEBUG=y
> CONFIG_THINKPAD_ACPI_UNSAFE_LEDS=y
> # CONFIG_THINKPAD_ACPI_VIDEO is not set
> # CONFIG_THINKPAD_ACPI_HOTKEY_POLL is not set
> CONFIG_SENSORS_HDAPS=m
> CONFIG_INTEL_MENLOW=m
> CONFIG_ASUS_WIRELESS=m
> CONFIG_ACPI_WMI=y
> # CONFIG_MSI_WMI is not set
> CONFIG_TOPSTAR_LAPTOP=m
> # CONFIG_ACPI_TOSHIBA is not set
> CONFIG_TOSHIBA_BT_RFKILL=m
> # CONFIG_TOSHIBA_HAPS is not set
> CONFIG_TOSHIBA_WMI=m
> CONFIG_ACPI_CMPC=m
> CONFIG_INTEL_HID_EVENT=m
> CONFIG_INTEL_VBTN=m
> # CONFIG_INTEL_IPS is not set
> CONFIG_INTEL_PMC_CORE=y
> CONFIG_IBM_RTL=y
> # CONFIG_SAMSUNG_LAPTOP is not set
> CONFIG_MXM_WMI=y
> CONFIG_SAMSUNG_Q10=y
> CONFIG_APPLE_GMUX=m
> CONFIG_INTEL_RST=y
> CONFIG_INTEL_SMARTCONNECT=m
> CONFIG_PVPANIC=y
> # CONFIG_INTEL_PMC_IPC is not set
> # CONFIG_SURFACE_PRO3_BUTTON is not set
> CONFIG_INTEL_PUNIT_IPC=y
> CONFIG_CHROME_PLATFORMS=y
> CONFIG_CHROMEOS_LAPTOP=m
> CONFIG_CHROMEOS_PSTORE=m
> CONFIG_CROS_EC_CHARDEV=m
> # CONFIG_CROS_EC_LPC is not set
> CONFIG_CROS_EC_PROTO=y
> CONFIG_CROS_KBD_LED_BACKLIGHT=m
> CONFIG_CLKDEV_LOOKUP=y
> CONFIG_HAVE_CLK_PREPARE=y
> CONFIG_COMMON_CLK=y
>
> #
> # Common Clock Framework
> #
> CONFIG_COMMON_CLK_MAX77686=m
> CONFIG_COMMON_CLK_RK808=y
> CONFIG_COMMON_CLK_SI5351=y
> # CONFIG_COMMON_CLK_SI514 is not set
> CONFIG_COMMON_CLK_SI570=m
> CONFIG_COMMON_CLK_CDCE706=m
> CONFIG_COMMON_CLK_CDCE925=m
> # CONFIG_COMMON_CLK_CS2000_CP is not set
> CONFIG_CLK_TWL6040=m
> # CONFIG_COMMON_CLK_NXP is not set
> CONFIG_COMMON_CLK_PALMAS=y
> CONFIG_COMMON_CLK_PWM=y
> # CONFIG_COMMON_CLK_PXA is not set
> # CONFIG_COMMON_CLK_PIC32 is not set
>
> #
> # Hardware Spinlock drivers
> #
>
> #
> # Clock Source drivers
> #
> CONFIG_CLKSRC_I8253=y
> CONFIG_CLKEVT_I8253=y
> CONFIG_I8253_LOCK=y
> CONFIG_CLKBLD_I8253=y
> # CONFIG_ATMEL_PIT is not set
> # CONFIG_SH_TIMER_CMT is not set
> # CONFIG_SH_TIMER_MTU2 is not set
> # CONFIG_SH_TIMER_TMU is not set
> # CONFIG_EM_TIMER_STI is not set
> CONFIG_MAILBOX=y
> CONFIG_PLATFORM_MHU=y
> CONFIG_PCC=y
> CONFIG_ALTERA_MBOX=y
> CONFIG_MAILBOX_TEST=y
> # CONFIG_IOMMU_SUPPORT is not set
>
> #
> # Remoteproc drivers
> #
> # CONFIG_STE_MODEM_RPROC is not set
>
> #
> # Rpmsg drivers
> #
>
> #
> # SOC (System On Chip) specific Drivers
> #
>
> #
> # Broadcom SoC drivers
> #
> # CONFIG_SUNXI_SRAM is not set
> CONFIG_SOC_TI=y
> CONFIG_PM_DEVFREQ=y
>
> #
> # DEVFREQ Governors
> #
> # CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND is not set
> # CONFIG_DEVFREQ_GOV_PERFORMANCE is not set
> CONFIG_DEVFREQ_GOV_POWERSAVE=m
> CONFIG_DEVFREQ_GOV_USERSPACE=y
> CONFIG_DEVFREQ_GOV_PASSIVE=m
>
> #
> # DEVFREQ Drivers
> #
> CONFIG_PM_DEVFREQ_EVENT=y
> CONFIG_EXTCON=y
>
> #
> # Extcon Device Drivers
> #
> # CONFIG_EXTCON_ADC_JACK is not set
> CONFIG_EXTCON_AXP288=y
> CONFIG_EXTCON_GPIO=m
> # CONFIG_EXTCON_MAX3355 is not set
> # CONFIG_EXTCON_MAX77693 is not set
> # CONFIG_EXTCON_MAX77843 is not set
> # CONFIG_EXTCON_PALMAS is not set
> # CONFIG_EXTCON_QCOM_SPMI_MISC is not set
> CONFIG_EXTCON_RT8973A=y
> CONFIG_EXTCON_SM5502=m
> # CONFIG_EXTCON_USB_GPIO is not set
> # CONFIG_MEMORY is not set
> CONFIG_IIO=m
> CONFIG_IIO_BUFFER=y
> CONFIG_IIO_BUFFER_CB=m
> CONFIG_IIO_KFIFO_BUF=m
> CONFIG_IIO_TRIGGERED_BUFFER=m
> CONFIG_IIO_CONFIGFS=m
> CONFIG_IIO_TRIGGER=y
> CONFIG_IIO_CONSUMERS_PER_TRIGGER=2
> CONFIG_IIO_SW_DEVICE=m
> CONFIG_IIO_SW_TRIGGER=m
>
> #
> # Accelerometers
> #
> # CONFIG_BMA180 is not set
> # CONFIG_BMC150_ACCEL is not set
> # CONFIG_DMARD06 is not set
> CONFIG_DMARD09=m
> # CONFIG_HID_SENSOR_ACCEL_3D is not set
> CONFIG_IIO_ST_ACCEL_3AXIS=m
> CONFIG_IIO_ST_ACCEL_I2C_3AXIS=m
> CONFIG_KXSD9=m
> CONFIG_KXSD9_I2C=m
> # CONFIG_KXCJK1013 is not set
> # CONFIG_MC3230 is not set
> CONFIG_MMA7455=m
> CONFIG_MMA7455_I2C=m
> # CONFIG_MMA7660 is not set
> CONFIG_MMA8452=m
> CONFIG_MMA9551_CORE=m
> CONFIG_MMA9551=m
> # CONFIG_MMA9553 is not set
> CONFIG_MXC4005=m
> # CONFIG_MXC6255 is not set
> CONFIG_STK8312=m
> # CONFIG_STK8BA50 is not set
>
> #
> # Analog to digital converters
> #
> CONFIG_AD7291=m
> CONFIG_AD799X=m
> CONFIG_AXP288_ADC=m
> CONFIG_CC10001_ADC=m
> # CONFIG_INA2XX_ADC is not set
> CONFIG_LTC2485=m
> CONFIG_MAX1363=m
> CONFIG_MCP3422=m
> CONFIG_MEN_Z188_ADC=m
> CONFIG_NAU7802=m
> # CONFIG_PALMAS_GPADC is not set
> CONFIG_STX104=m
> # CONFIG_TI_ADC081C is not set
> CONFIG_TI_ADS1015=m
> # CONFIG_TI_AM335X_ADC is not set
> CONFIG_VF610_ADC=m
> # CONFIG_VIPERBOARD_ADC is not set
>
> #
> # Amplifiers
> #
>
> #
> # Chemical Sensors
> #
> CONFIG_ATLAS_PH_SENSOR=m
> CONFIG_IAQCORE=m
> CONFIG_VZ89X=m
>
> #
> # Hid Sensor IIO Common
> #
> CONFIG_HID_SENSOR_IIO_COMMON=m
> CONFIG_HID_SENSOR_IIO_TRIGGER=m
> CONFIG_IIO_MS_SENSORS_I2C=m
>
> #
> # SSP Sensor Common
> #
> CONFIG_IIO_ST_SENSORS_I2C=m
> CONFIG_IIO_ST_SENSORS_CORE=m
>
> #
> # Digital to analog converters
> #
> CONFIG_AD5064=m
> CONFIG_AD5380=m
> CONFIG_AD5446=m
> CONFIG_AD5592R_BASE=m
> CONFIG_AD5593R=m
> # CONFIG_CIO_DAC is not set
> # CONFIG_M62332 is not set
> CONFIG_MAX517=m
> # CONFIG_MAX5821 is not set
> # CONFIG_MCP4725 is not set
> # CONFIG_VF610_DAC is not set
>
> #
> # IIO dummy driver
> #
> CONFIG_IIO_SIMPLE_DUMMY=m
> # CONFIG_IIO_SIMPLE_DUMMY_EVENTS is not set
> CONFIG_IIO_SIMPLE_DUMMY_BUFFER=y
>
> #
> # Frequency Synthesizers DDS/PLL
> #
>
> #
> # Clock Generator/Distribution
> #
>
> #
> # Phase-Locked Loop (PLL) frequency synthesizers
> #
>
> #
> # Digital gyroscope sensors
> #
> # CONFIG_BMG160 is not set
> CONFIG_HID_SENSOR_GYRO_3D=m
> CONFIG_IIO_ST_GYRO_3AXIS=m
> CONFIG_IIO_ST_GYRO_I2C_3AXIS=m
> CONFIG_ITG3200=m
>
> #
> # Health Sensors
> #
>
> #
> # Heart Rate Monitors
> #
> CONFIG_AFE4404=m
> # CONFIG_MAX30100 is not set
>
> #
> # Humidity sensors
> #
> CONFIG_AM2315=m
> # CONFIG_DHT11 is not set
> # CONFIG_HDC100X is not set
> CONFIG_HTU21=m
> # CONFIG_SI7005 is not set
> CONFIG_SI7020=m
>
> #
> # Inertial measurement units
> #
> CONFIG_BMI160=m
> CONFIG_BMI160_I2C=m
> # CONFIG_KMX61 is not set
> # CONFIG_INV_MPU6050_I2C is not set
>
> #
> # Light sensors
> #
> # CONFIG_ACPI_ALS is not set
> CONFIG_ADJD_S311=m
> CONFIG_AL3320A=m
> # CONFIG_APDS9300 is not set
> CONFIG_APDS9960=m
> CONFIG_BH1750=m
> # CONFIG_BH1780 is not set
> CONFIG_CM32181=m
> CONFIG_CM3232=m
> CONFIG_CM3323=m
> # CONFIG_CM36651 is not set
> CONFIG_GP2AP020A00F=m
> # CONFIG_ISL29125 is not set
> # CONFIG_HID_SENSOR_ALS is not set
> CONFIG_HID_SENSOR_PROX=m
> # CONFIG_JSA1212 is not set
> CONFIG_RPR0521=m
> CONFIG_SENSORS_LM3533=m
> CONFIG_LTR501=m
> CONFIG_MAX44000=m
> CONFIG_OPT3001=m
> # CONFIG_PA12203001 is not set
> CONFIG_SI1145=m
> CONFIG_STK3310=m
> # CONFIG_TCS3414 is not set
> # CONFIG_TCS3472 is not set
> # CONFIG_SENSORS_TSL2563 is not set
> # CONFIG_TSL4531 is not set
> CONFIG_US5182D=m
> # CONFIG_VCNL4000 is not set
> # CONFIG_VEML6070 is not set
>
> #
> # Magnetometer sensors
> #
> CONFIG_AK8974=m
> CONFIG_AK8975=m
> # CONFIG_AK09911 is not set
> CONFIG_BMC150_MAGN=m
> CONFIG_BMC150_MAGN_I2C=m
> CONFIG_MAG3110=m
> CONFIG_HID_SENSOR_MAGNETOMETER_3D=m
> CONFIG_MMC35240=m
> CONFIG_IIO_ST_MAGN_3AXIS=m
> CONFIG_IIO_ST_MAGN_I2C_3AXIS=m
> # CONFIG_SENSORS_HMC5843_I2C is not set
>
> #
> # Inclinometer sensors
> #
> CONFIG_HID_SENSOR_INCLINOMETER_3D=m
> CONFIG_HID_SENSOR_DEVICE_ROTATION=m
>
> #
> # Triggers - standalone
> #
> # CONFIG_IIO_HRTIMER_TRIGGER is not set
> CONFIG_IIO_INTERRUPT_TRIGGER=m
> # CONFIG_IIO_TIGHTLOOP_TRIGGER is not set
> CONFIG_IIO_SYSFS_TRIGGER=m
>
> #
> # Digital potentiometers
> #
> CONFIG_DS1803=m
> CONFIG_MCP4531=m
> CONFIG_TPL0102=m
>
> #
> # Pressure sensors
> #
> CONFIG_BMP280=m
> CONFIG_BMP280_I2C=m
> # CONFIG_HID_SENSOR_PRESS is not set
> CONFIG_HP03=m
> # CONFIG_MPL115_I2C is not set
> # CONFIG_MPL3115 is not set
> # CONFIG_MS5611 is not set
> # CONFIG_MS5637 is not set
> CONFIG_IIO_ST_PRESS=m
> CONFIG_IIO_ST_PRESS_I2C=m
> # CONFIG_T5403 is not set
> # CONFIG_HP206C is not set
> CONFIG_ZPA2326=m
> CONFIG_ZPA2326_I2C=m
>
> #
> # Lightning sensors
> #
>
> #
> # Proximity sensors
> #
> # CONFIG_LIDAR_LITE_V2 is not set
> CONFIG_SX9500=m
>
> #
> # Temperature sensors
> #
> CONFIG_MLX90614=m
> # CONFIG_TMP006 is not set
> # CONFIG_TSYS01 is not set
> CONFIG_TSYS02D=m
> CONFIG_NTB=y
> # CONFIG_NTB_PINGPONG is not set
> # CONFIG_NTB_TOOL is not set
> CONFIG_NTB_PERF=m
> CONFIG_NTB_TRANSPORT=y
> # CONFIG_VME_BUS is not set
> CONFIG_PWM=y
> CONFIG_PWM_SYSFS=y
> # CONFIG_PWM_ATMEL_HLCDC_PWM is not set
> CONFIG_PWM_CROS_EC=m
> # CONFIG_PWM_FSL_FTM is not set
> CONFIG_PWM_LP3943=m
> CONFIG_PWM_LPSS=y
> CONFIG_PWM_LPSS_PCI=y
> CONFIG_PWM_LPSS_PLATFORM=m
> CONFIG_PWM_PCA9685=y
> # CONFIG_PWM_STMPE is not set
> CONFIG_IRQCHIP=y
> CONFIG_ARM_GIC_MAX_NR=1
> # CONFIG_IPACK_BUS is not set
> CONFIG_RESET_CONTROLLER=y
> # CONFIG_RESET_ATH79 is not set
> # CONFIG_RESET_BERLIN is not set
> # CONFIG_RESET_LPC18XX is not set
> # CONFIG_RESET_MESON is not set
> # CONFIG_RESET_PISTACHIO is not set
> # CONFIG_RESET_SOCFPGA is not set
> # CONFIG_RESET_STM32 is not set
> # CONFIG_RESET_SUNXI is not set
> # CONFIG_TI_SYSCON_RESET is not set
> # CONFIG_RESET_ZYNQ is not set
> # CONFIG_FMC is not set
>
> #
> # PHY Subsystem
> #
> CONFIG_GENERIC_PHY=y
> CONFIG_PHY_PXA_28NM_HSIC=y
> CONFIG_PHY_PXA_28NM_USB2=m
> CONFIG_BCM_KONA_USB2_PHY=y
> # CONFIG_PHY_SAMSUNG_USB2 is not set
> CONFIG_POWERCAP=y
> CONFIG_INTEL_RAPL=m
> CONFIG_MCB=m
> CONFIG_MCB_PCI=m
> # CONFIG_MCB_LPC is not set
>
> #
> # Performance monitor support
> #
> CONFIG_RAS=y
> CONFIG_THUNDERBOLT=y
>
> #
> # Android
> #
> # CONFIG_ANDROID is not set
> CONFIG_NVMEM=m
> CONFIG_STM=y
> # CONFIG_STM_DUMMY is not set
> # CONFIG_STM_SOURCE_CONSOLE is not set
> # CONFIG_STM_SOURCE_HEARTBEAT is not set
> CONFIG_INTEL_TH=y
> CONFIG_INTEL_TH_PCI=m
> # CONFIG_INTEL_TH_GTH is not set
> CONFIG_INTEL_TH_STH=m
> # CONFIG_INTEL_TH_MSU is not set
> CONFIG_INTEL_TH_PTI=m
> # CONFIG_INTEL_TH_DEBUG is not set
>
> #
> # FPGA Configuration Support
> #
> CONFIG_FPGA=m
>
> #
> # Firmware Drivers
> #
> # CONFIG_ARM_SCPI_PROTOCOL is not set
> CONFIG_EDD=m
> # CONFIG_EDD_OFF is not set
> # CONFIG_FIRMWARE_MEMMAP is not set
> # CONFIG_DELL_RBU is not set
> CONFIG_DCDBAS=y
> # CONFIG_DMIID is not set
> CONFIG_DMI_SYSFS=m
> CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
> # CONFIG_ISCSI_IBFT_FIND is not set
> # CONFIG_FW_CFG_SYSFS is not set
> # CONFIG_GOOGLE_FIRMWARE is not set
>
> #
> # File systems
> #
> CONFIG_DCACHE_WORD_ACCESS=y
> # CONFIG_FS_POSIX_ACL is not set
> CONFIG_EXPORTFS=y
> CONFIG_EXPORTFS_BLOCK_OPS=y
> CONFIG_FILE_LOCKING=y
> CONFIG_MANDATORY_FILE_LOCKING=y
> CONFIG_FSNOTIFY=y
> # CONFIG_DNOTIFY is not set
> CONFIG_INOTIFY_USER=y
> # CONFIG_FANOTIFY is not set
> # CONFIG_QUOTA is not set
> # CONFIG_QUOTACTL is not set
> CONFIG_AUTOFS4_FS=m
> # CONFIG_FUSE_FS is not set
> CONFIG_OVERLAY_FS=m
>
> #
> # Caches
> #
> # CONFIG_FSCACHE is not set
>
> #
> # Pseudo filesystems
> #
> CONFIG_PROC_FS=y
> # CONFIG_PROC_KCORE is not set
> CONFIG_PROC_SYSCTL=y
> CONFIG_PROC_PAGE_MONITOR=y
> CONFIG_PROC_CHILDREN=y
> CONFIG_KERNFS=y
> CONFIG_SYSFS=y
> CONFIG_HUGETLBFS=y
> CONFIG_HUGETLB_PAGE=y
> CONFIG_CONFIGFS_FS=y
> # CONFIG_MISC_FILESYSTEMS is not set
> CONFIG_NETWORK_FILESYSTEMS=y
> CONFIG_NLS=y
> CONFIG_NLS_DEFAULT="iso8859-1"
> # CONFIG_NLS_CODEPAGE_437 is not set
> # CONFIG_NLS_CODEPAGE_737 is not set
> CONFIG_NLS_CODEPAGE_775=y
> CONFIG_NLS_CODEPAGE_850=m
> # CONFIG_NLS_CODEPAGE_852 is not set
> CONFIG_NLS_CODEPAGE_855=y
> CONFIG_NLS_CODEPAGE_857=y
> CONFIG_NLS_CODEPAGE_860=y
> CONFIG_NLS_CODEPAGE_861=m
> # CONFIG_NLS_CODEPAGE_862 is not set
> CONFIG_NLS_CODEPAGE_863=m
> # CONFIG_NLS_CODEPAGE_864 is not set
> # CONFIG_NLS_CODEPAGE_865 is not set
> CONFIG_NLS_CODEPAGE_866=y
> CONFIG_NLS_CODEPAGE_869=y
> CONFIG_NLS_CODEPAGE_936=y
> CONFIG_NLS_CODEPAGE_950=m
> CONFIG_NLS_CODEPAGE_932=m
> CONFIG_NLS_CODEPAGE_949=y
> CONFIG_NLS_CODEPAGE_874=m
> # CONFIG_NLS_ISO8859_8 is not set
> CONFIG_NLS_CODEPAGE_1250=y
> CONFIG_NLS_CODEPAGE_1251=y
> CONFIG_NLS_ASCII=y
> CONFIG_NLS_ISO8859_1=y
> # CONFIG_NLS_ISO8859_2 is not set
> CONFIG_NLS_ISO8859_3=y
> # CONFIG_NLS_ISO8859_4 is not set
> CONFIG_NLS_ISO8859_5=m
> # CONFIG_NLS_ISO8859_6 is not set
> CONFIG_NLS_ISO8859_7=y
> CONFIG_NLS_ISO8859_9=m
> CONFIG_NLS_ISO8859_13=m
> CONFIG_NLS_ISO8859_14=y
> CONFIG_NLS_ISO8859_15=y
> CONFIG_NLS_KOI8_R=m
> CONFIG_NLS_KOI8_U=y
> CONFIG_NLS_MAC_ROMAN=y
> # CONFIG_NLS_MAC_CELTIC is not set
> CONFIG_NLS_MAC_CENTEURO=m
> CONFIG_NLS_MAC_CROATIAN=m
> CONFIG_NLS_MAC_CYRILLIC=y
> # CONFIG_NLS_MAC_GAELIC is not set
> CONFIG_NLS_MAC_GREEK=y
> CONFIG_NLS_MAC_ICELAND=m
> CONFIG_NLS_MAC_INUIT=m
> CONFIG_NLS_MAC_ROMANIAN=m
> # CONFIG_NLS_MAC_TURKISH is not set
> # CONFIG_NLS_UTF8 is not set
>
> #
> # Kernel hacking
> #
> CONFIG_TRACE_IRQFLAGS_SUPPORT=y
>
> #
> # printk and dmesg options
> #
> CONFIG_PRINTK_TIME=y
> CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
> CONFIG_BOOT_PRINTK_DELAY=y
> CONFIG_DYNAMIC_DEBUG=y
>
> #
> # Compile-time checks and compiler options
> #
> # CONFIG_DEBUG_INFO is not set
> CONFIG_ENABLE_WARN_DEPRECATED=y
> CONFIG_ENABLE_MUST_CHECK=y
> CONFIG_FRAME_WARN=2048
> # CONFIG_STRIP_ASM_SYMS is not set
> CONFIG_READABLE_ASM=y
> CONFIG_UNUSED_SYMBOLS=y
> CONFIG_PAGE_OWNER=y
> CONFIG_DEBUG_FS=y
> CONFIG_HEADERS_CHECK=y
> # CONFIG_DEBUG_SECTION_MISMATCH is not set
> CONFIG_SECTION_MISMATCH_WARN_ONLY=y
> CONFIG_ARCH_WANT_FRAME_POINTERS=y
> CONFIG_FRAME_POINTER=y
> # CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
> CONFIG_MAGIC_SYSRQ=y
> CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
> CONFIG_DEBUG_KERNEL=y
>
> #
> # Memory Debugging
> #
> CONFIG_PAGE_EXTENSION=y
> CONFIG_DEBUG_PAGEALLOC=y
> CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y
> CONFIG_PAGE_POISONING=y
> CONFIG_PAGE_POISONING_NO_SANITY=y
> CONFIG_PAGE_POISONING_ZERO=y
> # CONFIG_DEBUG_PAGE_REF is not set
> CONFIG_DEBUG_OBJECTS=y
> # CONFIG_DEBUG_OBJECTS_SELFTEST is not set
> # CONFIG_DEBUG_OBJECTS_FREE is not set
> # CONFIG_DEBUG_OBJECTS_TIMERS is not set
> CONFIG_DEBUG_OBJECTS_WORK=y
> CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
> CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y
> CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1
> # CONFIG_SLUB_STATS is not set
> CONFIG_HAVE_DEBUG_KMEMLEAK=y
> # CONFIG_DEBUG_KMEMLEAK is not set
> # CONFIG_DEBUG_STACK_USAGE is not set
> CONFIG_DEBUG_VM=y
> # CONFIG_DEBUG_VM_VMACACHE is not set
> CONFIG_DEBUG_VM_RB=y
> CONFIG_DEBUG_VM_PGFLAGS=y
> CONFIG_DEBUG_VIRTUAL=y
> CONFIG_DEBUG_MEMORY_INIT=y
> # CONFIG_DEBUG_HIGHMEM is not set
> CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
> CONFIG_DEBUG_STACKOVERFLOW=y
> CONFIG_HAVE_ARCH_KMEMCHECK=y
> # CONFIG_DEBUG_SHIRQ is not set
>
> #
> # Debug Lockups and Hangs
> #
> # CONFIG_LOCKUP_DETECTOR is not set
> CONFIG_DETECT_HUNG_TASK=y
> CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
> CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
> CONFIG_WQ_WATCHDOG=y
> CONFIG_PANIC_ON_OOPS=y
> CONFIG_PANIC_ON_OOPS_VALUE=1
> CONFIG_PANIC_TIMEOUT=0
> CONFIG_SCHED_DEBUG=y
> # CONFIG_SCHED_INFO is not set
> # CONFIG_SCHEDSTATS is not set
> CONFIG_SCHED_STACK_END_CHECK=y
> CONFIG_DEBUG_TIMEKEEPING=y
> # CONFIG_TIMER_STATS is not set
> # CONFIG_DEBUG_PREEMPT is not set
>
> #
> # Lock Debugging (spinlocks, mutexes, etc...)
> #
> CONFIG_DEBUG_RT_MUTEXES=y
> CONFIG_DEBUG_SPINLOCK=y
> CONFIG_DEBUG_MUTEXES=y
> CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y
> CONFIG_DEBUG_LOCK_ALLOC=y
> CONFIG_PROVE_LOCKING=y
> CONFIG_LOCKDEP=y
> # CONFIG_LOCK_STAT is not set
> # CONFIG_DEBUG_LOCKDEP is not set
> CONFIG_DEBUG_ATOMIC_SLEEP=y
> CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
> # CONFIG_LOCK_TORTURE_TEST is not set
> CONFIG_TRACE_IRQFLAGS=y
> CONFIG_STACKTRACE=y
> # CONFIG_DEBUG_KOBJECT is not set
> CONFIG_DEBUG_BUGVERBOSE=y
> CONFIG_DEBUG_LIST=y
> CONFIG_DEBUG_PI_LIST=y
> CONFIG_DEBUG_SG=y
> CONFIG_DEBUG_NOTIFIERS=y
> # CONFIG_DEBUG_CREDENTIALS is not set
>
> #
> # RCU Debugging
> #
> CONFIG_PROVE_RCU=y
> # CONFIG_PROVE_RCU_REPEATEDLY is not set
> # CONFIG_SPARSE_RCU_POINTER is not set
> # CONFIG_TORTURE_TEST is not set
> # CONFIG_RCU_PERF_TEST is not set
> # CONFIG_RCU_TORTURE_TEST is not set
> CONFIG_RCU_CPU_STALL_TIMEOUT=21
> CONFIG_RCU_TRACE=y
> CONFIG_RCU_EQS_DEBUG=y
> CONFIG_DEBUG_WQ_FORCE_RR_CPU=y
> # CONFIG_NOTIFIER_ERROR_INJECTION is not set
> CONFIG_FAULT_INJECTION=y
> # CONFIG_FAILSLAB is not set
> CONFIG_FAIL_PAGE_ALLOC=y
> # CONFIG_FAIL_FUTEX is not set
> CONFIG_FAULT_INJECTION_DEBUG_FS=y
> CONFIG_FAULT_INJECTION_STACKTRACE_FILTER=y
> # CONFIG_LATENCYTOP is not set
> CONFIG_USER_STACKTRACE_SUPPORT=y
> CONFIG_NOP_TRACER=y
> CONFIG_HAVE_FUNCTION_TRACER=y
> CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
> CONFIG_HAVE_DYNAMIC_FTRACE=y
> CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
> CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
> CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
> CONFIG_HAVE_C_RECORDMCOUNT=y
> CONFIG_TRACER_MAX_TRACE=y
> CONFIG_TRACE_CLOCK=y
> CONFIG_RING_BUFFER=y
> CONFIG_EVENT_TRACING=y
> CONFIG_CONTEXT_SWITCH_TRACER=y
> CONFIG_RING_BUFFER_ALLOW_SWAP=y
> CONFIG_TRACING=y
> CONFIG_GENERIC_TRACER=y
> CONFIG_TRACING_SUPPORT=y
> CONFIG_FTRACE=y
> CONFIG_FUNCTION_TRACER=y
> # CONFIG_IRQSOFF_TRACER is not set
> CONFIG_PREEMPT_TRACER=y
> # CONFIG_SCHED_TRACER is not set
> # CONFIG_HWLAT_TRACER is not set
> # CONFIG_FTRACE_SYSCALLS is not set
> CONFIG_TRACER_SNAPSHOT=y
> CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP=y
> CONFIG_TRACE_BRANCH_PROFILING=y
> # CONFIG_BRANCH_PROFILE_NONE is not set
> # CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
> CONFIG_PROFILE_ALL_BRANCHES=y
> CONFIG_TRACING_BRANCHES=y
> CONFIG_BRANCH_TRACER=y
> CONFIG_STACK_TRACER=y
> # CONFIG_UPROBE_EVENT is not set
> # CONFIG_PROBE_EVENTS is not set
> # CONFIG_DYNAMIC_FTRACE is not set
> CONFIG_FUNCTION_PROFILER=y
> # CONFIG_FTRACE_STARTUP_TEST is not set
> # CONFIG_MMIOTRACE is not set
> # CONFIG_HIST_TRIGGERS is not set
> CONFIG_TRACEPOINT_BENCHMARK=y
> CONFIG_RING_BUFFER_BENCHMARK=y
> # CONFIG_RING_BUFFER_STARTUP_TEST is not set
> CONFIG_TRACE_ENUM_MAP_FILE=y
> CONFIG_TRACING_EVENTS_GPIO=y
>
> #
> # Runtime Testing
> #
> CONFIG_TEST_LIST_SORT=y
> # CONFIG_BACKTRACE_SELF_TEST is not set
> CONFIG_RBTREE_TEST=m
> CONFIG_INTERVAL_TREE_TEST=m
> CONFIG_PERCPU_TEST=m
> # CONFIG_ATOMIC64_SELFTEST is not set
> CONFIG_TEST_HEXDUMP=m
> CONFIG_TEST_STRING_HELPERS=y
> CONFIG_TEST_KSTRTOX=m
> CONFIG_TEST_PRINTF=m
> CONFIG_TEST_BITMAP=y
> # CONFIG_TEST_UUID is not set
> CONFIG_TEST_RHASHTABLE=y
> # CONFIG_TEST_HASH is not set
> # CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
> # CONFIG_DMA_API_DEBUG is not set
> CONFIG_TEST_LKM=m
> # CONFIG_TEST_USER_COPY is not set
> # CONFIG_TEST_BPF is not set
> # CONFIG_TEST_FIRMWARE is not set
> CONFIG_TEST_UDELAY=y
> # CONFIG_MEMTEST is not set
> CONFIG_TEST_STATIC_KEYS=m
> # CONFIG_SAMPLES is not set
> CONFIG_HAVE_ARCH_KGDB=y
> # CONFIG_KGDB is not set
> CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
> # CONFIG_ARCH_WANTS_UBSAN_NO_NULL is not set
> CONFIG_UBSAN=y
> # CONFIG_UBSAN_SANITIZE_ALL is not set
> # CONFIG_UBSAN_ALIGNMENT is not set
> CONFIG_UBSAN_NULL=y
> CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
> # CONFIG_STRICT_DEVMEM is not set
> CONFIG_X86_VERBOSE_BOOTUP=y
> CONFIG_EARLY_PRINTK=y
> # CONFIG_EARLY_PRINTK_DBGP is not set
> CONFIG_X86_PTDUMP_CORE=y
> CONFIG_X86_PTDUMP=y
> # CONFIG_DEBUG_RODATA_TEST is not set
> # CONFIG_DEBUG_WX is not set
> # CONFIG_DEBUG_SET_MODULE_RONX is not set
> # CONFIG_DEBUG_NX_TEST is not set
> # CONFIG_DOUBLEFAULT is not set
> # CONFIG_DEBUG_TLBFLUSH is not set
> # CONFIG_IOMMU_STRESS is not set
> CONFIG_HAVE_MMIOTRACE_SUPPORT=y
> CONFIG_IO_DELAY_TYPE_0X80=0
> CONFIG_IO_DELAY_TYPE_0XED=1
> CONFIG_IO_DELAY_TYPE_UDELAY=2
> CONFIG_IO_DELAY_TYPE_NONE=3
> # CONFIG_IO_DELAY_0X80 is not set
> CONFIG_IO_DELAY_0XED=y
> # CONFIG_IO_DELAY_UDELAY is not set
> # CONFIG_IO_DELAY_NONE is not set
> CONFIG_DEFAULT_IO_DELAY_TYPE=1
> CONFIG_DEBUG_BOOT_PARAMS=y
> # CONFIG_CPA_DEBUG is not set
> CONFIG_OPTIMIZE_INLINING=y
> # CONFIG_DEBUG_ENTRY is not set
> CONFIG_DEBUG_NMI_SELFTEST=y
> CONFIG_X86_DEBUG_FPU=y
> CONFIG_PUNIT_ATOM_DEBUG=y
>
> #
> # Security options
> #
> CONFIG_KEYS=y
> CONFIG_PERSISTENT_KEYRINGS=y
> # CONFIG_TRUSTED_KEYS is not set
> CONFIG_ENCRYPTED_KEYS=m
> CONFIG_KEY_DH_OPERATIONS=y
> CONFIG_SECURITY_DMESG_RESTRICT=y
> # CONFIG_SECURITY is not set
> CONFIG_SECURITYFS=y
> CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
> CONFIG_HAVE_ARCH_HARDENED_USERCOPY=y
> # CONFIG_HARDENED_USERCOPY is not set
> CONFIG_DEFAULT_SECURITY_DAC=y
> CONFIG_DEFAULT_SECURITY=""
> CONFIG_CRYPTO=y
>
> #
> # Crypto core or helper
> #
> CONFIG_CRYPTO_ALGAPI=y
> CONFIG_CRYPTO_ALGAPI2=y
> CONFIG_CRYPTO_AEAD=y
> CONFIG_CRYPTO_AEAD2=y
> CONFIG_CRYPTO_BLKCIPHER=y
> CONFIG_CRYPTO_BLKCIPHER2=y
> CONFIG_CRYPTO_HASH=y
> CONFIG_CRYPTO_HASH2=y
> CONFIG_CRYPTO_RNG=y
> CONFIG_CRYPTO_RNG2=y
> CONFIG_CRYPTO_RNG_DEFAULT=y
> CONFIG_CRYPTO_AKCIPHER2=y
> CONFIG_CRYPTO_AKCIPHER=y
> CONFIG_CRYPTO_KPP2=y
> CONFIG_CRYPTO_KPP=y
> CONFIG_CRYPTO_RSA=y
> CONFIG_CRYPTO_DH=y
> CONFIG_CRYPTO_ECDH=m
> CONFIG_CRYPTO_MANAGER=y
> CONFIG_CRYPTO_MANAGER2=y
> # CONFIG_CRYPTO_USER is not set
> CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
> CONFIG_CRYPTO_GF128MUL=y
> CONFIG_CRYPTO_NULL=y
> CONFIG_CRYPTO_NULL2=y
> CONFIG_CRYPTO_WORKQUEUE=y
> CONFIG_CRYPTO_CRYPTD=y
> CONFIG_CRYPTO_MCRYPTD=m
> CONFIG_CRYPTO_AUTHENC=y
> # CONFIG_CRYPTO_TEST is not set
> CONFIG_CRYPTO_ABLK_HELPER=m
> CONFIG_CRYPTO_GLUE_HELPER_X86=m
>
> #
> # Authenticated Encryption with Associated Data
> #
> CONFIG_CRYPTO_CCM=y
> CONFIG_CRYPTO_GCM=m
> CONFIG_CRYPTO_CHACHA20POLY1305=m
> CONFIG_CRYPTO_SEQIV=y
> CONFIG_CRYPTO_ECHAINIV=y
>
> #
> # Block modes
> #
> CONFIG_CRYPTO_CBC=m
> CONFIG_CRYPTO_CTR=y
> CONFIG_CRYPTO_CTS=y
> CONFIG_CRYPTO_ECB=y
> CONFIG_CRYPTO_LRW=y
> # CONFIG_CRYPTO_PCBC is not set
> CONFIG_CRYPTO_XTS=y
> CONFIG_CRYPTO_KEYWRAP=y
>
> #
> # Hash modes
> #
> CONFIG_CRYPTO_CMAC=y
> CONFIG_CRYPTO_HMAC=y
> # CONFIG_CRYPTO_XCBC is not set
> # CONFIG_CRYPTO_VMAC is not set
>
> #
> # Digest
> #
> CONFIG_CRYPTO_CRC32C=m
> # CONFIG_CRYPTO_CRC32C_INTEL is not set
> CONFIG_CRYPTO_CRC32=m
> # CONFIG_CRYPTO_CRC32_PCLMUL is not set
> CONFIG_CRYPTO_CRCT10DIF=m
> CONFIG_CRYPTO_GHASH=y
> CONFIG_CRYPTO_POLY1305=m
> CONFIG_CRYPTO_MD4=y
> # CONFIG_CRYPTO_MD5 is not set
> CONFIG_CRYPTO_MICHAEL_MIC=m
> # CONFIG_CRYPTO_RMD128 is not set
> CONFIG_CRYPTO_RMD160=y
> CONFIG_CRYPTO_RMD256=m
> CONFIG_CRYPTO_RMD320=y
> CONFIG_CRYPTO_SHA1=y
> CONFIG_CRYPTO_SHA256=y
> CONFIG_CRYPTO_SHA512=y
> CONFIG_CRYPTO_SHA3=m
> # CONFIG_CRYPTO_TGR192 is not set
> CONFIG_CRYPTO_WP512=m
>
> #
> # Ciphers
> #
> CONFIG_CRYPTO_AES=y
> CONFIG_CRYPTO_AES_586=m
> CONFIG_CRYPTO_AES_NI_INTEL=m
> CONFIG_CRYPTO_ANUBIS=m
> CONFIG_CRYPTO_ARC4=m
> # CONFIG_CRYPTO_BLOWFISH is not set
> CONFIG_CRYPTO_CAMELLIA=m
> CONFIG_CRYPTO_CAST_COMMON=y
> CONFIG_CRYPTO_CAST5=y
> CONFIG_CRYPTO_CAST6=m
> # CONFIG_CRYPTO_DES is not set
> CONFIG_CRYPTO_FCRYPT=m
> # CONFIG_CRYPTO_KHAZAD is not set
> CONFIG_CRYPTO_SALSA20=y
> # CONFIG_CRYPTO_SALSA20_586 is not set
> CONFIG_CRYPTO_CHACHA20=y
> CONFIG_CRYPTO_SEED=y
> CONFIG_CRYPTO_SERPENT=m
> CONFIG_CRYPTO_SERPENT_SSE2_586=m
> CONFIG_CRYPTO_TEA=m
> CONFIG_CRYPTO_TWOFISH=y
> CONFIG_CRYPTO_TWOFISH_COMMON=y
> CONFIG_CRYPTO_TWOFISH_586=y
>
> #
> # Compression
> #
> # CONFIG_CRYPTO_DEFLATE is not set
> # CONFIG_CRYPTO_LZO is not set
> # CONFIG_CRYPTO_842 is not set
> # CONFIG_CRYPTO_LZ4 is not set
> CONFIG_CRYPTO_LZ4HC=m
>
> #
> # Random Number Generation
> #
> # CONFIG_CRYPTO_ANSI_CPRNG is not set
> CONFIG_CRYPTO_DRBG_MENU=y
> CONFIG_CRYPTO_DRBG_HMAC=y
> # CONFIG_CRYPTO_DRBG_HASH is not set
> CONFIG_CRYPTO_DRBG_CTR=y
> CONFIG_CRYPTO_DRBG=y
> CONFIG_CRYPTO_JITTERENTROPY=y
> # CONFIG_CRYPTO_USER_API_HASH is not set
> # CONFIG_CRYPTO_USER_API_SKCIPHER is not set
> # CONFIG_CRYPTO_USER_API_RNG is not set
> # CONFIG_CRYPTO_USER_API_AEAD is not set
> CONFIG_CRYPTO_HASH_INFO=y
> CONFIG_CRYPTO_HW=y
> CONFIG_CRYPTO_DEV_PADLOCK=m
> CONFIG_CRYPTO_DEV_PADLOCK_AES=m
> CONFIG_CRYPTO_DEV_PADLOCK_SHA=m
> # CONFIG_CRYPTO_DEV_GEODE is not set
> CONFIG_CRYPTO_DEV_CCP=y
> CONFIG_CRYPTO_DEV_CCP_DD=m
> CONFIG_CRYPTO_DEV_CCP_CRYPTO=m
> CONFIG_CRYPTO_DEV_QAT=y
> CONFIG_CRYPTO_DEV_QAT_DH895xCC=m
> CONFIG_CRYPTO_DEV_QAT_C3XXX=y
> CONFIG_CRYPTO_DEV_QAT_C62X=m
> CONFIG_CRYPTO_DEV_QAT_DH895xCCVF=m
> CONFIG_CRYPTO_DEV_QAT_C3XXXVF=y
> CONFIG_CRYPTO_DEV_QAT_C62XVF=m
> CONFIG_ASYMMETRIC_KEY_TYPE=y
> CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
> CONFIG_X509_CERTIFICATE_PARSER=y
> CONFIG_PKCS7_MESSAGE_PARSER=y
>
> #
> # Certificates for signature checking
> #
> CONFIG_SYSTEM_TRUSTED_KEYRING=y
> CONFIG_SYSTEM_TRUSTED_KEYS=""
> # CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
> # CONFIG_SECONDARY_TRUSTED_KEYRING is not set
> CONFIG_HAVE_KVM=y
> CONFIG_VIRTUALIZATION=y
> # CONFIG_KVM is not set
> # CONFIG_VHOST_NET is not set
> CONFIG_VHOST_CROSS_ENDIAN_LEGACY=y
> CONFIG_BINARY_PRINTF=y
>
> #
> # Library routines
> #
> CONFIG_BITREVERSE=y
> # CONFIG_HAVE_ARCH_BITREVERSE is not set
> CONFIG_RATIONAL=y
> CONFIG_GENERIC_STRNCPY_FROM_USER=y
> CONFIG_GENERIC_STRNLEN_USER=y
> CONFIG_GENERIC_NET_UTILS=y
> CONFIG_GENERIC_FIND_FIRST_BIT=y
> CONFIG_GENERIC_PCI_IOMAP=y
> CONFIG_GENERIC_IOMAP=y
> CONFIG_GENERIC_IO=y
> CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
> CONFIG_CRC_CCITT=m
> CONFIG_CRC16=y
> CONFIG_CRC_T10DIF=m
> CONFIG_CRC_ITU_T=y
> CONFIG_CRC32=y
> # CONFIG_CRC32_SELFTEST is not set
> # CONFIG_CRC32_SLICEBY8 is not set
> # CONFIG_CRC32_SLICEBY4 is not set
> CONFIG_CRC32_SARWATE=y
> # CONFIG_CRC32_BIT is not set
> # CONFIG_CRC7 is not set
> CONFIG_LIBCRC32C=m
> CONFIG_CRC8=m
> # CONFIG_AUDIT_ARCH_COMPAT_GENERIC is not set
> # CONFIG_RANDOM32_SELFTEST is not set
> CONFIG_ZLIB_INFLATE=y
> CONFIG_LZO_COMPRESS=y
> CONFIG_LZO_DECOMPRESS=y
> CONFIG_LZ4HC_COMPRESS=m
> CONFIG_LZ4_DECOMPRESS=m
> CONFIG_XZ_DEC=y
> CONFIG_XZ_DEC_X86=y
> # CONFIG_XZ_DEC_POWERPC is not set
> CONFIG_XZ_DEC_IA64=y
> # CONFIG_XZ_DEC_ARM is not set
> # CONFIG_XZ_DEC_ARMTHUMB is not set
> # CONFIG_XZ_DEC_SPARC is not set
> CONFIG_XZ_DEC_BCJ=y
> CONFIG_XZ_DEC_TEST=m
> CONFIG_DECOMPRESS_GZIP=y
> CONFIG_DECOMPRESS_XZ=y
> CONFIG_GENERIC_ALLOCATOR=y
> CONFIG_REED_SOLOMON=m
> CONFIG_REED_SOLOMON_DEC16=y
> CONFIG_BCH=m
> CONFIG_BCH_CONST_PARAMS=y
> CONFIG_INTERVAL_TREE=y
> CONFIG_ASSOCIATIVE_ARRAY=y
> CONFIG_HAS_IOMEM=y
> CONFIG_HAS_IOPORT_MAP=y
> CONFIG_HAS_DMA=y
> CONFIG_CHECK_SIGNATURE=y
> CONFIG_DQL=y
> CONFIG_NLATTR=y
> CONFIG_CLZ_TAB=y
> CONFIG_CORDIC=m
> CONFIG_DDR=y
> # CONFIG_IRQ_POLL is not set
> CONFIG_MPILIB=y
> CONFIG_LIBFDT=y
> CONFIG_OID_REGISTRY=y
> # CONFIG_SG_SPLIT is not set
> # CONFIG_SG_POOL is not set
> CONFIG_ARCH_HAS_SG_CHAIN=y
> CONFIG_ARCH_HAS_MMIO_FLUSH=y
> CONFIG_STACKDEPOT=y
> #!/bin/sh
>
> export_top_env()
> {
> export suite='trinity'
> export testcase='trinity'
> export runtime=300
> export job_origin='/lkp/lkp/src/allot/rand/vm-lkp-nex04-yocto-i386/trinity.yaml'
> export testbox='vm-lkp-nex04-yocto-i386-23'
> export tbox_group='vm-lkp-nex04-yocto-i386'
> export kconfig='i386-randconfig-b0-04201946'
> export compiler='gcc-5'
> export queue='bisect'
> export branch='linux-devel/devel-catchup-201704202148'
> export commit='73821bb516920b2b38732ce992d11c08c5d8bd7d'
> export submit_id='58f9191e0b9a934ff82fe4b8'
> export job_file='/lkp/scheduled/vm-lkp-nex04-yocto-i386-23/trinity-300s-yocto-tiny-i386-2016-04-22.cgz-73821bb516920b2b38732ce992d11c08c5d8bd7d-20170421-20472-n2i6sw-0.yaml'
> export id='f192123d8b39f1000d6a68a85801bd9ef030bed4'
> export model='qemu-system-i386 -enable-kvm'
> export nr_vm=32
> export nr_cpu=2
> export memory='320M'
> export rootfs='yocto-tiny-i386-2016-04-22.cgz'
> export swap_partitions='/dev/vda'
> export need_kconfig='CONFIG_KVM_GUEST=y'
> export enqueue_time='2017-04-21 04:25:02 +0800'
> export _id='58f9191e0b9a934ff82fe4b8'
> export _rt='/result/trinity/300s/vm-lkp-nex04-yocto-i386/yocto-tiny-i386-2016-04-22.cgz/i386-randconfig-b0-04201946/gcc-5/73821bb516920b2b38732ce992d11c08c5d8bd7d'
> export user='lkp'
> export result_root='/result/trinity/300s/vm-lkp-nex04-yocto-i386/yocto-tiny-i386-2016-04-22.cgz/i386-randconfig-b0-04201946/gcc-5/73821bb516920b2b38732ce992d11c08c5d8bd7d/0'
> export LKP_SERVER='inn'
> export max_uptime=1500
> export initrd='/osimage/yocto/yocto-tiny-i386-2016-04-22.cgz'
> export bootloader_append='root=/dev/ram0
> user=lkp
> job=/lkp/scheduled/vm-lkp-nex04-yocto-i386-23/trinity-300s-yocto-tiny-i386-2016-04-22.cgz-73821bb516920b2b38732ce992d11c08c5d8bd7d-20170421-20472-n2i6sw-0.yaml
> ARCH=i386
> kconfig=i386-randconfig-b0-04201946
> branch=linux-devel/devel-catchup-201704202148
> commit=73821bb516920b2b38732ce992d11c08c5d8bd7d
> BOOT_IMAGE=/pkg/linux/i386-randconfig-b0-04201946/gcc-5/73821bb516920b2b38732ce992d11c08c5d8bd7d/vmlinuz-4.9.0-rc8-00001-g73821bb
> max_uptime=1500
> RESULT_ROOT=/result/trinity/300s/vm-lkp-nex04-yocto-i386/yocto-tiny-i386-2016-04-22.cgz/i386-randconfig-b0-04201946/gcc-5/73821bb516920b2b38732ce992d11c08c5d8bd7d/0
> LKP_SERVER=inn
> debug
> apic=debug
> sysrq_always_enabled
> rcupdate.rcu_cpu_stall_timeout=100
> net.ifnames=0
> printk.devkmsg=on
> panic=-1
> softlockup_panic=1
> nmi_watchdog=panic
> oops=panic
> load_ramdisk=2
> prompt_ramdisk=0
> drbd.minor_count=8
> systemd.log_level=err
> ignore_loglevel
> earlyprintk=ttyS0,115200
> console=ttyS0,115200
> console=tty0
> vga=normal
> rw'
> export lkp_initrd='/lkp/lkp/lkp-i386.cgz'
> export modules_initrd='/pkg/linux/i386-randconfig-b0-04201946/gcc-5/73821bb516920b2b38732ce992d11c08c5d8bd7d/modules.cgz'
> export bm_initrd='/osimage/deps/debian-x86_64-2016-08-31.cgz/run-ipconfig.i386_2016-09-03.cgz,/osimage/pkg/static/trinity-i386.cgz'
> export site='inn'
> export LKP_CGI_PORT=80
> export LKP_CIFS_PORT=139
> export kernel='/pkg/linux/i386-randconfig-b0-04201946/gcc-5/73821bb516920b2b38732ce992d11c08c5d8bd7d/vmlinuz-4.9.0-rc8-00001-g73821bb'
> export dequeue_time='2017-04-21 05:08:18 +0800'
> export job_initrd='/lkp/scheduled/vm-lkp-nex04-yocto-i386-23/trinity-300s-yocto-tiny-i386-2016-04-22.cgz-73821bb516920b2b38732ce992d11c08c5d8bd7d-20170421-20472-n2i6sw-0.cgz'
>
> [ -n "$LKP_SRC" ] ||
> export LKP_SRC=/lkp/${user:-lkp}/src
> }
>
> run_job()
> {
> echo $$ > $TMP/run-job.pid
>
> . $LKP_SRC/lib/http.sh
> . $LKP_SRC/lib/job.sh
> . $LKP_SRC/lib/env.sh
>
> export_top_env
>
> run_monitor $LKP_SRC/monitors/wrapper kmsg
> run_monitor $LKP_SRC/monitors/wrapper oom-killer
> run_monitor $LKP_SRC/monitors/plain/watchdog
> run_monitor $LKP_SRC/monitors/wrapper nfs-hang
>
> run_test $LKP_SRC/tests/wrapper trinity
> }
>
> extract_stats()
> {
> $LKP_SRC/stats/wrapper kmsg
>
> $LKP_SRC/stats/wrapper time trinity.time
> $LKP_SRC/stats/wrapper time
> $LKP_SRC/stats/wrapper dmesg
> $LKP_SRC/stats/wrapper kmsg
> $LKP_SRC/stats/wrapper stderr
> $LKP_SRC/stats/wrapper last_state
> }
>
> "$@"
--
Michal Hocko
SUSE Labs
On Mon 24-04-17 10:44:43, Joonsoo Kim wrote:
> On Fri, Apr 21, 2017 at 09:16:16AM +0200, Michal Hocko wrote:
> > On Fri 21-04-17 13:38:28, Joonsoo Kim wrote:
> > > On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> > > > On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > > > > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> > > > [...]
> > > > > > Which pfn walkers you have in mind?
> > > > >
> > > > > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > > > > using pfn_valid().
> > > >
> > > > Yeah, I've checked that one and in fact this is a good example of the
> > > > case where you do not really care about holes. It just checks the page
> > > > count which is a valid information under any circumstances.
> > >
> > > I don't think so. First, it checks the page *map* count. Is it still valid
> > > even if PageReserved() is set?
> >
> > I do not know about any user which would manipulate page map count for
> > referenced pages. The core MM code doesn't.
>
> That's weird that we can get *map* count without PageReserved() check,
> but we cannot get zone information.
> Zone information is more static information than map count.
As I've already pointed out the rework of the hotplug code is mainly
about postponing the zone initialization from the physical hot add to
the logical onlining. The zone is really not clear until that moment.
> It should be defined/documented in this time that what information in
> the struct page is valid even if PageReserved() is set. And then, we
> need to fix all the things based on this design decision.
Where would you suggest documenting this? We do have
Documentation/memory-hotplug.txt but it is not really specific about
struct page.
[...]
> > You are trying to change a semantic of something that has a well defined
> > meaning. I disagree that we should change it. It might sound like a
> > simpler thing to do because pfn walkers will have to be checked but what
> > you are proposing is conflating two different things together.
>
> I don't think that *I* try to change the semantic of pfn_valid().
> It would be original semantic of pfn_valid().
>
> "If pfn_valid() returns true, we can get proper struct page and the
> zone information,"
I do not see any guarantee about the zone information anywhere. In fact
this is not true with the original implementation as I've tried to
explain already. We do have new pages associated with a zone but that
association might change during the online phase. So you cannot really
rely on that information until the page is online. There is no real
change in that regards after my rework.
[...]
> > So please do not conflate those two different concepts together. I
> > believe that the most prominent pfn walkers should be covered now and
> > others can be evaluated later.
>
> Even if original pfn_valid()'s semantic is not the one that I mentioned,
> I think that suggested semantic from me is better.
> Only hotplug code need to be changed and others doesn't need to be changed.
> There is no overhead for others. What's the problem about this approach?
That this would require to check _every_ single pfn_valid user in the
kernel. That is beyond my time capacity and not really necessary because
the current code already suffers from the same/similar class of
problems.
> And, I'm not sure that you covered the most prominent pfn walkers.
> Please see pagetypeinfo_showblockcount_print() in mm/vmstat.c.
I probably haven't (and will send a patch to fix this one - thanks for
pointing to it) but the point is they those are broken already and they
can be fixed in follow up patches. If you change pfn_valid you might
break an existing code in an unexpected ways.
--
Michal Hocko
SUSE Labs
On Mon, Apr 24, 2017 at 09:53:12AM +0200, Michal Hocko wrote:
> On Mon 24-04-17 10:44:43, Joonsoo Kim wrote:
> > On Fri, Apr 21, 2017 at 09:16:16AM +0200, Michal Hocko wrote:
> > > On Fri 21-04-17 13:38:28, Joonsoo Kim wrote:
> > > > On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> > > > > On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > > > > > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> > > > > [...]
> > > > > > > Which pfn walkers you have in mind?
> > > > > >
> > > > > > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > > > > > using pfn_valid().
> > > > >
> > > > > Yeah, I've checked that one and in fact this is a good example of the
> > > > > case where you do not really care about holes. It just checks the page
> > > > > count which is a valid information under any circumstances.
> > > >
> > > > I don't think so. First, it checks the page *map* count. Is it still valid
> > > > even if PageReserved() is set?
> > >
> > > I do not know about any user which would manipulate page map count for
> > > referenced pages. The core MM code doesn't.
> >
> > That's weird that we can get *map* count without PageReserved() check,
> > but we cannot get zone information.
> > Zone information is more static information than map count.
>
> As I've already pointed out the rework of the hotplug code is mainly
> about postponing the zone initialization from the physical hot add to
> the logical onlining. The zone is really not clear until that moment.
>
> > It should be defined/documented in this time that what information in
> > the struct page is valid even if PageReserved() is set. And then, we
> > need to fix all the things based on this design decision.
>
> Where would you suggest documenting this? We do have
> Documentation/memory-hotplug.txt but it is not really specific about
> struct page.
pfn_valid() in include/linux/mmzone.h looks proper place.
>
> [...]
>
> > > You are trying to change a semantic of something that has a well defined
> > > meaning. I disagree that we should change it. It might sound like a
> > > simpler thing to do because pfn walkers will have to be checked but what
> > > you are proposing is conflating two different things together.
> >
> > I don't think that *I* try to change the semantic of pfn_valid().
> > It would be original semantic of pfn_valid().
> >
> > "If pfn_valid() returns true, we can get proper struct page and the
> > zone information,"
>
> I do not see any guarantee about the zone information anywhere. In fact
> this is not true with the original implementation as I've tried to
> explain already. We do have new pages associated with a zone but that
> association might change during the online phase. So you cannot really
> rely on that information until the page is online. There is no real
> change in that regards after my rework.
I know that what you did doesn't change thing much. What I try to say
is that previous implementation related to pfn_valid() in hotplug is
wrong. Please do not assume that hotplug implementation is correct and
other pfn_valid() users are incorrect. There is no design document so
I'm not sure which one is correct but assumption that pfn_valid() user
can access whole the struct page information makes much sense to me.
So, I hope that please fix hotplug implementation rather than
modifying each pfn_valid() users.
>
> [...]
> > > So please do not conflate those two different concepts together. I
> > > believe that the most prominent pfn walkers should be covered now and
> > > others can be evaluated later.
> >
> > Even if original pfn_valid()'s semantic is not the one that I mentioned,
> > I think that suggested semantic from me is better.
> > Only hotplug code need to be changed and others doesn't need to be changed.
> > There is no overhead for others. What's the problem about this approach?
>
> That this would require to check _every_ single pfn_valid user in the
> kernel. That is beyond my time capacity and not really necessary because
> the current code already suffers from the same/similar class of
> problems.
I think that all the pfn_valid() user doesn't consider hole case.
Unlike your expectation, if your way is taken, it requires to check
_every_ pfn_valid() users.
Thanks.
On Tue 25-04-17 11:50:45, Joonsoo Kim wrote:
> On Mon, Apr 24, 2017 at 09:53:12AM +0200, Michal Hocko wrote:
> > On Mon 24-04-17 10:44:43, Joonsoo Kim wrote:
> > > On Fri, Apr 21, 2017 at 09:16:16AM +0200, Michal Hocko wrote:
> > > > On Fri 21-04-17 13:38:28, Joonsoo Kim wrote:
> > > > > On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> > > > > > On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > > > > > > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> > > > > > [...]
> > > > > > > > Which pfn walkers you have in mind?
> > > > > > >
> > > > > > > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > > > > > > using pfn_valid().
> > > > > >
> > > > > > Yeah, I've checked that one and in fact this is a good example of the
> > > > > > case where you do not really care about holes. It just checks the page
> > > > > > count which is a valid information under any circumstances.
> > > > >
> > > > > I don't think so. First, it checks the page *map* count. Is it still valid
> > > > > even if PageReserved() is set?
> > > >
> > > > I do not know about any user which would manipulate page map count for
> > > > referenced pages. The core MM code doesn't.
> > >
> > > That's weird that we can get *map* count without PageReserved() check,
> > > but we cannot get zone information.
> > > Zone information is more static information than map count.
> >
> > As I've already pointed out the rework of the hotplug code is mainly
> > about postponing the zone initialization from the physical hot add to
> > the logical onlining. The zone is really not clear until that moment.
> >
> > > It should be defined/documented in this time that what information in
> > > the struct page is valid even if PageReserved() is set. And then, we
> > > need to fix all the things based on this design decision.
> >
> > Where would you suggest documenting this? We do have
> > Documentation/memory-hotplug.txt but it is not really specific about
> > struct page.
>
> pfn_valid() in include/linux/mmzone.h looks proper place.
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index c412e6a3a1e9..443258fcac93 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1288,10 +1288,14 @@ unsigned long __init node_memmap_size_bytes(int, unsigned long, unsigned long);
#ifdef CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
/*
* pfn_valid() is meant to be able to tell if a given PFN has valid memmap
- * associated with it or not. In FLATMEM, it is expected that holes always
- * have valid memmap as long as there is valid PFNs either side of the hole.
- * In SPARSEMEM, it is assumed that a valid section has a memmap for the
- * entire section.
+ * associated with it or not. This means that a struct page exists for this
+ * pfn. The caller cannot assume the page is fully initialized though.
+ * pfn_to_online_page() should be used to make sure the struct page is fully
+ * initialized.
+ *
+ * In FLATMEM, it is expected that holes always have valid memmap as long as
+ * there is valid PFNs either side of the hole. In SPARSEMEM, it is assumed
+ * that a valid section has a memmap for the entire section.
*
* However, an ARM, and maybe other embedded architectures in the future
* free memmap backing holes to save memory on the assumption the memmap is
> > [...]
> >
> > > > You are trying to change a semantic of something that has a well defined
> > > > meaning. I disagree that we should change it. It might sound like a
> > > > simpler thing to do because pfn walkers will have to be checked but what
> > > > you are proposing is conflating two different things together.
> > >
> > > I don't think that *I* try to change the semantic of pfn_valid().
> > > It would be original semantic of pfn_valid().
> > >
> > > "If pfn_valid() returns true, we can get proper struct page and the
> > > zone information,"
> >
> > I do not see any guarantee about the zone information anywhere. In fact
> > this is not true with the original implementation as I've tried to
> > explain already. We do have new pages associated with a zone but that
> > association might change during the online phase. So you cannot really
> > rely on that information until the page is online. There is no real
> > change in that regards after my rework.
>
> I know that what you did doesn't change thing much. What I try to say
> is that previous implementation related to pfn_valid() in hotplug is
> wrong. Please do not assume that hotplug implementation is correct and
> other pfn_valid() users are incorrect. There is no design document so
> I'm not sure which one is correct but assumption that pfn_valid() user
> can access whole the struct page information makes much sense to me.
Not really. E.g. ZONE_DEVICE pages are never online AFAIK. I believe we
still need pfn_valid to work for those pfns. Really, pfn_valid has a
different meaning than you would like it to have. Who knows how many
others like that are lurking there. I feel much more comfortable to go
and hunt already broken code and fix it rathert than break something
unexpectedly.
--
Michal Hocko
SUSE Labs
On Wed, Apr 26, 2017 at 11:19:06AM +0200, Michal Hocko wrote:
> > > [...]
> > >
> > > > > You are trying to change a semantic of something that has a well defined
> > > > > meaning. I disagree that we should change it. It might sound like a
> > > > > simpler thing to do because pfn walkers will have to be checked but what
> > > > > you are proposing is conflating two different things together.
> > > >
> > > > I don't think that *I* try to change the semantic of pfn_valid().
> > > > It would be original semantic of pfn_valid().
> > > >
> > > > "If pfn_valid() returns true, we can get proper struct page and the
> > > > zone information,"
> > >
> > > I do not see any guarantee about the zone information anywhere. In fact
> > > this is not true with the original implementation as I've tried to
> > > explain already. We do have new pages associated with a zone but that
> > > association might change during the online phase. So you cannot really
> > > rely on that information until the page is online. There is no real
> > > change in that regards after my rework.
> >
> > I know that what you did doesn't change thing much. What I try to say
> > is that previous implementation related to pfn_valid() in hotplug is
> > wrong. Please do not assume that hotplug implementation is correct and
> > other pfn_valid() users are incorrect. There is no design document so
> > I'm not sure which one is correct but assumption that pfn_valid() user
> > can access whole the struct page information makes much sense to me.
>
> Not really. E.g. ZONE_DEVICE pages are never online AFAIK. I believe we
> still need pfn_valid to work for those pfns. Really, pfn_valid has a
It's really contrary example to your insist. They requires not only
struct page but also other information, especially, the zone index.
They checks zone idx to know whether this page is for ZONE_DEVICE or not.
So, pfn_valid() for ZONE_DEVICE pages assume that struct page has all
the valid information. It's perfectly matched with my suggestion.
Online isn't important issue here. What the important point is the condition
that pfn_valid() return true. pfn_valid() for ZONE_DEVICE returns true after
arch_add_memory() since all the struct page information is fixed there.
If zone of hotplugged memory cannot be fixed at this moment, you can
defef it until all the information is fixed (onlining). That
seems to be better semantic of pfn_valid() to me.
> different meaning than you would like it to have. Who knows how many
> others like that are lurking there. I feel much more comfortable to go
> and hunt already broken code and fix it rathert than break something
> unexpectedly.
I think that I did my best to explain my reasoning. It seems that we
cannot agree with each other so it's better for some others to express
their opinion to this problem. I will stop this discussion from now
on.
Thanks.
On Thu 27-04-17 11:08:38, Joonsoo Kim wrote:
> On Wed, Apr 26, 2017 at 11:19:06AM +0200, Michal Hocko wrote:
> > > > [...]
> > > >
> > > > > > You are trying to change a semantic of something that has a well defined
> > > > > > meaning. I disagree that we should change it. It might sound like a
> > > > > > simpler thing to do because pfn walkers will have to be checked but what
> > > > > > you are proposing is conflating two different things together.
> > > > >
> > > > > I don't think that *I* try to change the semantic of pfn_valid().
> > > > > It would be original semantic of pfn_valid().
> > > > >
> > > > > "If pfn_valid() returns true, we can get proper struct page and the
> > > > > zone information,"
> > > >
> > > > I do not see any guarantee about the zone information anywhere. In fact
> > > > this is not true with the original implementation as I've tried to
> > > > explain already. We do have new pages associated with a zone but that
> > > > association might change during the online phase. So you cannot really
> > > > rely on that information until the page is online. There is no real
> > > > change in that regards after my rework.
> > >
> > > I know that what you did doesn't change thing much. What I try to say
> > > is that previous implementation related to pfn_valid() in hotplug is
> > > wrong. Please do not assume that hotplug implementation is correct and
> > > other pfn_valid() users are incorrect. There is no design document so
> > > I'm not sure which one is correct but assumption that pfn_valid() user
> > > can access whole the struct page information makes much sense to me.
> >
> > Not really. E.g. ZONE_DEVICE pages are never online AFAIK. I believe we
> > still need pfn_valid to work for those pfns. Really, pfn_valid has a
>
> It's really contrary example to your insist. They requires not only
> struct page but also other information, especially, the zone index.
> They checks zone idx to know whether this page is for ZONE_DEVICE or not.
Yes and they guarantee this association is true. Without memory onlining
though. This memory is never online for anybody who is asking.
[...]
> I think that I did my best to explain my reasoning. It seems that we
> cannot agree with each other so it's better for some others to express
> their opinion to this problem. I will stop this discussion from now
> on.
I _do_ appreciate your feedback and if the general consensus is to
modify pfn_valid I can go that direction but my gut feeling tells me
that conflating "existing struct page" test and "fully online and
initialized" one is a wrong thing to do.
--
Michal Hocko
SUSE Labs
On Fri, Apr 21, 2017 at 09:16:16AM +0200, Michal Hocko wrote:
> On Fri 21-04-17 13:38:28, Joonsoo Kim wrote:
> > On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> > > On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > > > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> > > [...]
> > > > > Which pfn walkers you have in mind?
> > > >
> > > > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > > > using pfn_valid().
> > >
> > > Yeah, I've checked that one and in fact this is a good example of the
> > > case where you do not really care about holes. It just checks the page
> > > count which is a valid information under any circumstances.
> >
> > I don't think so. First, it checks the page *map* count. Is it still valid
> > even if PageReserved() is set?
>
> I do not know about any user which would manipulate page map count for
> referenced pages. The core MM code doesn't.
That's weird that we can get *map* count without PageReserved() check,
but we cannot get zone information.
Zone information is more static information than map count.
It should be defined/documented in this time that what information in
the struct page is valid even if PageReserved() is set. And then, we
need to fix all the things based on this design decision.
>
> > What I'd like to ask in this example is
> > that what information is valid if PageReserved() is set. Is there any
> > design document on this? I think that we need to define/document it first.
>
> NO, it is not AFAIK.
>
> [...]
> > > OK, fair enough. I did't consider memblock allocations. I will rethink
> > > this patch but there are essentially 3 options
> > > - use a different criterion for the offline holes dection. I
> > > have just realized we might do it by storing the online
> > > information into the mem sections
> > > - drop this patch
> > > - move the PageReferenced check down the chain into
> > > isolate_freepages_block resp. isolate_migratepages_block
> > >
> > > I would prefer 3 over 2 over 1. I definitely want to make this more
> > > robust so 1 is preferable long term but I do not want this to be a
> > > roadblock to the rest of the rework. Does that sound acceptable to you?
> >
> > I like #1 among of above options and I already see your patch for #1.
> > It's much better than your first attempt but I'm still not happy due
> > to the semantic of pfn_valid().
>
> You are trying to change a semantic of something that has a well defined
> meaning. I disagree that we should change it. It might sound like a
> simpler thing to do because pfn walkers will have to be checked but what
> you are proposing is conflating two different things together.
I don't think that *I* try to change the semantic of pfn_valid().
It would be original semantic of pfn_valid().
"If pfn_valid() returns true, we can get proper struct page and the
zone information,"
That situation is now being changed by your patch *hotplug rework*.
"Even if pfn_valid() returns true, we cannot get the zone information
without PageReserved() check, since *zone is determined during
onlining* and pfn_valid() return true after adding the memory."
>
> > > [..]
> > > > Let me clarify my desire(?) for this issue.
> > > >
> > > > 1. If pfn_valid() returns true, struct page has valid information, at
> > > > least, in flags (zone id, node id, flags, etc...). So, we can use them
> > > > without checking PageResereved().
> > >
> > > This is no longer true after my rework. Pages are associated with the
> > > zone during _onlining_ rather than when they are physically hotpluged.
> >
> > If your rework make information valid during _onlining_, my
> > suggestion is making pfn_valid() return false until onlining.
> >
> > Caller of pfn_valid() expects that they can get valid information from
> > the struct page. There is no reason to access the struct page if they
> > can't get valid information from it. So, passing pfn_valid() should
> > guarantee that, at least, some kind of information is valid.
> >
> > If pfn_valid() doesn't guarantee it, most of the pfn walker should
> > check PageResereved() to make sure that validity of information from
> > the struct page.
>
> This is true only for those walkers which really depend on the full
> initialization. This is not the case for all of them. I do not see any
> reason to introduce another _pfn_valid to just check whether there is a
> struct page...
It's really confusing concept that only some information is valid for
*not* fully initialized struct page. Even, there is no document that
what information is valid for this half-initialized struct page.
Better design would be that we regard that every information is
invalid for half-initialized struct page. In this case, it's natural
to make pfn_valid() returns false for this half-initialized struct page.
>
> So please do not conflate those two different concepts together. I
> believe that the most prominent pfn walkers should be covered now and
> others can be evaluated later.
Even if original pfn_valid()'s semantic is not the one that I mentioned,
I think that suggested semantic from me is better.
Only hotplug code need to be changed and others doesn't need to be changed.
There is no overhead for others. What's the problem about this approach?
And, I'm not sure that you covered the most prominent pfn walkers.
Please see pagetypeinfo_showblockcount_print() in mm/vmstat.c.
As you admitted, additional check approach is really error-prone and
this example shows that.
Thanks.