Distributions nowadays use udev rules ([1] [2]) to specify if and
how to online hotplugged memory. The rules seem to get more complex with
many special cases. Due to the various special cases,
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE cannot be used. All memory hotplug
is handled via udev rules.
Everytime we hotplug memory, the udev rule will come to the same
conclusion. Especially Hyper-V (but also soon virtio-mem) add a lot of
memory in separate memory blocks and wait for memory to get onlined by user
space before continuing to add more memory blocks (to not add memory faster
than it is getting onlined). This of course slows down the whole memory
hotplug process.
To make the job of distributions easier and to avoid udev rules that get
more and more complicated, let's extend the mechanism provided by
- /sys/devices/system/memory/auto_online_blocks
- "memhp_default_state=" on the kernel cmdline
to be able to specify also "online_movable" as well as "online_kernel"
v2 -> v3:
- "hv_balloon: don't check for memhp_auto_online manually"
-- init_completion() before register_memory_notifier()
- Minor typo fix
v1 -> v2:
- Tweaked some patch descriptions
- Added
-- "powernv/memtrace: always online added memory blocks"
-- "hv_balloon: don't check for memhp_auto_online manually"
-- "mm/memory_hotplug: unexport memhp_auto_online"
- "mm/memory_hotplug: convert memhp_auto_online to store an online_type"
-- No longer touches hv/memtrace code
=== Example /usr/libexec/config-memhotplug ===
#!/bin/bash
VIRT=`systemd-detect-virt --vm`
ARCH=`uname -p`
sense_virtio_mem() {
if [ -d "/sys/bus/virtio/drivers/virtio_mem/" ]; then
DEVICES=`find /sys/bus/virtio/drivers/virtio_mem/ -maxdepth 1 -type l | wc -l`
if [ $DEVICES != "0" ]; then
return 0
fi
fi
return 1
}
if [ ! -e "/sys/devices/system/memory/auto_online_blocks" ]; then
echo "Memory hotplug configuration support missing in the kernel"
exit 1
fi
if grep "memhp_default_state=" /proc/cmdline > /dev/null; then
echo "Memory hotplug configuration overridden in kernel cmdline (memhp_default_state=)"
exit 1
fi
if [ $VIRT == "microsoft" ]; then
echo "Detected Hyper-V on $ARCH"
# Hyper-V wants all memory in ZONE_NORMAL
ONLINE_TYPE="online_kernel"
elif sense_virtio_mem; then
echo "Detected virtio-mem on $ARCH"
# virtio-mem wants all memory in ZONE_NORMAL
ONLINE_TYPE="online_kernel"
elif [ $ARCH == "s390x" ] || [ $ARCH == "s390" ]; then
echo "Detected $ARCH"
# standby memory should not be onlined automatically
ONLINE_TYPE="offline"
elif [ $ARCH == "ppc64" ] || [ $ARCH == "ppc64le" ]; then
echo "Detected" $ARCH
# PPC64 onlines all hotplugged memory right from the kernel
ONLINE_TYPE="offline"
elif [ $VIRT == "none" ]; then
echo "Detected bare-metal on $ARCH"
# Bare metal users expect hotplugged memory to be unpluggable. We assume
# that ZONE imbalances on such enterpise servers cannot happen and is
# properly documented
ONLINE_TYPE="online_movable"
else
# TODO: Hypervisors that want to unplug DIMMs and can guarantee that ZONE
# imbalances won't happen
echo "Detected $VIRT on $ARCH"
# Usually, ballooning is used in virtual environments, so memory should go to
# ZONE_NORMAL. However, sometimes "movable_node" is relevant.
ONLINE_TYPE="online"
fi
echo "Selected online_type:" $ONLINE_TYPE
# Configure what to do with memory that will be hotplugged in the future
echo $ONLINE_TYPE 2>/dev/null > /sys/devices/system/memory/auto_online_blocks
if [ $? != "0" ]; then
echo "Memory hotplug cannot be configured (e.g., old kernel or missing permissions)"
# A backup udev rule should handle old kernels if necessary
exit 1
fi
# Process all already pluggedd blocks (e.g., DIMMs, but also Hyper-V or virtio-mem)
if [ $ONLINE_TYPE != "offline" ]; then
for MEMORY in /sys/devices/system/memory/memory*; do
STATE=`cat $MEMORY/state`
if [ $STATE == "offline" ]; then
echo $ONLINE_TYPE > $MEMORY/state
fi
done
fi
=== Example /usr/lib/systemd/system/config-memhotplug.service ===
[Unit]
Description=Configure memory hotplug behavior
DefaultDependencies=no
Conflicts=shutdown.target
Before=sysinit.target shutdown.target
After=systemd-modules-load.service
ConditionPathExists=|/sys/devices/system/memory/auto_online_blocks
[Service]
ExecStart=/usr/libexec/config-memhotplug
Type=oneshot
TimeoutSec=0
RemainAfterExit=yes
[Install]
WantedBy=sysinit.target
=== Example modification to the 40-redhat.rules [2] ===
diff --git a/40-redhat.rules b/40-redhat.rules-new
index 2c690e5..168fd03 100644
--- a/40-redhat.rules
+++ b/40-redhat.rules-new
@@ -6,6 +6,9 @@ SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}
# Memory hotadd request
SUBSYSTEM!="memory", GOTO="memory_hotplug_end"
ACTION!="add", GOTO="memory_hotplug_end"
+# memory hotplug behavior configured
+PROGRAM=="grep online /sys/devices/system/memory/auto_online_blocks", GOTO="memory_hotplug_end"
+
PROGRAM="/bin/uname -p", RESULT=="s390*", GOTO="memory_hotplug_end"
ENV{.state}="online"
===
[1] https://github.com/lnykryn/systemd-rhel/pull/281
[2] https://github.com/lnykryn/systemd-rhel/blob/staging/rules/40-redhat.rules
Cc: Vitaly Kuznetsov <[email protected]>
Cc: Yumei Huang <[email protected]>
Cc: Igor Mammedov <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Eduardo Habkost <[email protected]>
Cc: Milan Zamazal <[email protected]>
David Hildenbrand (8):
drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE
drivers/base/memory: map MMOP_OFFLINE to 0
drivers/base/memory: store mapping between MMOP_* and string in an
array
powernv/memtrace: always online added memory blocks
hv_balloon: don't check for memhp_auto_online manually
mm/memory_hotplug: unexport memhp_auto_online
mm/memory_hotplug: convert memhp_auto_online to store an online_type
mm/memory_hotplug: allow to specify a default online_type
arch/powerpc/platforms/powernv/memtrace.c | 14 ++---
drivers/base/memory.c | 71 ++++++++++++-----------
drivers/hv/hv_balloon.c | 25 ++++----
include/linux/memory_hotplug.h | 13 ++++-
mm/memory_hotplug.c | 16 ++---
5 files changed, 69 insertions(+), 70 deletions(-)
--
2.24.1
... and rename it to memhp_default_online_type. This is a preparation
for more detailed default online behavior.
Reviewed-by: Wei Yang <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Wei Yang <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
---
drivers/base/memory.c | 10 ++++------
include/linux/memory_hotplug.h | 3 ++-
mm/memory_hotplug.c | 11 ++++++-----
3 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 8a7f29c0bf97..8d3e16dab69f 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -386,10 +386,8 @@ static DEVICE_ATTR_RO(block_size_bytes);
static ssize_t auto_online_blocks_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
- if (memhp_auto_online)
- return sprintf(buf, "online\n");
- else
- return sprintf(buf, "offline\n");
+ return sprintf(buf, "%s\n",
+ online_type_to_str[memhp_default_online_type]);
}
static ssize_t auto_online_blocks_store(struct device *dev,
@@ -397,9 +395,9 @@ static ssize_t auto_online_blocks_store(struct device *dev,
const char *buf, size_t count)
{
if (sysfs_streq(buf, "online"))
- memhp_auto_online = true;
+ memhp_default_online_type = MMOP_ONLINE;
else if (sysfs_streq(buf, "offline"))
- memhp_auto_online = false;
+ memhp_default_online_type = MMOP_OFFLINE;
else
return -EINVAL;
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 76f3c617a8ab..6d6f85bb66e9 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -118,7 +118,8 @@ extern int arch_add_memory(int nid, u64 start, u64 size,
struct mhp_params *params);
extern u64 max_mem_size;
-extern bool memhp_auto_online;
+/* Default online_type (MMOP_*) when new memory blocks are added. */
+extern int memhp_default_online_type;
/* If movable_node boot option specified */
extern bool movable_node_enabled;
static inline bool movable_node_is_enabled(void)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index e21a7d53ade5..4efcf8cb9ac5 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -67,17 +67,17 @@ void put_online_mems(void)
bool movable_node_enabled = false;
#ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
-bool memhp_auto_online;
+int memhp_default_online_type = MMOP_OFFLINE;
#else
-bool memhp_auto_online = true;
+int memhp_default_online_type = MMOP_ONLINE;
#endif
static int __init setup_memhp_default_state(char *str)
{
if (!strcmp(str, "online"))
- memhp_auto_online = true;
+ memhp_default_online_type = MMOP_ONLINE;
else if (!strcmp(str, "offline"))
- memhp_auto_online = false;
+ memhp_default_online_type = MMOP_OFFLINE;
return 1;
}
@@ -993,6 +993,7 @@ static int check_hotplug_memory_range(u64 start, u64 size)
static int online_memory_block(struct memory_block *mem, void *arg)
{
+ mem->online_type = memhp_default_online_type;
return device_online(&mem->dev);
}
@@ -1065,7 +1066,7 @@ int __ref add_memory_resource(int nid, struct resource *res)
mem_hotplug_done();
/* online pages if requested */
- if (memhp_auto_online)
+ if (memhp_default_online_type != MMOP_OFFLINE)
walk_memory_blocks(start, size, NULL, online_memory_block);
return ret;
--
2.24.1
All in-tree users except the mm-core are gone. Let's drop the export.
Reviewed-by: Wei Yang <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Wei Yang <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
---
mm/memory_hotplug.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index da6aab272c9b..e21a7d53ade5 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -71,7 +71,6 @@ bool memhp_auto_online;
#else
bool memhp_auto_online = true;
#endif
-EXPORT_SYMBOL_GPL(memhp_auto_online);
static int __init setup_memhp_default_state(char *str)
{
--
2.24.1
We get the MEM_ONLINE notifier call if memory is added right from the
kernel via add_memory() or later from user space.
Let's get rid of the "ha_waiting" flag - the wait event has an inbuilt
mechanism (->done) for that. Initialize the wait event only once and
reinitialize before adding memory. Unconditionally call complete() and
wait_for_completion_timeout().
If there are no waiters, complete() will only increment ->done - which
will be reset by reinit_completion(). If complete() has already been
called, wait_for_completion_timeout() will not wait.
There is still the chance for a small race between concurrent
reinit_completion() and complete(). If complete() wins, we would not
wait - which is tolerable (and the race exists in current code as well).
Note: We only wait for "some" memory to get onlined, which seems to be
good enough for now.
Reviewed-by: Vitaly Kuznetsov <[email protected]>
Cc: "K. Y. Srinivasan" <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Wei Liu <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: [email protected]
Signed-off-by: David Hildenbrand <[email protected]>
---
drivers/hv/hv_balloon.c | 25 ++++++++++---------------
1 file changed, 10 insertions(+), 15 deletions(-)
diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index a02ce43d778d..32e3bc0aa665 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -533,7 +533,6 @@ struct hv_dynmem_device {
* State to synchronize hot-add.
*/
struct completion ol_waitevent;
- bool ha_waiting;
/*
* This thread handles hot-add
* requests from the host as well as notifying
@@ -634,10 +633,7 @@ static int hv_memory_notifier(struct notifier_block *nb, unsigned long val,
switch (val) {
case MEM_ONLINE:
case MEM_CANCEL_ONLINE:
- if (dm_device.ha_waiting) {
- dm_device.ha_waiting = false;
- complete(&dm_device.ol_waitevent);
- }
+ complete(&dm_device.ol_waitevent);
break;
case MEM_OFFLINE:
@@ -726,8 +722,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size,
has->covered_end_pfn += processed_pfn;
spin_unlock_irqrestore(&dm_device.ha_lock, flags);
- init_completion(&dm_device.ol_waitevent);
- dm_device.ha_waiting = !memhp_auto_online;
+ reinit_completion(&dm_device.ol_waitevent);
nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn));
ret = add_memory(nid, PFN_PHYS((start_pfn)),
@@ -753,15 +748,14 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size,
}
/*
- * Wait for the memory block to be onlined when memory onlining
- * is done outside of kernel (memhp_auto_online). Since the hot
- * add has succeeded, it is ok to proceed even if the pages in
- * the hot added region have not been "onlined" within the
- * allowed time.
+ * Wait for memory to get onlined. If the kernel onlined the
+ * memory when adding it, this will return directly. Otherwise,
+ * it will wait for user space to online the memory. This helps
+ * to avoid adding memory faster than it is getting onlined. As
+ * adding succeeded, it is ok to proceed even if the memory was
+ * not onlined in time.
*/
- if (dm_device.ha_waiting)
- wait_for_completion_timeout(&dm_device.ol_waitevent,
- 5*HZ);
+ wait_for_completion_timeout(&dm_device.ol_waitevent, 5 * HZ);
post_status(&dm_device);
}
}
@@ -1706,6 +1700,7 @@ static int balloon_probe(struct hv_device *dev,
#ifdef CONFIG_MEMORY_HOTPLUG
set_online_page_callback(&hv_online_page);
+ init_completion(&dm_device.ol_waitevent);
register_memory_notifier(&hv_memory_nb);
#endif
--
2.24.1
The name is misleading and it's not really clear what is "kept". Let's just
name it like the online_type name we expose to user space ("online").
Add some documentation to the types.
Reviewed-by: Wei Yang <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Wei Yang <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
---
drivers/base/memory.c | 9 +++++----
include/linux/memory_hotplug.h | 6 +++++-
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 6448c9ece2cb..8c5ce42c0fc3 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -216,7 +216,7 @@ static int memory_subsys_online(struct device *dev)
* attribute and need to set the online_type.
*/
if (mem->online_type < 0)
- mem->online_type = MMOP_ONLINE_KEEP;
+ mem->online_type = MMOP_ONLINE;
ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
@@ -251,7 +251,7 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
else if (sysfs_streq(buf, "online_movable"))
online_type = MMOP_ONLINE_MOVABLE;
else if (sysfs_streq(buf, "online"))
- online_type = MMOP_ONLINE_KEEP;
+ online_type = MMOP_ONLINE;
else if (sysfs_streq(buf, "offline"))
online_type = MMOP_OFFLINE;
else {
@@ -262,7 +262,7 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
switch (online_type) {
case MMOP_ONLINE_KERNEL:
case MMOP_ONLINE_MOVABLE:
- case MMOP_ONLINE_KEEP:
+ case MMOP_ONLINE:
/* mem->online_type is protected by device_hotplug_lock */
mem->online_type = online_type;
ret = device_online(&mem->dev);
@@ -342,7 +342,8 @@ static ssize_t valid_zones_show(struct device *dev,
}
nid = mem->nid;
- default_zone = zone_for_pfn_range(MMOP_ONLINE_KEEP, nid, start_pfn, nr_pages);
+ default_zone = zone_for_pfn_range(MMOP_ONLINE, nid, start_pfn,
+ nr_pages);
strcat(buf, default_zone->name);
print_allowed_zone(buf, nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL,
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 3195d11876ea..3aaf00db224c 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -47,9 +47,13 @@ enum {
/* Types for control the zone type of onlined and offlined memory */
enum {
+ /* Offline the memory. */
MMOP_OFFLINE = -1,
- MMOP_ONLINE_KEEP,
+ /* Online the memory. Zone depends, see default_zone_for_pfn(). */
+ MMOP_ONLINE,
+ /* Online the memory to ZONE_NORMAL. */
MMOP_ONLINE_KERNEL,
+ /* Online the memory to ZONE_MOVABLE. */
MMOP_ONLINE_MOVABLE,
};
--
2.24.1
> The name is misleading and it's not really clear what is "kept". Let's just
> name it like the online_type name we expose to user space ("online").
>
> Add some documentation to the types.
>
> Reviewed-by: Wei Yang <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Oscar Salvador <[email protected]>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: Baoquan He <[email protected]>
> Cc: Wei Yang <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>
> ---
> drivers/base/memory.c | 9 +++++----
> include/linux/memory_hotplug.h | 6 +++++-
> 2 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 6448c9ece2cb..8c5ce42c0fc3 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -216,7 +216,7 @@ static int memory_subsys_online(struct device *dev)
> * attribute and need to set the online_type.
> */
> if (mem->online_type < 0)
> - mem->online_type = MMOP_ONLINE_KEEP;
> + mem->online_type = MMOP_ONLINE;
>
> ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
>
> @@ -251,7 +251,7 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
> else if (sysfs_streq(buf, "online_movable"))
> online_type = MMOP_ONLINE_MOVABLE;
> else if (sysfs_streq(buf, "online"))
> - online_type = MMOP_ONLINE_KEEP;
> + online_type = MMOP_ONLINE;
> else if (sysfs_streq(buf, "offline"))
> online_type = MMOP_OFFLINE;
> else {
> @@ -262,7 +262,7 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
> switch (online_type) {
> case MMOP_ONLINE_KERNEL:
> case MMOP_ONLINE_MOVABLE:
> - case MMOP_ONLINE_KEEP:
> + case MMOP_ONLINE:
> /* mem->online_type is protected by device_hotplug_lock */
> mem->online_type = online_type;
> ret = device_online(&mem->dev);
> @@ -342,7 +342,8 @@ static ssize_t valid_zones_show(struct device *dev,
> }
>
> nid = mem->nid;
> - default_zone = zone_for_pfn_range(MMOP_ONLINE_KEEP, nid, start_pfn, nr_pages);
> + default_zone = zone_for_pfn_range(MMOP_ONLINE, nid, start_pfn,
> + nr_pages);
> strcat(buf, default_zone->name);
>
> print_allowed_zone(buf, nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL,
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 3195d11876ea..3aaf00db224c 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -47,9 +47,13 @@ enum {
>
> /* Types for control the zone type of onlined and offlined memory */
> enum {
> + /* Offline the memory. */
> MMOP_OFFLINE = -1,
> - MMOP_ONLINE_KEEP,
> + /* Online the memory. Zone depends, see default_zone_for_pfn(). */
> + MMOP_ONLINE,
> + /* Online the memory to ZONE_NORMAL. */
> MMOP_ONLINE_KERNEL,
> + /* Online the memory to ZONE_MOVABLE. */
> MMOP_ONLINE_MOVABLE,
> };
>
> --
Looks good to me.
Acked-by: Pankaj Gupta <[email protected]>
> 2.24.1
>
>
> All in-tree users except the mm-core are gone. Let's drop the export.
>
> Reviewed-by: Wei Yang <[email protected]>
> Acked-by: Michal Hocko <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Oscar Salvador <[email protected]>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: Baoquan He <[email protected]>
> Cc: Wei Yang <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>
> ---
> mm/memory_hotplug.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index da6aab272c9b..e21a7d53ade5 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -71,7 +71,6 @@ bool memhp_auto_online;
> #else
> bool memhp_auto_online = true;
> #endif
> -EXPORT_SYMBOL_GPL(memhp_auto_online);
>
> static int __init setup_memhp_default_state(char *str)
> {
> --
> 2.24.1
Acked-by: Pankaj Gupta <[email protected]>
>
>
> ... and rename it to memhp_default_online_type. This is a preparation
> for more detailed default online behavior.
>
> Reviewed-by: Wei Yang <[email protected]>
> Acked-by: Michal Hocko <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Oscar Salvador <[email protected]>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: Baoquan He <[email protected]>
> Cc: Wei Yang <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>
> ---
> drivers/base/memory.c | 10 ++++------
> include/linux/memory_hotplug.h | 3 ++-
> mm/memory_hotplug.c | 11 ++++++-----
> 3 files changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 8a7f29c0bf97..8d3e16dab69f 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -386,10 +386,8 @@ static DEVICE_ATTR_RO(block_size_bytes);
> static ssize_t auto_online_blocks_show(struct device *dev,
> struct device_attribute *attr, char *buf)
> {
> - if (memhp_auto_online)
> - return sprintf(buf, "online\n");
> - else
> - return sprintf(buf, "offline\n");
> + return sprintf(buf, "%s\n",
> + online_type_to_str[memhp_default_online_type]);
> }
>
> static ssize_t auto_online_blocks_store(struct device *dev,
> @@ -397,9 +395,9 @@ static ssize_t auto_online_blocks_store(struct device *dev,
> const char *buf, size_t count)
> {
> if (sysfs_streq(buf, "online"))
> - memhp_auto_online = true;
> + memhp_default_online_type = MMOP_ONLINE;
> else if (sysfs_streq(buf, "offline"))
> - memhp_auto_online = false;
> + memhp_default_online_type = MMOP_OFFLINE;
> else
> return -EINVAL;
>
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 76f3c617a8ab..6d6f85bb66e9 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -118,7 +118,8 @@ extern int arch_add_memory(int nid, u64 start, u64 size,
> struct mhp_params *params);
> extern u64 max_mem_size;
>
> -extern bool memhp_auto_online;
> +/* Default online_type (MMOP_*) when new memory blocks are added. */
> +extern int memhp_default_online_type;
> /* If movable_node boot option specified */
> extern bool movable_node_enabled;
> static inline bool movable_node_is_enabled(void)
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index e21a7d53ade5..4efcf8cb9ac5 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -67,17 +67,17 @@ void put_online_mems(void)
> bool movable_node_enabled = false;
>
> #ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
> -bool memhp_auto_online;
> +int memhp_default_online_type = MMOP_OFFLINE;
> #else
> -bool memhp_auto_online = true;
> +int memhp_default_online_type = MMOP_ONLINE;
> #endif
>
> static int __init setup_memhp_default_state(char *str)
> {
> if (!strcmp(str, "online"))
> - memhp_auto_online = true;
> + memhp_default_online_type = MMOP_ONLINE;
> else if (!strcmp(str, "offline"))
> - memhp_auto_online = false;
> + memhp_default_online_type = MMOP_OFFLINE;
>
> return 1;
> }
> @@ -993,6 +993,7 @@ static int check_hotplug_memory_range(u64 start, u64 size)
>
> static int online_memory_block(struct memory_block *mem, void *arg)
> {
> + mem->online_type = memhp_default_online_type;
> return device_online(&mem->dev);
> }
>
> @@ -1065,7 +1066,7 @@ int __ref add_memory_resource(int nid, struct resource *res)
> mem_hotplug_done();
>
> /* online pages if requested */
> - if (memhp_auto_online)
> + if (memhp_default_online_type != MMOP_OFFLINE)
> walk_memory_blocks(start, size, NULL, online_memory_block);
>
> return ret;
> --
Acked-by: Pankaj Gupta <[email protected]>
> 2.24.1
>
>
On 03/19/20 at 02:12pm, David Hildenbrand wrote:
> Distributions nowadays use udev rules ([1] [2]) to specify if and
> how to online hotplugged memory. The rules seem to get more complex with
> many special cases. Due to the various special cases,
> CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE cannot be used. All memory hotplug
> is handled via udev rules.
>
> Everytime we hotplug memory, the udev rule will come to the same
> conclusion. Especially Hyper-V (but also soon virtio-mem) add a lot of
> memory in separate memory blocks and wait for memory to get onlined by user
> space before continuing to add more memory blocks (to not add memory faster
> than it is getting onlined). This of course slows down the whole memory
> hotplug process.
>
> To make the job of distributions easier and to avoid udev rules that get
> more and more complicated, let's extend the mechanism provided by
> - /sys/devices/system/memory/auto_online_blocks
> - "memhp_default_state=" on the kernel cmdline
> to be able to specify also "online_movable" as well as "online_kernel"
>
> v2 -> v3:
> - "hv_balloon: don't check for memhp_auto_online manually"
> -- init_completion() before register_memory_notifier()
> - Minor typo fix
>
> v1 -> v2:
> - Tweaked some patch descriptions
> - Added
> -- "powernv/memtrace: always online added memory blocks"
> -- "hv_balloon: don't check for memhp_auto_online manually"
> -- "mm/memory_hotplug: unexport memhp_auto_online"
> - "mm/memory_hotplug: convert memhp_auto_online to store an online_type"
> -- No longer touches hv/memtrace code
Ack the series.
Reviewed-by: Baoquan He <[email protected]>