Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756036Ab0KUC2s (ORCPT ); Sat, 20 Nov 2010 21:28:48 -0500 Received: from smtp-out.google.com ([74.125.121.35]:21032 "EHLO smtp-out.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751102Ab0KUC2p (ORCPT ); Sat, 20 Nov 2010 21:28:45 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=google.com; s=beta; h=date:from:x-x-sender:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version:content-type; b=RkZFiY0soAva/k8vKscGvsRjJiQ/QBHYalrDRikL3iEiAiiHfk8mtSfqBxgni+yJbb MuB0C7gi7t4+WI5K7jmA== Date: Sat, 20 Nov 2010 18:28:38 -0800 (PST) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Andrew Morton , Greg Kroah-Hartman cc: Ingo Molnar , "H. Peter Anvin" , Thomas Gleixner , Shaohui Zheng , Paul Mundt , Andi Kleen , Yinghai Lu , Haicheng Li , Randy Dunlap , linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Subject: [patch 2/2] mm: add node hotplug emulation In-Reply-To: Message-ID: References: <20101117020759.016741414@intel.com> <20101117021000.568681101@intel.com> <20101117075128.GA30254@shaohui> <20101118041407.GA2408@shaohui> <20101118062715.GD17539@linux-sh.org> <20101118052750.GD2408@shaohui> <20101119003225.GB3327@shaohui> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-System-Of-Record: true Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5036 Lines: 141 Add an interface to allow new nodes to be added when performing memory hot-add. This provides a convenient interface to test memory hotplug notifier callbacks and surrounding hotplug code when new nodes are onlined without actually having a machine with such hotpluggable SRAT entries. This adds a new interface at /sys/devices/system/memory/add_node that behaves in a similar way to the memory hot-add "probe" interface. Its format is size@start, where "size" is the size of the new node to be added and "start" is the physical address of the new memory. The new node id is a currently offline, but possible, node. The bit must be set in node_possible_map so that nr_node_ids is sized appropriately. For emulation on x86, for example, it would be possible to set aside memory for hotplugged nodes (say, anything above 2G) and to add an additional three nodes as being possible on boot with mem=2G numa=possible=3 and then creating a new 128M node at runtime: # echo 128M@0x80000000 > /sys/devices/system/memory/add_node On node 1 totalpages: 0 init_memory_mapping: 0000000080000000-0000000088000000 0080000000 - 0088000000 page 2M Once the new node has been added, its memory can be onlined. If this memory represents memory section 16, for example: # echo online > /sys/devices/system/memory/memory16/state Built 2 zonelists in Node order, mobility grouping on. Total pages: 514846 Policy zone: Normal [ The memory section(s) mapped to a particular node are visible via /sys/devices/system/node/node1, in this example. ] The new node is now hotplugged and ready for testing. Signed-off-by: David Rientjes --- Documentation/memory-hotplug.txt | 24 ++++++++++++++++++++++++ drivers/base/memory.c | 36 +++++++++++++++++++++++++++++++++++- 2 files changed, 59 insertions(+), 1 deletions(-) diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt --- a/Documentation/memory-hotplug.txt +++ b/Documentation/memory-hotplug.txt @@ -18,6 +18,7 @@ be changed often. 4. Physical memory hot-add phase 4.1 Hardware(Firmware) Support 4.2 Notify memory hot-add event by hand + 4.3 Node hotplug emulation 5. Logical Memory hot-add phase 5.1. State of memory 5.2. How to online memory @@ -215,6 +216,29 @@ current implementation). You'll have to online memory by yourself. Please see "How to online memory" in this text. +4.3 Node hotplug emulation +------------ +It is possible to test node hotplug by assigning the newly added memory to a +new node id when using a different interface with a similar behavior to +"probe" described in section 4.2. If a node id is possible (there are bits +in /sys/devices/system/memory/possible that are not online), then it may be +used to emulate a newly added node as the result of memory hotplug by using +the "add_node" interface. + +The add_node interface is located at +/sys/devices/system/memory/add_node + +You can create a new node of a specified size starting at the physical +address of new memory by + +% echo size@start_address_of_new_memory > /sys/devices/system/memory/add_node + +Where "size" can be represented in megabytes or gigabytes (for example, +"128M" or "1G"). The minumum size is that of a memory section. + +Once the new node has been added, it is possible to online the memory by +toggling the "state" of its memory section(s) as described in section 5.1. + ------------------------------ 5. Logical Memory hot-add phase diff --git a/drivers/base/memory.c b/drivers/base/memory.c --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -353,10 +353,44 @@ memory_probe_store(struct class *class, struct class_attribute *attr, } static CLASS_ATTR(probe, S_IWUSR, NULL, memory_probe_store); +static ssize_t +memory_add_node_store(struct class *class, struct class_attribute *attr, + const char *buf, size_t count) +{ + nodemask_t mask; + u64 start, size; + char *p; + int nid; + int ret; + + size = memparse(buf, &p); + if (size < (PAGES_PER_SECTION << PAGE_SHIFT)) + return -EINVAL; + if (*p != '@') + return -EINVAL; + + start = simple_strtoull(p + 1, NULL, 0); + + nodes_andnot(mask, node_possible_map, node_online_map); + nid = first_node(mask); + if (nid == MAX_NUMNODES) + return -EINVAL; + + ret = add_memory(nid, start, size); + return ret ? ret : count; +} +static CLASS_ATTR(add_node, S_IWUSR, NULL, memory_add_node_store); + static int memory_probe_init(void) { - return sysfs_create_file(&memory_sysdev_class.kset.kobj, + int err; + + err = sysfs_create_file(&memory_sysdev_class.kset.kobj, &class_attr_probe.attr); + if (err) + return err; + return sysfs_create_file(&memory_sysdev_class.kset.kobj, + &class_attr_add_node.attr); } #else static inline int memory_probe_init(void) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/