2009-11-18 00:19:33

by David Rientjes

[permalink] [raw]
Subject: [patch -mm] mm: slab allocate memory section nodemask for large systems

Nodemasks should not be allocated on the stack for large systems (when it
is larger than 256 bytes) since there is a threat of overflow.

This patch causes the unregister_mem_sect_under_nodes() nodemask to be
allocated on the stack for smaller systems and be allocated by slab for
larger systems.

GFP_KERNEL is used since remove_memory_block() can block.

Cc: Gary Hade <[email protected]>
Cc: Badari Pulavarty <[email protected]>
Cc: Alex Chiang <[email protected]>
Signed-off-by: David Rientjes <[email protected]>
---
drivers/base/node.c | 13 +++++++++----
1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -363,12 +363,16 @@ int register_mem_sect_under_node(struct memory_block *mem_blk, int nid)
/* unregister memory section under all nodes that it spans */
int unregister_mem_sect_under_nodes(struct memory_block *mem_blk)
{
- nodemask_t unlinked_nodes;
+ NODEMASK_ALLOC(nodemask_t, unlinked_nodes, GFP_KERNEL);
unsigned long pfn, sect_start_pfn, sect_end_pfn;

- if (!mem_blk)
+ if (!mem_blk) {
+ NODEMASK_FREE(unlinked_nodes);
return -EFAULT;
- nodes_clear(unlinked_nodes);
+ }
+ if (!unlinked_nodes)
+ return -ENOMEM;
+ nodes_clear(*unlinked_nodes);
sect_start_pfn = section_nr_to_pfn(mem_blk->phys_index);
sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1;
for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
@@ -379,13 +383,14 @@ int unregister_mem_sect_under_nodes(struct memory_block *mem_blk)
continue;
if (!node_online(nid))
continue;
- if (node_test_and_set(nid, unlinked_nodes))
+ if (node_test_and_set(nid, *unlinked_nodes))
continue;
sysfs_remove_link(&node_devices[nid].sysdev.kobj,
kobject_name(&mem_blk->sysdev.kobj));
sysfs_remove_link(&mem_blk->sysdev.kobj,
kobject_name(&node_devices[nid].sysdev.kobj));
}
+ NODEMASK_FREE(unlinked_nodes);
return 0;
}


2009-11-18 18:35:30

by Gary Hade

[permalink] [raw]
Subject: Re: [patch -mm] mm: slab allocate memory section nodemask for large systems

Hi David,

On Tue, Nov 17, 2009 at 04:19:30PM -0800, David Rientjes wrote:
> Nodemasks should not be allocated on the stack for large systems (when it
> is larger than 256 bytes) since there is a threat of overflow.
>
> This patch causes the unregister_mem_sect_under_nodes() nodemask to be
> allocated on the stack for smaller systems and be allocated by slab for
> larger systems.

I notice that there are many other functions that always allocate
nodemask_t objects on the stack. In addition to several that add
a single instance to the stack, cpuset_attach() in kernel/cpuset.c
adds 2 instances and all that are created by using SYSCALL_DEFINE4()
in mm/mempolicy.c add 3 instances. Are there plans to correct the
other functions as well or is there something about
unregister_mem_sect_under_nodes() that makes it more likely to
cause stack overflows than the others?

Gary

--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
[email protected]
http://www.ibm.com/linux/ltc