2012-05-04 18:55:45

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [RFC PATCH] Expand memblock=debug to provide a bit more details (v1).

While trying to track down some memory allocation issues, I realized that
memblock=debug was giving some information, but for guests with 256GB or
so the majority of it was just:

memblock_reserve: [0x00003efeeea000-0x00003efeeeb000] __alloc_memory_core_early+0x5c/0x64

which really didn't tell me that much. With these patches I know it is:

memblock_reserve: [0x00003ffe724000-0x00003ffe725000] (4kB) vmemmap_pmd_populate+0x4b/0xa2

.. which isn't really that useful for the problem I was tracking down, but
it does help in figuring out which routines are using memblock.

Please see the patches - not sure what is in the future for memblock.c
so if they are running afoul of some future grand plans - I can rebase them.

Thanks!


2012-05-04 18:55:46

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 2/2] bootmem/sparsemem: Have a new __alloc_bootmem_node_high

called "__alloc_bootmem_node_high_caller" which will allow to pass
the IP of the caller. The particular user is sparse vmemmap:

memblock_reserve: [0x0000003fafb000-0x0000003fefb000] (4096kB) sparse_init+0x25/0x272
memblock_reserve: [0x0000003fafaf00-0x0000003fafafd8] (0kB) sparse_early_usemaps_alloc_node+0x34/0x7d
memblock_reserve: [0x0000003f6faf00-0x0000003fafaf00] (4096kB) sparse_init+0x104/0x272
-memblock_reserve: [0x0000003e400000-0x0000003f600000] (18432kB) sparse_mem_maps_populate_node+0x46/0x138
-memblock_reserve: [0x0000003f6f9000-0x0000003f6fa000] (4kB) vmemmap_alloc_block+0xde/0xe3
-memblock_reserve: [0x0000003f6f8000-0x0000003f6f9000] (4kB) vmemmap_alloc_block+0xde/0xe3
-memblock_reserve: [0x0000003f6f7000-0x0000003f6f8000] (4kB) vmemmap_alloc_block+0xde/0xe3
-memblock_reserve: [0x0000003f6f6000-0x0000003f6f7000] (4kB) vmemmap_alloc_block+0xde/0xe3
-memblock_reserve: [0x0000003f6f5000-0x0000003f6f6000] (4kB) vmemmap_alloc_block+0xde/0xe3
-memblock_reserve: [0x0000003f6f4000-0x0000003f6f5000] (4kB) vmemmap_alloc_block+0xde/0xe3
-memblock_reserve: [0x0000003f6f3000-0x0000003f6f4000] (4kB) vmemmap_alloc_block+0xde/0xe3
-memblock_reserve: [0x0000003f6f2000-0x0000003f6f3000] (4kB) vmemmap_alloc_block+0xde/0xe3
-memblock_reserve: [0x0000003f6f1000-0x0000003f6f2000] (4kB) vmemmap_alloc_block+0xde/0xe3
-memblock_reserve: [0x0000003f6f0000-0x0000003f6f1000] (4kB) vmemmap_alloc_block+0xde/0xe3
- memblock_free: [0x0000003f3c0000-0x0000003f600000] (2304kB) sparse_mem_maps_populate_node+0x113/0x138
+memblock_reserve: [0x0000003e400000-0x0000003f600000] (18432kB) sparse_mem_maps_populate_node+0x0/0x13f
+memblock_reserve: [0x0000003f6f9000-0x0000003f6fa000] (4kB) vmemmap_pgd_populate+0x2b/0x82
+memblock_reserve: [0x0000003f6f8000-0x0000003f6f9000] (4kB) vmemmap_pud_populate+0x4b/0xa2
+memblock_reserve: [0x0000003f6f7000-0x0000003f6f8000] (4kB) vmemmap_pmd_populate+0x4b/0xa2
+memblock_reserve: [0x0000003f6f6000-0x0000003f6f7000] (4kB) vmemmap_pmd_populate+0x4b/0xa2
+memblock_reserve: [0x0000003f6f5000-0x0000003f6f6000] (4kB) vmemmap_pmd_populate+0x4b/0xa2
+memblock_reserve: [0x0000003f6f4000-0x0000003f6f5000] (4kB) vmemmap_pmd_populate+0x4b/0xa2
+memblock_reserve: [0x0000003f6f3000-0x0000003f6f4000] (4kB) vmemmap_pmd_populate+0x4b/0xa2
+memblock_reserve: [0x0000003f6f2000-0x0000003f6f3000] (4kB) vmemmap_pmd_populate+0x4b/0xa2
+memblock_reserve: [0x0000003f6f1000-0x0000003f6f2000] (4kB) vmemmap_pmd_populate+0x4b/0xa2
+memblock_reserve: [0x0000003f6f0000-0x0000003f6f1000] (4kB) vmemmap_pmd_populate+0x4b/0xa2
+ memblock_free: [0x0000003f3c0000-0x0000003f600000] (2304kB) sparse_mem_maps_populate_node+0x11a/0x13f
memblock_free: [0x0000003f6faf00-0x0000003fafaf00] (4096kB) sparse_init+0x24e/0x272
memblock_free: [0x0000003fafb000-0x0000003fefb000] (4096kB) sparse_init+0x263/0x272

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
include/linux/bootmem.h | 5 +++++
mm/bootmem.c | 7 ++++++-
mm/nobootmem.c | 6 ++++++
mm/sparse-vmemmap.c | 9 +++++----
4 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index 66d3e95..4c4ed3b 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -87,6 +87,11 @@ void *__alloc_bootmem_node_high(pg_data_t *pgdat,
unsigned long size,
unsigned long align,
unsigned long goal);
+void *__alloc_bootmem_node_high_caller(pg_data_t *pgdat,
+ unsigned long size,
+ unsigned long align,
+ unsigned long goal,
+ void *caller);
extern void *__alloc_bootmem_node_nopanic(pg_data_t *pgdat,
unsigned long size,
unsigned long align,
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 0131170..89c792b 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -753,7 +753,12 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,
return __alloc_bootmem_node(pgdat, size, align, goal);

}
-
+void * __init __alloc_bootmem_node_high_caller(pg_data_t *pgdat, unsigned long size,
+ unsigned long align, unsigned long goal,
+ void *caller)
+{
+ return __alloc_bootmem_node_high(pgdat, size, align, goal);
+}
#ifdef CONFIG_SPARSEMEM
/**
* alloc_bootmem_section - allocate boot memory from a specific section
diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index fe9b251..0acc38e 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -326,6 +326,12 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,
{
return ____alloc_bootmem_node(pgdat, size, align, goal, (void *)_RET_IP_);
}
+void * __init __alloc_bootmem_node_high_caller(pg_data_t *pgdat, unsigned long size,
+ unsigned long align, unsigned long goal,
+ void *caller)
+{
+ return ____alloc_bootmem_node(pgdat, size, align, goal, caller);
+}

#ifdef CONFIG_SPARSEMEM
/**
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index 1b7e22a..00e3b2a 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -38,9 +38,10 @@
static void * __init_refok __earlyonly_bootmem_alloc(int node,
unsigned long size,
unsigned long align,
- unsigned long goal)
+ unsigned long goal,
+ void *caller)
{
- return __alloc_bootmem_node_high(NODE_DATA(node), size, align, goal);
+ return __alloc_bootmem_node_high_caller(NODE_DATA(node), size, align, goal, caller);
}

static void *vmemmap_buf;
@@ -63,7 +64,7 @@ void * __meminit vmemmap_alloc_block(unsigned long size, int node)
return NULL;
} else
return __earlyonly_bootmem_alloc(node, size, size,
- __pa(MAX_DMA_ADDRESS));
+ __pa(MAX_DMA_ADDRESS), (void *)_RET_IP_);
}

/* need to make sure size is all the same during early stage */
@@ -195,7 +196,7 @@ void __init sparse_mem_maps_populate_node(struct page **map_map,

size = ALIGN(size, PMD_SIZE);
vmemmap_buf_start = __earlyonly_bootmem_alloc(nodeid, size * map_count,
- PMD_SIZE, __pa(MAX_DMA_ADDRESS));
+ PMD_SIZE, __pa(MAX_DMA_ADDRESS), (void *)_THIS_IP_);

if (vmemmap_buf_start) {
vmemmap_buf = vmemmap_buf_start;
--
1.7.7.5

2012-05-04 18:56:18

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 1/2] memblock: Add _THIS_IP off the caller to memblock debug statements and size in kB.

-memblock_reserve: [0x0000003fefb000-0x0000003fefc000] (4kB) __alloc_memory_core_early+0x65/0x70
-memblock_reserve: [0x0000003fafb000-0x0000003fefb000] (4096kB) __alloc_memory_core_early+0x65/0x70
-memblock_reserve: [0x0000003fafaf00-0x0000003fafafd8] (0kB) __alloc_memory_core_early+0x65/0x70
-memblock_reserve: [0x0000003f6faf00-0x0000003fafaf00] (4096kB) __alloc_memory_core_early+0x65/0x70
-memblock_reserve: [0x0000003e400000-0x0000003f600000] (18432kB) __alloc_memory_core_early+0x65/0x70
-memblock_reserve: [0x0000003f6f9000-0x0000003f6fa000] (4kB) __alloc_memory_core_early+0x65/0x70
.. snip..
- memblock_free: [0x0000003f3c0000-0x0000003f600000] (2304kB) free_bootmem+0xd/0xf
- memblock_free: [0x0000003f6faf00-0x0000003fafaf00] (4096kB) free_bootmem+0xd/0xf
- memblock_free: [0x0000003fafb000-0x0000003fefb000] (4096kB) free_bootmem+0xd/0xf
+memblock_reserve: [0x0000003fefb000-0x0000003fefc000] __alloc_memory_core_early+0x5c/0x64
+memblock_reserve: [0x0000003fafb000-0x0000003fefb000] __alloc_memory_core_early+0x5c/0x64
+memblock_reserve: [0x0000003fafaf00-0x0000003fafafd8] __alloc_memory_core_early+0x5c/0x64
+memblock_reserve: [0x0000003f6faf00-0x0000003fafaf00] __alloc_memory_core_early+0x5c/0x64
+memblock_reserve: [0x0000003e400000-0x0000003f600000] __alloc_memory_core_early+0x5c/0x64
.. snip..
+ memblock_free: [0x0000003f3c0000-0x0000003f600000] free_bootmem+0x9/0xb
+ memblock_free: [0x0000003f6faf00-0x0000003fafaf00] free_bootmem+0x9/0xb
+ memblock_free: [0x0000003fafb000-0x0000003fefb000] free_bootmem+0x9/0xb

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
include/linux/memblock.h | 6 +++-
mm/memblock.c | 14 +++++++-----
mm/nobootmem.c | 50 ++++++++++++++++++++++++++-------------------
3 files changed, 41 insertions(+), 29 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index a6bb102..2a1ec82 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -57,8 +57,10 @@ void memblock_allow_resize(void);
int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid);
int memblock_add(phys_addr_t base, phys_addr_t size);
int memblock_remove(phys_addr_t base, phys_addr_t size);
-int memblock_free(phys_addr_t base, phys_addr_t size);
-int memblock_reserve(phys_addr_t base, phys_addr_t size);
+int __memblock_free(phys_addr_t base, phys_addr_t size, void *caller);
+#define memblock_free(base, size) __memblock_free(base, size, (void *)_RET_IP_)
+int __memblock_reserve(phys_addr_t base, phys_addr_t size, void *caller);
+#define memblock_reserve(base, size) __memblock_reserve(base, size, (void *)_RET_IP_)

#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
diff --git a/mm/memblock.c b/mm/memblock.c
index a44eab3..3e97b07 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -502,24 +502,26 @@ int __init_memblock memblock_remove(phys_addr_t base, phys_addr_t size)
return __memblock_remove(&memblock.memory, base, size);
}

-int __init_memblock memblock_free(phys_addr_t base, phys_addr_t size)
+int __init_memblock __memblock_free(phys_addr_t base, phys_addr_t size, void *caller)
{
- memblock_dbg(" memblock_free: [%#016llx-%#016llx] %pF\n",
+ memblock_dbg(" memblock_free: [%#016llx-%#016llx] (%lukB) %pF\n",
(unsigned long long)base,
(unsigned long long)base + size,
- (void *)_RET_IP_);
+ (unsigned long)(size >> 10),
+ caller);

return __memblock_remove(&memblock.reserved, base, size);
}

-int __init_memblock memblock_reserve(phys_addr_t base, phys_addr_t size)
+int __init_memblock __memblock_reserve(phys_addr_t base, phys_addr_t size, void *caller)
{
struct memblock_type *_rgn = &memblock.reserved;

- memblock_dbg("memblock_reserve: [%#016llx-%#016llx] %pF\n",
+ memblock_dbg("memblock_reserve: [%#016llx-%#016llx] (%lukB) %pF\n",
(unsigned long long)base,
(unsigned long long)base + size,
- (void *)_RET_IP_);
+ (unsigned long)(size >> 10),
+ caller);

return memblock_add_region(_rgn, base, size, MAX_NUMNODES);
}
diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index e53bb8a..fe9b251 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -16,6 +16,7 @@
#include <linux/kmemleak.h>
#include <linux/range.h>
#include <linux/memblock.h>
+#include <linux/kernel.h>

#include <asm/bug.h>
#include <asm/io.h>
@@ -33,7 +34,7 @@ unsigned long min_low_pfn;
unsigned long max_pfn;

static void * __init __alloc_memory_core_early(int nid, u64 size, u64 align,
- u64 goal, u64 limit)
+ u64 goal, u64 limit, void *caller)
{
void *ptr;
u64 addr;
@@ -47,7 +48,7 @@ static void * __init __alloc_memory_core_early(int nid, u64 size, u64 align,

ptr = phys_to_virt(addr);
memset(ptr, 0, size);
- memblock_reserve(addr, size);
+ __memblock_reserve(addr, size, caller);
/*
* The min_count is set to 0 so that bootmem allocated blocks
* are never reported as leaks.
@@ -175,7 +176,7 @@ void __init free_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
unsigned long size)
{
kmemleak_free_part(__va(physaddr), size);
- memblock_free(physaddr, size);
+ __memblock_free(physaddr, size, (void *)_RET_IP_);
}

/**
@@ -190,13 +191,14 @@ void __init free_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
void __init free_bootmem(unsigned long addr, unsigned long size)
{
kmemleak_free_part(__va(addr), size);
- memblock_free(addr, size);
+ __memblock_free(addr, size, (void *)_RET_IP_);
}

static void * __init ___alloc_bootmem_nopanic(unsigned long size,
unsigned long align,
unsigned long goal,
- unsigned long limit)
+ unsigned long limit,
+ void *caller)
{
void *ptr;

@@ -205,7 +207,7 @@ static void * __init ___alloc_bootmem_nopanic(unsigned long size,

restart:

- ptr = __alloc_memory_core_early(MAX_NUMNODES, size, align, goal, limit);
+ ptr = __alloc_memory_core_early(MAX_NUMNODES, size, align, goal, limit, caller);

if (ptr)
return ptr;
@@ -236,13 +238,14 @@ void * __init __alloc_bootmem_nopanic(unsigned long size, unsigned long align,
{
unsigned long limit = -1UL;

- return ___alloc_bootmem_nopanic(size, align, goal, limit);
+ return ___alloc_bootmem_nopanic(size, align, goal, limit, (void *)_RET_IP_);
}

static void * __init ___alloc_bootmem(unsigned long size, unsigned long align,
- unsigned long goal, unsigned long limit)
+ unsigned long goal, unsigned long limit,
+ void *caller)
{
- void *mem = ___alloc_bootmem_nopanic(size, align, goal, limit);
+ void *mem = ___alloc_bootmem_nopanic(size, align, goal, limit, caller);

if (mem)
return mem;
@@ -271,8 +274,7 @@ void * __init __alloc_bootmem(unsigned long size, unsigned long align,
unsigned long goal)
{
unsigned long limit = -1UL;
-
- return ___alloc_bootmem(size, align, goal, limit);
+ return ___alloc_bootmem(size, align, goal, limit, (void *)_RET_IP_);
}

/**
@@ -290,8 +292,9 @@ void * __init __alloc_bootmem(unsigned long size, unsigned long align,
*
* The function panics if the request can not be satisfied.
*/
-void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size,
- unsigned long align, unsigned long goal)
+void * __init ____alloc_bootmem_node(pg_data_t *pgdat, unsigned long size,
+ unsigned long align, unsigned long goal,
+ void *caller)
{
void *ptr;

@@ -300,23 +303,28 @@ void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size,

again:
ptr = __alloc_memory_core_early(pgdat->node_id, size, align,
- goal, -1ULL);
+ goal, -1ULL, caller);
if (ptr)
return ptr;

ptr = __alloc_memory_core_early(MAX_NUMNODES, size, align,
- goal, -1ULL);
+ goal, -1ULL, caller);
if (!ptr && goal) {
goal = 0;
goto again;
}
return ptr;
}
+void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size,
+ unsigned long align, unsigned long goal)
+{
+ return ____alloc_bootmem_node(pgdat, size, align, goal, (void *)_RET_IP_);
+}

void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,
unsigned long align, unsigned long goal)
{
- return __alloc_bootmem_node(pgdat, size, align, goal);
+ return ____alloc_bootmem_node(pgdat, size, align, goal, (void *)_RET_IP_);
}

#ifdef CONFIG_SPARSEMEM
@@ -337,7 +345,7 @@ void * __init alloc_bootmem_section(unsigned long size,
limit = section_nr_to_pfn(section_nr + 1) << PAGE_SHIFT;

return __alloc_memory_core_early(early_pfn_to_nid(pfn), size,
- SMP_CACHE_BYTES, goal, limit);
+ SMP_CACHE_BYTES, goal, limit, (void *)_RET_IP_);
}
#endif

@@ -350,7 +358,7 @@ void * __init __alloc_bootmem_node_nopanic(pg_data_t *pgdat, unsigned long size,
return kzalloc_node(size, GFP_NOWAIT, pgdat->node_id);

ptr = __alloc_memory_core_early(pgdat->node_id, size, align,
- goal, -1ULL);
+ goal, -1ULL, (void *)_RET_IP_);
if (ptr)
return ptr;

@@ -377,7 +385,7 @@ void * __init __alloc_bootmem_node_nopanic(pg_data_t *pgdat, unsigned long size,
void * __init __alloc_bootmem_low(unsigned long size, unsigned long align,
unsigned long goal)
{
- return ___alloc_bootmem(size, align, goal, ARCH_LOW_ADDRESS_LIMIT);
+ return ___alloc_bootmem(size, align, goal, ARCH_LOW_ADDRESS_LIMIT, (void *)_RET_IP_);
}

/**
@@ -404,10 +412,10 @@ void * __init __alloc_bootmem_low_node(pg_data_t *pgdat, unsigned long size,
return kzalloc_node(size, GFP_NOWAIT, pgdat->node_id);

ptr = __alloc_memory_core_early(pgdat->node_id, size, align,
- goal, ARCH_LOW_ADDRESS_LIMIT);
+ goal, ARCH_LOW_ADDRESS_LIMIT, (void *)_RET_IP_);
if (ptr)
return ptr;

return __alloc_memory_core_early(MAX_NUMNODES, size, align,
- goal, ARCH_LOW_ADDRESS_LIMIT);
+ goal, ARCH_LOW_ADDRESS_LIMIT, (void *)_RET_IP_);
}
--
1.7.7.5

2012-05-04 19:23:01

by Yinghai Lu

[permalink] [raw]
Subject: Re: [RFC PATCH] Expand memblock=debug to provide a bit more details (v1).

On Fri, May 4, 2012 at 11:49 AM, Konrad Rzeszutek Wilk
<[email protected]> wrote:
> While trying to track down some memory allocation issues, I realized that
> memblock=debug was giving some information, but for guests with 256GB or
> so the majority of it was just:
>
> ?memblock_reserve: [0x00003efeeea000-0x00003efeeeb000] __alloc_memory_core_early+0x5c/0x64
>
> which really didn't tell me that much. With these patches I know it is:
>
> ?memblock_reserve: [0x00003ffe724000-0x00003ffe725000] (4kB) vmemmap_pmd_populate+0x4b/0xa2
>
> .. which isn't really that useful for the problem I was tracking down, but
> it does help in figuring out which routines are using memblock.
>

that RET_IP is not very helpful for debugging.

Actually I have local debug patch for memblock. please check if that
is going to help debugging.

Thanks

Yinghai


Attachments:
nobootmem_name_1.patch (12.83 kB)
nobootmem_name_2.patch (21.35 kB)
Download all attachments

2012-05-04 19:30:51

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [RFC PATCH] Expand memblock=debug to provide a bit more details (v1).

On Fri, May 04, 2012 at 12:22:58PM -0700, Yinghai Lu wrote:
> On Fri, May 4, 2012 at 11:49 AM, Konrad Rzeszutek Wilk
> <[email protected]> wrote:
> > While trying to track down some memory allocation issues, I realized that
> > memblock=debug was giving some information, but for guests with 256GB or
> > so the majority of it was just:
> >
> > ?memblock_reserve: [0x00003efeeea000-0x00003efeeeb000] __alloc_memory_core_early+0x5c/0x64
> >
> > which really didn't tell me that much. With these patches I know it is:
> >
> > ?memblock_reserve: [0x00003ffe724000-0x00003ffe725000] (4kB) vmemmap_pmd_populate+0x4b/0xa2
> >
> > .. which isn't really that useful for the problem I was tracking down, but
> > it does help in figuring out which routines are using memblock.
> >
>
> that RET_IP is not very helpful for debugging.

Is there a better way of doing it that is automatic?
>
> Actually I have local debug patch for memblock. please check if that
> is going to help debugging.
>
> Thanks
>
> Yinghai


2012-05-08 17:35:57

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH] Expand memblock=debug to provide a bit more details (v1).

On 05/04/2012 12:24 PM, Konrad Rzeszutek Wilk wrote:
>>
>> that RET_IP is not very helpful for debugging.
>
> Is there a better way of doing it that is automatic?
>>

It depends on what "it" is. You could do a full stack backtrace, or use
__builtin_return_address(N).

-hpa