2013-04-06 13:55:39

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 00/15] accurately calculate memory statisitic information

The original goal of this patchset is to fix the bug reported by
https://bugzilla.kernel.org/show_bug.cgi?id=53501
Now it has also been expanded to reduce common code used by memory
initializion.

This is the third part, previous two patch sets could be accessed at:
http://marc.info/?l=linux-mm&m=136289696323825&w=2
http://marc.info/?l=linux-mm&m=136290291524901&w=2

This patchset applies to
git://git.cmpxchg.org/linux-mmotm.git fc374c1f9d7bdcfb851b15b86e58ac5e1f645e32
which is based on mmotm-2013-03-26-15-09.

V2->V4:
1) rebase to git://git.cmpxchg.org/linux-mmotm.git
2) fix some build warnings and other minor bugs of previous patches

We have only tested these patchset on x86 platforms, and have done basic
compliation tests using cross-compilers from ftp.kernel.org. That means
some code may not pass compilation on some architectures. So any help
to test this patchset are welcomed!

Patch 1-7:
Bugfixes and more work for part1 and part2
Patch 8-9:
Fix typo and minor bugs in mm core
Patch 10-14:
Enhance the way to manage totalram_pages, totalhigh_pages and
zone->managed_pages.
Patch 15:
Report available pages within the node as "MemTotal" for sysfs
interface /sys/.../node/nodex/meminfo

Jiang Liu (15):
mm: fix build warnings caused by free_reserved_area()
mm: enhance free_reserved_area() to support poisoning memory with
zero
mm/ARM64: kill poison_init_mem()
mm/x86: use free_reserved_area() to simplify code
mm/tile: use common help functions to free reserved pages
mm, powertv: use free_reserved_area() to simplify code
mm, acornfb: use free_reserved_area() to simplify code
mm: fix some trivial typos in comments
mm: use managed_pages to calculate default zonelist order
mm: accurately calculate zone->managed_pages for highmem zones
mm: use a dedicated lock to protect totalram_pages and
zone->managed_pages
mm: make __free_pages_bootmem() only available at boot time
mm: correctly update zone->mamaged_pages
mm: concentrate modification of totalram_pages into the mm core
mm: report available pages as "MemTotal" for each NUMA node

arch/alpha/kernel/sys_nautilus.c | 2 +-
arch/alpha/mm/init.c | 6 ++--
arch/alpha/mm/numa.c | 2 +-
arch/arc/mm/init.c | 2 +-
arch/arm/mm/init.c | 13 ++++----
arch/arm64/mm/init.c | 15 ++-------
arch/avr32/mm/init.c | 6 ++--
arch/blackfin/mm/init.c | 6 ++--
arch/c6x/mm/init.c | 6 ++--
arch/cris/mm/init.c | 4 +--
arch/frv/mm/init.c | 6 ++--
arch/h8300/mm/init.c | 6 ++--
arch/hexagon/mm/init.c | 3 +-
arch/ia64/mm/init.c | 4 +--
arch/m32r/mm/init.c | 6 ++--
arch/m68k/mm/init.c | 8 ++---
arch/metag/mm/init.c | 11 ++++---
arch/microblaze/mm/init.c | 6 ++--
arch/mips/mm/init.c | 2 +-
arch/mips/powertv/asic/asic_devices.c | 13 ++------
arch/mips/sgi-ip27/ip27-memory.c | 2 +-
arch/mn10300/mm/init.c | 2 +-
arch/openrisc/mm/init.c | 6 ++--
arch/parisc/mm/init.c | 8 ++---
arch/powerpc/kernel/kvm.c | 2 +-
arch/powerpc/mm/mem.c | 7 ++---
arch/s390/mm/init.c | 4 +--
arch/score/mm/init.c | 2 +-
arch/sh/mm/init.c | 6 ++--
arch/sparc/mm/init_32.c | 3 +-
arch/sparc/mm/init_64.c | 2 +-
arch/tile/mm/init.c | 9 ++----
arch/um/kernel/mem.c | 4 +--
arch/unicore32/mm/init.c | 6 ++--
arch/x86/mm/highmem_32.c | 6 ++++
arch/x86/mm/init.c | 14 ++-------
arch/x86/mm/init_32.c | 2 +-
arch/x86/mm/init_64.c | 25 +++------------
arch/xtensa/mm/init.c | 6 ++--
drivers/video/acornfb.c | 28 ++---------------
drivers/virtio/virtio_balloon.c | 8 +++--
drivers/xen/balloon.c | 23 +++-----------
include/linux/bootmem.h | 1 +
include/linux/mm.h | 17 +++++-----
include/linux/mmzone.h | 14 ++++++---
mm/bootmem.c | 41 +++++++++++++++---------
mm/hugetlb.c | 2 +-
mm/memory_hotplug.c | 33 ++++----------------
mm/nobootmem.c | 35 ++++++++++++---------
mm/page_alloc.c | 55 +++++++++++++++++++++------------
50 files changed, 222 insertions(+), 278 deletions(-)

--
1.7.9.5


2013-04-06 13:55:48

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 01/15] mm: fix build warnings caused by free_reserved_area()

Fix following build warnings cuased by free_reserved_area():

arch/arm/mm/init.c: In function 'mem_init':
arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
^
In file included from include/linux/mman.h:4:0,
from arch/arm/mm/init.c:15:
include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
extern unsigned long free_reserved_area(unsigned long start, unsigned long end,

mm/page_alloc.c: In function 'free_reserved_area':
>> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
In file included from arch/mips/include/asm/page.h:49:0,
from include/linux/mmzone.h:20,
from include/linux/gfp.h:4,
from include/linux/mm.h:8,
from mm/page_alloc.c:18:
arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
mm/page_alloc.c: In function 'free_area_init_nodes':
mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]

Signed-off-by: Jiang Liu <[email protected]>
Reported-by: Arnd Bergmann <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
arch/arm/mm/init.c | 6 ++++--
mm/page_alloc.c | 2 +-
2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 9a5cdc0..7a82fcd 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -600,7 +600,8 @@ void __init mem_init(void)

#ifdef CONFIG_SA1111
/* now that our DMA memory is actually so designated, we can free it */
- free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
+ free_reserved_area((unsigned long)__va(PHYS_PFN_OFFSET),
+ (unsigned long)swapper_pg_dir, 0, NULL);
#endif

free_highpages();
@@ -728,7 +729,8 @@ void free_initmem(void)
extern char __tcm_start, __tcm_end;

poison_init_mem(&__tcm_start, &__tcm_end - &__tcm_start);
- free_reserved_area(&__tcm_start, &__tcm_end, 0, "TCM link");
+ free_reserved_area((unsigned long)&__tcm_start,
+ (unsigned long)&__tcm_end, 0, "TCM link");
#endif

poison_init_mem(__init_begin, __init_end - __init_begin);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e4923e9..8bf7956 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5196,7 +5196,7 @@ unsigned long free_reserved_area(unsigned long start, unsigned long end,
for (pages = 0; pos < end; pos += PAGE_SIZE, pages++) {
if (poison)
memset((void *)pos, poison, PAGE_SIZE);
- free_reserved_page(virt_to_page(pos));
+ free_reserved_page(virt_to_page((void *)pos));
}

if (pages && s)
--
1.7.9.5

2013-04-06 13:55:58

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 02/15] mm: enhance free_reserved_area() to support poisoning memory with zero

Address more review comments from last round of code review.
1) Enhance free_reserved_area() to support poisoning freed memory with
pattern '0'. This could be used to get rid of poison_init_mem()
on ARM64.
2) A previous patch has disabled memory poison for initmem on s390
by mistake, so restore to the original behavior.
3) Remove redundant PAGE_ALIGN() when calling free_reserved_area().

Signed-off-by: Jiang Liu <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
---
arch/alpha/kernel/sys_nautilus.c | 2 +-
arch/alpha/mm/init.c | 4 ++--
arch/arm/mm/init.c | 8 ++++----
arch/arm64/mm/init.c | 4 ++--
arch/avr32/mm/init.c | 4 ++--
arch/blackfin/mm/init.c | 4 ++--
arch/c6x/mm/init.c | 4 ++--
arch/cris/mm/init.c | 2 +-
arch/frv/mm/init.c | 4 ++--
arch/h8300/mm/init.c | 4 ++--
arch/ia64/mm/init.c | 2 +-
arch/m32r/mm/init.c | 4 ++--
arch/m68k/mm/init.c | 4 ++--
arch/microblaze/mm/init.c | 4 ++--
arch/openrisc/mm/init.c | 4 ++--
arch/parisc/mm/init.c | 4 ++--
arch/powerpc/kernel/kvm.c | 2 +-
arch/powerpc/mm/mem.c | 2 +-
arch/s390/mm/init.c | 2 +-
arch/sh/mm/init.c | 4 ++--
arch/um/kernel/mem.c | 2 +-
arch/unicore32/mm/init.c | 4 ++--
arch/xtensa/mm/init.c | 4 ++--
include/linux/mm.h | 11 ++++++-----
mm/page_alloc.c | 2 +-
25 files changed, 48 insertions(+), 47 deletions(-)

diff --git a/arch/alpha/kernel/sys_nautilus.c b/arch/alpha/kernel/sys_nautilus.c
index a8b9d66..7f4e7bf 100644
--- a/arch/alpha/kernel/sys_nautilus.c
+++ b/arch/alpha/kernel/sys_nautilus.c
@@ -234,7 +234,7 @@ nautilus_init_pci(void)
memtop = pci_mem;
if (memtop > alpha_mv.min_mem_address) {
free_reserved_area((unsigned long)__va(alpha_mv.min_mem_address),
- (unsigned long)__va(memtop), 0, NULL);
+ (unsigned long)__va(memtop), -1, NULL);
printk("nautilus_init_pci: %ldk freed\n",
(memtop - alpha_mv.min_mem_address) >> 10);
}
diff --git a/arch/alpha/mm/init.c b/arch/alpha/mm/init.c
index 0ba85ee..9930837 100644
--- a/arch/alpha/mm/init.c
+++ b/arch/alpha/mm/init.c
@@ -319,13 +319,13 @@ mem_init(void)
void
free_initmem(void)
{
- free_initmem_default(0);
+ free_initmem_default(-1);
}

#ifdef CONFIG_BLK_DEV_INITRD
void
free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 7a82fcd..a2ab290 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -601,7 +601,7 @@ void __init mem_init(void)
#ifdef CONFIG_SA1111
/* now that our DMA memory is actually so designated, we can free it */
free_reserved_area((unsigned long)__va(PHYS_PFN_OFFSET),
- (unsigned long)swapper_pg_dir, 0, NULL);
+ (unsigned long)swapper_pg_dir, -1, NULL);
#endif

free_highpages();
@@ -730,12 +730,12 @@ void free_initmem(void)

poison_init_mem(&__tcm_start, &__tcm_end - &__tcm_start);
free_reserved_area((unsigned long)&__tcm_start,
- (unsigned long)&__tcm_end, 0, "TCM link");
+ (unsigned long)&__tcm_end, -1, "TCM link");
#endif

poison_init_mem(__init_begin, __init_end - __init_begin);
if (!machine_is_integrator() && !machine_is_cintegrator())
- free_initmem_default(0);
+ free_initmem_default(-1);
}

#ifdef CONFIG_BLK_DEV_INITRD
@@ -746,7 +746,7 @@ void free_initrd_mem(unsigned long start, unsigned long end)
{
if (!keep_initrd) {
poison_init_mem((void *)start, PAGE_ALIGN(end) - start);
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
}

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f497ca7..e58dd7f 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -387,7 +387,7 @@ void __init mem_init(void)
void free_initmem(void)
{
poison_init_mem(__init_begin, __init_end - __init_begin);
- free_initmem_default(0);
+ free_initmem_default(-1);
}

#ifdef CONFIG_BLK_DEV_INITRD
@@ -398,7 +398,7 @@ void free_initrd_mem(unsigned long start, unsigned long end)
{
if (!keep_initrd) {
poison_init_mem((void *)start, PAGE_ALIGN(end) - start);
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
}

diff --git a/arch/avr32/mm/init.c b/arch/avr32/mm/init.c
index e66e840..871f98a 100644
--- a/arch/avr32/mm/init.c
+++ b/arch/avr32/mm/init.c
@@ -148,12 +148,12 @@ void __init mem_init(void)

void free_initmem(void)
{
- free_initmem_default(0);
+ free_initmem_default(-1);
}

#ifdef CONFIG_BLK_DEV_INITRD
void free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif
diff --git a/arch/blackfin/mm/init.c b/arch/blackfin/mm/init.c
index 82d01a7..e64286b 100644
--- a/arch/blackfin/mm/init.c
+++ b/arch/blackfin/mm/init.c
@@ -133,7 +133,7 @@ void __init mem_init(void)
void __init free_initrd_mem(unsigned long start, unsigned long end)
{
#ifndef CONFIG_MPU
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
#endif
}
#endif
@@ -141,7 +141,7 @@ void __init free_initrd_mem(unsigned long start, unsigned long end)
void __init_refok free_initmem(void)
{
#if defined CONFIG_RAMKERNEL && !defined CONFIG_MPU
- free_initmem_default(0);
+ free_initmem_default(-1);
if (memory_start == (unsigned long)(&__init_end))
memory_start = (unsigned long)(&__init_begin);
#endif
diff --git a/arch/c6x/mm/init.c b/arch/c6x/mm/init.c
index a9fcd89..ce39b48 100644
--- a/arch/c6x/mm/init.c
+++ b/arch/c6x/mm/init.c
@@ -77,11 +77,11 @@ void __init mem_init(void)
#ifdef CONFIG_BLK_DEV_INITRD
void __init free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif

void __init free_initmem(void)
{
- free_initmem_default(0);
+ free_initmem_default(-1);
}
diff --git a/arch/cris/mm/init.c b/arch/cris/mm/init.c
index 9ac8094..8fec263 100644
--- a/arch/cris/mm/init.c
+++ b/arch/cris/mm/init.c
@@ -65,5 +65,5 @@ mem_init(void)
void
free_initmem(void)
{
- free_initmem_default(0);
+ free_initmem_default(-1);
}
diff --git a/arch/frv/mm/init.c b/arch/frv/mm/init.c
index dee354f..a421948 100644
--- a/arch/frv/mm/init.c
+++ b/arch/frv/mm/init.c
@@ -162,7 +162,7 @@ void __init mem_init(void)
void free_initmem(void)
{
#if defined(CONFIG_RAMKERNEL) && !defined(CONFIG_PROTECT_KERNEL)
- free_initmem_default(0);
+ free_initmem_default(-1);
#endif
} /* end free_initmem() */

@@ -173,6 +173,6 @@ void free_initmem(void)
#ifdef CONFIG_BLK_DEV_INITRD
void __init free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
} /* end free_initrd_mem() */
#endif
diff --git a/arch/h8300/mm/init.c b/arch/h8300/mm/init.c
index ff349d7..488e2a3 100644
--- a/arch/h8300/mm/init.c
+++ b/arch/h8300/mm/init.c
@@ -161,7 +161,7 @@ void __init mem_init(void)
#ifdef CONFIG_BLK_DEV_INITRD
void free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif

@@ -169,7 +169,7 @@ void
free_initmem(void)
{
#ifdef CONFIG_RAMKERNEL
- free_initmem_default(0);
+ free_initmem_default(-1);
#endif
}

diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index d1fe4b4..941568a 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -156,7 +156,7 @@ free_initmem (void)
{
free_reserved_area((unsigned long)ia64_imva(__init_begin),
(unsigned long)ia64_imva(__init_end),
- 0, "unused kernel");
+ -1, "unused kernel");
}

void __init
diff --git a/arch/m32r/mm/init.c b/arch/m32r/mm/init.c
index ab4cbce..58ea4d6 100644
--- a/arch/m32r/mm/init.c
+++ b/arch/m32r/mm/init.c
@@ -181,7 +181,7 @@ void __init mem_init(void)
*======================================================================*/
void free_initmem(void)
{
- free_initmem_default(0);
+ free_initmem_default(-1);
}

#ifdef CONFIG_BLK_DEV_INITRD
@@ -191,6 +191,6 @@ void free_initmem(void)
*======================================================================*/
void free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif
diff --git a/arch/m68k/mm/init.c b/arch/m68k/mm/init.c
index 1af2ca3..75e1cbf 100644
--- a/arch/m68k/mm/init.c
+++ b/arch/m68k/mm/init.c
@@ -110,7 +110,7 @@ void __init paging_init(void)
void free_initmem(void)
{
#ifndef CONFIG_MMU_SUN3
- free_initmem_default(0);
+ free_initmem_default(-1);
#endif /* CONFIG_MMU_SUN3 */
}

@@ -202,6 +202,6 @@ void __init mem_init(void)
#ifdef CONFIG_BLK_DEV_INITRD
void free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif
diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
index 4ec137d..53383e4 100644
--- a/arch/microblaze/mm/init.c
+++ b/arch/microblaze/mm/init.c
@@ -235,13 +235,13 @@ void __init setup_memory(void)
#ifdef CONFIG_BLK_DEV_INITRD
void free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif

void free_initmem(void)
{
- free_initmem_default(0);
+ free_initmem_default(-1);
}

void __init mem_init(void)
diff --git a/arch/openrisc/mm/init.c b/arch/openrisc/mm/init.c
index b3cbc67..d19950c 100644
--- a/arch/openrisc/mm/init.c
+++ b/arch/openrisc/mm/init.c
@@ -261,11 +261,11 @@ void __init mem_init(void)
#ifdef CONFIG_BLK_DEV_INITRD
void free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif

void free_initmem(void)
{
- free_initmem_default(0);
+ free_initmem_default(-1);
}
diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
index 157b931..27f3f88 100644
--- a/arch/parisc/mm/init.c
+++ b/arch/parisc/mm/init.c
@@ -532,7 +532,7 @@ void free_initmem(void)
* pages are no-longer executable */
flush_icache_range(init_begin, init_end);

- num_physpages += free_initmem_default(0);
+ num_physpages += free_initmem_default(-1);

/* set up a new led state on systems shipped LED State panel */
pdc_chassis_send_status(PDC_CHASSIS_DIRECT_BCOMPLETE);
@@ -1099,6 +1099,6 @@ void flush_tlb_all(void)
#ifdef CONFIG_BLK_DEV_INITRD
void free_initrd_mem(unsigned long start, unsigned long end)
{
- num_physpages += free_reserved_area(start, end, 0, "initrd");
+ num_physpages += free_reserved_area(start, end, -1, "initrd");
}
#endif
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 6782221..4d3e37d 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -756,7 +756,7 @@ static __init void kvm_free_tmp(void)
end = (ulong)&kvm_tmp[ARRAY_SIZE(kvm_tmp)] & PAGE_MASK;

/* Free the tmp space we don't need */
- free_reserved_area(start, end, 0, NULL);
+ free_reserved_area(start, end, -1, NULL);
}

static int __init kvm_guest_init(void)
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index cd76c45..2e912ca 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -408,7 +408,7 @@ void free_initmem(void)
#ifdef CONFIG_BLK_DEV_INITRD
void __init free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif

diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 0b09b23..275345e 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -164,7 +164,7 @@ void __init mem_init(void)

void free_initmem(void)
{
- free_initmem_default(0);
+ free_initmem_default(POISON_FREE_INITMEM);
}

#ifdef CONFIG_BLK_DEV_INITRD
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index 20f9ead..31294f1 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -499,13 +499,13 @@ void __init mem_init(void)

void free_initmem(void)
{
- free_initmem_default(0);
+ free_initmem_default(-1);
}

#ifdef CONFIG_BLK_DEV_INITRD
void free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif

diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c
index 9df292b..1e84189 100644
--- a/arch/um/kernel/mem.c
+++ b/arch/um/kernel/mem.c
@@ -244,7 +244,7 @@ void free_initmem(void)
#ifdef CONFIG_BLK_DEV_INITRD
void free_initrd_mem(unsigned long start, unsigned long end)
{
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif

diff --git a/arch/unicore32/mm/init.c b/arch/unicore32/mm/init.c
index 63df12d..5614b05 100644
--- a/arch/unicore32/mm/init.c
+++ b/arch/unicore32/mm/init.c
@@ -476,7 +476,7 @@ void __init mem_init(void)

void free_initmem(void)
{
- free_initmem_default(0);
+ free_initmem_default(-1);
}

#ifdef CONFIG_BLK_DEV_INITRD
@@ -486,7 +486,7 @@ static int keep_initrd;
void free_initrd_mem(unsigned long start, unsigned long end)
{
if (!keep_initrd)
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}

static int __init keepinitrd_setup(char *__unused)
diff --git a/arch/xtensa/mm/init.c b/arch/xtensa/mm/init.c
index bba125b..6f70647 100644
--- a/arch/xtensa/mm/init.c
+++ b/arch/xtensa/mm/init.c
@@ -214,11 +214,11 @@ extern int initrd_is_mapped;
void free_initrd_mem(unsigned long start, unsigned long end)
{
if (initrd_is_mapped)
- free_reserved_area(start, end, 0, "initrd");
+ free_reserved_area(start, end, -1, "initrd");
}
#endif

void free_initmem(void)
{
- free_initmem_default(0);
+ free_initmem_default(-1);
}
diff --git a/include/linux/mm.h b/include/linux/mm.h
index da099bc..1f03b0e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1296,7 +1296,7 @@ extern void free_initmem(void);
/*
* Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK)
* into the buddy system. The freed pages will be poisoned with pattern
- * "poison" if it's non-zero.
+ * "poison" if it's within range [0, UCHAR_MAX].
* Return pages freed into the buddy system.
*/
extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
@@ -1336,15 +1336,16 @@ static inline void mark_page_reserved(struct page *page)

/*
* Default method to free all the __init memory into the buddy system.
- * The freed pages will be poisoned with pattern "poison" if it is
- * non-zero. Return pages freed into the buddy system.
+ * The freed pages will be poisoned with pattern "poison" if it's within
+ * range [0, UCHAR_MAX].
+ * Return pages freed into the buddy system.
*/
static inline unsigned long free_initmem_default(int poison)
{
extern char __init_begin[], __init_end[];

- return free_reserved_area(PAGE_ALIGN((unsigned long)&__init_begin) ,
- ((unsigned long)&__init_end) & PAGE_MASK,
+ return free_reserved_area((unsigned long)&__init_begin ,
+ (unsigned long)&__init_end,
poison, "unused kernel");
}

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8bf7956..6bd697c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5194,7 +5194,7 @@ unsigned long free_reserved_area(unsigned long start, unsigned long end,
pos = start = PAGE_ALIGN(start);
end &= PAGE_MASK;
for (pages = 0; pos < end; pos += PAGE_SIZE, pages++) {
- if (poison)
+ if ((unsigned int)poison <= 0xFF)
memset((void *)pos, poison, PAGE_SIZE);
free_reserved_page(virt_to_page((void *)pos));
}
--
1.7.9.5

2013-04-06 13:56:06

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 03/15] mm/ARM64: kill poison_init_mem()

Use free_reserved_area() to poison initmem memory pages and kill
poison_init_mem() on ARM64.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
arch/arm64/mm/init.c | 17 +++--------------
1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index e58dd7f..b87bdb8 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -197,14 +197,6 @@ void __init bootmem_init(void)
max_pfn = max_low_pfn = max;
}

-/*
- * Poison init memory with an undefined instruction (0x0).
- */
-static inline void poison_init_mem(void *s, size_t count)
-{
- memset(s, 0, count);
-}
-
#ifndef CONFIG_SPARSEMEM_VMEMMAP
static inline void free_memmap(unsigned long start_pfn, unsigned long end_pfn)
{
@@ -386,8 +378,7 @@ void __init mem_init(void)

void free_initmem(void)
{
- poison_init_mem(__init_begin, __init_end - __init_begin);
- free_initmem_default(-1);
+ free_initmem_default(0);
}

#ifdef CONFIG_BLK_DEV_INITRD
@@ -396,10 +387,8 @@ static int keep_initrd;

void free_initrd_mem(unsigned long start, unsigned long end)
{
- if (!keep_initrd) {
- poison_init_mem((void *)start, PAGE_ALIGN(end) - start);
- free_reserved_area(start, end, -1, "initrd");
- }
+ if (!keep_initrd)
+ free_reserved_area(start, end, 0, "initrd");
}

static int __init keepinitrd_setup(char *__unused)
--
1.7.9.5

2013-04-06 13:56:18

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 04/15] mm/x86: use free_reserved_area() to simplify code

Use common help function free_reserved_area() to simplify code.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: Yinghai Lu <[email protected]>
Cc: Tang Chen <[email protected]>
Cc: Wen Congyang <[email protected]>
Cc: Jianguo Wu <[email protected]>
Cc: [email protected]
---
arch/x86/mm/init.c | 14 +++-----------
arch/x86/mm/init_64.c | 5 ++---
2 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index fdc5dca..6738e1b 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -477,7 +477,6 @@ int devmem_is_allowed(unsigned long pagenr)

void free_init_pages(char *what, unsigned long begin, unsigned long end)
{
- unsigned long addr;
unsigned long begin_aligned, end_aligned;

/* Make sure boundaries are page aligned */
@@ -492,8 +491,6 @@ void free_init_pages(char *what, unsigned long begin, unsigned long end)
if (begin >= end)
return;

- addr = begin;
-
/*
* If debugging page accesses then do not free this memory but
* mark them not present - any buggy init-section access will
@@ -512,18 +509,13 @@ void free_init_pages(char *what, unsigned long begin, unsigned long end)
set_memory_nx(begin, (end - begin) >> PAGE_SHIFT);
set_memory_rw(begin, (end - begin) >> PAGE_SHIFT);

- printk(KERN_INFO "Freeing %s: %luk freed\n", what, (end - begin) >> 10);
-
- for (; addr < end; addr += PAGE_SIZE) {
- memset((void *)addr, POISON_FREE_INITMEM, PAGE_SIZE);
- free_reserved_page(virt_to_page(addr));
- }
+ free_reserved_area(begin, end, POISON_FREE_INITMEM, what);
#endif
}

void free_initmem(void)
{
- free_init_pages("unused kernel memory",
+ free_init_pages("unused kernel",
(unsigned long)(&__init_begin),
(unsigned long)(&__init_end));
}
@@ -549,7 +541,7 @@ void __init free_initrd_mem(unsigned long start, unsigned long end)
* - relocate_initrd()
* So here We can do PAGE_ALIGN() safely to get partial page to be freed
*/
- free_init_pages("initrd memory", start, PAGE_ALIGN(end));
+ free_init_pages("initrd", start, PAGE_ALIGN(end));
}
#endif

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index caad9a0..0c6efb8 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1165,11 +1165,10 @@ void mark_rodata_ro(void)
set_memory_ro(start, (end-start) >> PAGE_SHIFT);
#endif

- free_init_pages("unused kernel memory",
+ free_init_pages("unused kernel",
(unsigned long) __va(__pa_symbol(text_end)),
(unsigned long) __va(__pa_symbol(rodata_start)));
-
- free_init_pages("unused kernel memory",
+ free_init_pages("unused kernel",
(unsigned long) __va(__pa_symbol(rodata_end)),
(unsigned long) __va(__pa_symbol(_sdata)));
}
--
1.7.9.5

2013-04-06 13:56:28

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 05/15] mm/tile: use common help functions to free reserved pages

Use common help functions to free reserved pages.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Wen Congyang <[email protected]>
Cc: [email protected]
---
arch/tile/mm/init.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/tile/mm/init.c b/arch/tile/mm/init.c
index 2749515..ccfeb3f 100644
--- a/arch/tile/mm/init.c
+++ b/arch/tile/mm/init.c
@@ -720,7 +720,7 @@ static void __init init_free_pfn_range(unsigned long start, unsigned long end)
}
init_page_count(page);
__free_pages(page, order);
- totalram_pages += count;
+ adjust_managed_page_count(page, count);

page += count;
pfn += count;
@@ -1024,16 +1024,13 @@ static void free_init_pages(char *what, unsigned long begin, unsigned long end)
pte_clear(&init_mm, addr, ptep);
continue;
}
- __ClearPageReserved(page);
- init_page_count(page);
if (pte_huge(*ptep))
BUG_ON(!kdata_huge);
else
set_pte_at(&init_mm, addr, ptep,
pfn_pte(pfn, PAGE_KERNEL));
memset((void *)addr, POISON_FREE_INITMEM, PAGE_SIZE);
- free_page(addr);
- totalram_pages++;
+ free_reserved_page(page);
}
pr_info("Freeing %s: %ldk freed\n", what, (end - begin) >> 10);
}
--
1.7.9.5

2013-04-06 13:56:35

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 06/15] mm, powertv: use free_reserved_area() to simplify code

Use common help function free_reserved_area() to simplify code.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Jiang Liu <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
arch/mips/powertv/asic/asic_devices.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/arch/mips/powertv/asic/asic_devices.c b/arch/mips/powertv/asic/asic_devices.c
index d38b095..9f64c23 100644
--- a/arch/mips/powertv/asic/asic_devices.c
+++ b/arch/mips/powertv/asic/asic_devices.c
@@ -529,17 +529,8 @@ EXPORT_SYMBOL(asic_resource_get);
*/
void platform_release_memory(void *ptr, int size)
{
- unsigned long addr;
- unsigned long end;
-
- addr = ((unsigned long)ptr + (PAGE_SIZE - 1)) & PAGE_MASK;
- end = ((unsigned long)ptr + size) & PAGE_MASK;
-
- for (; addr < end; addr += PAGE_SIZE) {
- ClearPageReserved(virt_to_page(__va(addr)));
- init_page_count(virt_to_page(__va(addr)));
- free_page((unsigned long)__va(addr));
- }
+ free_reserved_area((unsigned long)ptr, (unsigned long)(ptr + size),
+ -1, NULL);
}
EXPORT_SYMBOL(platform_release_memory);

--
1.7.9.5

2013-04-06 13:56:45

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 07/15] mm, acornfb: use free_reserved_area() to simplify code

Use common help function free_reserved_area() to simplify code.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Florian Tobias Schandinat <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
drivers/video/acornfb.c | 28 ++--------------------------
1 file changed, 2 insertions(+), 26 deletions(-)

diff --git a/drivers/video/acornfb.c b/drivers/video/acornfb.c
index 6488a73..344f2bb 100644
--- a/drivers/video/acornfb.c
+++ b/drivers/video/acornfb.c
@@ -1188,32 +1188,8 @@ static int acornfb_detect_monitortype(void)
static inline void
free_unused_pages(unsigned int virtual_start, unsigned int virtual_end)
{
- int mb_freed = 0;
-
- /*
- * Align addresses
- */
- virtual_start = PAGE_ALIGN(virtual_start);
- virtual_end = PAGE_ALIGN(virtual_end);
-
- while (virtual_start < virtual_end) {
- struct page *page;
-
- /*
- * Clear page reserved bit,
- * set count to 1, and free
- * the page.
- */
- page = virt_to_page(virtual_start);
- ClearPageReserved(page);
- init_page_count(page);
- free_page(virtual_start);
-
- virtual_start += PAGE_SIZE;
- mb_freed += PAGE_SIZE / 1024;
- }
-
- printk("acornfb: freed %dK memory\n", mb_freed);
+ free_reserved_area(virtual_start, PAGE_ALIGN(virtual_end),
+ -1, "acornfb");
}

static int acornfb_probe(struct platform_device *dev)
--
1.7.9.5

2013-04-06 13:56:55

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 08/15] mm: fix some trivial typos in comments

Fix some trivial typos in comments.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Wen Congyang <[email protected]>
Cc: Tang Chen <[email protected]>
Cc: Jiang Liu <[email protected]>
Cc: Yasuaki Ishimatsu <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
mm/memory_hotplug.c | 2 +-
mm/page_alloc.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 57decb2..a5b8fde 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -309,7 +309,7 @@ static int __meminit move_pfn_range_left(struct zone *z1, struct zone *z2,
/* can't move pfns which are higher than @z2 */
if (end_pfn > zone_end_pfn(z2))
goto out_fail;
- /* the move out part mast at the left most of @z2 */
+ /* the move out part must at the left most of @z2 */
if (start_pfn > z2->zone_start_pfn)
goto out_fail;
/* must included/overlap */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6bd697c..c3c3eda 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2863,7 +2863,7 @@ EXPORT_SYMBOL(free_pages_exact);
* nr_free_zone_pages() counts the number of counts pages which are beyond the
* high watermark within all zones at or below a given zone index. For each
* zone, the number of pages is calculated as:
- * present_pages - high_pages
+ * managed_pages - high_pages
*/
static unsigned long nr_free_zone_pages(int offset)
{
--
1.7.9.5

2013-04-06 13:57:08

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 09/15] mm: use managed_pages to calculate default zonelist order

Use zone->managed_pages instead of zone->present_pages to calculate
default zonelist order because managed_pages means allocatable pages.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
mm/page_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c3c3eda..1f94380 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3449,8 +3449,8 @@ static int default_zonelist_order(void)
z = &NODE_DATA(nid)->node_zones[zone_type];
if (populated_zone(z)) {
if (zone_type < ZONE_NORMAL)
- low_kmem_size += z->present_pages;
- total_size += z->present_pages;
+ low_kmem_size += z->managed_pages;
+ total_size += z->managed_pages;
} else if (zone_type == ZONE_NORMAL) {
/*
* If any node has only lowmem, then node order
--
1.7.9.5

2013-04-06 13:57:20

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 10/15] mm: accurately calculate zone->managed_pages for highmem zones

Commit "mm: introduce new field 'managed_pages' to struct zone" assumes
that all highmem pages will be freed into the buddy system by function
mem_init(). But that's not always true, some architectures may reserve
some highmem pages during boot. For example PPC may allocate highmem
pages for giagant HugeTLB pages, and several architectures have code to
check PageReserved flag to exclude highmem pages allocated during boot
when freeing highmem pages into the buddy system.

So treat highmem pages in the same way as normal pages, that is to:
1) reset zone->managed_pages to zero in mem_init().
2) recalculate managed_pages when freeing pages into the buddy system.

Signed-off-by: Jiang Liu <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: Tejun Heo <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Kamezawa Hiroyuki <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
arch/metag/mm/init.c | 6 ++++++
arch/x86/mm/highmem_32.c | 6 ++++++
include/linux/bootmem.h | 1 +
mm/bootmem.c | 32 ++++++++++++++++++--------------
mm/nobootmem.c | 30 ++++++++++++++++--------------
mm/page_alloc.c | 1 +
6 files changed, 48 insertions(+), 28 deletions(-)

diff --git a/arch/metag/mm/init.c b/arch/metag/mm/init.c
index d05b845..58a36f3 100644
--- a/arch/metag/mm/init.c
+++ b/arch/metag/mm/init.c
@@ -380,6 +380,12 @@ void __init mem_init(void)

#ifdef CONFIG_HIGHMEM
unsigned long tmp;
+
+ /*
+ * Explicitly reset zone->managed_pages because highmem pages are
+ * freed before calling free_all_bootmem_node();
+ */
+ reset_all_zones_managed_pages();
for (tmp = highstart_pfn; tmp < highend_pfn; tmp++)
free_highmem_page(pfn_to_page(tmp));
num_physpages += totalhigh_pages;
diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c
index 252b8f5..4500142 100644
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -1,6 +1,7 @@
#include <linux/highmem.h>
#include <linux/module.h>
#include <linux/swap.h> /* for totalram_pages */
+#include <linux/bootmem.h>

void *kmap(struct page *page)
{
@@ -121,6 +122,11 @@ void __init set_highmem_pages_init(void)
struct zone *zone;
int nid;

+ /*
+ * Explicitly reset zone->managed_pages because set_highmem_pages_init()
+ * is invoked before free_all_bootmem()
+ */
+ reset_all_zones_managed_pages();
for_each_zone(zone) {
unsigned long zone_start_pfn, zone_end_pfn;

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index cdc3bab..b585f57 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -47,6 +47,7 @@ extern unsigned long init_bootmem(unsigned long addr, unsigned long memend);
extern unsigned long free_low_memory_core_early(int nodeid);
extern unsigned long free_all_bootmem_node(pg_data_t *pgdat);
extern unsigned long free_all_bootmem(void);
+extern void reset_all_zones_managed_pages(void);

extern void free_bootmem_node(pg_data_t *pgdat,
unsigned long addr,
diff --git a/mm/bootmem.c b/mm/bootmem.c
index d937107..39e6156 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -242,20 +242,26 @@ static unsigned long __init free_all_bootmem_core(bootmem_data_t *bdata)
return count;
}

-static void reset_node_lowmem_managed_pages(pg_data_t *pgdat)
+static int reset_managed_pages_done __initdata;
+
+static inline void __init reset_node_managed_pages(pg_data_t *pgdat)
{
struct zone *z;

- /*
- * In free_area_init_core(), highmem zone's managed_pages is set to
- * present_pages, and bootmem allocator doesn't allocate from highmem
- * zones. So there's no need to recalculate managed_pages because all
- * highmem pages will be managed by the buddy system. Here highmem
- * zone also includes highmem movable zone.
- */
+ if (reset_managed_pages_done)
+ return;
+
for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++)
- if (!is_highmem(z))
- z->managed_pages = 0;
+ z->managed_pages = 0;
+}
+
+void __init reset_all_zones_managed_pages(void)
+{
+ struct pglist_data *pgdat;
+
+ for_each_online_pgdat(pgdat)
+ reset_node_managed_pages(pgdat);
+ reset_managed_pages_done = 1;
}

/**
@@ -267,7 +273,7 @@ static void reset_node_lowmem_managed_pages(pg_data_t *pgdat)
unsigned long __init free_all_bootmem_node(pg_data_t *pgdat)
{
register_page_bootmem_info_node(pgdat);
- reset_node_lowmem_managed_pages(pgdat);
+ reset_node_managed_pages(pgdat);
return free_all_bootmem_core(pgdat->bdata);
}

@@ -280,10 +286,8 @@ unsigned long __init free_all_bootmem(void)
{
unsigned long total_pages = 0;
bootmem_data_t *bdata;
- struct pglist_data *pgdat;

- for_each_online_pgdat(pgdat)
- reset_node_lowmem_managed_pages(pgdat);
+ reset_all_zones_managed_pages();

list_for_each_entry(bdata, &bdata_list, list)
total_pages += free_all_bootmem_core(bdata);
diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index 5e07d36..fa584ff 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -137,20 +137,25 @@ unsigned long __init free_low_memory_core_early(int nodeid)
return count;
}

-static void reset_node_lowmem_managed_pages(pg_data_t *pgdat)
+static int reset_managed_pages_done __initdata;
+
+static inline void __init reset_node_managed_pages(pg_data_t *pgdat)
{
struct zone *z;

- /*
- * In free_area_init_core(), highmem zone's managed_pages is set to
- * present_pages, and bootmem allocator doesn't allocate from highmem
- * zones. So there's no need to recalculate managed_pages because all
- * highmem pages will be managed by the buddy system. Here highmem
- * zone also includes highmem movable zone.
- */
+ if (reset_managed_pages_done)
+ return;
for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++)
- if (!is_highmem(z))
- z->managed_pages = 0;
+ z->managed_pages = 0;
+}
+
+void __init reset_all_zones_managed_pages(void)
+{
+ struct pglist_data *pgdat;
+
+ for_each_online_pgdat(pgdat)
+ reset_node_managed_pages(pgdat);
+ reset_managed_pages_done = 1;
}

/**
@@ -160,10 +165,7 @@ static void reset_node_lowmem_managed_pages(pg_data_t *pgdat)
*/
unsigned long __init free_all_bootmem(void)
{
- struct pglist_data *pgdat;
-
- for_each_online_pgdat(pgdat)
- reset_node_lowmem_managed_pages(pgdat);
+ reset_all_zones_managed_pages();

/*
* We need to use MAX_NUMNODES instead of NODE_DATA(0)->node_id
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1f94380..45be58c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5211,6 +5211,7 @@ void free_highmem_page(struct page *page)
{
__free_reserved_page(page);
totalram_pages++;
+ page_zone(page)->managed_pages++;
totalhigh_pages++;
}
#endif
--
1.7.9.5

2013-04-06 13:57:31

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 11/15] mm: use a dedicated lock to protect totalram_pages and zone->managed_pages

Currently lock_memory_hotplug()/unlock_memory_hotplug() are used to
protect totalram_pages and zone->managed_pages. Other than the memory
hotplug driver, totalram_pages and zone->managed_pages may also be
modified at runtime by other drivers, such as Xen balloon,
virtio_balloon etc. For those cases, memory hotplug lock is a little
too heavy, so introduce a dedicated lock to protect totalram_pages
and zone->managed_pages.

Now we have a simplified locking rules totalram_pages and
zone->managed_pages as:
1) no locking for read accesses because they are unsigned long.
2) no locking for write accesses at boot time in single-threaded context.
3) serialize write accesses at runtime by acquiring the dedicated
managed_page_count_lock.

Also adjust zone->managed_pages when freeing reserved pages into the
buddy system, to keep totalram_pages and zone->managed_pages in
consistence.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: [email protected] (open list:MEMORY MANAGEMENT)
Cc: [email protected] (open list)
---
include/linux/mm.h | 6 ++----
include/linux/mmzone.h | 14 ++++++++++----
mm/page_alloc.c | 19 +++++++++++++++++++
3 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1f03b0e..da3ffb0 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1301,6 +1301,7 @@ extern void free_initmem(void);
*/
extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
int poison, char *s);
+
#ifdef CONFIG_HIGHMEM
/*
* Free a highmem page into the buddy system, adjusting totalhigh_pages
@@ -1309,10 +1310,7 @@ extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
extern void free_highmem_page(struct page *page);
#endif

-static inline void adjust_managed_page_count(struct page *page, long count)
-{
- totalram_pages += count;
-}
+extern void adjust_managed_page_count(struct page *page, long count);

/* Free the reserved page into the buddy system, so it gets managed. */
static inline void __free_reserved_page(struct page *page)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 72e1cb5..dc9c6ca 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -474,10 +474,16 @@ struct zone {
* frequently read in proximity to zone->lock. It's good to
* give them a chance of being in the same cacheline.
*
- * Write access to present_pages and managed_pages at runtime should
- * be protected by lock_memory_hotplug()/unlock_memory_hotplug().
- * Any reader who can't tolerant drift of present_pages and
- * managed_pages should hold memory hotplug lock to get a stable value.
+ * Write access to present_pages at runtime should be protected by
+ * lock_memory_hotplug()/unlock_memory_hotplug(). Any reader who can't
+ * tolerant drift of present_pages should hold memory hotplug lock to
+ * get a stable value.
+ *
+ * Read access to managed_pages should be safe because it's unsigned
+ * long. Write access to zone->managed_pages and totalram_pages are
+ * protected by managed_page_count_lock at runtime. Idealy only
+ * adjust_managed_page_count() should be used instead of directly
+ * touching zone->managed_pages and totalram_pages.
*/
unsigned long spanned_pages;
unsigned long present_pages;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 45be58c..ca1a6ce 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -100,6 +100,9 @@ nodemask_t node_states[NR_NODE_STATES] __read_mostly = {
};
EXPORT_SYMBOL(node_states);

+/* Protect totalram_pages and zone->managed_pages */
+static DEFINE_SPINLOCK(managed_page_count_lock);
+
unsigned long totalram_pages __read_mostly;
unsigned long totalreserve_pages __read_mostly;
/*
@@ -5186,6 +5189,22 @@ early_param("movablecore", cmdline_parse_movablecore);

#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */

+void adjust_managed_page_count(struct page *page, long count)
+{
+ bool lock = (system_state != SYSTEM_BOOTING);
+
+ /* No need to acquire the lock during boot */
+ if (lock)
+ spin_lock(&managed_page_count_lock);
+
+ page_zone(page)->managed_pages += count;
+ totalram_pages += count;
+
+ if (lock)
+ spin_unlock(&managed_page_count_lock);
+}
+EXPORT_SYMBOL(adjust_managed_page_count);
+
unsigned long free_reserved_area(unsigned long start, unsigned long end,
int poison, char *s)
{
--
1.7.9.5

2013-04-06 13:57:40

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 12/15] mm: make __free_pages_bootmem() only available at boot time

In order to simpilify management of totalram_pages and
zone->managed_pages, make __free_pages_bootmem() only available
at boot time. With this change applied, __free_pages_bootmem()
will only be used by bootmem.c and nobootmem.c at boot time,
so mark it as __init. Other callers of __free_pages_bootmem()
have been converted to use free_reserved_page(), which handles
totalram_pages and zone->managed_pages in a safer way.

This patch also fix a bug in free_pagetable() for x86_64, which
should increase zone->managed_pages instead of zone->present_pages
when freeing reserved pages.

And now we have managed_pages_count_lock to protect totalram_pages
and zone->managed_pages, so remove the redundant ppb_lock lock in
put_page_bootmem(). This greatly simplifies the locking rules.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Cc: Wen Congyang <[email protected]>
Cc: Tang Chen <[email protected]>
Cc: Yasuaki Ishimatsu <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
arch/x86/mm/init_64.c | 18 ++----------------
mm/memory_hotplug.c | 16 ++--------------
mm/page_alloc.c | 9 +--------
3 files changed, 5 insertions(+), 38 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 0c6efb8..6ab46f2 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -711,36 +711,22 @@ EXPORT_SYMBOL_GPL(arch_add_memory);

static void __meminit free_pagetable(struct page *page, int order)
{
- struct zone *zone;
- bool bootmem = false;
unsigned long magic;
unsigned int nr_pages = 1 << order;

/* bootmem page has reserved flag */
if (PageReserved(page)) {
__ClearPageReserved(page);
- bootmem = true;

magic = (unsigned long)page->lru.next;
if (magic == SECTION_INFO || magic == MIX_SECTION_INFO) {
while (nr_pages--)
put_page_bootmem(page++);
} else
- __free_pages_bootmem(page, order);
+ while (nr_pages--)
+ free_reserved_page(page++);
} else
free_pages((unsigned long)page_address(page), order);
-
- /*
- * SECTION_INFO pages and MIX_SECTION_INFO pages
- * are all allocated by bootmem.
- */
- if (bootmem) {
- zone = page_zone(page);
- zone_span_writelock(zone);
- zone->present_pages += nr_pages;
- zone_span_writeunlock(zone);
- totalram_pages += nr_pages;
- }
}

static void __meminit free_pte_table(pte_t *pte_start, pmd_t *pmd)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a5b8fde..6a600f1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -101,12 +101,9 @@ void get_page_bootmem(unsigned long info, struct page *page,
atomic_inc(&page->_count);
}

-/* reference to __meminit __free_pages_bootmem is valid
- * so use __ref to tell modpost not to generate a warning */
-void __ref put_page_bootmem(struct page *page)
+void put_page_bootmem(struct page *page)
{
unsigned long type;
- static DEFINE_MUTEX(ppb_lock);

type = (unsigned long) page->lru.next;
BUG_ON(type < MEMORY_HOTPLUG_MIN_BOOTMEM_TYPE ||
@@ -116,17 +113,8 @@ void __ref put_page_bootmem(struct page *page)
ClearPagePrivate(page);
set_page_private(page, 0);
INIT_LIST_HEAD(&page->lru);
-
- /*
- * Please refer to comment for __free_pages_bootmem()
- * for why we serialize here.
- */
- mutex_lock(&ppb_lock);
- __free_pages_bootmem(page, 0);
- mutex_unlock(&ppb_lock);
- totalram_pages++;
+ free_reserved_page(page);
}
-
}

#ifdef CONFIG_HAVE_BOOTMEM_INFO_NODE
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ca1a6ce..b87596f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -745,14 +745,7 @@ static void __free_pages_ok(struct page *page, unsigned int order)
local_irq_restore(flags);
}

-/*
- * Read access to zone->managed_pages is safe because it's unsigned long,
- * but we still need to serialize writers. Currently all callers of
- * __free_pages_bootmem() except put_page_bootmem() should only be used
- * at boot time. So for shorter boot time, we shift the burden to
- * put_page_bootmem() to serialize writers.
- */
-void __meminit __free_pages_bootmem(struct page *page, unsigned int order)
+void __init __free_pages_bootmem(struct page *page, unsigned int order)
{
unsigned int nr_pages = 1 << order;
unsigned int loop;
--
1.7.9.5

2013-04-06 13:57:53

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 13/15] mm: correctly update zone->mamaged_pages

Enhance adjust_managed_page_count() to adjust totalhigh_pages for
highmem pages. And change code which directly adjusts totalram_pages
to use adjust_managed_page_count() because it adjusts totalram_pages,
totalhigh_pages and zone->managed_pages altogether in a safe way.

Remove inc_totalhigh_pages() and dec_totalhigh_pages() from xen/balloon
driver bacause adjust_managed_page_count() has already adjusted
totalhigh_pages.

This patch also fixes two bugs:
1) enhances virtio_balloon driver to adjust totalhigh_pages when
reserve/unreserve pages.
2) enhance memory_hotplug.c to adjust totalhigh_pages when hot-removing
memory.

We still need to deal with modifications of totalram_pages in file
arch/powerpc/platforms/pseries/cmm.c, but need help from PPC experts.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Jeremy Fitzhardinge <[email protected]>
Cc: Wen Congyang <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Tang Chen <[email protected]>
Cc: Yasuaki Ishimatsu <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
drivers/virtio/virtio_balloon.c | 8 +++++---
drivers/xen/balloon.c | 23 +++++------------------
mm/hugetlb.c | 2 +-
mm/memory_hotplug.c | 15 +++------------
mm/page_alloc.c | 10 +++++-----
5 files changed, 19 insertions(+), 39 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index bd3ae32..6649968 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -148,7 +148,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
}
set_page_pfns(vb->pfns + vb->num_pfns, page);
vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
- totalram_pages--;
+ adjust_managed_page_count(page, -1);
}

/* Did we get any? */
@@ -160,11 +160,13 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
static void release_pages_by_pfn(const u32 pfns[], unsigned int num)
{
unsigned int i;
+ struct page *page;

/* Find pfns pointing at start of each page, get pages and free them. */
for (i = 0; i < num; i += VIRTIO_BALLOON_PAGES_PER_PAGE) {
- balloon_page_free(balloon_pfn_to_page(pfns[i]));
- totalram_pages++;
+ page = balloon_pfn_to_page(pfns[i]);
+ balloon_page_free(page);
+ adjust_managed_page_count(page, 1);
}
}

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index d42da3b..a453c05 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -89,14 +89,6 @@ EXPORT_SYMBOL_GPL(balloon_stats);
/* We increase/decrease in batches which fit in a page */
static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)];

-#ifdef CONFIG_HIGHMEM
-#define inc_totalhigh_pages() (totalhigh_pages++)
-#define dec_totalhigh_pages() (totalhigh_pages--)
-#else
-#define inc_totalhigh_pages() do {} while (0)
-#define dec_totalhigh_pages() do {} while (0)
-#endif
-
/* List of ballooned pages, threaded through the mem_map array. */
static LIST_HEAD(ballooned_pages);

@@ -132,9 +124,7 @@ static void __balloon_append(struct page *page)
static void balloon_append(struct page *page)
{
__balloon_append(page);
- if (PageHighMem(page))
- dec_totalhigh_pages();
- totalram_pages--;
+ adjust_managed_page_count(page, -1);
}

/* balloon_retrieve: rescue a page from the balloon, if it is not empty. */
@@ -151,13 +141,12 @@ static struct page *balloon_retrieve(bool prefer_highmem)
page = list_entry(ballooned_pages.next, struct page, lru);
list_del(&page->lru);

- if (PageHighMem(page)) {
+ if (PageHighMem(page))
balloon_stats.balloon_high--;
- inc_totalhigh_pages();
- } else
+ else
balloon_stats.balloon_low--;

- totalram_pages++;
+ adjust_managed_page_count(page, 1);

return page;
}
@@ -374,9 +363,7 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
#endif

/* Relinquish the page back to the allocator. */
- ClearPageReserved(page);
- init_page_count(page);
- __free_page(page);
+ __free_reserved_page(page);
}

balloon_stats.current_pages += rc;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index bacdf38..28757b7 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1246,7 +1246,7 @@ static void __init gather_bootmem_prealloc(void)
* side-effects, like CommitLimit going negative.
*/
if (h->order > (MAX_ORDER - 1))
- totalram_pages += 1 << h->order;
+ adjust_managed_page_count(page, 1 << h->order);
}
}

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 6a600f1..f3b12d70 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -760,20 +760,13 @@ EXPORT_SYMBOL_GPL(__online_page_set_limits);

void __online_page_increment_counters(struct page *page)
{
- totalram_pages++;
-
-#ifdef CONFIG_HIGHMEM
- if (PageHighMem(page))
- totalhigh_pages++;
-#endif
+ adjust_managed_page_count(page, 1);
}
EXPORT_SYMBOL_GPL(__online_page_increment_counters);

void __online_page_free(struct page *page)
{
- ClearPageReserved(page);
- init_page_count(page);
- __free_page(page);
+ __free_reserved_page(page);
}
EXPORT_SYMBOL_GPL(__online_page_free);

@@ -970,7 +963,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
return ret;
}

- zone->managed_pages += onlined_pages;
zone->present_pages += onlined_pages;
zone->zone_pgdat->node_present_pages += onlined_pages;
if (onlined_pages) {
@@ -1554,10 +1546,9 @@ repeat:
/* reset pagetype flags and makes migrate type to be MOVABLE */
undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
/* removal success */
- zone->managed_pages -= offlined_pages;
+ adjust_managed_page_count(pfn_to_page(start_pfn), -offlined_pages);
zone->present_pages -= offlined_pages;
zone->zone_pgdat->node_present_pages -= offlined_pages;
- totalram_pages -= offlined_pages;

init_per_zone_wmark_min();

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b87596f..4257a63 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -780,11 +780,7 @@ void __init init_cma_reserved_pageblock(struct page *page)
set_page_refcounted(page);
set_pageblock_migratetype(page, MIGRATE_CMA);
__free_pages(page, pageblock_order);
- totalram_pages += pageblock_nr_pages;
-#ifdef CONFIG_HIGHMEM
- if (PageHighMem(page))
- totalhigh_pages += pageblock_nr_pages;
-#endif
+ adjust_managed_page_count(page, pageblock_nr_pages);
}
#endif

@@ -5192,6 +5188,10 @@ void adjust_managed_page_count(struct page *page, long count)

page_zone(page)->managed_pages += count;
totalram_pages += count;
+#ifdef CONFIG_HIGHMEM
+ if (PageHighMem(page))
+ totalhigh_pages += count;
+#endif

if (lock)
spin_unlock(&managed_page_count_lock);
--
1.7.9.5

2013-04-06 13:58:02

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 14/15] mm: concentrate modification of totalram_pages into the mm core

Concentrate code to modify totalram_pages into the mm core, so the arch
memory initialized code doesn't need to take care of it. With these
changes applied, only following functions from mm core modify global
variable totalram_pages:
free_bootmem_late(), free_all_bootmem(), free_all_bootmem_node(),
adjust_managed_page_count().

With this patch applied, it will be much more easier for us to keep
totalram_pages and zone->managed_pages in consistence.

Signed-off-by: Jiang Liu <[email protected]>
---
arch/alpha/mm/init.c | 2 +-
arch/alpha/mm/numa.c | 2 +-
arch/arc/mm/init.c | 2 +-
arch/arm/mm/init.c | 3 +--
arch/arm64/mm/init.c | 2 +-
arch/avr32/mm/init.c | 2 --
arch/blackfin/mm/init.c | 2 +-
arch/c6x/mm/init.c | 2 +-
arch/cris/mm/init.c | 2 +-
arch/frv/mm/init.c | 2 +-
arch/h8300/mm/init.c | 2 +-
arch/hexagon/mm/init.c | 3 +--
arch/ia64/mm/init.c | 2 +-
arch/m32r/mm/init.c | 2 +-
arch/m68k/mm/init.c | 4 ++--
arch/metag/mm/init.c | 5 +----
arch/microblaze/mm/init.c | 2 +-
arch/mips/mm/init.c | 2 +-
arch/mips/sgi-ip27/ip27-memory.c | 2 +-
arch/mn10300/mm/init.c | 2 +-
arch/openrisc/mm/init.c | 2 +-
arch/parisc/mm/init.c | 4 ++--
arch/powerpc/mm/mem.c | 5 ++---
arch/s390/mm/init.c | 2 +-
arch/score/mm/init.c | 2 +-
arch/sh/mm/init.c | 2 +-
arch/sparc/mm/init_32.c | 3 +--
arch/sparc/mm/init_64.c | 2 +-
arch/tile/mm/init.c | 2 +-
arch/um/kernel/mem.c | 2 +-
arch/unicore32/mm/init.c | 2 +-
arch/x86/mm/init_32.c | 2 +-
arch/x86/mm/init_64.c | 2 +-
arch/xtensa/mm/init.c | 2 +-
mm/bootmem.c | 9 ++++++++-
mm/nobootmem.c | 7 ++++++-
36 files changed, 50 insertions(+), 47 deletions(-)

diff --git a/arch/alpha/mm/init.c b/arch/alpha/mm/init.c
index 9930837..ca07a97 100644
--- a/arch/alpha/mm/init.c
+++ b/arch/alpha/mm/init.c
@@ -309,7 +309,7 @@ void __init
mem_init(void)
{
max_mapnr = num_physpages = max_low_pfn;
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();
high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);

printk_memory_info();
diff --git a/arch/alpha/mm/numa.c b/arch/alpha/mm/numa.c
index 3388504..857452c 100644
--- a/arch/alpha/mm/numa.c
+++ b/arch/alpha/mm/numa.c
@@ -334,7 +334,7 @@ void __init mem_init(void)
/*
* This will free up the bootmem, ie, slot 0 memory
*/
- totalram_pages += free_all_bootmem_node(NODE_DATA(nid));
+ free_all_bootmem_node(NODE_DATA(nid));

pfn = NODE_DATA(nid)->node_start_pfn;
for (i = 0; i < node_spanned_pages(nid); i++, pfn++)
diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c
index 4a17736..78d8c31 100644
--- a/arch/arc/mm/init.c
+++ b/arch/arc/mm/init.c
@@ -111,7 +111,7 @@ void __init mem_init(void)

high_memory = (void *)(CONFIG_LINUX_LINK_BASE + arc_mem_sz);

- totalram_pages = free_all_bootmem();
+ free_all_bootmem();

/* count all reserved pages [kernel code/data/mem_map..] */
reserved_pages = 0;
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index a2ab290..add4fcb 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -595,8 +595,7 @@ void __init mem_init(void)

/* this will put all unused low memory onto the freelists */
free_unused_memmap(&meminfo);
-
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();

#ifdef CONFIG_SA1111
/* now that our DMA memory is actually so designated, we can free it */
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index b87bdb8..0f2cf5d 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -284,7 +284,7 @@ void __init mem_init(void)
free_unused_memmap();
#endif

- totalram_pages += free_all_bootmem();
+ free_all_bootmem();

reserved_pages = free_pages = 0;

diff --git a/arch/avr32/mm/init.c b/arch/avr32/mm/init.c
index 871f98a..7e8d55a 100644
--- a/arch/avr32/mm/init.c
+++ b/arch/avr32/mm/init.c
@@ -117,8 +117,6 @@ void __init mem_init(void)
if (pgdat->node_spanned_pages != 0)
node_pages = free_all_bootmem_node(pgdat);

- totalram_pages += node_pages;
-
for (i = 0; i < node_pages; i++)
if (PageReserved(pgdat->node_mem_map + i))
reservedpages++;
diff --git a/arch/blackfin/mm/init.c b/arch/blackfin/mm/init.c
index e64286b..1cc8607 100644
--- a/arch/blackfin/mm/init.c
+++ b/arch/blackfin/mm/init.c
@@ -104,7 +104,7 @@ void __init mem_init(void)
printk(KERN_DEBUG "Kernel managed physical pages: %lu\n", num_physpages);

/* This will put all low memory onto the freelists. */
- totalram_pages = free_all_bootmem();
+ free_all_bootmem();

reservedpages = 0;
for (tmp = ARCH_PFN_OFFSET; tmp < max_mapnr; tmp++)
diff --git a/arch/c6x/mm/init.c b/arch/c6x/mm/init.c
index ce39b48..2c51474 100644
--- a/arch/c6x/mm/init.c
+++ b/arch/c6x/mm/init.c
@@ -64,7 +64,7 @@ void __init mem_init(void)
high_memory = (void *)(memory_end & PAGE_MASK);

/* this will put all memory onto the freelists */
- totalram_pages = free_all_bootmem();
+ free_all_bootmem();

codek = (_etext - _stext) >> 10;
datak = (_end - _sdata) >> 10;
diff --git a/arch/cris/mm/init.c b/arch/cris/mm/init.c
index 8fec263..52b8b56 100644
--- a/arch/cris/mm/init.c
+++ b/arch/cris/mm/init.c
@@ -33,7 +33,7 @@ mem_init(void)
max_mapnr = num_physpages = max_low_pfn - min_low_pfn;

/* this will put all memory onto the freelists */
- totalram_pages = free_all_bootmem();
+ free_all_bootmem();

reservedpages = 0;
for (tmp = 0; tmp < max_mapnr; tmp++) {
diff --git a/arch/frv/mm/init.c b/arch/frv/mm/init.c
index a421948..4215822 100644
--- a/arch/frv/mm/init.c
+++ b/arch/frv/mm/init.c
@@ -123,7 +123,7 @@ void __init mem_init(void)
int codek = 0, datak = 0;

/* this will put all low memory onto the freelists */
- totalram_pages = free_all_bootmem();
+ free_all_bootmem();

#ifdef CONFIG_MMU
for (loop = 0 ; loop < npages ; loop++)
diff --git a/arch/h8300/mm/init.c b/arch/h8300/mm/init.c
index 488e2a3..22fd869 100644
--- a/arch/h8300/mm/init.c
+++ b/arch/h8300/mm/init.c
@@ -140,7 +140,7 @@ void __init mem_init(void)
max_mapnr = num_physpages = MAP_NR(high_memory);

/* this will put all low memory onto the freelists */
- totalram_pages = free_all_bootmem();
+ free_all_bootmem();

codek = (_etext - _stext) >> 10;
datak = (__bss_stop - _sdata) >> 10;
diff --git a/arch/hexagon/mm/init.c b/arch/hexagon/mm/init.c
index 69ffcfd..c048d06e 100644
--- a/arch/hexagon/mm/init.c
+++ b/arch/hexagon/mm/init.c
@@ -69,8 +69,7 @@ unsigned long long kmap_generation;
*/
void __init mem_init(void)
{
- /* No idea where this is actually declared. Seems to evade LXR. */
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();
num_physpages = bootmem_lastpg; /* seriously, what? */

printk(KERN_INFO "totalram_pages = %ld\n", totalram_pages);
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 941568a..b5b71e8 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -623,7 +623,7 @@ mem_init (void)

for_each_online_pgdat(pgdat)
if (pgdat->bdata->node_bootmem_map)
- totalram_pages += free_all_bootmem_node(pgdat);
+ free_all_bootmem_node(pgdat);

reserved_pages = 0;
efi_memmap_walk(count_reserved_pages, &reserved_pages);
diff --git a/arch/m32r/mm/init.c b/arch/m32r/mm/init.c
index 58ea4d6..c421c31 100644
--- a/arch/m32r/mm/init.c
+++ b/arch/m32r/mm/init.c
@@ -158,7 +158,7 @@ void __init mem_init(void)

/* this will put all low memory onto the freelists */
for_each_online_node(nid)
- totalram_pages += free_all_bootmem_node(NODE_DATA(nid));
+ free_all_bootmem_node(NODE_DATA(nid));

reservedpages = reservedpages_count() - hole_pages;
codesize = (unsigned long) &_etext - (unsigned long)&_text;
diff --git a/arch/m68k/mm/init.c b/arch/m68k/mm/init.c
index 75e1cbf..2485a8c 100644
--- a/arch/m68k/mm/init.c
+++ b/arch/m68k/mm/init.c
@@ -155,11 +155,11 @@ void __init mem_init(void)
int i;

/* this will put all memory onto the freelists */
- totalram_pages = num_physpages = 0;
+ num_physpages = 0;
for_each_online_pgdat(pgdat) {
num_physpages += pgdat->node_present_pages;

- totalram_pages += free_all_bootmem_node(pgdat);
+ free_all_bootmem_node(pgdat);
for (i = 0; i < pgdat->node_spanned_pages; i++) {
struct page *page = pgdat->node_mem_map + i;
char *addr = page_to_virt(page);
diff --git a/arch/metag/mm/init.c b/arch/metag/mm/init.c
index 58a36f3..279d701 100644
--- a/arch/metag/mm/init.c
+++ b/arch/metag/mm/init.c
@@ -393,14 +393,11 @@ void __init mem_init(void)

for_each_online_node(nid) {
pg_data_t *pgdat = NODE_DATA(nid);
- unsigned long node_pages = 0;

num_physpages += pgdat->node_present_pages;

if (pgdat->node_spanned_pages)
- node_pages = free_all_bootmem_node(pgdat);
-
- totalram_pages += node_pages;
+ free_all_bootmem_node(pgdat);
}

pr_info("Memory: %luk/%luk available\n",
diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
index 53383e4..fb7f248 100644
--- a/arch/microblaze/mm/init.c
+++ b/arch/microblaze/mm/init.c
@@ -252,7 +252,7 @@ void __init mem_init(void)
high_memory = (void *)__va(memory_start + lowmem_size - 1);

/* this will put all memory onto the freelists */
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();

for_each_online_pgdat(pgdat) {
unsigned long i;
diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index 3d0346d..de4ff2f 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -373,7 +373,7 @@ void __init mem_init(void)
#endif
high_memory = (void *) __va(max_low_pfn << PAGE_SHIFT);

- totalram_pages += free_all_bootmem();
+ free_all_bootmem();
setup_zero_pages(); /* Setup zeroed pages. */

reservedpages = ram = 0;
diff --git a/arch/mips/sgi-ip27/ip27-memory.c b/arch/mips/sgi-ip27/ip27-memory.c
index 5f2bddb..936e617 100644
--- a/arch/mips/sgi-ip27/ip27-memory.c
+++ b/arch/mips/sgi-ip27/ip27-memory.c
@@ -489,7 +489,7 @@ void __init mem_init(void)
/*
* This will free up the bootmem, ie, slot 0 memory.
*/
- totalram_pages += free_all_bootmem_node(NODE_DATA(node));
+ free_all_bootmem_node(NODE_DATA(node));
}

setup_zero_pages(); /* This comes from node 0 */
diff --git a/arch/mn10300/mm/init.c b/arch/mn10300/mm/init.c
index 5a8ace6..d7312aa 100644
--- a/arch/mn10300/mm/init.c
+++ b/arch/mn10300/mm/init.c
@@ -114,7 +114,7 @@ void __init mem_init(void)
memset(empty_zero_page, 0, PAGE_SIZE);

/* this will put all low memory onto the freelists */
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();

reservedpages = 0;
for (tmp = 0; tmp < num_physpages; tmp++)
diff --git a/arch/openrisc/mm/init.c b/arch/openrisc/mm/init.c
index d19950c..da26482 100644
--- a/arch/openrisc/mm/init.c
+++ b/arch/openrisc/mm/init.c
@@ -207,7 +207,7 @@ static int __init free_pages_init(void)
int reservedpages, pfn;

/* this will put all low memory onto the freelists */
- totalram_pages = free_all_bootmem();
+ free_all_bootmem();

reservedpages = 0;
for (pfn = 0; pfn < max_low_pfn; pfn++) {
diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
index 27f3f88..1fe9d841 100644
--- a/arch/parisc/mm/init.c
+++ b/arch/parisc/mm/init.c
@@ -593,13 +593,13 @@ void __init mem_init(void)

#ifndef CONFIG_DISCONTIGMEM
max_mapnr = page_to_pfn(virt_to_page(high_memory - 1)) + 1;
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();
#else
{
int i;

for (i = 0; i < npmem_ranges; i++)
- totalram_pages += free_all_bootmem_node(NODE_DATA(i));
+ free_all_bootmem_node(NODE_DATA(i));
}
#endif

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 2e912ca..8ddef0a 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -319,13 +319,12 @@ void __init mem_init(void)
for_each_online_node(nid) {
if (NODE_DATA(nid)->node_spanned_pages != 0) {
printk("freeing bootmem node %d\n", nid);
- totalram_pages +=
- free_all_bootmem_node(NODE_DATA(nid));
+ free_all_bootmem_node(NODE_DATA(nid));
}
}
#else
max_mapnr = max_pfn;
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();
#endif
for_each_online_pgdat(pgdat) {
for (i = 0; i < pgdat->node_spanned_pages; i++) {
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 275345e..24d52aa 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -142,7 +142,7 @@ void __init mem_init(void)
cmma_init();

/* this will put all low memory onto the freelists */
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();
setup_zero_pages(); /* Setup zeroed pages. */

reservedpages = 0;
diff --git a/arch/score/mm/init.c b/arch/score/mm/init.c
index 1592aad..579fc4e 100644
--- a/arch/score/mm/init.c
+++ b/arch/score/mm/init.c
@@ -81,7 +81,7 @@ void __init mem_init(void)
unsigned long tmp, ram = 0;

high_memory = (void *) __va(max_low_pfn << PAGE_SHIFT);
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();
setup_zero_page(); /* Setup zeroed pages. */
reservedpages = 0;

diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index 31294f1..aecd913 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -422,7 +422,7 @@ void __init mem_init(void)
num_physpages += pgdat->node_present_pages;

if (pgdat->node_spanned_pages)
- totalram_pages += free_all_bootmem_node(pgdat);
+ free_all_bootmem_node(pgdat);


node_high_memory = (void *)__va((pgdat->node_start_pfn +
diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
index af472cf..e96afed 100644
--- a/arch/sparc/mm/init_32.c
+++ b/arch/sparc/mm/init_32.c
@@ -323,8 +323,7 @@ void __init mem_init(void)

max_mapnr = last_valid_pfn - pfn_base;
high_memory = __va(max_low_pfn << PAGE_SHIFT);
-
- totalram_pages = free_all_bootmem();
+ free_all_bootmem();

for (i = 0; sp_banks[i].num_bytes != 0; i++) {
unsigned long start_pfn = sp_banks[i].base_addr >> PAGE_SHIFT;
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 63ddf47..541a3bc 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2055,7 +2055,7 @@ void __init mem_init(void)
high_memory = __va(last_valid_pfn << PAGE_SHIFT);

register_page_bootmem_info();
- totalram_pages = free_all_bootmem();
+ free_all_bootmem();

/* We subtract one to account for the mem_map_zero page
* allocated below.
diff --git a/arch/tile/mm/init.c b/arch/tile/mm/init.c
index ccfeb3f..45ce26d 100644
--- a/arch/tile/mm/init.c
+++ b/arch/tile/mm/init.c
@@ -846,7 +846,7 @@ void __init mem_init(void)
set_max_mapnr_init();

/* this will put all bootmem onto the freelists */
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();

#ifndef CONFIG_64BIT
/* count all remaining LOWMEM and give all HIGHMEM to page allocator */
diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c
index 1e84189..a7dc6c1 100644
--- a/arch/um/kernel/mem.c
+++ b/arch/um/kernel/mem.c
@@ -65,7 +65,7 @@ void __init mem_init(void)
uml_reserved = brk_end;

/* this will put all low memory onto the freelists */
- totalram_pages = free_all_bootmem();
+ free_all_bootmem();
max_low_pfn = totalram_pages;
#ifdef CONFIG_HIGHMEM
setup_highmem(end_iomem, highmem);
diff --git a/arch/unicore32/mm/init.c b/arch/unicore32/mm/init.c
index 5614b05..119b9e8 100644
--- a/arch/unicore32/mm/init.c
+++ b/arch/unicore32/mm/init.c
@@ -392,7 +392,7 @@ void __init mem_init(void)
free_unused_memmap(&meminfo);

/* this will put all unused low memory onto the freelists */
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();

reserved_pages = free_pages = 0;

diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 3ac7e31..9fa46ba 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -759,7 +759,7 @@ void __init mem_init(void)
set_highmem_pages_init();

/* this will put all low memory onto the freelists */
- totalram_pages += free_all_bootmem();
+ free_all_bootmem();

reservedpages = 0;
for (tmp = 0; tmp < max_low_pfn; tmp++)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 6ab46f2..65116e5 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1053,7 +1053,7 @@ void __init mem_init(void)
register_page_bootmem_info();

/* this will put all memory onto the freelists */
- totalram_pages = free_all_bootmem();
+ free_all_bootmem();

absent_pages = absent_pages_in_range(0, max_pfn);
reservedpages = max_pfn - totalram_pages - absent_pages;
diff --git a/arch/xtensa/mm/init.c b/arch/xtensa/mm/init.c
index 6f70647..dc6e009 100644
--- a/arch/xtensa/mm/init.c
+++ b/arch/xtensa/mm/init.c
@@ -184,7 +184,7 @@ void __init mem_init(void)
#error HIGHGMEM not implemented in init.c
#endif

- totalram_pages += free_all_bootmem();
+ free_all_bootmem();

reservedpages = ram = 0;
for (tmp = 0; tmp < max_mapnr; tmp++) {
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 39e6156..a19404b 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -272,9 +272,14 @@ void __init reset_all_zones_managed_pages(void)
*/
unsigned long __init free_all_bootmem_node(pg_data_t *pgdat)
{
+ unsigned long pages;
+
register_page_bootmem_info_node(pgdat);
reset_node_managed_pages(pgdat);
- return free_all_bootmem_core(pgdat->bdata);
+ pages = free_all_bootmem_core(pgdat->bdata);
+ totalram_pages += pages;
+
+ return pages;
}

/**
@@ -292,6 +297,8 @@ unsigned long __init free_all_bootmem(void)
list_for_each_entry(bdata, &bdata_list, list)
total_pages += free_all_bootmem_core(bdata);

+ totalram_pages += total_pages;
+
return total_pages;
}

diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index fa584ff..6b63cd6 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -165,6 +165,8 @@ void __init reset_all_zones_managed_pages(void)
*/
unsigned long __init free_all_bootmem(void)
{
+ unsigned long pages;
+
reset_all_zones_managed_pages();

/*
@@ -172,7 +174,10 @@ unsigned long __init free_all_bootmem(void)
* because in some case like Node0 doesn't have RAM installed
* low ram will be on Node1
*/
- return free_low_memory_core_early(MAX_NUMNODES);
+ pages = free_low_memory_core_early(MAX_NUMNODES);
+ totalram_pages += pages;
+
+ return pages;
}

/**
--
1.7.9.5

2013-04-06 13:58:11

by Jiang Liu

[permalink] [raw]
Subject: [PATCH v4, part3 15/15] mm: report available pages as "MemTotal" for each NUMA node

As reported by https://bugzilla.kernel.org/show_bug.cgi?id=53501,
"MemTotal" from /proc/meminfo means memory pages managed by the buddy
system (managed_pages), but "MemTotal" from /sys/.../node/nodex/meminfo
means phsical pages present (present_pages) within the NUMA node.
There's a difference between managed_pages and present_pages due to
bootmem allocator and reserved pages.

And Documentation/filesystems/proc.txt says
MemTotal: Total usable ram (i.e. physical ram minus a few reserved
bits and the kernel binary code)

So change /sys/.../node/nodex/meminfo to report available pages within
the node as "MemTotal".

Signed-off-by: Jiang Liu <[email protected]>
Reported-by: [email protected]
Cc: Andrew Morton <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
mm/page_alloc.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4257a63..74b0faf 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2922,9 +2922,13 @@ EXPORT_SYMBOL(si_meminfo);
#ifdef CONFIG_NUMA
void si_meminfo_node(struct sysinfo *val, int nid)
{
+ int zone_type; /* needs to be signed */
+ unsigned long managed_pages = 0;
pg_data_t *pgdat = NODE_DATA(nid);

- val->totalram = pgdat->node_present_pages;
+ for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++)
+ managed_pages += pgdat->node_zones[zone_type].managed_pages;
+ val->totalram = managed_pages;
val->freeram = node_page_state(nid, NR_FREE_PAGES);
#ifdef CONFIG_HIGHMEM
val->totalhigh = pgdat->node_zones[ZONE_HIGHMEM].managed_pages;
--
1.7.9.5

2013-04-06 14:37:32

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: [PATCH v4, part3 08/15] mm: fix some trivial typos in comments

Hello.

On 06-04-2013 17:55, Jiang Liu wrote:

> Fix some trivial typos in comments.

> Signed-off-by: Jiang Liu <[email protected]>
> Cc: Wen Congyang <[email protected]>
> Cc: Tang Chen <[email protected]>
> Cc: Jiang Liu <[email protected]>
> Cc: Yasuaki Ishimatsu <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Cc: Marek Szyprowski <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
[...]

> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 57decb2..a5b8fde 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -309,7 +309,7 @@ static int __meminit move_pfn_range_left(struct zone *z1, struct zone *z2,
> /* can't move pfns which are higher than @z2 */
> if (end_pfn > zone_end_pfn(z2))
> goto out_fail;
> - /* the move out part mast at the left most of @z2 */
> + /* the move out part must at the left most of @z2 */

Maybe "must be"?

> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6bd697c..c3c3eda 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2863,7 +2863,7 @@ EXPORT_SYMBOL(free_pages_exact);
> * nr_free_zone_pages() counts the number of counts pages which are beyond the
> * high watermark within all zones at or below a given zone index. For each
> * zone, the number of pages is calculated as:
> - * present_pages - high_pages
> + * managed_pages - high_pages

I'm not sure it's that trivial.

WBR, Sergei

2013-04-06 14:46:05

by Jiang Liu

[permalink] [raw]
Subject: Re: [PATCH v4, part3 08/15] mm: fix some trivial typos in comments

On 04/06/2013 10:36 PM, Sergei Shtylyov wrote:
> Hello.
>
> On 06-04-2013 17:55, Jiang Liu wrote:
>
>> Fix some trivial typos in comments.
>
>> Signed-off-by: Jiang Liu <[email protected]>
>> Cc: Wen Congyang <[email protected]>
>> Cc: Tang Chen <[email protected]>
>> Cc: Jiang Liu <[email protected]>
>> Cc: Yasuaki Ishimatsu <[email protected]>
>> Cc: Mel Gorman <[email protected]>
>> Cc: Minchan Kim <[email protected]>
>> Cc: Marek Szyprowski <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
> [...]
>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 57decb2..a5b8fde 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -309,7 +309,7 @@ static int __meminit move_pfn_range_left(struct zone *z1, struct zone *z2,
>> /* can't move pfns which are higher than @z2 */
>> if (end_pfn > zone_end_pfn(z2))
>> goto out_fail;
>> - /* the move out part mast at the left most of @z2 */
>> + /* the move out part must at the left most of @z2 */
>
> Maybe "must be"?
Good catch!

>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 6bd697c..c3c3eda 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2863,7 +2863,7 @@ EXPORT_SYMBOL(free_pages_exact);
>> * nr_free_zone_pages() counts the number of counts pages which are beyond the
>> * high watermark within all zones at or below a given zone index. For each
>> * zone, the number of pages is calculated as:
>> - * present_pages - high_pages
>> + * managed_pages - high_pages
>
> I'm not sure it's that trivial.
We just changes the comments to follow the code, so mark it as "trivial".

Regards!
Gerry

>
> WBR, Sergei
>

2013-04-07 01:34:28

by Simon Jeons

[permalink] [raw]
Subject: Re: [PATCH v4, part3 00/15] accurately calculate memory statisitic information

Hi Jiang,
On 04/06/2013 09:54 PM, Jiang Liu wrote:
> The original goal of this patchset is to fix the bug reported by
> https://bugzilla.kernel.org/show_bug.cgi?id=53501
> Now it has also been expanded to reduce common code used by memory
> initializion.
>
> This is the third part, previous two patch sets could be accessed at:
> http://marc.info/?l=linux-mm&m=136289696323825&w=2
> http://marc.info/?l=linux-mm&m=136290291524901&w=2
>
> This patchset applies to
> git://git.cmpxchg.org/linux-mmotm.git fc374c1f9d7bdcfb851b15b86e58ac5e1f645e32
> which is based on mmotm-2013-03-26-15-09.
>
> V2->V4:
> 1) rebase to git://git.cmpxchg.org/linux-mmotm.git
> 2) fix some build warnings and other minor bugs of previous patches
>
> We have only tested these patchset on x86 platforms, and have done basic
> compliation tests using cross-compilers from ftp.kernel.org. That means
> some code may not pass compilation on some architectures. So any help
> to test this patchset are welcomed!
>
> Patch 1-7:
> Bugfixes and more work for part1 and part2
> Patch 8-9:
> Fix typo and minor bugs in mm core
> Patch 10-14:
> Enhance the way to manage totalram_pages, totalhigh_pages and
> zone->managed_pages.
> Patch 15:
> Report available pages within the node as "MemTotal" for sysfs
> interface /sys/.../node/nodex/meminfo
>
> Jiang Liu (15):
> mm: fix build warnings caused by free_reserved_area()
> mm: enhance free_reserved_area() to support poisoning memory with
> zero
> mm/ARM64: kill poison_init_mem()
> mm/x86: use free_reserved_area() to simplify code
> mm/tile: use common help functions to free reserved pages
> mm, powertv: use free_reserved_area() to simplify code
> mm, acornfb: use free_reserved_area() to simplify code
> mm: fix some trivial typos in comments
> mm: use managed_pages to calculate default zonelist order
> mm: accurately calculate zone->managed_pages for highmem zones
> mm: use a dedicated lock to protect totalram_pages and
> zone->managed_pages
> mm: make __free_pages_bootmem() only available at boot time
> mm: correctly update zone->mamaged_pages
> mm: concentrate modification of totalram_pages into the mm core
> mm: report available pages as "MemTotal" for each NUMA node

What I interested in is how you test different platform? I don't think
you can have all the physical platform.

>
> arch/alpha/kernel/sys_nautilus.c | 2 +-
> arch/alpha/mm/init.c | 6 ++--
> arch/alpha/mm/numa.c | 2 +-
> arch/arc/mm/init.c | 2 +-
> arch/arm/mm/init.c | 13 ++++----
> arch/arm64/mm/init.c | 15 ++-------
> arch/avr32/mm/init.c | 6 ++--
> arch/blackfin/mm/init.c | 6 ++--
> arch/c6x/mm/init.c | 6 ++--
> arch/cris/mm/init.c | 4 +--
> arch/frv/mm/init.c | 6 ++--
> arch/h8300/mm/init.c | 6 ++--
> arch/hexagon/mm/init.c | 3 +-
> arch/ia64/mm/init.c | 4 +--
> arch/m32r/mm/init.c | 6 ++--
> arch/m68k/mm/init.c | 8 ++---
> arch/metag/mm/init.c | 11 ++++---
> arch/microblaze/mm/init.c | 6 ++--
> arch/mips/mm/init.c | 2 +-
> arch/mips/powertv/asic/asic_devices.c | 13 ++------
> arch/mips/sgi-ip27/ip27-memory.c | 2 +-
> arch/mn10300/mm/init.c | 2 +-
> arch/openrisc/mm/init.c | 6 ++--
> arch/parisc/mm/init.c | 8 ++---
> arch/powerpc/kernel/kvm.c | 2 +-
> arch/powerpc/mm/mem.c | 7 ++---
> arch/s390/mm/init.c | 4 +--
> arch/score/mm/init.c | 2 +-
> arch/sh/mm/init.c | 6 ++--
> arch/sparc/mm/init_32.c | 3 +-
> arch/sparc/mm/init_64.c | 2 +-
> arch/tile/mm/init.c | 9 ++----
> arch/um/kernel/mem.c | 4 +--
> arch/unicore32/mm/init.c | 6 ++--
> arch/x86/mm/highmem_32.c | 6 ++++
> arch/x86/mm/init.c | 14 ++-------
> arch/x86/mm/init_32.c | 2 +-
> arch/x86/mm/init_64.c | 25 +++------------
> arch/xtensa/mm/init.c | 6 ++--
> drivers/video/acornfb.c | 28 ++---------------
> drivers/virtio/virtio_balloon.c | 8 +++--
> drivers/xen/balloon.c | 23 +++-----------
> include/linux/bootmem.h | 1 +
> include/linux/mm.h | 17 +++++-----
> include/linux/mmzone.h | 14 ++++++---
> mm/bootmem.c | 41 +++++++++++++++---------
> mm/hugetlb.c | 2 +-
> mm/memory_hotplug.c | 33 ++++----------------
> mm/nobootmem.c | 35 ++++++++++++---------
> mm/page_alloc.c | 55 +++++++++++++++++++++------------
> 50 files changed, 222 insertions(+), 278 deletions(-)
>

2013-04-07 15:09:11

by Jiang Liu

[permalink] [raw]
Subject: Re: [PATCH v4, part3 00/15] accurately calculate memory statisitic information

On 04/07/2013 09:34 AM, Simon Jeons wrote:
> Hi Jiang,
> On 04/06/2013 09:54 PM, Jiang Liu wrote:
>> Jiang Liu (15):
>> mm: fix build warnings caused by free_reserved_area()
>> mm: enhance free_reserved_area() to support poisoning memory with
>> zero
>> mm/ARM64: kill poison_init_mem()
>> mm/x86: use free_reserved_area() to simplify code
>> mm/tile: use common help functions to free reserved pages
>> mm, powertv: use free_reserved_area() to simplify code
>> mm, acornfb: use free_reserved_area() to simplify code
>> mm: fix some trivial typos in comments
>> mm: use managed_pages to calculate default zonelist order
>> mm: accurately calculate zone->managed_pages for highmem zones
>> mm: use a dedicated lock to protect totalram_pages and
>> zone->managed_pages
>> mm: make __free_pages_bootmem() only available at boot time
>> mm: correctly update zone->mamaged_pages
>> mm: concentrate modification of totalram_pages into the mm core
>> mm: report available pages as "MemTotal" for each NUMA node
>
> What I interested in is how you test different platform? I don't think you can have all the physical platform.
>
Hi Simon,
That's one issue I'm facing, I only have limited hardware platforms for testing,
so I could ask for help from the community to review and test the patch series.
Regards!
Gerry

2013-04-08 13:40:34

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH v4, part3 11/15] mm: use a dedicated lock to protect totalram_pages and zone->managed_pages

On 04/06/2013 09:55 AM, Jiang Liu wrote:

> @@ -5186,6 +5189,22 @@ early_param("movablecore", cmdline_parse_movablecore);
>
> #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>
> +void adjust_managed_page_count(struct page *page, long count)
> +{
> + bool lock = (system_state != SYSTEM_BOOTING);
> +
> + /* No need to acquire the lock during boot */
> + if (lock)
> + spin_lock(&managed_page_count_lock);
> +
> + page_zone(page)->managed_pages += count;
> + totalram_pages += count;
> +
> + if (lock)
> + spin_unlock(&managed_page_count_lock);
> +}

While I agree the boot code currently does not need the lock, is
there any harm to removing that conditional?

That would simplify the code, and protect against possible future
cleverness of initializing multiple memory things simultaneously.

--
All rights reversed

2013-04-08 16:06:09

by Jiang Liu

[permalink] [raw]
Subject: Re: [PATCH v4, part3 11/15] mm: use a dedicated lock to protect totalram_pages and zone->managed_pages

On 04/08/2013 09:39 PM, Rik van Riel wrote:
> On 04/06/2013 09:55 AM, Jiang Liu wrote:
>
>> @@ -5186,6 +5189,22 @@ early_param("movablecore", cmdline_parse_movablecore);
>>
>> #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>>
>> +void adjust_managed_page_count(struct page *page, long count)
>> +{
>> + bool lock = (system_state != SYSTEM_BOOTING);
>> +
>> + /* No need to acquire the lock during boot */
>> + if (lock)
>> + spin_lock(&managed_page_count_lock);
>> +
>> + page_zone(page)->managed_pages += count;
>> + totalram_pages += count;
>> +
>> + if (lock)
>> + spin_unlock(&managed_page_count_lock);
>> +}
>
> While I agree the boot code currently does not need the lock, is
> there any harm to removing that conditional?
>
> That would simplify the code, and protect against possible future
> cleverness of initializing multiple memory things simultaneously.
>
Hi Rik,
Thanks for you comments.
I'm OK with that. Acquiring/releasing the lock should be lightweight
because there shouldn't be contention during boot. Will remove the logic in
next version.
Regards!
Gerry

2013-04-19 16:54:16

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH v4, part3 01/15] mm: fix build warnings caused by free_reserved_area()

On Sat, Apr 06, 2013 at 09:54:55PM +0800, Jiang Liu wrote:
> Fix following build warnings cuased by free_reserved_area():
>
> arch/arm/mm/init.c: In function 'mem_init':
> arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
> free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
> ^
> In file included from include/linux/mman.h:4:0,
> from arch/arm/mm/init.c:15:
> include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
> extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
>
> mm/page_alloc.c: In function 'free_reserved_area':
> >> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
> In file included from arch/mips/include/asm/page.h:49:0,
> from include/linux/mmzone.h:20,
> from include/linux/gfp.h:4,
> from include/linux/mm.h:8,
> from mm/page_alloc.c:18:
> arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
> mm/page_alloc.c: In function 'free_area_init_nodes':
> mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]
>
> Signed-off-by: Jiang Liu <[email protected]>
> Reported-by: Arnd Bergmann <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---
> arch/arm/mm/init.c | 6 ++++--
> mm/page_alloc.c | 2 +-
> 2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 9a5cdc0..7a82fcd 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -600,7 +600,8 @@ void __init mem_init(void)
>
> #ifdef CONFIG_SA1111
> /* now that our DMA memory is actually so designated, we can free it */
> - free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
> + free_reserved_area((unsigned long)__va(PHYS_PFN_OFFSET),
> + (unsigned long)swapper_pg_dir, 0, NULL);
> #endif
>
> free_highpages();
> @@ -728,7 +729,8 @@ void free_initmem(void)
> extern char __tcm_start, __tcm_end;
>
> poison_init_mem(&__tcm_start, &__tcm_end - &__tcm_start);
> - free_reserved_area(&__tcm_start, &__tcm_end, 0, "TCM link");
> + free_reserved_area((unsigned long)&__tcm_start,
> + (unsigned long)&__tcm_end, 0, "TCM link");
> #endif
>
> poison_init_mem(__init_begin, __init_end - __init_begin);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e4923e9..8bf7956 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5196,7 +5196,7 @@ unsigned long free_reserved_area(unsigned long start, unsigned long end,
> for (pages = 0; pos < end; pos += PAGE_SIZE, pages++) {
> if (poison)
> memset((void *)pos, poison, PAGE_SIZE);
> - free_reserved_page(virt_to_page(pos));
> + free_reserved_page(virt_to_page((void *)pos));

Don't all these casts suggest to you that you may have the type wrong
in the first place?

2013-04-20 15:34:43

by Jiang Liu

[permalink] [raw]
Subject: Re: [PATCH v4, part3 01/15] mm: fix build warnings caused by free_reserved_area()

On 04/20/2013 12:52 AM, Russell King - ARM Linux wrote:
> On Sat, Apr 06, 2013 at 09:54:55PM +0800, Jiang Liu wrote:
>> Fix following build warnings cuased by free_reserved_area():
>>
>> arch/arm/mm/init.c: In function 'mem_init':
>> arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
>> free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
>> ^
>> In file included from include/linux/mman.h:4:0,
>> from arch/arm/mm/init.c:15:
>> include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
>> extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
>>
>> mm/page_alloc.c: In function 'free_reserved_area':
>>>> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
>> In file included from arch/mips/include/asm/page.h:49:0,
>> from include/linux/mmzone.h:20,
>> from include/linux/gfp.h:4,
>> from include/linux/mm.h:8,
>> from mm/page_alloc.c:18:
>> arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
>> mm/page_alloc.c: In function 'free_area_init_nodes':
>> mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]
>>
>> Signed-off-by: Jiang Liu <[email protected]>
>> Reported-by: Arnd Bergmann <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> ---
>> arch/arm/mm/init.c | 6 ++++--
>> mm/page_alloc.c | 2 +-
>> 2 files changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index 9a5cdc0..7a82fcd 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -600,7 +600,8 @@ void __init mem_init(void)
>>
>> #ifdef CONFIG_SA1111
>> /* now that our DMA memory is actually so designated, we can free it */
>> - free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
>> + free_reserved_area((unsigned long)__va(PHYS_PFN_OFFSET),
>> + (unsigned long)swapper_pg_dir, 0, NULL);
>> #endif
>>
>> free_highpages();
>> @@ -728,7 +729,8 @@ void free_initmem(void)
>> extern char __tcm_start, __tcm_end;
>>
>> poison_init_mem(&__tcm_start, &__tcm_end - &__tcm_start);
>> - free_reserved_area(&__tcm_start, &__tcm_end, 0, "TCM link");
>> + free_reserved_area((unsigned long)&__tcm_start,
>> + (unsigned long)&__tcm_end, 0, "TCM link");
>> #endif
>>
>> poison_init_mem(__init_begin, __init_end - __init_begin);
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index e4923e9..8bf7956 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5196,7 +5196,7 @@ unsigned long free_reserved_area(unsigned long start, unsigned long end,
>> for (pages = 0; pos < end; pos += PAGE_SIZE, pages++) {
>> if (poison)
>> memset((void *)pos, poison, PAGE_SIZE);
>> - free_reserved_page(virt_to_page(pos));
>> + free_reserved_page(virt_to_page((void *)pos));
>
> Don't all these casts suggest to you that you may have the type wrong
> in the first place?
>
Hi Russell,
Good question!
Originally free_reserved_area() is designed to simplify free_initrd_mem(), and
free_initrd_mem() is declared as:
void free_initrd_mem(unsigned long start, unsigned long end)
So I have chosen "unsigned long" for free_reserved_area()'s first and second
parameters, otherwise it will cause much more type casts in function free_initrd_mem().
For code purity, we should use "void *" instead. So should we change to "void *" here?
Regards!
Gerry