2020-08-02 16:39:17

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 00/17] memblock: seasonal cleaning^w cleanup

From: Mike Rapoport <[email protected]>

Hi,

These patches simplify several uses of memblock iterators and hide some of
the memblock implementation details from the rest of the system.

The patches are on top of v5.8-rc7 + cherry-pick of "mm/sparse: cleanup the
code surrounding memory_present()" [1] from mmotm tree.

v2 changes:
* replace for_each_memblock() with two versions, one for memblock.memory
and another one for memblock.reserved
* fix overzealous cleanup of powerpc fadamp: keep the traversal over the
memblocks, but use better suited iterators
* don't remove traversal over memblock.reserved in x86 numa cleanup but
replace for_each_memblock() with new for_each_reserved_mem_region()
* simplify ramdisk and crash kernel allocations on x86
* drop more redundant and unused code: __next_reserved_mem_region() and
memblock_mem_size()
* add description of numa initialization fix on arm64 (thanks Jonathan)
* add Acked and Reviewed tags

[1] http://lkml.kernel.org/r/[email protected]

Mike Rapoport (17):
KVM: PPC: Book3S HV: simplify kvm_cma_reserve()
dma-contiguous: simplify cma_early_percent_memory()
arm, xtensa: simplify initialization of high memory pages
arm64: numa: simplify dummy_numa_init()
h8300, nds32, openrisc: simplify detection of memory extents
riscv: drop unneeded node initialization
mircoblaze: drop unneeded NUMA and sparsemem initializations
memblock: make for_each_memblock_type() iterator private
memblock: make memblock_debug and related functionality private
memblock: reduce number of parameters in for_each_mem_range()
arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()
arch, drivers: replace for_each_membock() with for_each_mem_range()
x86/setup: simplify initrd relocation and reservation
x86/setup: simplify reserve_crashkernel()
memblock: remove unused memblock_mem_size()
memblock: implement for_each_reserved_mem_region() using __next_mem_region()
memblock: use separate iterators for memory and reserved regions

.clang-format | 4 +-
arch/arm/kernel/setup.c | 18 +++--
arch/arm/mm/init.c | 59 ++++------------
arch/arm/mm/mmu.c | 39 ++++-------
arch/arm/mm/pmsa-v7.c | 20 +++---
arch/arm/mm/pmsa-v8.c | 17 +++--
arch/arm/xen/mm.c | 7 +-
arch/arm64/kernel/machine_kexec_file.c | 6 +-
arch/arm64/kernel/setup.c | 4 +-
arch/arm64/mm/init.c | 11 ++-
arch/arm64/mm/kasan_init.c | 10 +--
arch/arm64/mm/mmu.c | 11 +--
arch/arm64/mm/numa.c | 15 ++---
arch/c6x/kernel/setup.c | 9 +--
arch/h8300/kernel/setup.c | 8 +--
arch/microblaze/mm/init.c | 24 ++-----
arch/mips/cavium-octeon/dma-octeon.c | 12 ++--
arch/mips/kernel/setup.c | 31 +++++----
arch/mips/netlogic/xlp/setup.c | 2 +-
arch/nds32/kernel/setup.c | 8 +--
arch/openrisc/kernel/setup.c | 9 +--
arch/openrisc/mm/init.c | 8 ++-
arch/powerpc/kernel/fadump.c | 57 ++++++++--------
arch/powerpc/kvm/book3s_hv_builtin.c | 11 +--
arch/powerpc/mm/book3s64/hash_utils.c | 16 ++---
arch/powerpc/mm/book3s64/radix_pgtable.c | 11 ++-
arch/powerpc/mm/kasan/kasan_init_32.c | 8 +--
arch/powerpc/mm/mem.c | 33 +++++----
arch/powerpc/mm/numa.c | 7 +-
arch/powerpc/mm/pgtable_32.c | 8 +--
arch/riscv/mm/init.c | 34 +++-------
arch/riscv/mm/kasan_init.c | 10 +--
arch/s390/kernel/crash_dump.c | 8 +--
arch/s390/kernel/setup.c | 31 +++++----
arch/s390/mm/page-states.c | 6 +-
arch/s390/mm/vmem.c | 16 +++--
arch/sh/mm/init.c | 9 +--
arch/sparc/mm/init_64.c | 12 ++--
arch/x86/kernel/setup.c | 56 +++++-----------
arch/x86/mm/numa.c | 2 +-
arch/xtensa/mm/init.c | 55 +++------------
drivers/bus/mvebu-mbus.c | 12 ++--
drivers/irqchip/irq-gic-v3-its.c | 2 +-
drivers/s390/char/zcore.c | 9 +--
include/linux/memblock.h | 65 +++++++++---------
kernel/dma/contiguous.c | 11 +--
mm/memblock.c | 85 ++++++++----------------
mm/page_alloc.c | 11 ++-
mm/sparse.c | 10 ++-
49 files changed, 366 insertions(+), 561 deletions(-)

--
2.26.2


2020-08-02 16:39:35

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 12/17] arch, drivers: replace for_each_membock() with for_each_mem_range()

From: Mike Rapoport <[email protected]>

There are several occurrences of the following pattern:

for_each_memblock(memory, reg) {
start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));

/* do something with start and end */
}

Using for_each_mem_range() iterator is more appropriate in such cases and
allows simpler and cleaner code.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/arm/kernel/setup.c | 18 ++++++---
arch/arm/mm/mmu.c | 39 ++++++------------
arch/arm/mm/pmsa-v7.c | 20 +++++-----
arch/arm/mm/pmsa-v8.c | 17 ++++----
arch/arm/xen/mm.c | 7 ++--
arch/arm64/mm/kasan_init.c | 10 ++---
arch/arm64/mm/mmu.c | 11 ++----
arch/c6x/kernel/setup.c | 9 +++--
arch/microblaze/mm/init.c | 9 +++--
arch/mips/cavium-octeon/dma-octeon.c | 12 +++---
arch/mips/kernel/setup.c | 31 +++++++--------
arch/openrisc/mm/init.c | 8 ++--
arch/powerpc/kernel/fadump.c | 50 +++++++++++-------------
arch/powerpc/mm/book3s64/hash_utils.c | 16 ++++----
arch/powerpc/mm/book3s64/radix_pgtable.c | 11 +++---
arch/powerpc/mm/kasan/kasan_init_32.c | 8 ++--
arch/powerpc/mm/mem.c | 16 +++++---
arch/powerpc/mm/pgtable_32.c | 8 ++--
arch/riscv/mm/init.c | 25 +++++-------
arch/riscv/mm/kasan_init.c | 10 ++---
arch/s390/kernel/setup.c | 27 ++++++++-----
arch/s390/mm/vmem.c | 16 ++++----
arch/sparc/mm/init_64.c | 12 ++----
drivers/bus/mvebu-mbus.c | 12 +++---
drivers/s390/char/zcore.c | 9 +++--
25 files changed, 200 insertions(+), 211 deletions(-)

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index d8e18cdd96d3..3f65d0ac9f63 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -843,19 +843,25 @@ early_param("mem", early_mem);

static void __init request_standard_resources(const struct machine_desc *mdesc)
{
- struct memblock_region *region;
+ phys_addr_t start, end, res_end;
struct resource *res;
+ u64 i;

kernel_code.start = virt_to_phys(_text);
kernel_code.end = virt_to_phys(__init_begin - 1);
kernel_data.start = virt_to_phys(_sdata);
kernel_data.end = virt_to_phys(_end - 1);

- for_each_memblock(memory, region) {
- phys_addr_t start = __pfn_to_phys(memblock_region_memory_base_pfn(region));
- phys_addr_t end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
+ for_each_mem_range(i, &start, &end) {
unsigned long boot_alias_start;

+ /*
+ * In memblock, end points to the first byte after the
+ * range while in resourses, end points to the last byte in
+ * the range.
+ */
+ res_end = end - 1;
+
/*
* Some systems have a special memory alias which is only
* used for booting. We need to advertise this region to
@@ -869,7 +875,7 @@ static void __init request_standard_resources(const struct machine_desc *mdesc)
__func__, sizeof(*res));
res->name = "System RAM (boot alias)";
res->start = boot_alias_start;
- res->end = phys_to_idmap(end);
+ res->end = phys_to_idmap(res_end);
res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
request_resource(&iomem_resource, res);
}
@@ -880,7 +886,7 @@ static void __init request_standard_resources(const struct machine_desc *mdesc)
sizeof(*res));
res->name = "System RAM";
res->start = start;
- res->end = end;
+ res->end = res_end;
res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;

request_resource(&iomem_resource, res);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 628028bfbb92..a149d9cb4fdb 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1155,9 +1155,8 @@ phys_addr_t arm_lowmem_limit __initdata = 0;

void __init adjust_lowmem_bounds(void)
{
- phys_addr_t memblock_limit = 0;
- u64 vmalloc_limit;
- struct memblock_region *reg;
+ phys_addr_t block_start, block_end, memblock_limit = 0;
+ u64 vmalloc_limit, i;
phys_addr_t lowmem_limit = 0;

/*
@@ -1173,26 +1172,18 @@ void __init adjust_lowmem_bounds(void)
* The first usable region must be PMD aligned. Mark its start
* as MEMBLOCK_NOMAP if it isn't
*/
- for_each_memblock(memory, reg) {
- if (!memblock_is_nomap(reg)) {
- if (!IS_ALIGNED(reg->base, PMD_SIZE)) {
- phys_addr_t len;
+ for_each_mem_range(i, &block_start, &block_end) {
+ if (!IS_ALIGNED(block_start, PMD_SIZE)) {
+ phys_addr_t len;

- len = round_up(reg->base, PMD_SIZE) - reg->base;
- memblock_mark_nomap(reg->base, len);
- }
- break;
+ len = round_up(block_start, PMD_SIZE) - block_start;
+ memblock_mark_nomap(block_start, len);
}
+ break;
}

- for_each_memblock(memory, reg) {
- phys_addr_t block_start = reg->base;
- phys_addr_t block_end = reg->base + reg->size;
-
- if (memblock_is_nomap(reg))
- continue;
-
- if (reg->base < vmalloc_limit) {
+ for_each_mem_range(i, &block_start, &block_end) {
+ if (block_start < vmalloc_limit) {
if (block_end > lowmem_limit)
/*
* Compare as u64 to ensure vmalloc_limit does
@@ -1441,19 +1432,15 @@ static void __init kmap_init(void)

static void __init map_lowmem(void)
{
- struct memblock_region *reg;
phys_addr_t kernel_x_start = round_down(__pa(KERNEL_START), SECTION_SIZE);
phys_addr_t kernel_x_end = round_up(__pa(__init_end), SECTION_SIZE);
+ phys_addr_t start, end;
+ u64 i;

/* Map all the lowmem memory banks. */
- for_each_memblock(memory, reg) {
- phys_addr_t start = reg->base;
- phys_addr_t end = start + reg->size;
+ for_each_mem_range(i, &start, &end) {
struct map_desc map;

- if (memblock_is_nomap(reg))
- continue;
-
if (end > arm_lowmem_limit)
end = arm_lowmem_limit;
if (start >= end)
diff --git a/arch/arm/mm/pmsa-v7.c b/arch/arm/mm/pmsa-v7.c
index 699fa2e88725..44b7644a4237 100644
--- a/arch/arm/mm/pmsa-v7.c
+++ b/arch/arm/mm/pmsa-v7.c
@@ -231,10 +231,9 @@ static int __init allocate_region(phys_addr_t base, phys_addr_t size,
void __init pmsav7_adjust_lowmem_bounds(void)
{
phys_addr_t specified_mem_size = 0, total_mem_size = 0;
- struct memblock_region *reg;
- bool first = true;
phys_addr_t mem_start;
phys_addr_t mem_end;
+ phys_addr_t reg_start, reg_end;
unsigned int mem_max_regions;
int num, i;

@@ -262,20 +261,19 @@ void __init pmsav7_adjust_lowmem_bounds(void)
mem_max_regions -= num;
#endif

- for_each_memblock(memory, reg) {
- if (first) {
+ for_each_mem_range(i, &reg_start, &reg_end) {
+ if (i == 0) {
phys_addr_t phys_offset = PHYS_OFFSET;

/*
* Initially only use memory continuous from
* PHYS_OFFSET */
- if (reg->base != phys_offset)
+ if (reg_start != phys_offset)
panic("First memory bank must be contiguous from PHYS_OFFSET");

- mem_start = reg->base;
- mem_end = reg->base + reg->size;
- specified_mem_size = reg->size;
- first = false;
+ mem_start = reg_start;
+ mem_end = reg_end
+ specified_mem_size = mem_end - mem_start;
} else {
/*
* memblock auto merges contiguous blocks, remove
@@ -283,8 +281,8 @@ void __init pmsav7_adjust_lowmem_bounds(void)
* blocks separately while iterating)
*/
pr_notice("Ignoring RAM after %pa, memory at %pa ignored\n",
- &mem_end, &reg->base);
- memblock_remove(reg->base, 0 - reg->base);
+ &mem_end, &reg_start);
+ memblock_remove(reg_start, 0 - reg_start);
break;
}
}
diff --git a/arch/arm/mm/pmsa-v8.c b/arch/arm/mm/pmsa-v8.c
index 0d7d5fb59247..b39e74b48437 100644
--- a/arch/arm/mm/pmsa-v8.c
+++ b/arch/arm/mm/pmsa-v8.c
@@ -94,20 +94,19 @@ static __init bool is_region_fixed(int number)
void __init pmsav8_adjust_lowmem_bounds(void)
{
phys_addr_t mem_end;
- struct memblock_region *reg;
- bool first = true;
+ phys_addr_t reg_start, reg_end;
+ int i;

- for_each_memblock(memory, reg) {
- if (first) {
+ for_each_mem_range(i, &reg_start, &reg_end) {
+ if (i == 0) {
phys_addr_t phys_offset = PHYS_OFFSET;

/*
* Initially only use memory continuous from
* PHYS_OFFSET */
- if (reg->base != phys_offset)
+ if (reg_start != phys_offset)
panic("First memory bank must be contiguous from PHYS_OFFSET");
- mem_end = reg->base + reg->size;
- first = false;
+ mem_end = reg_end;
} else {
/*
* memblock auto merges contiguous blocks, remove
@@ -115,8 +114,8 @@ void __init pmsav8_adjust_lowmem_bounds(void)
* blocks separately while iterating)
*/
pr_notice("Ignoring RAM after %pa, memory at %pa ignored\n",
- &mem_end, &reg->base);
- memblock_remove(reg->base, 0 - reg->base);
+ &mem_end, &reg_start);
+ memblock_remove(reg_start, 0 - reg_start);
break;
}
}
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index d40e9e5fc52b..05f24ff41e36 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -24,11 +24,12 @@

unsigned long xen_get_swiotlb_free_pages(unsigned int order)
{
- struct memblock_region *reg;
+ phys_addr_t base;
gfp_t flags = __GFP_NOWARN|__GFP_KSWAPD_RECLAIM;
+ u64 i;

- for_each_memblock(memory, reg) {
- if (reg->base < (phys_addr_t)0xffffffff) {
+ for_each_mem_range(i, &base, NULL) {
+ if (base < (phys_addr_t)0xffffffff) {
if (IS_ENABLED(CONFIG_ZONE_DMA32))
flags |= __GFP_DMA32;
else
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 7291b26ce788..b24e43d20667 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -212,8 +212,8 @@ void __init kasan_init(void)
{
u64 kimg_shadow_start, kimg_shadow_end;
u64 mod_shadow_start, mod_shadow_end;
- struct memblock_region *reg;
- int i;
+ phys_addr_t pa_start, pa_end;
+ u64 i;

kimg_shadow_start = (u64)kasan_mem_to_shadow(_text) & PAGE_MASK;
kimg_shadow_end = PAGE_ALIGN((u64)kasan_mem_to_shadow(_end));
@@ -246,9 +246,9 @@ void __init kasan_init(void)
kasan_populate_early_shadow((void *)mod_shadow_end,
(void *)kimg_shadow_start);

- for_each_memblock(memory, reg) {
- void *start = (void *)__phys_to_virt(reg->base);
- void *end = (void *)__phys_to_virt(reg->base + reg->size);
+ for_each_mem_range(i, &pa_start, &pa_end) {
+ void *start = (void *)__phys_to_virt(pa_start);
+ void *end = (void *)__phys_to_virt(pa_end);

if (start >= end)
break;
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 1df25f26571d..327264fb83fb 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -461,8 +461,9 @@ static void __init map_mem(pgd_t *pgdp)
{
phys_addr_t kernel_start = __pa_symbol(_text);
phys_addr_t kernel_end = __pa_symbol(__init_begin);
- struct memblock_region *reg;
+ phys_addr_t start, end;
int flags = 0;
+ u64 i;

if (rodata_full || debug_pagealloc_enabled())
flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
@@ -481,15 +482,9 @@ static void __init map_mem(pgd_t *pgdp)
#endif

/* map all the memory banks */
- for_each_memblock(memory, reg) {
- phys_addr_t start = reg->base;
- phys_addr_t end = start + reg->size;
-
+ for_each_mem_range(i, &start, &end) {
if (start >= end)
break;
- if (memblock_is_nomap(reg))
- continue;
-
__map_memblock(pgdp, start, end, PAGE_KERNEL, flags);
}

diff --git a/arch/c6x/kernel/setup.c b/arch/c6x/kernel/setup.c
index 8ef35131f999..9254c3b794a5 100644
--- a/arch/c6x/kernel/setup.c
+++ b/arch/c6x/kernel/setup.c
@@ -287,7 +287,8 @@ notrace void __init machine_init(unsigned long dt_ptr)

void __init setup_arch(char **cmdline_p)
{
- struct memblock_region *reg;
+ phys_addr_t start, end;
+ u64 i;

printk(KERN_INFO "Initializing kernel\n");

@@ -351,9 +352,9 @@ void __init setup_arch(char **cmdline_p)
disable_caching(ram_start, ram_end - 1);

/* Set caching of external RAM used by Linux */
- for_each_memblock(memory, reg)
- enable_caching(CACHE_REGION_START(reg->base),
- CACHE_REGION_START(reg->base + reg->size - 1));
+ for_each_mem_range(i, &start, &end)
+ enable_caching(CACHE_REGION_START(start),
+ CACHE_REGION_START(end - 1));

#ifdef CONFIG_BLK_DEV_INITRD
/*
diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
index 49e0c241f9b1..15403b5adfcf 100644
--- a/arch/microblaze/mm/init.c
+++ b/arch/microblaze/mm/init.c
@@ -106,13 +106,14 @@ static void __init paging_init(void)
void __init setup_memory(void)
{
#ifndef CONFIG_MMU
- struct memblock_region *reg;
u32 kernel_align_start, kernel_align_size;
+ phys_addr_t start, end;
+ u64 i;

/* Find main memory where is the kernel */
- for_each_memblock(memory, reg) {
- memory_start = (u32)reg->base;
- lowmem_size = reg->size;
+ for_each_mem_range(i, &start, &end) {
+ memory_start = start;
+ lowmem_size = end - start;
if ((memory_start <= (u32)_text) &&
((u32)_text <= (memory_start + lowmem_size - 1))) {
memory_size = lowmem_size;
diff --git a/arch/mips/cavium-octeon/dma-octeon.c b/arch/mips/cavium-octeon/dma-octeon.c
index 14ea680d180e..d938c1f7c1e1 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -190,25 +190,25 @@ char *octeon_swiotlb;

void __init plat_swiotlb_setup(void)
{
- struct memblock_region *mem;
+ phys_addr_t start, end;
phys_addr_t max_addr;
phys_addr_t addr_size;
size_t swiotlbsize;
unsigned long swiotlb_nslabs;
+ u64 i;

max_addr = 0;
addr_size = 0;

- for_each_memblock(memory, mem) {
+ for_each_mem_range(i, &start, &end) {
/* These addresses map low for PCI. */
if (mem->base > 0x410000000ull && !OCTEON_IS_OCTEON2())
continue;

- addr_size += mem->size;
-
- if (max_addr < mem->base + mem->size)
- max_addr = mem->base + mem->size;
+ addr_size += (end - start);

+ if (max_addr < end)
+ max_addr = end;
}

swiotlbsize = PAGE_SIZE;
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 7b537fa2035d..eaac1b66026d 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -300,8 +300,9 @@ static void __init bootmem_init(void)

static void __init bootmem_init(void)
{
- struct memblock_region *mem;
phys_addr_t ramstart, ramend;
+ phys_addr_t start, end;
+ u64 i;

ramstart = memblock_start_of_DRAM();
ramend = memblock_end_of_DRAM();
@@ -338,18 +339,13 @@ static void __init bootmem_init(void)

min_low_pfn = ARCH_PFN_OFFSET;
max_pfn = PFN_DOWN(ramend);
- for_each_memblock(memory, mem) {
- unsigned long start = memblock_region_memory_base_pfn(mem);
- unsigned long end = memblock_region_memory_end_pfn(mem);
-
+ for_each_mem_range(i, &start, &end) {
/*
* Skip highmem here so we get an accurate max_low_pfn if low
* memory stops short of high memory.
* If the region overlaps HIGHMEM_START, end is clipped so
* max_pfn excludes the highmem portion.
*/
- if (memblock_is_nomap(mem))
- continue;
if (start >= PFN_DOWN(HIGHMEM_START))
continue;
if (end > PFN_DOWN(HIGHMEM_START))
@@ -458,13 +454,12 @@ early_param("memmap", early_parse_memmap);
unsigned long setup_elfcorehdr, setup_elfcorehdr_size;
static int __init early_parse_elfcorehdr(char *p)
{
- struct memblock_region *mem;
+ phys_addr_t start, end;
+ u64 i;

setup_elfcorehdr = memparse(p, &p);

- for_each_memblock(memory, mem) {
- unsigned long start = mem->base;
- unsigned long end = start + mem->size;
+ for_each_mem_range(i, &start, &end) {
if (setup_elfcorehdr >= start && setup_elfcorehdr < end) {
/*
* Reserve from the elf core header to the end of
@@ -728,7 +723,8 @@ static void __init arch_mem_init(char **cmdline_p)

static void __init resource_init(void)
{
- struct memblock_region *region;
+ phys_addr_t start, end;
+ u64 i;

if (UNCAC_BASE != IO_BASE)
return;
@@ -740,9 +736,7 @@ static void __init resource_init(void)
bss_resource.start = __pa_symbol(&__bss_start);
bss_resource.end = __pa_symbol(&__bss_stop) - 1;

- for_each_memblock(memory, region) {
- phys_addr_t start = PFN_PHYS(memblock_region_memory_base_pfn(region));
- phys_addr_t end = PFN_PHYS(memblock_region_memory_end_pfn(region)) - 1;
+ for_each_mem_range(i, &start, &end) {
struct resource *res;

res = memblock_alloc(sizeof(struct resource), SMP_CACHE_BYTES);
@@ -751,7 +745,12 @@ static void __init resource_init(void)
sizeof(struct resource));

res->start = start;
- res->end = end;
+ /*
+ * In memblock, end points to the first byte after the
+ * range while in resourses, end points to the last byte in
+ * the range.
+ */
+ res->end = end - 1;
res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
res->name = "System RAM";

diff --git a/arch/openrisc/mm/init.c b/arch/openrisc/mm/init.c
index 3d7c79c7745d..8348feaaf46e 100644
--- a/arch/openrisc/mm/init.c
+++ b/arch/openrisc/mm/init.c
@@ -64,6 +64,7 @@ extern const char _s_kernel_ro[], _e_kernel_ro[];
*/
static void __init map_ram(void)
{
+ phys_addr_t start, end;
unsigned long v, p, e;
pgprot_t prot;
pgd_t *pge;
@@ -71,6 +72,7 @@ static void __init map_ram(void)
pud_t *pue;
pmd_t *pme;
pte_t *pte;
+ u64 i;
/* These mark extents of read-only kernel pages...
* ...from vmlinux.lds.S
*/
@@ -78,9 +80,9 @@ static void __init map_ram(void)

v = PAGE_OFFSET;

- for_each_memblock(memory, region) {
- p = (u32) region->base & PAGE_MASK;
- e = p + (u32) region->size;
+ for_each_mem_range(i, &start, &end) {
+ p = (u32) start & PAGE_MASK;
+ e = (u32) end;

v = (u32) __va(p);
pge = pgd_offset_k(v);
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index fc85cbc66839..b0f1db443251 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -180,13 +180,13 @@ int is_fadump_active(void)
*/
static bool is_fadump_mem_area_contiguous(u64 d_start, u64 d_end)
{
- struct memblock_region *reg;
+ phys_addr_t reg_start, reg_end;
bool ret = false;
- u64 start, end;
+ u64 i, start, end;

- for_each_memblock(memory, reg) {
- start = max_t(u64, d_start, reg->base);
- end = min_t(u64, d_end, (reg->base + reg->size));
+ for_each_mem_range(i, &reg_start, &reg_end) {
+ start = max_t(u64, d_start, reg_start);
+ end = min_t(u64, d_end, reg_end);
if (d_start < end) {
/* Memory hole from d_start to start */
if (start > d_start)
@@ -411,34 +411,34 @@ static int __init add_boot_mem_regions(unsigned long mstart,

static int __init fadump_get_boot_mem_regions(void)
{
- unsigned long base, size, cur_size, hole_size, last_end;
+ unsigned long size, cur_size, hole_size, last_end;
unsigned long mem_size = fw_dump.boot_memory_size;
- struct memblock_region *reg;
+ phys_addr_t reg_start, reg_end;
int ret = 1;
+ u64 i;

fw_dump.boot_mem_regs_cnt = 0;

last_end = 0;
hole_size = 0;
cur_size = 0;
- for_each_memblock(memory, reg) {
- base = reg->base;
- size = reg->size;
- hole_size += (base - last_end);
+ for_each_mem_range(i, &reg_start, &reg_end) {
+ size = reg_end - reg_start;
+ hole_size += (reg_start - last_end);

if ((cur_size + size) >= mem_size) {
size = (mem_size - cur_size);
- ret = add_boot_mem_regions(base, size);
+ ret = add_boot_mem_regions(reg_start, size);
break;
}

mem_size -= size;
cur_size += size;
- ret = add_boot_mem_regions(base, size);
+ ret = add_boot_mem_regions(reg_start, size);
if (!ret)
break;

- last_end = base + size;
+ last_end = reg_end;
}
fw_dump.boot_mem_top = PAGE_ALIGN(fw_dump.boot_memory_size + hole_size);

@@ -959,9 +959,8 @@ static int fadump_init_elfcore_header(char *bufp)
*/
static int fadump_setup_crash_memory_ranges(void)
{
- struct memblock_region *reg;
- u64 start, end;
- int i, ret;
+ u64 i, start, end;
+ int ret;

pr_debug("Setup crash memory ranges.\n");
crash_mrange_info.mem_range_cnt = 0;
@@ -979,10 +978,7 @@ static int fadump_setup_crash_memory_ranges(void)
return ret;
}

- for_each_memblock(memory, reg) {
- start = (u64)reg->base;
- end = start + (u64)reg->size;
-
+ for_each_mem_range(i, &start, &end) {
/*
* skip the memory chunk that is already added
* (0 through boot_memory_top).
@@ -1216,7 +1212,9 @@ static void fadump_free_reserved_memory(unsigned long start_pfn,
*/
static void fadump_release_reserved_area(u64 start, u64 end)
{
- u64 tstart, tend, spfn, epfn, reg_spfn, reg_epfn, i;
+ unsigned long reg_spfn, reg_epfn;
+ u64 tstart, tend, spfn, epfn;
+ int i;

spfn = PHYS_PFN(start);
epfn = PHYS_PFN(end);
@@ -1659,12 +1657,10 @@ int __init fadump_reserve_mem(void)
/* Preserve everything above the base address */
static void __init fadump_reserve_crash_area(u64 base)
{
- struct memblock_region *reg;
- u64 mstart, msize;
+ u64 i, mstart, mend, msize;

- for_each_memblock(memory, reg) {
- mstart = reg->base;
- msize = reg->size;
+ for_each_mem_range(i, &mstart, &mend) {
+ msize = mend - mstart;

if ((mstart + msize) < base)
continue;
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 468169e33c86..9ba76b075b11 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -7,7 +7,7 @@
*
* SMP scalability work:
* Copyright (C) 2001 Anton Blanchard <[email protected]>, IBM
- *
+ *
* Module name: htab.c
*
* Description:
@@ -862,8 +862,8 @@ static void __init htab_initialize(void)
unsigned long table;
unsigned long pteg_count;
unsigned long prot;
- unsigned long base = 0, size = 0;
- struct memblock_region *reg;
+ phys_addr_t base = 0, size = 0, end;
+ u64 i;

DBG(" -> htab_initialize()\n");

@@ -879,7 +879,7 @@ static void __init htab_initialize(void)
/*
* Calculate the required size of the htab. We want the number of
* PTEGs to equal one half the number of real pages.
- */
+ */
htab_size_bytes = htab_get_table_size();
pteg_count = htab_size_bytes >> 7;

@@ -889,7 +889,7 @@ static void __init htab_initialize(void)
firmware_has_feature(FW_FEATURE_PS3_LV1)) {
/* Using a hypervisor which owns the htab */
htab_address = NULL;
- _SDR1 = 0;
+ _SDR1 = 0;
#ifdef CONFIG_FA_DUMP
/*
* If firmware assisted dump is active firmware preserves
@@ -955,9 +955,9 @@ static void __init htab_initialize(void)
#endif /* CONFIG_DEBUG_PAGEALLOC */

/* create bolted the linear mapping in the hash table */
- for_each_memblock(memory, reg) {
- base = (unsigned long)__va(reg->base);
- size = reg->size;
+ for_each_mem_range(i, &base, &end) {
+ size = end - base;
+ base = (unsigned long)__va(base);

DBG("creating mapping for region: %lx..%lx (prot: %lx)\n",
base, size, prot);
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index bb00e0cba119..65657b920847 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -318,28 +318,27 @@ static int __meminit create_physical_mapping(unsigned long start,
static void __init radix_init_pgtable(void)
{
unsigned long rts_field;
- struct memblock_region *reg;
+ phys_addr_t start, end;
+ u64 i;

/* We don't support slb for radix */
mmu_slb_size = 0;
/*
* Create the linear mapping, using standard page size for now
*/
- for_each_memblock(memory, reg) {
+ for_each_mem_range(i, &start, &end) {
/*
* The memblock allocator is up at this point, so the
* page tables will be allocated within the range. No
* need or a node (which we don't have yet).
*/

- if ((reg->base + reg->size) >= RADIX_VMALLOC_START) {
+ if (end >= RADIX_VMALLOC_START) {
pr_warn("Outside the supported range\n");
continue;
}

- WARN_ON(create_physical_mapping(reg->base,
- reg->base + reg->size,
- -1, PAGE_KERNEL));
+ WARN_ON(create_physical_mapping(start, end, -1, PAGE_KERNEL));
}

/* Find out how many PID bits are supported */
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index 0760e1e754e4..6e73434e4e41 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -120,11 +120,11 @@ static void __init kasan_unmap_early_shadow_vmalloc(void)
static void __init kasan_mmu_init(void)
{
int ret;
- struct memblock_region *reg;
+ phys_addr_t base, end;
+ u64 i;

- for_each_memblock(memory, reg) {
- phys_addr_t base = reg->base;
- phys_addr_t top = min(base + reg->size, total_lowmem);
+ for_each_mem_range(i, &base, &end) {
+ phys_addr_t top = min(end, total_lowmem);

if (base >= top)
continue;
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 1364dd532107..fd78de08506f 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -593,20 +593,24 @@ void flush_icache_user_page(struct vm_area_struct *vma, struct page *page,
*/
static int __init add_system_ram_resources(void)
{
- struct memblock_region *reg;
+ phys_addr_t start, end;
+ u64 i;

- for_each_memblock(memory, reg) {
+ for_each_mem_range(i, &start, &end) {
struct resource *res;
- unsigned long base = reg->base;
- unsigned long size = reg->size;

res = kzalloc(sizeof(struct resource), GFP_KERNEL);
WARN_ON(!res);

if (res) {
res->name = "System RAM";
- res->start = base;
- res->end = base + size - 1;
+ res->start = start;
+ /*
+ * In memblock, end points to the first byte after
+ * the range while in resourses, end points to the
+ * last byte in the range.
+ */
+ res->end = end - 1;
res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
WARN_ON(request_resource(&iomem_resource, res) < 0);
}
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 6eb4eab79385..079159e97bca 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -123,11 +123,11 @@ static void __init __mapin_ram_chunk(unsigned long offset, unsigned long top)

void __init mapin_ram(void)
{
- struct memblock_region *reg;
+ phys_addr_t base, end;
+ u64 i;

- for_each_memblock(memory, reg) {
- phys_addr_t base = reg->base;
- phys_addr_t top = min(base + reg->size, total_lowmem);
+ for_each_mem_range(i, &base, &end) {
+ phys_addr_t top = min(end, total_lowmem);

if (base >= top)
continue;
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 7440ba2cdaaa..8479a703dc06 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -145,21 +145,21 @@ static phys_addr_t dtb_early_pa __initdata;

void __init setup_bootmem(void)
{
- struct memblock_region *reg;
phys_addr_t mem_size = 0;
phys_addr_t total_mem = 0;
- phys_addr_t mem_start, end = 0;
+ phys_addr_t mem_start, start, end = 0;
phys_addr_t vmlinux_end = __pa_symbol(&_end);
phys_addr_t vmlinux_start = __pa_symbol(&_start);
+ u64 i;

/* Find the memory region containing the kernel */
- for_each_memblock(memory, reg) {
- end = reg->base + reg->size;
+ for_each_mem_range(i, &start, &end) {
+ phys_addr_t size = end - start;
if (!total_mem)
- mem_start = reg->base;
- if (reg->base <= vmlinux_start && vmlinux_end <= end)
- BUG_ON(reg->size == 0);
- total_mem = total_mem + reg->size;
+ mem_start = start;
+ if (start <= vmlinux_start && vmlinux_end <= end)
+ BUG_ON(size == 0);
+ total_mem = total_mem + size;
}

/*
@@ -456,7 +456,7 @@ static void __init setup_vm_final(void)
{
uintptr_t va, map_size;
phys_addr_t pa, start, end;
- struct memblock_region *reg;
+ u64 i;

/* Set mmu_enabled flag */
mmu_enabled = true;
@@ -467,14 +467,9 @@ static void __init setup_vm_final(void)
PGDIR_SIZE, PAGE_TABLE);

/* Map all memory banks */
- for_each_memblock(memory, reg) {
- start = reg->base;
- end = start + reg->size;
-
+ for_each_mem_range(i, &start, &end) {
if (start >= end)
break;
- if (memblock_is_nomap(reg))
- continue;
if (start <= __pa(PAGE_OFFSET) &&
__pa(PAGE_OFFSET) < end)
start = __pa(PAGE_OFFSET);
diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
index 87b4ab3d3c77..12ddd1f6bf70 100644
--- a/arch/riscv/mm/kasan_init.c
+++ b/arch/riscv/mm/kasan_init.c
@@ -85,16 +85,16 @@ static void __init populate(void *start, void *end)

void __init kasan_init(void)
{
- struct memblock_region *reg;
- unsigned long i;
+ phys_addr_t _start, _end;
+ u64 i;

kasan_populate_early_shadow((void *)KASAN_SHADOW_START,
(void *)kasan_mem_to_shadow((void *)
VMALLOC_END));

- for_each_memblock(memory, reg) {
- void *start = (void *)__va(reg->base);
- void *end = (void *)__va(reg->base + reg->size);
+ for_each_mem_range(i, &_start, &_end) {
+ void *start = (void *)_start;
+ void *end = (void *)_end;

if (start >= end)
break;
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index 8b284cf6e199..b6c4a0c5ff86 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -198,7 +198,7 @@ static void __init conmode_default(void)
cpcmd("QUERY TERM", query_buffer, 1024, NULL);
ptr = strstr(query_buffer, "CONMODE");
/*
- * Set the conmode to 3215 so that the device recognition
+ * Set the conmode to 3215 so that the device recognition
* will set the cu_type of the console to 3215. If the
* conmode is 3270 and we don't set it back then both
* 3215 and the 3270 driver will try to access the console
@@ -258,7 +258,7 @@ static inline void setup_zfcpdump(void) {}

/*
* Reboot, halt and power_off stubs. They just call _machine_restart,
- * _machine_halt or _machine_power_off.
+ * _machine_halt or _machine_power_off.
*/

void machine_restart(char *command)
@@ -484,8 +484,9 @@ static struct resource __initdata *standard_resources[] = {
static void __init setup_resources(void)
{
struct resource *res, *std_res, *sub_res;
- struct memblock_region *reg;
+ phys_addr_t start, end;
int j;
+ u64 i;

code_resource.start = (unsigned long) _text;
code_resource.end = (unsigned long) _etext - 1;
@@ -494,7 +495,7 @@ static void __init setup_resources(void)
bss_resource.start = (unsigned long) __bss_start;
bss_resource.end = (unsigned long) __bss_stop - 1;

- for_each_memblock(memory, reg) {
+ for_each_mem_range(i, &start, &end) {
res = memblock_alloc(sizeof(*res), 8);
if (!res)
panic("%s: Failed to allocate %zu bytes align=0x%x\n",
@@ -502,8 +503,13 @@ static void __init setup_resources(void)
res->flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM;

res->name = "System RAM";
- res->start = reg->base;
- res->end = reg->base + reg->size - 1;
+ res->start = start;
+ /*
+ * In memblock, end points to the first byte after the
+ * range while in resourses, end points to the last byte in
+ * the range.
+ */
+ res->end = end - 1;
request_resource(&iomem_resource, res);

for (j = 0; j < ARRAY_SIZE(standard_resources); j++) {
@@ -819,14 +825,15 @@ static void __init reserve_kernel(void)

static void __init setup_memory(void)
{
- struct memblock_region *reg;
+ phys_addr_t start, end;
+ u64 i;

/*
* Init storage key for present memory
*/
- for_each_memblock(memory, reg) {
- storage_key_init_range(reg->base, reg->base + reg->size);
- }
+ for_each_mem_range(i, &start, &end)
+ storage_key_init_range(start, end);
+
psw_set_key(PAGE_DEFAULT_KEY);

/* Only cosmetics */
diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
index 8b6282cf7d13..30076ecc3eb7 100644
--- a/arch/s390/mm/vmem.c
+++ b/arch/s390/mm/vmem.c
@@ -399,10 +399,11 @@ int vmem_add_mapping(unsigned long start, unsigned long size)
*/
void __init vmem_map_init(void)
{
- struct memblock_region *reg;
+ phys_addr_t start, end;
+ u64 i;

- for_each_memblock(memory, reg)
- vmem_add_mem(reg->base, reg->size);
+ for_each_mem_range(i, &start, &end)
+ vmem_add_mem(start, end - start);
__set_memory((unsigned long)_stext,
(unsigned long)(_etext - _stext) >> PAGE_SHIFT,
SET_MEMORY_RO | SET_MEMORY_X);
@@ -428,16 +429,17 @@ void __init vmem_map_init(void)
*/
static int __init vmem_convert_memory_chunk(void)
{
- struct memblock_region *reg;
+ phys_addr_t start, end;
struct memory_segment *seg;
+ u64 i;

mutex_lock(&vmem_mutex);
- for_each_memblock(memory, reg) {
+ for_each_mem_range(i, &start, &end) {
seg = kzalloc(sizeof(*seg), GFP_KERNEL);
if (!seg)
panic("Out of memory...\n");
- seg->start = reg->base;
- seg->size = reg->size;
+ seg->start = start;
+ seg->size = end - start;
insert_memory_segment(seg);
}
mutex_unlock(&vmem_mutex);
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 02e6e5e0f106..de63c002638e 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -1192,18 +1192,14 @@ int of_node_to_nid(struct device_node *dp)

static void __init add_node_ranges(void)
{
- struct memblock_region *reg;
+ phys_addr_t start, end;
unsigned long prev_max;
+ u64 i;

memblock_resized:
prev_max = memblock.memory.max;

- for_each_memblock(memory, reg) {
- unsigned long size = reg->size;
- unsigned long start, end;
-
- start = reg->base;
- end = start + size;
+ for_each_mem_range(i, &start, &end) {
while (start < end) {
unsigned long this_end;
int nid;
@@ -1211,7 +1207,7 @@ static void __init add_node_ranges(void)
this_end = memblock_nid_range(start, end, &nid);

numadbg("Setting memblock NUMA node nid[%d] "
- "start[%lx] end[%lx]\n",
+ "start[%llx] end[%lx]\n",
nid, start, this_end);

memblock_set_node(start, this_end - start,
diff --git a/drivers/bus/mvebu-mbus.c b/drivers/bus/mvebu-mbus.c
index 5b2a11a88951..2519ceede64b 100644
--- a/drivers/bus/mvebu-mbus.c
+++ b/drivers/bus/mvebu-mbus.c
@@ -610,23 +610,23 @@ static unsigned int armada_xp_mbus_win_remap_offset(int win)
static void __init
mvebu_mbus_find_bridge_hole(uint64_t *start, uint64_t *end)
{
- struct memblock_region *r;
- uint64_t s = 0;
+ phys_addr_t reg_start, reg_end;
+ uint64_t i, s = 0;

- for_each_memblock(memory, r) {
+ for_each_mem_range(i, &reg_start, &reg_end) {
/*
* This part of the memory is above 4 GB, so we don't
* care for the MBus bridge hole.
*/
- if (r->base >= 0x100000000ULL)
+ if (reg_start >= 0x100000000ULL)
continue;

/*
* The MBus bridge hole is at the end of the RAM under
* the 4 GB limit.
*/
- if (r->base + r->size > s)
- s = r->base + r->size;
+ if (reg_end > s)
+ s = reg_end;
}

*start = s;
diff --git a/drivers/s390/char/zcore.c b/drivers/s390/char/zcore.c
index 08f812475f5e..484b1ec9a1bc 100644
--- a/drivers/s390/char/zcore.c
+++ b/drivers/s390/char/zcore.c
@@ -148,18 +148,19 @@ static ssize_t zcore_memmap_read(struct file *filp, char __user *buf,

static int zcore_memmap_open(struct inode *inode, struct file *filp)
{
- struct memblock_region *reg;
+ phys_addr_t start, end;
char *buf;
int i = 0;
+ u64 r;

buf = kcalloc(memblock.memory.cnt, CHUNK_INFO_SIZE, GFP_KERNEL);
if (!buf) {
return -ENOMEM;
}
- for_each_memblock(memory, reg) {
+ for_each_mem_range(r, &start, &end) {
sprintf(buf + (i++ * CHUNK_INFO_SIZE), "%016llx %016llx ",
- (unsigned long long) reg->base,
- (unsigned long long) reg->size);
+ (unsigned long long) start,
+ (unsigned long long) (end - start));
}
filp->private_data = buf;
return nonseekable_open(inode, filp);
--
2.26.2

2020-08-02 16:39:38

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 13/17] x86/setup: simplify initrd relocation and reservation

From: Mike Rapoport <[email protected]>

Currently, initrd image is reserved very early during setup and then it
might be relocated and re-reserved after the initial physical memory
mapping is created. The "late" reservation of memblock verifies that mapped
memory size exceeds the size of initrd, the checks whether the relocation
required and, if yes, relocates inirtd to a new memory allocated from
memblock and frees the old location.

The check for memory size is excessive as memblock allocation will anyway
fail if there is not enough memory. Besides, there is no point to allocate
memory from memblock using memblock_find_in_range() + memblock_reserve()
when there exists memblock_phys_alloc_range() with required functionality.

Remove the redundant check and simplify memblock allocation.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/x86/kernel/setup.c | 16 +++-------------
1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index a3767e74c758..d8de4053c5e8 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -262,16 +262,12 @@ static void __init relocate_initrd(void)
u64 area_size = PAGE_ALIGN(ramdisk_size);

/* We need to move the initrd down into directly mapped mem */
- relocated_ramdisk = memblock_find_in_range(0, PFN_PHYS(max_pfn_mapped),
- area_size, PAGE_SIZE);
-
+ relocated_ramdisk = memblock_phys_alloc_range(area_size, PAGE_SIZE, 0,
+ PFN_PHYS(max_pfn_mapped));
if (!relocated_ramdisk)
panic("Cannot find place for new RAMDISK of size %lld\n",
ramdisk_size);

- /* Note: this includes all the mem currently occupied by
- the initrd, we rely on that fact to keep the data intact. */
- memblock_reserve(relocated_ramdisk, area_size);
initrd_start = relocated_ramdisk + PAGE_OFFSET;
initrd_end = initrd_start + ramdisk_size;
printk(KERN_INFO "Allocated new RAMDISK: [mem %#010llx-%#010llx]\n",
@@ -298,13 +294,13 @@ static void __init early_reserve_initrd(void)

memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image);
}
+
static void __init reserve_initrd(void)
{
/* Assume only end is not page aligned */
u64 ramdisk_image = get_ramdisk_image();
u64 ramdisk_size = get_ramdisk_size();
u64 ramdisk_end = PAGE_ALIGN(ramdisk_image + ramdisk_size);
- u64 mapped_size;

if (!boot_params.hdr.type_of_loader ||
!ramdisk_image || !ramdisk_size)
@@ -312,12 +308,6 @@ static void __init reserve_initrd(void)

initrd_start = 0;

- mapped_size = memblock_mem_size(max_pfn_mapped);
- if (ramdisk_size >= (mapped_size>>1))
- panic("initrd too large to handle, "
- "disabling initrd (%lld needed, %lld available)\n",
- ramdisk_size, mapped_size>>1);
-
printk(KERN_INFO "RAMDISK: [mem %#010llx-%#010llx]\n", ramdisk_image,
ramdisk_end - 1);

--
2.26.2

2020-08-02 16:39:44

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 14/17] x86/setup: simplify reserve_crashkernel()

From: Mike Rapoport <[email protected]>

* Replace magic numbers with defines
* Replace memblock_find_in_range() + memblock_reserve() with
memblock_phys_alloc_range()
* Stop checking for low memory size in reserve_crashkernel_low(). The
allocation from limited range will anyway fail if there is no enough
memory, so there is no need for extra traversal of memblock.memory

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/x86/kernel/setup.c | 40 ++++++++++++++--------------------------
1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index d8de4053c5e8..d7ced6982524 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -419,13 +419,13 @@ static int __init reserve_crashkernel_low(void)
{
#ifdef CONFIG_X86_64
unsigned long long base, low_base = 0, low_size = 0;
- unsigned long total_low_mem;
+ unsigned long low_mem_limit;
int ret;

- total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
+ low_mem_limit = min(memblock_phys_mem_size(), CRASH_ADDR_LOW_MAX);

/* crashkernel=Y,low */
- ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size, &base);
+ ret = parse_crashkernel_low(boot_command_line, low_mem_limit, &low_size, &base);
if (ret) {
/*
* two parts from kernel/dma/swiotlb.c:
@@ -443,23 +443,17 @@ static int __init reserve_crashkernel_low(void)
return 0;
}

- low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
+ low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
if (!low_base) {
pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
(unsigned long)(low_size >> 20));
return -ENOMEM;
}

- ret = memblock_reserve(low_base, low_size);
- if (ret) {
- pr_err("%s: Error reserving crashkernel low memblock.\n", __func__);
- return ret;
- }
-
- pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
+ pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (low RAM limit: %ldMB)\n",
(unsigned long)(low_size >> 20),
(unsigned long)(low_base >> 20),
- (unsigned long)(total_low_mem >> 20));
+ (unsigned long)(low_mem_limit >> 20));

crashk_low_res.start = low_base;
crashk_low_res.end = low_base + low_size - 1;
@@ -503,13 +497,13 @@ static void __init reserve_crashkernel(void)
* unless "crashkernel=size[KMG],high" is specified.
*/
if (!high)
- crash_base = memblock_find_in_range(CRASH_ALIGN,
- CRASH_ADDR_LOW_MAX,
- crash_size, CRASH_ALIGN);
+ crash_base = memblock_phys_alloc_range(crash_size,
+ CRASH_ALIGN, CRASH_ALIGN,
+ CRASH_ADDR_LOW_MAX);
if (!crash_base)
- crash_base = memblock_find_in_range(CRASH_ALIGN,
- CRASH_ADDR_HIGH_MAX,
- crash_size, CRASH_ALIGN);
+ crash_base = memblock_phys_alloc_range(crash_size,
+ CRASH_ALIGN, CRASH_ALIGN,
+ CRASH_ADDR_HIGH_MAX);
if (!crash_base) {
pr_info("crashkernel reservation failed - No suitable area found.\n");
return;
@@ -517,19 +511,13 @@ static void __init reserve_crashkernel(void)
} else {
unsigned long long start;

- start = memblock_find_in_range(crash_base,
- crash_base + crash_size,
- crash_size, 1 << 20);
+ start = memblock_phys_alloc_range(crash_size, SZ_1M, crash_base,
+ crash_base + crash_size);
if (start != crash_base) {
pr_info("crashkernel reservation failed - memory is in use.\n");
return;
}
}
- ret = memblock_reserve(crash_base, crash_size);
- if (ret) {
- pr_err("%s: Error reserving crashkernel memblock.\n", __func__);
- return;
- }

if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) {
memblock_free(crash_base, crash_size);
--
2.26.2

2020-08-02 16:40:18

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 11/17] arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()

From: Mike Rapoport <[email protected]>

There are several occurrences of the following pattern:

for_each_memblock(memory, reg) {
start_pfn = memblock_region_memory_base_pfn(reg);
end_pfn = memblock_region_memory_end_pfn(reg);

/* do something with start_pfn and end_pfn */
}

Rather than iterate over all memblock.memory regions and each time query
for their start and end PFNs, use for_each_mem_pfn_range() iterator to get
simpler and clearer code.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/arm/mm/init.c | 11 ++++-------
arch/arm64/mm/init.c | 11 ++++-------
arch/powerpc/kernel/fadump.c | 11 ++++++-----
arch/powerpc/mm/mem.c | 15 ++++++++-------
arch/powerpc/mm/numa.c | 7 ++-----
arch/s390/mm/page-states.c | 6 ++----
arch/sh/mm/init.c | 9 +++------
mm/memblock.c | 6 ++----
mm/sparse.c | 10 ++++------
9 files changed, 35 insertions(+), 51 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 626af348eb8f..d630573277d1 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -304,16 +304,14 @@ free_memmap(unsigned long start_pfn, unsigned long end_pfn)
*/
static void __init free_unused_memmap(void)
{
- unsigned long start, prev_end = 0;
- struct memblock_region *reg;
+ unsigned long start, end, prev_end = 0;
+ int i;

/*
* This relies on each bank being in address order.
* The banks are sorted previously in bootmem_init().
*/
- for_each_memblock(memory, reg) {
- start = memblock_region_memory_base_pfn(reg);
-
+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, NULL) {
#ifdef CONFIG_SPARSEMEM
/*
* Take care not to free memmap entries that don't exist
@@ -341,8 +339,7 @@ static void __init free_unused_memmap(void)
* memmap entries are valid from the bank end aligned to
* MAX_ORDER_NR_PAGES.
*/
- prev_end = ALIGN(memblock_region_memory_end_pfn(reg),
- MAX_ORDER_NR_PAGES);
+ prev_end = ALIGN(end, MAX_ORDER_NR_PAGES);
}

#ifdef CONFIG_SPARSEMEM
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1e93cfc7c47a..291b5805457d 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -473,12 +473,10 @@ static inline void free_memmap(unsigned long start_pfn, unsigned long end_pfn)
*/
static void __init free_unused_memmap(void)
{
- unsigned long start, prev_end = 0;
- struct memblock_region *reg;
-
- for_each_memblock(memory, reg) {
- start = __phys_to_pfn(reg->base);
+ unsigned long start, end, prev_end = 0;
+ int i;

+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, NULL) {
#ifdef CONFIG_SPARSEMEM
/*
* Take care not to free memmap entries that don't exist due
@@ -498,8 +496,7 @@ static void __init free_unused_memmap(void)
* memmap entries are valid from the bank end aligned to
* MAX_ORDER_NR_PAGES.
*/
- prev_end = ALIGN(__phys_to_pfn(reg->base + reg->size),
- MAX_ORDER_NR_PAGES);
+ prev_end = ALIGN(end, MAX_ORDER_NR_PAGES);
}

#ifdef CONFIG_SPARSEMEM
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 78ab9a6ee6ac..fc85cbc66839 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1216,14 +1216,15 @@ static void fadump_free_reserved_memory(unsigned long start_pfn,
*/
static void fadump_release_reserved_area(u64 start, u64 end)
{
- u64 tstart, tend, spfn, epfn;
- struct memblock_region *reg;
+ u64 tstart, tend, spfn, epfn, reg_spfn, reg_epfn, i;

spfn = PHYS_PFN(start);
epfn = PHYS_PFN(end);
- for_each_memblock(memory, reg) {
- tstart = max_t(u64, spfn, memblock_region_memory_base_pfn(reg));
- tend = min_t(u64, epfn, memblock_region_memory_end_pfn(reg));
+
+ for_each_mem_pfn_range(i, MAX_NUMNODES, &reg_spfn, &reg_epfn, NULL) {
+ tstart = max_t(u64, spfn, reg_spfn);
+ tend = min_t(u64, epfn, reg_epfn);
+
if (tstart < tend) {
fadump_free_reserved_memory(tstart, tend);

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index c2c11eb8dcfc..1364dd532107 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -192,15 +192,16 @@ void __init initmem_init(void)
/* mark pages that don't exist as nosave */
static int __init mark_nonram_nosave(void)
{
- struct memblock_region *reg, *prev = NULL;
+ unsigned long spfn, epfn, prev = 0;
+ int i;

- for_each_memblock(memory, reg) {
- if (prev &&
- memblock_region_memory_end_pfn(prev) < memblock_region_memory_base_pfn(reg))
- register_nosave_region(memblock_region_memory_end_pfn(prev),
- memblock_region_memory_base_pfn(reg));
- prev = reg;
+ for_each_mem_pfn_range(i, MAX_NUMNODES, &spfn, &epfn, NULL) {
+ if (prev && prev < spfn)
+ register_nosave_region(prev, spfn);
+
+ prev = epfn;
}
+
return 0;
}
#else /* CONFIG_NEED_MULTIPLE_NODES */
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 9fcf2d195830..bae2d9edd52c 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -800,17 +800,14 @@ static void __init setup_nonnuma(void)
unsigned long total_ram = memblock_phys_mem_size();
unsigned long start_pfn, end_pfn;
unsigned int nid = 0;
- struct memblock_region *reg;
+ int i;

printk(KERN_DEBUG "Top of RAM: 0x%lx, Total RAM: 0x%lx\n",
top_of_ram, total_ram);
printk(KERN_DEBUG "Memory hole size: %ldMB\n",
(top_of_ram - total_ram) >> 20);

- for_each_memblock(memory, reg) {
- start_pfn = memblock_region_memory_base_pfn(reg);
- end_pfn = memblock_region_memory_end_pfn(reg);
-
+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, NULL) {
fake_numa_create_new_node(end_pfn, &nid);
memblock_set_node(PFN_PHYS(start_pfn),
PFN_PHYS(end_pfn - start_pfn),
diff --git a/arch/s390/mm/page-states.c b/arch/s390/mm/page-states.c
index fc141893d028..567c69f3069e 100644
--- a/arch/s390/mm/page-states.c
+++ b/arch/s390/mm/page-states.c
@@ -183,9 +183,9 @@ static void mark_kernel_pgd(void)

void __init cmma_init_nodat(void)
{
- struct memblock_region *reg;
struct page *page;
unsigned long start, end, ix;
+ int i;

if (cmma_flag < 2)
return;
@@ -193,9 +193,7 @@ void __init cmma_init_nodat(void)
mark_kernel_pgd();

/* Set all kernel pages not used for page tables to stable/no-dat */
- for_each_memblock(memory, reg) {
- start = memblock_region_memory_base_pfn(reg);
- end = memblock_region_memory_end_pfn(reg);
+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, NULL) {
page = pfn_to_page(start);
for (ix = start; ix < end; ix++, page++) {
if (__test_and_clear_bit(PG_arch_1, &page->flags))
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index 62b8f03ffc80..586ea500dcc7 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -224,15 +224,12 @@ void __init allocate_pgdat(unsigned int nid)

static void __init do_init_bootmem(void)
{
- struct memblock_region *reg;
+ unsigned long start_pfn, end_pfn;
+ int i;

/* Add active regions with valid PFNs. */
- for_each_memblock(memory, reg) {
- unsigned long start_pfn, end_pfn;
- start_pfn = memblock_region_memory_base_pfn(reg);
- end_pfn = memblock_region_memory_end_pfn(reg);
+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, NULL)
__add_active_range(0, start_pfn, end_pfn);
- }

/* All of system RAM sits in node 0 for the non-NUMA case */
allocate_pgdat(0);
diff --git a/mm/memblock.c b/mm/memblock.c
index 824938849f6d..c1a4c8798973 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1659,12 +1659,10 @@ phys_addr_t __init_memblock memblock_reserved_size(void)
phys_addr_t __init memblock_mem_size(unsigned long limit_pfn)
{
unsigned long pages = 0;
- struct memblock_region *r;
unsigned long start_pfn, end_pfn;
+ int i;

- for_each_memblock(memory, r) {
- start_pfn = memblock_region_memory_base_pfn(r);
- end_pfn = memblock_region_memory_end_pfn(r);
+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, NULL) {
start_pfn = min_t(unsigned long, start_pfn, limit_pfn);
end_pfn = min_t(unsigned long, end_pfn, limit_pfn);
pages += end_pfn - start_pfn;
diff --git a/mm/sparse.c b/mm/sparse.c
index b2b9a3e34696..c2ba412a3141 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -292,13 +292,11 @@ void __init memory_present(int nid, unsigned long start, unsigned long end)
*/
void __init memblocks_present(void)
{
- struct memblock_region *reg;
+ unsigned long start, end;
+ int i, nid;

- for_each_memblock(memory, reg) {
- memory_present(memblock_get_region_node(reg),
- memblock_region_memory_base_pfn(reg),
- memblock_region_memory_end_pfn(reg));
- }
+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, &nid)
+ memory_present(nid, start, end);
}

/*
--
2.26.2

2020-08-02 16:40:28

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 07/17] mircoblaze: drop unneeded NUMA and sparsemem initializations

From: Mike Rapoport <[email protected]>

microblaze does not support neither NUMA not SPARSMEM, so there is no point
to call memblock_set_node() and sparse_memory_present_with_active_regions()
functions during microblaze memory initialization.

Remove these calls and the surrounding code.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/microblaze/mm/init.c | 17 +----------------
1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
index 521b59ba716c..49e0c241f9b1 100644
--- a/arch/microblaze/mm/init.c
+++ b/arch/microblaze/mm/init.c
@@ -105,9 +105,8 @@ static void __init paging_init(void)

void __init setup_memory(void)
{
- struct memblock_region *reg;
-
#ifndef CONFIG_MMU
+ struct memblock_region *reg;
u32 kernel_align_start, kernel_align_size;

/* Find main memory where is the kernel */
@@ -161,20 +160,6 @@ void __init setup_memory(void)
pr_info("%s: max_low_pfn: %#lx\n", __func__, max_low_pfn);
pr_info("%s: max_pfn: %#lx\n", __func__, max_pfn);

- /* Add active regions with valid PFNs */
- for_each_memblock(memory, reg) {
- unsigned long start_pfn, end_pfn;
-
- start_pfn = memblock_region_memory_base_pfn(reg);
- end_pfn = memblock_region_memory_end_pfn(reg);
- memblock_set_node(start_pfn << PAGE_SHIFT,
- (end_pfn - start_pfn) << PAGE_SHIFT,
- &memblock.memory, 0);
- }
-
- /* XXX need to clip this if using highmem? */
- sparse_memory_present_with_active_regions(0);
-
paging_init();
}

--
2.26.2

2020-08-02 16:40:43

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 10/17] memblock: reduce number of parameters in for_each_mem_range()

From: Mike Rapoport <[email protected]>

Currently for_each_mem_range() iterator is the most generic way to traverse
memblock regions. As such, it has 8 parameters and it is hardly convenient
to users. Most users choose to utilize one of its wrappers and the only
user that actually needs most of the parameters outside memblock is s390
crash dump implementation.

To avoid yet another naming for memblock iterators, rename the existing
for_each_mem_range() to __for_each_mem_range() and add a new
for_each_mem_range() wrapper with only index, start and end parameters.

The new wrapper nicely fits into init_unavailable_mem() and will be used in
upcoming changes to simplify memblock traversals.

Signed-off-by: Mike Rapoport <[email protected]>
Reviewed-by: Baoquan He <[email protected]>
---
.clang-format | 1 +
arch/arm64/kernel/machine_kexec_file.c | 6 ++----
arch/s390/kernel/crash_dump.c | 8 ++++----
include/linux/memblock.h | 18 ++++++++++++++----
mm/page_alloc.c | 3 +--
5 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/.clang-format b/.clang-format
index a0a96088c74f..52ededab25ce 100644
--- a/.clang-format
+++ b/.clang-format
@@ -205,6 +205,7 @@ ForEachMacros:
- 'for_each_memblock_type'
- 'for_each_memcg_cache_index'
- 'for_each_mem_pfn_range'
+ - '__for_each_mem_range'
- 'for_each_mem_range'
- 'for_each_mem_range_rev'
- 'for_each_migratetype_order'
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 361a1143e09e..5b0e67b93cdc 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -215,8 +215,7 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
phys_addr_t start, end;

nr_ranges = 1; /* for exclusion of crashkernel region */
- for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE,
- MEMBLOCK_NONE, &start, &end, NULL)
+ for_each_mem_range(i, &start, &end)
nr_ranges++;

cmem = kmalloc(struct_size(cmem, ranges, nr_ranges), GFP_KERNEL);
@@ -225,8 +224,7 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)

cmem->max_nr_ranges = nr_ranges;
cmem->nr_ranges = 0;
- for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE,
- MEMBLOCK_NONE, &start, &end, NULL) {
+ for_each_mem_range(i, &start, &end) {
cmem->ranges[cmem->nr_ranges].start = start;
cmem->ranges[cmem->nr_ranges].end = end - 1;
cmem->nr_ranges++;
diff --git a/arch/s390/kernel/crash_dump.c b/arch/s390/kernel/crash_dump.c
index f96a5857bbfd..e28085c725ff 100644
--- a/arch/s390/kernel/crash_dump.c
+++ b/arch/s390/kernel/crash_dump.c
@@ -549,8 +549,8 @@ static int get_mem_chunk_cnt(void)
int cnt = 0;
u64 idx;

- for_each_mem_range(idx, &memblock.physmem, &oldmem_type, NUMA_NO_NODE,
- MEMBLOCK_NONE, NULL, NULL, NULL)
+ __for_each_mem_range(idx, &memblock.physmem, &oldmem_type, NUMA_NO_NODE,
+ MEMBLOCK_NONE, NULL, NULL, NULL)
cnt++;
return cnt;
}
@@ -563,8 +563,8 @@ static void loads_init(Elf64_Phdr *phdr, u64 loads_offset)
phys_addr_t start, end;
u64 idx;

- for_each_mem_range(idx, &memblock.physmem, &oldmem_type, NUMA_NO_NODE,
- MEMBLOCK_NONE, &start, &end, NULL) {
+ __for_each_mem_range(idx, &memblock.physmem, &oldmem_type, NUMA_NO_NODE,
+ MEMBLOCK_NONE, &start, &end, NULL) {
phdr->p_filesz = end - start;
phdr->p_type = PT_LOAD;
phdr->p_offset = start;
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index e6a23b3db696..d70c2835e913 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -142,7 +142,7 @@ void __next_reserved_mem_region(u64 *idx, phys_addr_t *out_start,
void __memblock_free_late(phys_addr_t base, phys_addr_t size);

/**
- * for_each_mem_range - iterate through memblock areas from type_a and not
+ * __for_each_mem_range - iterate through memblock areas from type_a and not
* included in type_b. Or just type_a if type_b is NULL.
* @i: u64 used as loop variable
* @type_a: ptr to memblock_type to iterate
@@ -153,7 +153,7 @@ void __memblock_free_late(phys_addr_t base, phys_addr_t size);
* @p_end: ptr to phys_addr_t for end address of the range, can be %NULL
* @p_nid: ptr to int for nid of the range, can be %NULL
*/
-#define for_each_mem_range(i, type_a, type_b, nid, flags, \
+#define __for_each_mem_range(i, type_a, type_b, nid, flags, \
p_start, p_end, p_nid) \
for (i = 0, __next_mem_range(&i, nid, flags, type_a, type_b, \
p_start, p_end, p_nid); \
@@ -182,6 +182,16 @@ void __memblock_free_late(phys_addr_t base, phys_addr_t size);
__next_mem_range_rev(&i, nid, flags, type_a, type_b, \
p_start, p_end, p_nid))

+/**
+ * for_each_mem_range - iterate through memory areas.
+ * @i: u64 used as loop variable
+ * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL
+ * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL
+ */
+#define for_each_mem_range(i, p_start, p_end) \
+ __for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, \
+ MEMBLOCK_NONE, p_start, p_end, NULL)
+
/**
* for_each_reserved_mem_region - iterate over all reserved memblock areas
* @i: u64 used as loop variable
@@ -287,8 +297,8 @@ int __init deferred_page_init_max_threads(const struct cpumask *node_cpumask);
* soon as memblock is initialized.
*/
#define for_each_free_mem_range(i, nid, flags, p_start, p_end, p_nid) \
- for_each_mem_range(i, &memblock.memory, &memblock.reserved, \
- nid, flags, p_start, p_end, p_nid)
+ __for_each_mem_range(i, &memblock.memory, &memblock.reserved, \
+ nid, flags, p_start, p_end, p_nid)

/**
* for_each_free_mem_range_reverse - rev-iterate through free memblock areas
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e028b87ce294..95af111d69d3 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6972,8 +6972,7 @@ static void __init init_unavailable_mem(void)
* Loop through unavailable ranges not covered by memblock.memory.
*/
pgcnt = 0;
- for_each_mem_range(i, &memblock.memory, NULL,
- NUMA_NO_NODE, MEMBLOCK_NONE, &start, &end, NULL) {
+ for_each_mem_range(i, &start, &end) {
if (next < start)
pgcnt += init_unavailable_range(PFN_DOWN(next),
PFN_UP(start));
--
2.26.2

2020-08-02 16:40:47

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 16/17] memblock: implement for_each_reserved_mem_region() using __next_mem_region()

From: Mike Rapoport <[email protected]>

Iteration over memblock.reserved with for_each_reserved_mem_region() used
__next_reserved_mem_region() that implemented a subset of
__next_mem_region().

Use __for_each_mem_range() and, essentially, __next_mem_region() with
appropriate parameters to reduce code duplication.

While on it, rename for_each_reserved_mem_region() to
for_each_reserved_mem_range() for consistency.

Signed-off-by: Mike Rapoport <[email protected]>
---
.clang-format | 2 +-
arch/arm64/kernel/setup.c | 2 +-
drivers/irqchip/irq-gic-v3-its.c | 2 +-
include/linux/memblock.h | 12 +++------
mm/memblock.c | 46 +++++++-------------------------
5 files changed, 17 insertions(+), 47 deletions(-)

diff --git a/.clang-format b/.clang-format
index 52ededab25ce..e28a849a1c58 100644
--- a/.clang-format
+++ b/.clang-format
@@ -266,7 +266,7 @@ ForEachMacros:
- 'for_each_process_thread'
- 'for_each_property_of_node'
- 'for_each_registered_fb'
- - 'for_each_reserved_mem_region'
+ - 'for_each_reserved_mem_range'
- 'for_each_rtd_codec_dais'
- 'for_each_rtd_codec_dais_rollback'
- 'for_each_rtd_components'
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 93b3844cf442..f3aec7244aab 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -257,7 +257,7 @@ static int __init reserve_memblock_reserved_regions(void)
if (!memblock_is_region_reserved(mem->start, mem_size))
continue;

- for_each_reserved_mem_region(j, &r_start, &r_end) {
+ for_each_reserved_mem_range(j, &r_start, &r_end) {
resource_size_t start, end;

start = max(PFN_PHYS(PFN_DOWN(r_start)), mem->start);
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index beac4caefad9..9971fd8cf6b6 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -2192,7 +2192,7 @@ static bool gic_check_reserved_range(phys_addr_t addr, unsigned long size)

addr_end = addr + size - 1;

- for_each_reserved_mem_region(i, &start, &end) {
+ for_each_reserved_mem_range(i, &start, &end) {
if (addr >= start && addr_end <= end)
return true;
}
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index ec2fd8f32a19..9e51b3fd4134 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -136,9 +136,6 @@ void __next_mem_range_rev(u64 *idx, int nid, enum memblock_flags flags,
struct memblock_type *type_b, phys_addr_t *out_start,
phys_addr_t *out_end, int *out_nid);

-void __next_reserved_mem_region(u64 *idx, phys_addr_t *out_start,
- phys_addr_t *out_end);
-
void __memblock_free_late(phys_addr_t base, phys_addr_t size);

/**
@@ -193,7 +190,7 @@ void __memblock_free_late(phys_addr_t base, phys_addr_t size);
MEMBLOCK_NONE, p_start, p_end, NULL)

/**
- * for_each_reserved_mem_region - iterate over all reserved memblock areas
+ * for_each_reserved_mem_range - iterate over all reserved memblock areas
* @i: u64 used as loop variable
* @p_start: ptr to phys_addr_t for start address of the range, can be %NULL
* @p_end: ptr to phys_addr_t for end address of the range, can be %NULL
@@ -201,10 +198,9 @@ void __memblock_free_late(phys_addr_t base, phys_addr_t size);
* Walks over reserved areas of memblock. Available as soon as memblock
* is initialized.
*/
-#define for_each_reserved_mem_region(i, p_start, p_end) \
- for (i = 0UL, __next_reserved_mem_region(&i, p_start, p_end); \
- i != (u64)ULLONG_MAX; \
- __next_reserved_mem_region(&i, p_start, p_end))
+#define for_each_reserved_mem_range(i, p_start, p_end) \
+ __for_each_mem_range(i, &memblock.reserved, NULL, NUMA_NO_NODE, \
+ MEMBLOCK_NONE, p_start, p_end, NULL)

static inline bool memblock_is_hotpluggable(struct memblock_region *m)
{
diff --git a/mm/memblock.c b/mm/memblock.c
index 48d614352b25..dadf579f7c53 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -946,42 +946,16 @@ int __init_memblock memblock_clear_nomap(phys_addr_t base, phys_addr_t size)
return memblock_setclr_flag(base, size, 0, MEMBLOCK_NOMAP);
}

-/**
- * __next_reserved_mem_region - next function for for_each_reserved_region()
- * @idx: pointer to u64 loop variable
- * @out_start: ptr to phys_addr_t for start address of the region, can be %NULL
- * @out_end: ptr to phys_addr_t for end address of the region, can be %NULL
- *
- * Iterate over all reserved memory regions.
- */
-void __init_memblock __next_reserved_mem_region(u64 *idx,
- phys_addr_t *out_start,
- phys_addr_t *out_end)
-{
- struct memblock_type *type = &memblock.reserved;
-
- if (*idx < type->cnt) {
- struct memblock_region *r = &type->regions[*idx];
- phys_addr_t base = r->base;
- phys_addr_t size = r->size;
-
- if (out_start)
- *out_start = base;
- if (out_end)
- *out_end = base + size - 1;
-
- *idx += 1;
- return;
- }
-
- /* signal end of iteration */
- *idx = ULLONG_MAX;
-}
-
-static bool should_skip_region(struct memblock_region *m, int nid, int flags)
+static bool __init_memblock should_skip_region(struct memblock_type *type,
+ struct memblock_region *m,
+ int nid, int flags)
{
int m_nid = memblock_get_region_node(m);

+ /* we never skip regions when iterating memblock.reserved */
+ if (type == &memblock.reserved)
+ return false;
+
/* only memory regions are associated with nodes, check it */
if (nid != NUMA_NO_NODE && nid != m_nid)
return true;
@@ -1048,7 +1022,7 @@ void __init_memblock __next_mem_range(u64 *idx, int nid,
phys_addr_t m_end = m->base + m->size;
int m_nid = memblock_get_region_node(m);

- if (should_skip_region(m, nid, flags))
+ if (should_skip_region(type_a, m, nid, flags))
continue;

if (!type_b) {
@@ -1152,7 +1126,7 @@ void __init_memblock __next_mem_range_rev(u64 *idx, int nid,
phys_addr_t m_end = m->base + m->size;
int m_nid = memblock_get_region_node(m);

- if (should_skip_region(m, nid, flags))
+ if (should_skip_region(type_a, m, nid, flags))
continue;

if (!type_b) {
@@ -1977,7 +1951,7 @@ static unsigned long __init free_low_memory_core_early(void)

memblock_clear_hotplug(0, -1);

- for_each_reserved_mem_region(i, &start, &end)
+ for_each_reserved_mem_range(i, &start, &end)
reserve_bootmem_region(start, end);

/*
--
2.26.2

2020-08-02 16:41:22

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 15/17] memblock: remove unused memblock_mem_size()

From: Mike Rapoport <[email protected]>

The only user of memblock_mem_size() was x86 setup code, it is gone now and
memblock_mem_size() funciton can be removed.

Signed-off-by: Mike Rapoport <[email protected]>
---
include/linux/memblock.h | 1 -
mm/memblock.c | 15 ---------------
2 files changed, 16 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index d70c2835e913..ec2fd8f32a19 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -450,7 +450,6 @@ static inline bool memblock_bottom_up(void)

phys_addr_t memblock_phys_mem_size(void);
phys_addr_t memblock_reserved_size(void);
-phys_addr_t memblock_mem_size(unsigned long limit_pfn);
phys_addr_t memblock_start_of_DRAM(void);
phys_addr_t memblock_end_of_DRAM(void);
void memblock_enforce_memory_limit(phys_addr_t memory_limit);
diff --git a/mm/memblock.c b/mm/memblock.c
index c1a4c8798973..48d614352b25 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1656,21 +1656,6 @@ phys_addr_t __init_memblock memblock_reserved_size(void)
return memblock.reserved.total_size;
}

-phys_addr_t __init memblock_mem_size(unsigned long limit_pfn)
-{
- unsigned long pages = 0;
- unsigned long start_pfn, end_pfn;
- int i;
-
- for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, NULL) {
- start_pfn = min_t(unsigned long, start_pfn, limit_pfn);
- end_pfn = min_t(unsigned long, end_pfn, limit_pfn);
- pages += end_pfn - start_pfn;
- }
-
- return PFN_PHYS(pages);
-}
-
/* lowest address */
phys_addr_t __init_memblock memblock_start_of_DRAM(void)
{
--
2.26.2

2020-08-02 16:43:15

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 17/17] memblock: use separate iterators for memory and reserved regions

From: Mike Rapoport <[email protected]>

for_each_memblock() is used to iterate over memblock.memory in
a few places that use data from memblock_region rather than the memory
ranges.

Introduce separate for_each_mem_region() and for_each_reserved_mem_region()
to improve encapsulation of memblock internals from its users.

Signed-off-by: Mike Rapoport <[email protected]>
---
.clang-format | 3 ++-
arch/arm64/kernel/setup.c | 2 +-
arch/arm64/mm/numa.c | 2 +-
arch/mips/netlogic/xlp/setup.c | 2 +-
arch/x86/mm/numa.c | 2 +-
include/linux/memblock.h | 19 ++++++++++++++++---
mm/memblock.c | 4 ++--
mm/page_alloc.c | 8 ++++----
8 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/.clang-format b/.clang-format
index e28a849a1c58..cff71d345456 100644
--- a/.clang-format
+++ b/.clang-format
@@ -201,7 +201,7 @@ ForEachMacros:
- 'for_each_matching_node'
- 'for_each_matching_node_and_match'
- 'for_each_member'
- - 'for_each_memblock'
+ - 'for_each_mem_region'
- 'for_each_memblock_type'
- 'for_each_memcg_cache_index'
- 'for_each_mem_pfn_range'
@@ -267,6 +267,7 @@ ForEachMacros:
- 'for_each_property_of_node'
- 'for_each_registered_fb'
- 'for_each_reserved_mem_range'
+ - 'for_each_reserved_mem_region'
- 'for_each_rtd_codec_dais'
- 'for_each_rtd_codec_dais_rollback'
- 'for_each_rtd_components'
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index f3aec7244aab..52ea2f1a7184 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -217,7 +217,7 @@ static void __init request_standard_resources(void)
if (!standard_resources)
panic("%s: Failed to allocate %zu bytes\n", __func__, res_size);

- for_each_memblock(memory, region) {
+ for_each_mem_region(region) {
res = &standard_resources[i++];
if (memblock_is_nomap(region)) {
res->name = "reserved";
diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
index 0cbdbcc885fb..f121e42246a6 100644
--- a/arch/arm64/mm/numa.c
+++ b/arch/arm64/mm/numa.c
@@ -350,7 +350,7 @@ static int __init numa_register_nodes(void)
struct memblock_region *mblk;

/* Check that valid nid is set to memblks */
- for_each_memblock(memory, mblk) {
+ for_each_mem_region(mblk) {
int mblk_nid = memblock_get_region_node(mblk);

if (mblk_nid == NUMA_NO_NODE || mblk_nid >= MAX_NUMNODES) {
diff --git a/arch/mips/netlogic/xlp/setup.c b/arch/mips/netlogic/xlp/setup.c
index 1a0fc5b62ba4..6e3102bcd2f1 100644
--- a/arch/mips/netlogic/xlp/setup.c
+++ b/arch/mips/netlogic/xlp/setup.c
@@ -70,7 +70,7 @@ static void nlm_fixup_mem(void)
const int pref_backup = 512;
struct memblock_region *mem;

- for_each_memblock(memory, mem) {
+ for_each_mem_region(mem) {
memblock_remove(mem->base + mem->size - pref_backup,
pref_backup);
}
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 8ee952038c80..fe6ea18d6923 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -516,7 +516,7 @@ static void __init numa_clear_kernel_node_hotplug(void)
* memory ranges, because quirks such as trim_snb_memory()
* reserve specific pages for Sandy Bridge graphics. ]
*/
- for_each_memblock(reserved, mb_region) {
+ for_each_reserved_mem_region(mb_region) {
int nid = memblock_get_region_node(mb_region);

if (nid != MAX_NUMNODES)
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 9e51b3fd4134..a6970e058bd7 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -522,9 +522,22 @@ static inline unsigned long memblock_region_reserved_end_pfn(const struct memblo
return PFN_UP(reg->base + reg->size);
}

-#define for_each_memblock(memblock_type, region) \
- for (region = memblock.memblock_type.regions; \
- region < (memblock.memblock_type.regions + memblock.memblock_type.cnt); \
+/**
+ * for_each_mem_region - itereate over registered memory regions
+ * @region: loop variable
+ */
+#define for_each_mem_region(region) \
+ for (region = memblock.memory.regions; \
+ region < (memblock.memory.regions + memblock.memory.cnt); \
+ region++)
+
+/**
+ * for_each_reserved_mem_region - itereate over reserved memory regions
+ * @region: loop variable
+ */
+#define for_each_reserved_mem_region(region) \
+ for (region = memblock.reserved.regions; \
+ region < (memblock.reserved.regions + memblock.reserved.cnt); \
region++)

extern void *alloc_large_system_hash(const char *tablename,
diff --git a/mm/memblock.c b/mm/memblock.c
index dadf579f7c53..7d30db5c539f 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1653,7 +1653,7 @@ static phys_addr_t __init_memblock __find_max_addr(phys_addr_t limit)
* the memory memblock regions, if the @limit exceeds the total size
* of those regions, max_addr will keep original value PHYS_ADDR_MAX
*/
- for_each_memblock(memory, r) {
+ for_each_mem_region(r) {
if (limit <= r->size) {
max_addr = r->base + limit;
break;
@@ -1823,7 +1823,7 @@ void __init_memblock memblock_trim_memory(phys_addr_t align)
phys_addr_t start, end, orig_start, orig_end;
struct memblock_region *r;

- for_each_memblock(memory, r) {
+ for_each_mem_region(r) {
orig_start = r->base;
orig_end = r->base + r->size;
start = round_up(orig_start, align);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 95af111d69d3..948c7a754cdb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5927,7 +5927,7 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)

if (mirrored_kernelcore && zone == ZONE_MOVABLE) {
if (!r || *pfn >= memblock_region_memory_end_pfn(r)) {
- for_each_memblock(memory, r) {
+ for_each_mem_region(r) {
if (*pfn < memblock_region_memory_end_pfn(r))
break;
}
@@ -6528,7 +6528,7 @@ static unsigned long __init zone_absent_pages_in_node(int nid,
unsigned long start_pfn, end_pfn;
struct memblock_region *r;

- for_each_memblock(memory, r) {
+ for_each_mem_region(r) {
start_pfn = clamp(memblock_region_memory_base_pfn(r),
zone_start_pfn, zone_end_pfn);
end_pfn = clamp(memblock_region_memory_end_pfn(r),
@@ -7122,7 +7122,7 @@ static void __init find_zone_movable_pfns_for_nodes(void)
* options.
*/
if (movable_node_is_enabled()) {
- for_each_memblock(memory, r) {
+ for_each_mem_region(r) {
if (!memblock_is_hotpluggable(r))
continue;

@@ -7143,7 +7143,7 @@ static void __init find_zone_movable_pfns_for_nodes(void)
if (mirrored_kernelcore) {
bool mem_below_4gb_not_mirrored = false;

- for_each_memblock(memory, r) {
+ for_each_mem_region(r) {
if (memblock_is_mirror(r))
continue;

--
2.26.2

2020-08-02 18:05:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 14/17] x86/setup: simplify reserve_crashkernel()


* Mike Rapoport <[email protected]> wrote:

> From: Mike Rapoport <[email protected]>
>
> * Replace magic numbers with defines
> * Replace memblock_find_in_range() + memblock_reserve() with
> memblock_phys_alloc_range()
> * Stop checking for low memory size in reserve_crashkernel_low(). The
> allocation from limited range will anyway fail if there is no enough
> memory, so there is no need for extra traversal of memblock.memory
>
> Signed-off-by: Mike Rapoport <[email protected]>

Assuming that this got or will get tested with a crash kernel:

Acked-by: Ingo Molnar <[email protected]>

Thanks,

Ingo

2020-08-02 18:06:07

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 17/17] memblock: use separate iterators for memory and reserved regions


* Mike Rapoport <[email protected]> wrote:

> From: Mike Rapoport <[email protected]>
>
> for_each_memblock() is used to iterate over memblock.memory in
> a few places that use data from memblock_region rather than the memory
> ranges.
>
> Introduce separate for_each_mem_region() and for_each_reserved_mem_region()
> to improve encapsulation of memblock internals from its users.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> .clang-format | 3 ++-
> arch/arm64/kernel/setup.c | 2 +-
> arch/arm64/mm/numa.c | 2 +-
> arch/mips/netlogic/xlp/setup.c | 2 +-
> arch/x86/mm/numa.c | 2 +-
> include/linux/memblock.h | 19 ++++++++++++++++---
> mm/memblock.c | 4 ++--
> mm/page_alloc.c | 8 ++++----
> 8 files changed, 28 insertions(+), 14 deletions(-)

The x86 part:

Acked-by: Ingo Molnar <[email protected]>

Thanks,

Ingo

2020-08-02 18:06:41

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 13/17] x86/setup: simplify initrd relocation and reservation


* Mike Rapoport <[email protected]> wrote:

> From: Mike Rapoport <[email protected]>
>
> Currently, initrd image is reserved very early during setup and then it
> might be relocated and re-reserved after the initial physical memory
> mapping is created. The "late" reservation of memblock verifies that mapped
> memory size exceeds the size of initrd, the checks whether the relocation
> required and, if yes, relocates inirtd to a new memory allocated from
> memblock and frees the old location.
>
> The check for memory size is excessive as memblock allocation will anyway
> fail if there is not enough memory. Besides, there is no point to allocate
> memory from memblock using memblock_find_in_range() + memblock_reserve()
> when there exists memblock_phys_alloc_range() with required functionality.
>
> Remove the redundant check and simplify memblock allocation.
>
> Signed-off-by: Mike Rapoport <[email protected]>

Assuming there's no hidden dependency here breaking something:

Acked-by: Ingo Molnar <[email protected]>

Thanks,

Ingo

2020-08-05 03:58:06

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v2 11/17] arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()

On 08/02/20 at 07:35pm, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> There are several occurrences of the following pattern:
>
> for_each_memblock(memory, reg) {
> start_pfn = memblock_region_memory_base_pfn(reg);
> end_pfn = memblock_region_memory_end_pfn(reg);
>
> /* do something with start_pfn and end_pfn */
> }
>
> Rather than iterate over all memblock.memory regions and each time query
> for their start and end PFNs, use for_each_mem_pfn_range() iterator to get
> simpler and clearer code.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> arch/arm/mm/init.c | 11 ++++-------
> arch/arm64/mm/init.c | 11 ++++-------
> arch/powerpc/kernel/fadump.c | 11 ++++++-----
> arch/powerpc/mm/mem.c | 15 ++++++++-------
> arch/powerpc/mm/numa.c | 7 ++-----
> arch/s390/mm/page-states.c | 6 ++----
> arch/sh/mm/init.c | 9 +++------
> mm/memblock.c | 6 ++----
> mm/sparse.c | 10 ++++------
> 9 files changed, 35 insertions(+), 51 deletions(-)
>

Reviewed-by: Baoquan He <[email protected]>

2020-08-05 04:23:19

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v2 13/17] x86/setup: simplify initrd relocation and reservation

On 08/02/20 at 07:35pm, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> Currently, initrd image is reserved very early during setup and then it
> might be relocated and re-reserved after the initial physical memory
> mapping is created. The "late" reservation of memblock verifies that mapped
> memory size exceeds the size of initrd, the checks whether the relocation
~ then?
> required and, if yes, relocates inirtd to a new memory allocated from
> memblock and frees the old location.
>
> The check for memory size is excessive as memblock allocation will anyway
> fail if there is not enough memory. Besides, there is no point to allocate
> memory from memblock using memblock_find_in_range() + memblock_reserve()
> when there exists memblock_phys_alloc_range() with required functionality.
>
> Remove the redundant check and simplify memblock allocation.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> arch/x86/kernel/setup.c | 16 +++-------------
> 1 file changed, 3 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index a3767e74c758..d8de4053c5e8 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -262,16 +262,12 @@ static void __init relocate_initrd(void)
> u64 area_size = PAGE_ALIGN(ramdisk_size);
>
> /* We need to move the initrd down into directly mapped mem */
> - relocated_ramdisk = memblock_find_in_range(0, PFN_PHYS(max_pfn_mapped),
> - area_size, PAGE_SIZE);
> -
> + relocated_ramdisk = memblock_phys_alloc_range(area_size, PAGE_SIZE, 0,
> + PFN_PHYS(max_pfn_mapped));
> if (!relocated_ramdisk)
> panic("Cannot find place for new RAMDISK of size %lld\n",
> ramdisk_size);
>
> - /* Note: this includes all the mem currently occupied by
> - the initrd, we rely on that fact to keep the data intact. */
> - memblock_reserve(relocated_ramdisk, area_size);
> initrd_start = relocated_ramdisk + PAGE_OFFSET;
> initrd_end = initrd_start + ramdisk_size;
> printk(KERN_INFO "Allocated new RAMDISK: [mem %#010llx-%#010llx]\n",
> @@ -298,13 +294,13 @@ static void __init early_reserve_initrd(void)
>
> memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image);
> }
> +
> static void __init reserve_initrd(void)
> {
> /* Assume only end is not page aligned */
> u64 ramdisk_image = get_ramdisk_image();
> u64 ramdisk_size = get_ramdisk_size();
> u64 ramdisk_end = PAGE_ALIGN(ramdisk_image + ramdisk_size);
> - u64 mapped_size;
>
> if (!boot_params.hdr.type_of_loader ||
> !ramdisk_image || !ramdisk_size)
> @@ -312,12 +308,6 @@ static void __init reserve_initrd(void)
>
> initrd_start = 0;
>
> - mapped_size = memblock_mem_size(max_pfn_mapped);
> - if (ramdisk_size >= (mapped_size>>1))
> - panic("initrd too large to handle, "
> - "disabling initrd (%lld needed, %lld available)\n",
> - ramdisk_size, mapped_size>>1);

Reviewed-by: Baoquan He <[email protected]>

> -
> printk(KERN_INFO "RAMDISK: [mem %#010llx-%#010llx]\n", ramdisk_image,
> ramdisk_end - 1);
>
> --
> 2.26.2
>

2020-08-05 06:00:37

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v2 13/17] x86/setup: simplify initrd relocation and reservation

On Wed, Aug 05, 2020 at 12:20:24PM +0800, Baoquan He wrote:
> On 08/02/20 at 07:35pm, Mike Rapoport wrote:
> > From: Mike Rapoport <[email protected]>
> >
> > Currently, initrd image is reserved very early during setup and then it
> > might be relocated and re-reserved after the initial physical memory
> > mapping is created. The "late" reservation of memblock verifies that mapped
> > memory size exceeds the size of initrd, the checks whether the relocation
> ~ then?

Right, thanks!

> > required and, if yes, relocates inirtd to a new memory allocated from
> > memblock and frees the old location.

2020-08-05 06:04:04

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v2 14/17] x86/setup: simplify reserve_crashkernel()

On 08/02/20 at 07:35pm, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> * Replace magic numbers with defines
> * Replace memblock_find_in_range() + memblock_reserve() with
> memblock_phys_alloc_range()
> * Stop checking for low memory size in reserve_crashkernel_low(). The
> allocation from limited range will anyway fail if there is no enough
> memory, so there is no need for extra traversal of memblock.memory
>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> arch/x86/kernel/setup.c | 40 ++++++++++++++--------------------------
> 1 file changed, 14 insertions(+), 26 deletions(-)

Applied this patch on top of 5.8, crashkernel reservation works well.
And the code change looks good.

Reviewed-by: Baoquan He <[email protected]>

>
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index d8de4053c5e8..d7ced6982524 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -419,13 +419,13 @@ static int __init reserve_crashkernel_low(void)
> {
> #ifdef CONFIG_X86_64
> unsigned long long base, low_base = 0, low_size = 0;
> - unsigned long total_low_mem;
> + unsigned long low_mem_limit;
> int ret;
>
> - total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
> + low_mem_limit = min(memblock_phys_mem_size(), CRASH_ADDR_LOW_MAX);
>
> /* crashkernel=Y,low */
> - ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size, &base);
> + ret = parse_crashkernel_low(boot_command_line, low_mem_limit, &low_size, &base);
> if (ret) {
> /*
> * two parts from kernel/dma/swiotlb.c:
> @@ -443,23 +443,17 @@ static int __init reserve_crashkernel_low(void)
> return 0;
> }
>
> - low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
> + low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> if (!low_base) {
> pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
> (unsigned long)(low_size >> 20));
> return -ENOMEM;
> }
>
> - ret = memblock_reserve(low_base, low_size);
> - if (ret) {
> - pr_err("%s: Error reserving crashkernel low memblock.\n", __func__);
> - return ret;
> - }
> -
> - pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
> + pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (low RAM limit: %ldMB)\n",
> (unsigned long)(low_size >> 20),
> (unsigned long)(low_base >> 20),
> - (unsigned long)(total_low_mem >> 20));
> + (unsigned long)(low_mem_limit >> 20));
>
> crashk_low_res.start = low_base;
> crashk_low_res.end = low_base + low_size - 1;
> @@ -503,13 +497,13 @@ static void __init reserve_crashkernel(void)
> * unless "crashkernel=size[KMG],high" is specified.
> */
> if (!high)
> - crash_base = memblock_find_in_range(CRASH_ALIGN,
> - CRASH_ADDR_LOW_MAX,
> - crash_size, CRASH_ALIGN);
> + crash_base = memblock_phys_alloc_range(crash_size,
> + CRASH_ALIGN, CRASH_ALIGN,
> + CRASH_ADDR_LOW_MAX);
> if (!crash_base)
> - crash_base = memblock_find_in_range(CRASH_ALIGN,
> - CRASH_ADDR_HIGH_MAX,
> - crash_size, CRASH_ALIGN);
> + crash_base = memblock_phys_alloc_range(crash_size,
> + CRASH_ALIGN, CRASH_ALIGN,
> + CRASH_ADDR_HIGH_MAX);
> if (!crash_base) {
> pr_info("crashkernel reservation failed - No suitable area found.\n");
> return;
> @@ -517,19 +511,13 @@ static void __init reserve_crashkernel(void)
> } else {
> unsigned long long start;
>
> - start = memblock_find_in_range(crash_base,
> - crash_base + crash_size,
> - crash_size, 1 << 20);
> + start = memblock_phys_alloc_range(crash_size, SZ_1M, crash_base,
> + crash_base + crash_size);
> if (start != crash_base) {
> pr_info("crashkernel reservation failed - memory is in use.\n");
> return;
> }
> }
> - ret = memblock_reserve(crash_base, crash_size);
> - if (ret) {
> - pr_err("%s: Error reserving crashkernel memblock.\n", __func__);
> - return;
> - }
>
> if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) {
> memblock_free(crash_base, crash_size);
> --
> 2.26.2
>

2020-08-05 08:30:36

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v2 15/17] memblock: remove unused memblock_mem_size()

On 08/02/20 at 07:35pm, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> The only user of memblock_mem_size() was x86 setup code, it is gone now and
> memblock_mem_size() funciton can be removed.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> include/linux/memblock.h | 1 -
> mm/memblock.c | 15 ---------------
> 2 files changed, 16 deletions(-)
>
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index d70c2835e913..ec2fd8f32a19 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -450,7 +450,6 @@ static inline bool memblock_bottom_up(void)
>
> phys_addr_t memblock_phys_mem_size(void);
> phys_addr_t memblock_reserved_size(void);
> -phys_addr_t memblock_mem_size(unsigned long limit_pfn);
> phys_addr_t memblock_start_of_DRAM(void);
> phys_addr_t memblock_end_of_DRAM(void);
> void memblock_enforce_memory_limit(phys_addr_t memory_limit);
> diff --git a/mm/memblock.c b/mm/memblock.c
> index c1a4c8798973..48d614352b25 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1656,21 +1656,6 @@ phys_addr_t __init_memblock memblock_reserved_size(void)
> return memblock.reserved.total_size;
> }
>
> -phys_addr_t __init memblock_mem_size(unsigned long limit_pfn)
> -{
> - unsigned long pages = 0;
> - unsigned long start_pfn, end_pfn;
> - int i;
> -
> - for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, NULL) {
> - start_pfn = min_t(unsigned long, start_pfn, limit_pfn);
> - end_pfn = min_t(unsigned long, end_pfn, limit_pfn);
> - pages += end_pfn - start_pfn;
> - }
> -
> - return PFN_PHYS(pages);
> -}

Reviewed-by: Baoquan He <[email protected]>

2020-08-05 09:13:13

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v2 16/17] memblock: implement for_each_reserved_mem_region() using __next_mem_region()

On 08/02/20 at 07:36pm, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> Iteration over memblock.reserved with for_each_reserved_mem_region() used
> __next_reserved_mem_region() that implemented a subset of
> __next_mem_region().
>
> Use __for_each_mem_range() and, essentially, __next_mem_region() with
> appropriate parameters to reduce code duplication.
>
> While on it, rename for_each_reserved_mem_region() to
> for_each_reserved_mem_range() for consistency.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> .clang-format | 2 +-
> arch/arm64/kernel/setup.c | 2 +-
> drivers/irqchip/irq-gic-v3-its.c | 2 +-
> include/linux/memblock.h | 12 +++------
> mm/memblock.c | 46 +++++++-------------------------
> 5 files changed, 17 insertions(+), 47 deletions(-)

Reviewed-by: Baoquan He <[email protected]>

2020-08-05 09:31:52

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v2 17/17] memblock: use separate iterators for memory and reserved regions

On 08/02/20 at 07:36pm, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> for_each_memblock() is used to iterate over memblock.memory in
> a few places that use data from memblock_region rather than the memory
> ranges.
>
> Introduce separate for_each_mem_region() and for_each_reserved_mem_region()
> to improve encapsulation of memblock internals from its users.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> .clang-format | 3 ++-
> arch/arm64/kernel/setup.c | 2 +-
> arch/arm64/mm/numa.c | 2 +-
> arch/mips/netlogic/xlp/setup.c | 2 +-
> arch/x86/mm/numa.c | 2 +-
> include/linux/memblock.h | 19 ++++++++++++++++---
> mm/memblock.c | 4 ++--
> mm/page_alloc.c | 8 ++++----
> 8 files changed, 28 insertions(+), 14 deletions(-)
>
...

> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 9e51b3fd4134..a6970e058bd7 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -522,9 +522,22 @@ static inline unsigned long memblock_region_reserved_end_pfn(const struct memblo
> return PFN_UP(reg->base + reg->size);
> }
>
> -#define for_each_memblock(memblock_type, region) \
> - for (region = memblock.memblock_type.regions; \
> - region < (memblock.memblock_type.regions + memblock.memblock_type.cnt); \
> +/**
> + * for_each_mem_region - itereate over registered memory regions
~~~~~~~~~~~~~~~~~

Wonder why emphasize 'registered' memory.

Other than this confusion to me, this patch looks good.

Reviewed-by: Baoquan He <[email protected]>

2020-08-05 13:28:45

by Thomas Bogendoerfer

[permalink] [raw]
Subject: Re: [PATCH v2 17/17] memblock: use separate iterators for memory and reserved regions

On Sun, Aug 02, 2020 at 07:36:01PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> for_each_memblock() is used to iterate over memblock.memory in
> a few places that use data from memblock_region rather than the memory
> ranges.
>
> Introduce separate for_each_mem_region() and for_each_reserved_mem_region()
> to improve encapsulation of memblock internals from its users.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> arch/mips/netlogic/xlp/setup.c | 2 +-

Acked-by: Thomas Bogendoerfer <[email protected]>

Thomas.

--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]

2020-08-05 16:09:30

by Thomas Bogendoerfer

[permalink] [raw]
Subject: Re: [PATCH v2 12/17] arch, drivers: replace for_each_membock() with for_each_mem_range()

On Sun, Aug 02, 2020 at 07:35:56PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> There are several occurrences of the following pattern:
>
> for_each_memblock(memory, reg) {
> start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
> end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));
>
> /* do something with start and end */
> }
>
> Using for_each_mem_range() iterator is more appropriate in such cases and
> allows simpler and cleaner code.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> arch/mips/cavium-octeon/dma-octeon.c | 12 +++---
> arch/mips/kernel/setup.c | 31 +++++++--------

Acked-by: Thomas Bogendoerfer <[email protected]>

Thomas.

--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]

2020-08-05 19:28:26

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v2 16/17] memblock: implement for_each_reserved_mem_region() using __next_mem_region()

On Sun, Aug 2, 2020 at 6:40 PM Mike Rapoport <[email protected]> wrote:
>
> .clang-format | 2 +-

The .clang-format bit:

Acked-by: Miguel Ojeda <[email protected]>

Cheers,
Miguel

2020-08-05 19:31:52

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v2 17/17] memblock: use separate iterators for memory and reserved regions

On Sun, Aug 2, 2020 at 6:41 PM Mike Rapoport <[email protected]> wrote:
>
> .clang-format | 3 ++-

The .clang-format bit:

Acked-by: Miguel Ojeda <[email protected]>

Cheers,
Miguel