2014-01-24 19:10:39

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 0/3] memblock, x86: fix big numa system booting

Big numa system boot get broken while switch API from bootmem to
memblock_virt.

Revert the offending patch, and also address swiotlb regression.

Thanks

Yinghai


2014-01-24 19:10:51

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 3/3] memblock: Don't align size silent in memblock_virt_alloc()

In original __alloc_memory_core_early() for bootmem wrapper, we do not
align size silently.

We should not do that, as later free with old size will leave some
range not freed.

It's obvious that code is copied from memblock_base_nid(), and that code
is wrong for the same reason.

Also remove that in memblock_alloc_base.

Signed-off-by: Yinghai Lu <[email protected]>

---
mm/memblock.c | 6 ------
1 file changed, 6 deletions(-)

Index: linux-2.6/mm/memblock.c
===================================================================
--- linux-2.6.orig/mm/memblock.c
+++ linux-2.6/mm/memblock.c
@@ -981,9 +981,6 @@ static phys_addr_t __init memblock_alloc
if (!align)
align = SMP_CACHE_BYTES;

- /* align @size to avoid excessive fragmentation on reserved array */
- size = round_up(size, align);
-
found = memblock_find_in_range_node(size, align, 0, max_addr, nid);
if (found && !memblock_reserve(found, size))
return found;
@@ -1077,9 +1074,6 @@ static void * __init memblock_virt_alloc
if (!align)
align = SMP_CACHE_BYTES;

- /* align @size to avoid excessive fragmentation on reserved array */
- size = round_up(size, align);
-
again:
alloc = memblock_find_in_range_node(size, align, min_addr, max_addr,
nid);

2014-01-24 19:11:01

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

The new memblock_virt APIs are used to replaced old bootmem API.

We need to allocate page below 4G for swiotlb.

That should fix regression on Andrew's system that is using swiotlb.

Signed-off-by: Yinghai Lu <[email protected]>
Cc: Russell King <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>

---
arch/arm/kernel/setup.c | 2 +-
include/linux/bootmem.h | 37 +++++++++++++++++++++++++++++++++++++
lib/swiotlb.c | 4 ++--
3 files changed, 40 insertions(+), 3 deletions(-)

Index: linux-2.6/include/linux/bootmem.h
===================================================================
--- linux-2.6.orig/include/linux/bootmem.h
+++ linux-2.6/include/linux/bootmem.h
@@ -175,6 +175,27 @@ static inline void * __init memblock_vir
NUMA_NO_NODE);
}

+#ifndef ARCH_LOW_ADDRESS_LIMIT
+#define ARCH_LOW_ADDRESS_LIMIT 0xffffffffUL
+#endif
+
+static inline void * __init memblock_virt_alloc_low(
+ phys_addr_t size, phys_addr_t align)
+{
+ return memblock_virt_alloc_try_nid(size, align,
+ BOOTMEM_LOW_LIMIT,
+ ARCH_LOW_ADDRESS_LIMIT,
+ NUMA_NO_NODE);
+}
+static inline void * __init memblock_virt_alloc_low_nopanic(
+ phys_addr_t size, phys_addr_t align)
+{
+ return memblock_virt_alloc_try_nid_nopanic(size, align,
+ BOOTMEM_LOW_LIMIT,
+ ARCH_LOW_ADDRESS_LIMIT,
+ NUMA_NO_NODE);
+}
+
static inline void * __init memblock_virt_alloc_from_nopanic(
phys_addr_t size, phys_addr_t align, phys_addr_t min_addr)
{
@@ -238,6 +259,22 @@ static inline void * __init memblock_vir
return __alloc_bootmem_nopanic(size, align, BOOTMEM_LOW_LIMIT);
}

+static inline void * __init memblock_virt_alloc_low(
+ phys_addr_t size, phys_addr_t align)
+{
+ if (!align)
+ align = SMP_CACHE_BYTES;
+ return __alloc_bootmem_low(size, align, BOOTMEM_LOW_LIMIT);
+}
+
+static inline void * __init memblock_virt_alloc_low_nopanic(
+ phys_addr_t size, phys_addr_t align)
+{
+ if (!align)
+ align = SMP_CACHE_BYTES;
+ return __alloc_bootmem_low_nopanic(size, align, BOOTMEM_LOW_LIMIT);
+}
+
static inline void * __init memblock_virt_alloc_from_nopanic(
phys_addr_t size, phys_addr_t align, phys_addr_t min_addr)
{
Index: linux-2.6/lib/swiotlb.c
===================================================================
--- linux-2.6.orig/lib/swiotlb.c
+++ linux-2.6/lib/swiotlb.c
@@ -172,7 +172,7 @@ int __init swiotlb_init_with_tbl(char *t
/*
* Get the overflow emergency buffer
*/
- v_overflow_buffer = memblock_virt_alloc_nopanic(
+ v_overflow_buffer = memblock_virt_alloc_low_nopanic(
PAGE_ALIGN(io_tlb_overflow),
PAGE_SIZE);
if (!v_overflow_buffer)
@@ -220,7 +220,7 @@ swiotlb_init(int verbose)
bytes = io_tlb_nslabs << IO_TLB_SHIFT;

/* Get IO TLB memory from the low pages */
- vstart = memblock_virt_alloc_nopanic(PAGE_ALIGN(bytes), PAGE_SIZE);
+ vstart = memblock_virt_alloc_low_nopanic(PAGE_ALIGN(bytes), PAGE_SIZE);
if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, verbose))
return;

Index: linux-2.6/arch/arm/kernel/setup.c
===================================================================
--- linux-2.6.orig/arch/arm/kernel/setup.c
+++ linux-2.6/arch/arm/kernel/setup.c
@@ -717,7 +717,7 @@ static void __init request_standard_reso
kernel_data.end = virt_to_phys(_end - 1);

for_each_memblock(memory, region) {
- res = memblock_virt_alloc(sizeof(*res), 0);
+ res = memblock_virt_alloc_low(sizeof(*res), 0);
res->name = "System RAM";
res->start = __pfn_to_phys(memblock_region_memory_base_pfn(region));
res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;

2014-01-24 19:11:25

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 2/3] x86: Revert wrong memblock current limit setting.

Dave reported big numa system booting is broken.

It turns out
| commit 5b6e529521d35e1bcaa0fe43456d1bbb335cae5d
| Author: Santosh Shilimkar <[email protected]>
| Date: Tue Jan 21 15:50:03 2014 -0800
|
| x86: memblock: set current limit to max low memory address

set limit to low wrongly.

max_low_pfn_mapped is different from max_pfn_mapped.
max_low_pfn_mapped is always under 4G.

That will memblock_alloc_nid all go under 4G.

Revert that offending patch.

Reported-by: Dave Hansen <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>

---
arch/x86/include/asm/page_types.h | 4 ++--
arch/x86/kernel/setup.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)

Index: linux-2.6/arch/x86/include/asm/page_types.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/page_types.h
+++ linux-2.6/arch/x86/include/asm/page_types.h
@@ -51,9 +51,9 @@ extern int devmem_is_allowed(unsigned lo
extern unsigned long max_low_pfn_mapped;
extern unsigned long max_pfn_mapped;

-static inline phys_addr_t get_max_low_mapped(void)
+static inline phys_addr_t get_max_mapped(void)
{
- return (phys_addr_t)max_low_pfn_mapped << PAGE_SHIFT;
+ return (phys_addr_t)max_pfn_mapped << PAGE_SHIFT;
}

bool pfn_range_is_mapped(unsigned long start_pfn, unsigned long end_pfn);
Index: linux-2.6/arch/x86/kernel/setup.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/setup.c
+++ linux-2.6/arch/x86/kernel/setup.c
@@ -1173,7 +1173,7 @@ void __init setup_arch(char **cmdline_p)

setup_real_mode();

- memblock_set_current_limit(get_max_low_mapped());
+ memblock_set_current_limit(get_max_mapped());
dma_contiguous_reserve(0);

/*

2014-01-24 19:26:15

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Fri, Jan 24, 2014 at 11:11:10AM -0800, Yinghai Lu wrote:
> The new memblock_virt APIs are used to replaced old bootmem API.
>
> We need to allocate page below 4G for swiotlb.
>
> That should fix regression on Andrew's system that is using swiotlb.

Please include the title of the patch that caused the regression.
I presume it is "mm/lib/swiotlb: Use memblock apis for early memory allocations"

Interestingly enough when I asked about it:

https://lkml.org/lkml/2013/11/9/280


>> v_overflow_buffer = memblock_virt_alloc_align_nopanic(
>> + PAGE_ALIGN(io_tlb_overflow),
>> + PAGE_SIZE);
>
> Does this guarantee that the pages will be allocated below 4GB?
>
Yes. The memblock layer still allocates memory from lowmem. As I
mentioned, there is no change in the behavior than what is today
apart from just the interface change.

How did that happend? Was there another patch in the series that altered
such assumption?

>
> Signed-off-by: Yinghai Lu <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: Konrad Rzeszutek Wilk <[email protected]>
>
> ---
> arch/arm/kernel/setup.c | 2 +-
> include/linux/bootmem.h | 37 +++++++++++++++++++++++++++++++++++++
> lib/swiotlb.c | 4 ++--
> 3 files changed, 40 insertions(+), 3 deletions(-)
>
> Index: linux-2.6/include/linux/bootmem.h
> ===================================================================
> --- linux-2.6.orig/include/linux/bootmem.h
> +++ linux-2.6/include/linux/bootmem.h
> @@ -175,6 +175,27 @@ static inline void * __init memblock_vir
> NUMA_NO_NODE);
> }
>
> +#ifndef ARCH_LOW_ADDRESS_LIMIT
> +#define ARCH_LOW_ADDRESS_LIMIT 0xffffffffUL
> +#endif
> +
> +static inline void * __init memblock_virt_alloc_low(
> + phys_addr_t size, phys_addr_t align)
> +{
> + return memblock_virt_alloc_try_nid(size, align,
> + BOOTMEM_LOW_LIMIT,
> + ARCH_LOW_ADDRESS_LIMIT,
> + NUMA_NO_NODE);
> +}
> +static inline void * __init memblock_virt_alloc_low_nopanic(
> + phys_addr_t size, phys_addr_t align)
> +{
> + return memblock_virt_alloc_try_nid_nopanic(size, align,
> + BOOTMEM_LOW_LIMIT,
> + ARCH_LOW_ADDRESS_LIMIT,
> + NUMA_NO_NODE);
> +}
> +
> static inline void * __init memblock_virt_alloc_from_nopanic(
> phys_addr_t size, phys_addr_t align, phys_addr_t min_addr)
> {
> @@ -238,6 +259,22 @@ static inline void * __init memblock_vir
> return __alloc_bootmem_nopanic(size, align, BOOTMEM_LOW_LIMIT);
> }
>
> +static inline void * __init memblock_virt_alloc_low(
> + phys_addr_t size, phys_addr_t align)
> +{
> + if (!align)
> + align = SMP_CACHE_BYTES;
> + return __alloc_bootmem_low(size, align, BOOTMEM_LOW_LIMIT);
> +}
> +
> +static inline void * __init memblock_virt_alloc_low_nopanic(
> + phys_addr_t size, phys_addr_t align)
> +{
> + if (!align)
> + align = SMP_CACHE_BYTES;
> + return __alloc_bootmem_low_nopanic(size, align, BOOTMEM_LOW_LIMIT);
> +}
> +
> static inline void * __init memblock_virt_alloc_from_nopanic(
> phys_addr_t size, phys_addr_t align, phys_addr_t min_addr)
> {
> Index: linux-2.6/lib/swiotlb.c
> ===================================================================
> --- linux-2.6.orig/lib/swiotlb.c
> +++ linux-2.6/lib/swiotlb.c
> @@ -172,7 +172,7 @@ int __init swiotlb_init_with_tbl(char *t
> /*
> * Get the overflow emergency buffer
> */
> - v_overflow_buffer = memblock_virt_alloc_nopanic(
> + v_overflow_buffer = memblock_virt_alloc_low_nopanic(
> PAGE_ALIGN(io_tlb_overflow),
> PAGE_SIZE);
> if (!v_overflow_buffer)
> @@ -220,7 +220,7 @@ swiotlb_init(int verbose)
> bytes = io_tlb_nslabs << IO_TLB_SHIFT;
>
> /* Get IO TLB memory from the low pages */
> - vstart = memblock_virt_alloc_nopanic(PAGE_ALIGN(bytes), PAGE_SIZE);
> + vstart = memblock_virt_alloc_low_nopanic(PAGE_ALIGN(bytes), PAGE_SIZE);
> if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, verbose))
> return;
>
> Index: linux-2.6/arch/arm/kernel/setup.c
> ===================================================================
> --- linux-2.6.orig/arch/arm/kernel/setup.c
> +++ linux-2.6/arch/arm/kernel/setup.c
> @@ -717,7 +717,7 @@ static void __init request_standard_reso
> kernel_data.end = virt_to_phys(_end - 1);
>
> for_each_memblock(memory, region) {
> - res = memblock_virt_alloc(sizeof(*res), 0);
> + res = memblock_virt_alloc_low(sizeof(*res), 0);
> res->name = "System RAM";
> res->start = __pfn_to_phys(memblock_region_memory_base_pfn(region));
> res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;

2014-01-24 19:30:59

by Santosh Shilimkar

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Friday 24 January 2014 02:25 PM, Konrad Rzeszutek Wilk wrote:
> On Fri, Jan 24, 2014 at 11:11:10AM -0800, Yinghai Lu wrote:
>> The new memblock_virt APIs are used to replaced old bootmem API.
>>
>> We need to allocate page below 4G for swiotlb.
>>
>> That should fix regression on Andrew's system that is using swiotlb.
>
> Please include the title of the patch that caused the regression.
> I presume it is "mm/lib/swiotlb: Use memblock apis for early memory allocations"
>
> Interestingly enough when I asked about it:
>
> https://lkml.org/lkml/2013/11/9/280
>
>
> >> v_overflow_buffer = memblock_virt_alloc_align_nopanic(
> >> + PAGE_ALIGN(io_tlb_overflow),
> >> + PAGE_SIZE);
> >
> > Does this guarantee that the pages will be allocated below 4GB?
> >
> Yes. The memblock layer still allocates memory from lowmem. As I
> mentioned, there is no change in the behavior than what is today
> apart from just the interface change.
>
> How did that happend? Was there another patch in the series that altered
> such assumption?
>
Actually it didn't. It was the misunderstanding on my side about the
low_mem_max_addr being under 4GB, which is not always true especially
for 64-bit systems which have no addressing limitations.

Regards,
Santosh

2014-01-24 19:34:02

by Santosh Shilimkar

[permalink] [raw]
Subject: Re: [PATCH 0/3] memblock, x86: fix big numa system booting

Yinghai,

On Friday 24 January 2014 02:11 PM, Yinghai Lu wrote:
> Big numa system boot get broken while switch API from bootmem to
> memblock_virt.
>
> Revert the offending patch, and also address swiotlb regression.
>
Thanks a lot for fixes and help to narrow down and fix these
regressions. For all the patches in the series,

Acked-by: Santosh Shilimkar <[email protected]>

2014-01-24 19:34:35

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Fri, Jan 24, 2014 at 11:25 AM, Konrad Rzeszutek Wilk
<[email protected]> wrote:
> On Fri, Jan 24, 2014 at 11:11:10AM -0800, Yinghai Lu wrote:
>> The new memblock_virt APIs are used to replaced old bootmem API.
>>
>> We need to allocate page below 4G for swiotlb.
>>
>> That should fix regression on Andrew's system that is using swiotlb.
>
> Please include the title of the patch that caused the regression.
> I presume it is "mm/lib/swiotlb: Use memblock apis for early memory allocations"
>
> Interestingly enough when I asked about it:
>
> https://lkml.org/lkml/2013/11/9/280
>
>
> >> v_overflow_buffer = memblock_virt_alloc_align_nopanic(
> >> + PAGE_ALIGN(io_tlb_overflow),
> >> + PAGE_SIZE);
> >
> > Does this guarantee that the pages will be allocated below 4GB?
> >
> Yes. The memblock layer still allocates memory from lowmem. As I
> mentioned, there is no change in the behavior than what is today
> apart from just the interface change.
>
> How did that happend? Was there another patch in the series that altered
> such assumption?
>

Yes, that is one.

He chose to set memblock.current_limit to max_low_mapped. (that is under 4g).

but it broke big numa system as all boot mem is under 4G and system
with lots of memory will need to
have big chunk for vmemmap .

Before that patch, we need to add another API to make sure those
swiotlb under 4G.

Thanks

Yinghai

2014-01-24 20:17:29

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 0/3] memblock, x86: fix big numa system booting

On Fri, 24 Jan 2014 14:33:19 -0500 Santosh Shilimkar <[email protected]> wrote:

> Yinghai,
>
> On Friday 24 January 2014 02:11 PM, Yinghai Lu wrote:
> > Big numa system boot get broken while switch API from bootmem to
> > memblock_virt.
> >
> > Revert the offending patch, and also address swiotlb regression.
> >
> Thanks a lot for fixes and help to narrow down and fix these
> regressions. For all the patches in the series,
>
> Acked-by: Santosh Shilimkar <[email protected]>

That patchset continues to boot happily on my swiotlb test box.

2014-01-28 08:02:19

by Olof Johansson

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

Hi,

On Fri, Jan 24, 2014 at 11:11 AM, Yinghai Lu <[email protected]> wrote:
> The new memblock_virt APIs are used to replaced old bootmem API.
>
> We need to allocate page below 4G for swiotlb.
>
> That should fix regression on Andrew's system that is using swiotlb.
>
> Signed-off-by: Yinghai Lu <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: Konrad Rzeszutek Wilk <[email protected]>

This seems to have been merged by Linus tonight as ad6492b80f, and it
had fallout on ARM systems (boot failures with no console output on
all but 5 of my machine/config combos).

Seems like it didn't have a chance to sit in -next, which is somewhat
understandable given that it's considered a bugfix and it indeed fixed
the bug it was meant to.

i'm out of time to debug this tonight (I noticed the failures as I was
heading to bed and figured I'd at least bisect them), so I wouldn't
mind seeing a revert of the ARM side change of ad6492b80f until it's
been sorted out so we keep bisectabilty intact for the rest of the
kernel.


Thanks,

-Olof

2014-01-28 15:36:44

by Kevin Hilman

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 12:02 AM, Olof Johansson <[email protected]> wrote:
> Hi,
>
> On Fri, Jan 24, 2014 at 11:11 AM, Yinghai Lu <[email protected]> wrote:
>> The new memblock_virt APIs are used to replaced old bootmem API.
>>
>> We need to allocate page below 4G for swiotlb.
>>
>> That should fix regression on Andrew's system that is using swiotlb.
>>
>> Signed-off-by: Yinghai Lu <[email protected]>
>> Cc: Russell King <[email protected]>
>> Cc: Konrad Rzeszutek Wilk <[email protected]>
>
> This seems to have been merged by Linus tonight as ad6492b80f, and it
> had fallout on ARM systems (boot failures with no console output on
> all but 5 of my machine/config combos).
>
> Seems like it didn't have a chance to sit in -next, which is somewhat
> understandable given that it's considered a bugfix and it indeed fixed
> the bug it was meant to.
>
> i'm out of time to debug this tonight (I noticed the failures as I was
> heading to bed and figured I'd at least bisect them), so I wouldn't
> mind seeing a revert of the ARM side change of ad6492b80f until it's
> been sorted out so we keep bisectabilty intact for the rest of the
> kernel.

Like Olof, I noticed multiple boot failures on various ARM boards.
I've confirmed that reverting the arch/arm part of this patch makes
them all happily booting again.

Kevin

2014-01-28 17:12:30

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 7:30 AM, Kevin Hilman <[email protected]> wrote:
> On Tue, Jan 28, 2014 at 12:02 AM, Olof Johansson <[email protected]> wrote:
>> Hi,
>>
>> On Fri, Jan 24, 2014 at 11:11 AM, Yinghai Lu <[email protected]> wrote:
>>> The new memblock_virt APIs are used to replaced old bootmem API.
>>>
>>> We need to allocate page below 4G for swiotlb.
>>>
>>> That should fix regression on Andrew's system that is using swiotlb.
>>>
>>> Signed-off-by: Yinghai Lu <[email protected]>
>>> Cc: Russell King <[email protected]>
>>> Cc: Konrad Rzeszutek Wilk <[email protected]>
>>
>> This seems to have been merged by Linus tonight as ad6492b80f, and it
>> had fallout on ARM systems (boot failures with no console output on
>> all but 5 of my machine/config combos).
>>
>> Seems like it didn't have a chance to sit in -next, which is somewhat
>> understandable given that it's considered a bugfix and it indeed fixed
>> the bug it was meant to.
>>
>> i'm out of time to debug this tonight (I noticed the failures as I was
>> heading to bed and figured I'd at least bisect them), so I wouldn't
>> mind seeing a revert of the ARM side change of ad6492b80f until it's
>> been sorted out so we keep bisectabilty intact for the rest of the
>> kernel.
>
> Like Olof, I noticed multiple boot failures on various ARM boards.
> I've confirmed that reverting the arch/arm part of this patch makes
> them all happily booting again.

please try attached patch.

Index: linux-2.6/include/linux/bootmem.h
===================================================================
--- linux-2.6.orig/include/linux/bootmem.h
+++ linux-2.6/include/linux/bootmem.h
@@ -179,6 +179,9 @@ static inline void * __init memblock_vir
NUMA_NO_NODE);
}

+/* Take arch's ARCH_LOW_ADDRESS_LIMIT at first*/
+#include <asm/processor.h>
+
#ifndef ARCH_LOW_ADDRESS_LIMIT
#define ARCH_LOW_ADDRESS_LIMIT 0xffffffffUL
#endif


Attachments:
fix_arm_low.patch (522.00 B)

2014-01-28 17:18:50

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 09:12:27AM -0800, Yinghai Lu wrote:
> On Tue, Jan 28, 2014 at 7:30 AM, Kevin Hilman <[email protected]> wrote:
> > Like Olof, I noticed multiple boot failures on various ARM boards.
> > I've confirmed that reverting the arch/arm part of this patch makes
> > them all happily booting again.
>
> please try attached patch.

Maybe I'm missing something, but there is no ARCH_LOW_ADDRESS_LIMIT
defined in the ARM header files, so I don't see how adding that
additional include changes anything.

--
FTTC broadband for 0.8mile line: 5.8Mbps down 500kbps up. Estimation
in database were 13.1 to 19Mbit for a good line, about 7.5+ for a bad.
Estimate before purchase was "up to 13.2Mbit".

2014-01-28 17:23:38

by Santosh Shilimkar

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tuesday 28 January 2014 12:12 PM, Yinghai Lu wrote:
> On Tue, Jan 28, 2014 at 7:30 AM, Kevin Hilman <[email protected]> wrote:
>> > On Tue, Jan 28, 2014 at 12:02 AM, Olof Johansson <[email protected]> wrote:
>>> >> Hi,
>>> >>
>>> >> On Fri, Jan 24, 2014 at 11:11 AM, Yinghai Lu <[email protected]> wrote:
>>>> >>> The new memblock_virt APIs are used to replaced old bootmem API.
>>>> >>>
>>>> >>> We need to allocate page below 4G for swiotlb.
>>>> >>>
>>>> >>> That should fix regression on Andrew's system that is using swiotlb.
>>>> >>>
>>>> >>> Signed-off-by: Yinghai Lu <[email protected]>
>>>> >>> Cc: Russell King <[email protected]>
>>>> >>> Cc: Konrad Rzeszutek Wilk <[email protected]>
>>> >>
>>> >> This seems to have been merged by Linus tonight as ad6492b80f, and it
>>> >> had fallout on ARM systems (boot failures with no console output on
>>> >> all but 5 of my machine/config combos).
>>> >>
>>> >> Seems like it didn't have a chance to sit in -next, which is somewhat
>>> >> understandable given that it's considered a bugfix and it indeed fixed
>>> >> the bug it was meant to.
>>> >>
>>> >> i'm out of time to debug this tonight (I noticed the failures as I was
>>> >> heading to bed and figured I'd at least bisect them), so I wouldn't
>>> >> mind seeing a revert of the ARM side change of ad6492b80f until it's
>>> >> been sorted out so we keep bisectabilty intact for the rest of the
>>> >> kernel.
>> >
>> > Like Olof, I noticed multiple boot failures on various ARM boards.
>> > I've confirmed that reverting the arch/arm part of this patch makes
>> > them all happily booting again.
> please try attached patch.
>
> Index: linux-2.6/include/linux/bootmem.h
> ===================================================================
> --- linux-2.6.orig/include/linux/bootmem.h
> +++ linux-2.6/include/linux/bootmem.h
> @@ -179,6 +179,9 @@ static inline void * __init memblock_vir
> NUMA_NO_NODE);
> }
>
> +/* Take arch's ARCH_LOW_ADDRESS_LIMIT at first*/
> +#include <asm/processor.h>
> +
> #ifndef ARCH_LOW_ADDRESS_LIMIT
> #define ARCH_LOW_ADDRESS_LIMIT 0xffffffffUL
> #endif

This won't help mostly since the ARM 32 arch don't set ARCH_LOW_ADDRESS_LIMIT.
Sorry i couldn't respond to the thread earlier because of travel and
don't have access to my board to try out the patches.
The issue is mostly because on ARM the allocations needs to be limited lowmem and
that has been ensured using BOOTMEM_ALLOC_ACCESSIBLE in bootmem case and
MEMBLOCK_ALLOC_ACCESSIBLE in memblock case.

Even though ARM change done to use alloc_low() is correct, it has undesired
side effect of allocating memory from highmem because of non-use of
ARCH_LOW_ADDRESS_LIMIT. Setting ARCH_LOW_ADDRESS_LIMIT to MEMBLOCK_ALLOC_ACCESSIBLE
in ARM case also doesn't seem right.

Unless you have better idea, backing out the arch/arm change from the
subject patch seems to be most safe.

Regards,
Ssantosh




2014-01-28 17:24:05

by Santosh Shilimkar

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tuesday 28 January 2014 12:18 PM, Russell King - ARM Linux wrote:
> On Tue, Jan 28, 2014 at 09:12:27AM -0800, Yinghai Lu wrote:
>> On Tue, Jan 28, 2014 at 7:30 AM, Kevin Hilman <[email protected]> wrote:
>>> Like Olof, I noticed multiple boot failures on various ARM boards.
>>> I've confirmed that reverting the arch/arm part of this patch makes
>>> them all happily booting again.
>>
>> please try attached patch.
>
> Maybe I'm missing something, but there is no ARCH_LOW_ADDRESS_LIMIT
> defined in the ARM header files, so I don't see how adding that
> additional include changes anything.
>
Our emails crossed... You are right. It won't help.

Regards,
Santosh

2014-01-28 18:22:50

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 12:23:02PM -0500, Santosh Shilimkar wrote:
> On Tuesday 28 January 2014 12:12 PM, Yinghai Lu wrote:
> > Index: linux-2.6/include/linux/bootmem.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/bootmem.h
> > +++ linux-2.6/include/linux/bootmem.h
> > @@ -179,6 +179,9 @@ static inline void * __init memblock_vir
> > NUMA_NO_NODE);
> > }
> >
> > +/* Take arch's ARCH_LOW_ADDRESS_LIMIT at first*/
> > +#include <asm/processor.h>
> > +
> > #ifndef ARCH_LOW_ADDRESS_LIMIT
> > #define ARCH_LOW_ADDRESS_LIMIT 0xffffffffUL
> > #endif
>
> This won't help mostly since the ARM 32 arch don't set ARCH_LOW_ADDRESS_LIMIT.
> Sorry i couldn't respond to the thread earlier because of travel and
> don't have access to my board to try out the patches.

Let's think about this for a moment, shall we...

What does memblock_alloc_virt*() return? It returns a virtual address.

How is that virtual address obtained? ptr = phys_to_virt(alloc);

What is the valid address range for passing into phys_to_virt() ? Only
lowmem addresses.

Hence, having ARCH_LOW_ADDRESS_LIMIT set to 4GB-1 by default seems to be
completely rediculous - and presumably this also fails on x86_32 if it
returns memory up at 4GB.

So... yes, I think reverting the arch/arm part of this patch is the right
solution, whether the rest of it should be reverted is something I can't
comment on.

--
FTTC broadband for 0.8mile line: 5.8Mbps down 500kbps up. Estimation
in database were 13.1 to 19Mbit for a good line, about 7.5+ for a bad.
Estimate before purchase was "up to 13.2Mbit".

2014-01-28 18:36:51

by Santosh Shilimkar

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

+ Gryagorii,
On Tuesday 28 January 2014 01:22 PM, Russell King - ARM Linux wrote:
> On Tue, Jan 28, 2014 at 12:23:02PM -0500, Santosh Shilimkar wrote:
>> On Tuesday 28 January 2014 12:12 PM, Yinghai Lu wrote:
>>> Index: linux-2.6/include/linux/bootmem.h
>>> ===================================================================
>>> --- linux-2.6.orig/include/linux/bootmem.h
>>> +++ linux-2.6/include/linux/bootmem.h
>>> @@ -179,6 +179,9 @@ static inline void * __init memblock_vir
>>> NUMA_NO_NODE);
>>> }
>>>
>>> +/* Take arch's ARCH_LOW_ADDRESS_LIMIT at first*/
>>> +#include <asm/processor.h>
>>> +
>>> #ifndef ARCH_LOW_ADDRESS_LIMIT
>>> #define ARCH_LOW_ADDRESS_LIMIT 0xffffffffUL
>>> #endif
>>
>> This won't help mostly since the ARM 32 arch don't set ARCH_LOW_ADDRESS_LIMIT.
>> Sorry i couldn't respond to the thread earlier because of travel and
>> don't have access to my board to try out the patches.
>
> Let's think about this for a moment, shall we...
>
> What does memblock_alloc_virt*() return? It returns a virtual address.
>
> How is that virtual address obtained? ptr = phys_to_virt(alloc);
>
> What is the valid address range for passing into phys_to_virt() ? Only
> lowmem addresses.
>
> Hence, having ARCH_LOW_ADDRESS_LIMIT set to 4GB-1 by default seems to be
> completely rediculous - and presumably this also fails on x86_32 if it
> returns memory up at 4GB.
>
> So... yes, I think reverting the arch/arm part of this patch is the right
> solution, whether the rest of it should be reverted is something I can't
> comment on.
>
Grygorri mentioned an alternate to update the memblock_find_in_range_node() so
that it takes into account the limit.

Regards,
Santosh

2014-01-28 18:57:40

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 01:36:28PM -0500, Santosh Shilimkar wrote:
> + Gryagorii,
> On Tuesday 28 January 2014 01:22 PM, Russell King - ARM Linux wrote:
> > On Tue, Jan 28, 2014 at 12:23:02PM -0500, Santosh Shilimkar wrote:
> >> On Tuesday 28 January 2014 12:12 PM, Yinghai Lu wrote:
> >>> Index: linux-2.6/include/linux/bootmem.h
> >>> ===================================================================
> >>> --- linux-2.6.orig/include/linux/bootmem.h
> >>> +++ linux-2.6/include/linux/bootmem.h
> >>> @@ -179,6 +179,9 @@ static inline void * __init memblock_vir
> >>> NUMA_NO_NODE);
> >>> }
> >>>
> >>> +/* Take arch's ARCH_LOW_ADDRESS_LIMIT at first*/
> >>> +#include <asm/processor.h>
> >>> +
> >>> #ifndef ARCH_LOW_ADDRESS_LIMIT
> >>> #define ARCH_LOW_ADDRESS_LIMIT 0xffffffffUL
> >>> #endif
> >>
> >> This won't help mostly since the ARM 32 arch don't set ARCH_LOW_ADDRESS_LIMIT.
> >> Sorry i couldn't respond to the thread earlier because of travel and
> >> don't have access to my board to try out the patches.
> >
> > Let's think about this for a moment, shall we...
> >
> > What does memblock_alloc_virt*() return? It returns a virtual address.
> >
> > How is that virtual address obtained? ptr = phys_to_virt(alloc);
> >
> > What is the valid address range for passing into phys_to_virt() ? Only
> > lowmem addresses.
> >
> > Hence, having ARCH_LOW_ADDRESS_LIMIT set to 4GB-1 by default seems to be
> > completely rediculous - and presumably this also fails on x86_32 if it
> > returns memory up at 4GB.
> >
> > So... yes, I think reverting the arch/arm part of this patch is the right
> > solution, whether the rest of it should be reverted is something I can't
> > comment on.
> >
> Grygorri mentioned an alternate to update the memblock_find_in_range_node() so
> that it takes into account the limit.

This patch breaks also Xen and 32-bit guests (see
http://lists.xen.org/archives/html/xen-devel/2014-01/msg02476.html)

Reverting it fixes it.

>
> Regards,
> Santosh

2014-01-28 19:43:05

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 9:18 AM, Russell King - ARM Linux
<[email protected]> wrote:
> On Tue, Jan 28, 2014 at 09:12:27AM -0800, Yinghai Lu wrote:
>> On Tue, Jan 28, 2014 at 7:30 AM, Kevin Hilman <[email protected]> wrote:
>> > Like Olof, I noticed multiple boot failures on various ARM boards.
>> > I've confirmed that reverting the arch/arm part of this patch makes
>> > them all happily booting again.
>>
>> please try attached patch.
>
> Maybe I'm missing something, but there is no ARCH_LOW_ADDRESS_LIMIT
> defined in the ARM header files, so I don't see how adding that
> additional include changes anything.

there is one for arm64 and s390.

arch/arm64/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT PHYS_MAS
arch/s390/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT 0x7fffff

2014-01-28 19:48:04

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 11:43:02AM -0800, Yinghai Lu wrote:
> On Tue, Jan 28, 2014 at 9:18 AM, Russell King - ARM Linux
> <[email protected]> wrote:
> > On Tue, Jan 28, 2014 at 09:12:27AM -0800, Yinghai Lu wrote:
> >> On Tue, Jan 28, 2014 at 7:30 AM, Kevin Hilman <[email protected]> wrote:
> >> > Like Olof, I noticed multiple boot failures on various ARM boards.
> >> > I've confirmed that reverting the arch/arm part of this patch makes
> >> > them all happily booting again.
> >>
> >> please try attached patch.
> >
> > Maybe I'm missing something, but there is no ARCH_LOW_ADDRESS_LIMIT
> > defined in the ARM header files, so I don't see how adding that
> > additional include changes anything.
>
> there is one for arm64 and s390.
>
> arch/arm64/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT PHYS_MAS
> arch/s390/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT 0x7fffff

Forgive me being difficult, but how exactly does your patch come anywhere
close to sortting out the reported regression by Kevin and Olof, and now
myself which is on ARM?

Your patch adds asm/processor.h to a header file, to allow architectures
to override ARCH_LOW_ADDRESS_LIMIT, but there is /no/ override for ARM,
so your patch has _zero_ effect there - we still end up with this being
0xffffffff, and we still end up with the thing being unable to boot.

Maybe you could describe what this ARCH_LOW_ADDRESS_LIMIT is, and what
it should be set to - I'm less than clear on how to set this given that
we have a multitude of different physical memory layouts - where memory
can start at almost any physical address.

--
FTTC broadband for 0.8mile line: 5.8Mbps down 500kbps up. Estimation
in database were 13.1 to 19Mbit for a good line, about 7.5+ for a bad.
Estimate before purchase was "up to 13.2Mbit".

2014-01-28 19:55:09

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 11:47 AM, Russell King - ARM Linux
<[email protected]> wrote:
> On Tue, Jan 28, 2014 at 11:43:02AM -0800, Yinghai Lu wrote:
>> On Tue, Jan 28, 2014 at 9:18 AM, Russell King - ARM Linux
>> <[email protected]> wrote:
>> > On Tue, Jan 28, 2014 at 09:12:27AM -0800, Yinghai Lu wrote:
>> >> On Tue, Jan 28, 2014 at 7:30 AM, Kevin Hilman <[email protected]> wrote:
>> >> > Like Olof, I noticed multiple boot failures on various ARM boards.
>> >> > I've confirmed that reverting the arch/arm part of this patch makes
>> >> > them all happily booting again.
>> >>
>> >> please try attached patch.
>> >
>> > Maybe I'm missing something, but there is no ARCH_LOW_ADDRESS_LIMIT
>> > defined in the ARM header files, so I don't see how adding that
>> > additional include changes anything.
>>
>> there is one for arm64 and s390.
>>
>> arch/arm64/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT PHYS_MAS
>> arch/s390/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT 0x7fffff
>
> Forgive me being difficult, but how exactly does your patch come anywhere
> close to sortting out the reported regression by Kevin and Olof, and now
> myself which is on ARM?
>
> Your patch adds asm/processor.h to a header file, to allow architectures
> to override ARCH_LOW_ADDRESS_LIMIT, but there is /no/ override for ARM,
> so your patch has _zero_ effect there - we still end up with this being
> 0xffffffff, and we still end up with the thing being unable to boot.
>
> Maybe you could describe what this ARCH_LOW_ADDRESS_LIMIT is, and what
> it should be set to - I'm less than clear on how to set this given that
> we have a multitude of different physical memory layouts - where memory
> can start at almost any physical address.

well, I just want to restore the alloc_bootmem_low by following patch.

commit 9233d2be108f573caa21eb450411bf8fa68cadbb
Author: Santosh Shilimkar <[email protected]>
Date: Tue Jan 21 15:50:47 2014 -0800

arch/arm/kernel/: use memblock apis for early memory allocations

Switch to memblock interfaces for early memory allocator instead of
bootmem allocator. No functional change in beahvior than what it is in
current code from bootmem users points of view.

Archs already converted to NO_BOOTMEM now directly use memblock
interfaces instead of bootmem wrappers build on top of memblock. And
the archs which still uses bootmem, these new apis just fallback to
exiting bootmem APIs.

Signed-off-by: Santosh Shilimkar <[email protected]>
...
@@ -717,7 +717,7 @@ static void __init
request_standard_resources(const struct machine_desc *mdesc)
kernel_data.end = virt_to_phys(_end - 1);

for_each_memblock(memory, region) {
- res = alloc_bootmem_low(sizeof(*res));
+ res = memblock_virt_alloc(sizeof(*res), 0);
res->name = "System RAM";

but looks like memblock_virt_alloc_low is not the same as alloc_bootmem_low yet.

Let's me check further.

Thanks

Yinghai

2014-01-28 20:17:37

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 11:55 AM, Yinghai Lu <[email protected]> wrote:

> but looks like memblock_virt_alloc_low is not the same as alloc_bootmem_low yet.
>

The memblock_virt_alloc missed limit checking.

Please check attached patch.

Thanks

Yinghai


Attachments:
fix_memblock_virt_alloc_low.patch (762.00 B)

2014-01-28 20:17:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 10:22 AM, Russell King - ARM Linux
<[email protected]> wrote:
>
> Hence, having ARCH_LOW_ADDRESS_LIMIT set to 4GB-1 by default seems to be
> completely rediculous - and presumably this also fails on x86_32 if it
> returns memory up at 4GB.

Agreed. That looks broken even on x86-32. The low address limit is not
even *close* to 4GB in general on 32-bit, since you not only have the
TASK_SIZE, you have the kmap and the vmalloc area. On x86-32,
ARCH_LOW_ADDRESS_LIMIT should be MAXMEM, which iirc is somewhere
around 890MB or so. Not 4G.

Linus

2014-01-28 20:18:51

by Grygorii Strashko

[permalink] [raw]
Subject: RE: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

Hi all,

Sorry, for the invalid mail & patch format - have no way to send it properly now.

Suppose there is another way to fix this issue (more generic)
- Correct memblock_virt_allocX() API to limit allocations below memblock.current_limit
(patch attached).

Then the code behavior will become more similar to _alloc_memory_core_early.

Not tested.


Best regards,
- grygorii

________________________________________
From: Konrad Rzeszutek Wilk [[email protected]]
Sent: Tuesday, January 28, 2014 8:56 PM
To: Shilimkar, Santosh
Cc: Russell King - ARM Linux; Yinghai Lu; Kevin Hilman; Olof Johansson; Linus Torvalds; Andrew Morton; Ingo Molnar; H. Peter Anvin; Dave Hansen; [email protected]; Strashko, Grygorii; [email protected]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 01:36:28PM -0500, Santosh Shilimkar wrote:
> + Gryagorii,
> On Tuesday 28 January 2014 01:22 PM, Russell King - ARM Linux wrote:
> > On Tue, Jan 28, 2014 at 12:23:02PM -0500, Santosh Shilimkar wrote:
> >> On Tuesday 28 January 2014 12:12 PM, Yinghai Lu wrote:
> >>> Index: linux-2.6/include/linux/bootmem.h
> >>> ===================================================================
> >>> --- linux-2.6.orig/include/linux/bootmem.h
> >>> +++ linux-2.6/include/linux/bootmem.h
> >>> @@ -179,6 +179,9 @@ static inline void * __init memblock_vir
> >>> NUMA_NO_NODE);
> >>> }
> >>>
> >>> +/* Take arch's ARCH_LOW_ADDRESS_LIMIT at first*/
> >>> +#include <asm/processor.h>
> >>> +
> >>> #ifndef ARCH_LOW_ADDRESS_LIMIT
> >>> #define ARCH_LOW_ADDRESS_LIMIT 0xffffffffUL
> >>> #endif
> >>
> >> This won't help mostly since the ARM 32 arch don't set ARCH_LOW_ADDRESS_LIMIT.
> >> Sorry i couldn't respond to the thread earlier because of travel and
> >> don't have access to my board to try out the patches.
> >
> > Let's think about this for a moment, shall we...
> >
> > What does memblock_alloc_virt*() return? It returns a virtual address.
> >
> > How is that virtual address obtained? ptr = phys_to_virt(alloc);
> >
> > What is the valid address range for passing into phys_to_virt() ? Only
> > lowmem addresses.
> >
> > Hence, having ARCH_LOW_ADDRESS_LIMIT set to 4GB-1 by default seems to be
> > completely rediculous - and presumably this also fails on x86_32 if it
> > returns memory up at 4GB.
> >
> > So... yes, I think reverting the arch/arm part of this patch is the right
> > solution, whether the rest of it should be reverted is something I can't
> > comment on.
> >
> Grygorri mentioned an alternate to update the memblock_find_in_range_node() so
> that it takes into account the limit.

This patch breaks also Xen and 32-bit guests (see
http://lists.xen.org/archives/html/xen-devel/2014-01/msg02476.html)

Reverting it fixes it.

>
> Regards,
> Santosh


Attachments:
0001-mm-memblock-fix-upper-boundary-of-allocating-region.patch (901.00 B)
0001-mm-memblock-fix-upper-boundary-of-allocating-region.patch

2014-01-28 20:20:25

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()

On Tue, Jan 28, 2014 at 12:17 PM, Linus Torvalds
<[email protected]> wrote:
> On Tue, Jan 28, 2014 at 10:22 AM, Russell King - ARM Linux
> <[email protected]> wrote:
>>
>> Hence, having ARCH_LOW_ADDRESS_LIMIT set to 4GB-1 by default seems to be
>> completely rediculous - and presumably this also fails on x86_32 if it
>> returns memory up at 4GB.
>
> Agreed. That looks broken even on x86-32. The low address limit is not
> even *close* to 4GB in general on 32-bit, since you not only have the
> TASK_SIZE, you have the kmap and the vmalloc area. On x86-32,
> ARCH_LOW_ADDRESS_LIMIT should be MAXMEM, which iirc is somewhere
> around 890MB or so. Not 4G.
>

yeah, Please check the patch that one minute ago.

Subject: [PATCH] memblock: Add limit checking to memblock_virt_alloc

In original bootmem wrapper for memblock, we have limit checking.

Add it to memblock_virt_alloc, to address arm and x86 booting crash.

Signed-off-by: Yinghai Lu <[email protected]>

---
mm/memblock.c | 3 +++
1 file changed, 3 insertions(+)

Index: linux-2.6/mm/memblock.c
===================================================================
--- linux-2.6.orig/mm/memblock.c
+++ linux-2.6/mm/memblock.c
@@ -1077,6 +1077,9 @@ static void * __init memblock_virt_alloc
if (!align)
align = SMP_CACHE_BYTES;

+ if (max_addr > memblock.current_limit)
+ max_addr = memblock.current_limit;
+
again:
alloc = memblock_find_in_range_node(size, align, min_addr, max_addr,
nid);

Thanks

Yinghai