2021-10-13 05:51:59

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH] memblock: exclude NOMAP regions from kmemleak

From: Mike Rapoport <[email protected]>

Vladimir Zapolskiy reports:

commit a7259df76702 ("memblock: make memblock_find_in_range method private")
invokes a kernel panic while running kmemleak on OF platforms with nomaped
regions:

Unable to handle kernel paging request at virtual address fff000021e00000
[...]
scan_block+0x64/0x170
scan_gray_list+0xe8/0x17c
kmemleak_scan+0x270/0x514
kmemleak_write+0x34c/0x4ac

Indeed, NOMAP regions don't have linear map entries so an attempt to scan
these areas would fault.

Prevent such faults by excluding NOMAP regions from kmemleak.

Link: https://lore.kernel.org/all/[email protected]
Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private")
Signed-off-by: Mike Rapoport <[email protected]>
Tested-by: Vladimir Zapolskiy <[email protected]>
---
mm/memblock.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index 184dcd2e5d99..5c3503c98b2f 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -936,7 +936,12 @@ int __init_memblock memblock_mark_mirror(phys_addr_t base, phys_addr_t size)
*/
int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size)
{
- return memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP);
+ int ret = memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP);
+
+ if (!ret)
+ kmemleak_free_part_phys(base, size);
+
+ return ret;
}

/**

base-commit: 64570fbc14f8d7cb3fe3995f20e26bc25ce4b2cc
--
2.28.0


2021-10-13 07:50:05

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

On Wed, Oct 13, 2021 at 08:47:56AM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> Vladimir Zapolskiy reports:
>
> commit a7259df76702 ("memblock: make memblock_find_in_range method private")
> invokes a kernel panic while running kmemleak on OF platforms with nomaped
> regions:
>
> Unable to handle kernel paging request at virtual address fff000021e00000
> [...]
> scan_block+0x64/0x170
> scan_gray_list+0xe8/0x17c
> kmemleak_scan+0x270/0x514
> kmemleak_write+0x34c/0x4ac
>
> Indeed, NOMAP regions don't have linear map entries so an attempt to scan
> these areas would fault.
>
> Prevent such faults by excluding NOMAP regions from kmemleak.
>
> Link: https://lore.kernel.org/all/[email protected]
> Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private")
> Signed-off-by: Mike Rapoport <[email protected]>
> Tested-by: Vladimir Zapolskiy <[email protected]>

Acked-by: Catalin Marinas <[email protected]>

2021-10-13 11:37:11

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

On Wed, Oct 13, 2021 at 08:45:40AM +0100, Catalin Marinas wrote:
> On Wed, Oct 13, 2021 at 08:47:56AM +0300, Mike Rapoport wrote:
> > From: Mike Rapoport <[email protected]>
> >
> > Vladimir Zapolskiy reports:
> >
> > commit a7259df76702 ("memblock: make memblock_find_in_range method private")
> > invokes a kernel panic while running kmemleak on OF platforms with nomaped
> > regions:
> >
> > Unable to handle kernel paging request at virtual address fff000021e00000
> > [...]
> > scan_block+0x64/0x170
> > scan_gray_list+0xe8/0x17c
> > kmemleak_scan+0x270/0x514
> > kmemleak_write+0x34c/0x4ac
> >
> > Indeed, NOMAP regions don't have linear map entries so an attempt to scan
> > these areas would fault.
> >
> > Prevent such faults by excluding NOMAP regions from kmemleak.
> >
> > Link: https://lore.kernel.org/all/[email protected]
> > Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private")
> > Signed-off-by: Mike Rapoport <[email protected]>
> > Tested-by: Vladimir Zapolskiy <[email protected]>
>
> Acked-by: Catalin Marinas <[email protected]>

Thanks!

I'm going to take it via memblock tree if that's fine with everybody.

--
Sincerely yours,
Mike.

2021-10-19 04:24:13

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak



On 10/13/21 11:17 AM, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> Vladimir Zapolskiy reports:
>
> commit a7259df76702 ("memblock: make memblock_find_in_range method private")
> invokes a kernel panic while running kmemleak on OF platforms with nomaped
> regions:
>
> Unable to handle kernel paging request at virtual address fff000021e00000
> [...]
> scan_block+0x64/0x170
> scan_gray_list+0xe8/0x17c
> kmemleak_scan+0x270/0x514
> kmemleak_write+0x34c/0x4ac
>
> Indeed, NOMAP regions don't have linear map entries so an attempt to scan
> these areas would fault.
>
> Prevent such faults by excluding NOMAP regions from kmemleak.
>
> Link: https://lore.kernel.org/all/[email protected]
> Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private")
> Signed-off-by: Mike Rapoport <[email protected]>
> Tested-by: Vladimir Zapolskiy <[email protected]>
> ---
> mm/memblock.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 184dcd2e5d99..5c3503c98b2f 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -936,7 +936,12 @@ int __init_memblock memblock_mark_mirror(phys_addr_t base, phys_addr_t size)
> */
> int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size)
> {
> - return memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP);
> + int ret = memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP);
> +
> + if (!ret)
> + kmemleak_free_part_phys(base, size);
> +
> + return ret;
> }
>
> /**
>
> base-commit: 64570fbc14f8d7cb3fe3995f20e26bc25ce4b2cc
>

Reviewed-by: Anshuman Khandual <[email protected]>

A small nit though.

Just wondering. Should not the comment for memblock_mark_nomap() be
updated (or add a comment in the function) to explain the reason to
call kmemleak_free_part_phys(), to emphasize that a scan would fail
for such memory ranges due to lack of linear mapping ?

2021-10-19 05:47:59

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

Hi Qian,

On Mon, Oct 18, 2021 at 11:55:40PM -0400, Qian Cai wrote:
>
>
> On 10/13/2021 1:47 AM, Mike Rapoport wrote:
> > From: Mike Rapoport <[email protected]>
> >
> > Vladimir Zapolskiy reports:
> >
> > commit a7259df76702 ("memblock: make memblock_find_in_range method private")
> > invokes a kernel panic while running kmemleak on OF platforms with nomaped
> > regions:
> >
> > Unable to handle kernel paging request at virtual address fff000021e00000
> > [...]
> > scan_block+0x64/0x170
> > scan_gray_list+0xe8/0x17c
> > kmemleak_scan+0x270/0x514
> > kmemleak_write+0x34c/0x4ac
> >
> > Indeed, NOMAP regions don't have linear map entries so an attempt to scan
> > these areas would fault.
> >
> > Prevent such faults by excluding NOMAP regions from kmemleak.
> >
> > Link: https://lore.kernel.org/all/[email protected]
> > Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private")
> > Signed-off-by: Mike Rapoport <[email protected]>
> > Tested-by: Vladimir Zapolskiy <[email protected]>
>
> Mike, reverting this commit on the top of today's linux-next fixed the early booting hang
> on an arm64 server with kmemleak. Even with "earlycon", it could only print out those
> lines.
>
> EFI stub: Booting Linux Kernel...
> EFI stub: EFI_RNG_PROTOCOL unavailable
> EFI stub: ERROR: FIRMWARE BUG: kernel image not aligned on 128k boundary
> EFI stub: ERROR: FIRMWARE BUG: Image BSS overlaps adjacent EFI memory region
> EFI stub: Using DTB from configuration table
> EFI stub: Exiting boot services…
>
> I could help to confirm if it hangs right in the early boot somewhere if needed.

The kernel config and a log of working kernel would help to start with.

> start_kernel()
> setup_arch()
> paging_init()
> map_mem()
> memblock_mark_nomap(

So we have kmemleak_free_part_phys() here.

Catalin, any ideas?

>
> > ---
> > mm/memblock.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index 184dcd2e5d99..5c3503c98b2f 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -936,7 +936,12 @@ int __init_memblock memblock_mark_mirror(phys_addr_t base, phys_addr_t size)
> > */
> > int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size)
> > {
> > - return memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP);
> > + int ret = memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP);
> > +
> > + if (!ret)
> > + kmemleak_free_part_phys(base, size);
> > +
> > + return ret;
> > }
> >
> > /**
> >
> > base-commit: 64570fbc14f8d7cb3fe3995f20e26bc25ce4b2cc
> >

--
Sincerely yours,
Mike.

2021-10-19 11:39:37

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

On Tue, Oct 19, 2021 at 08:45:49AM +0300, Mike Rapoport wrote:
> On Mon, Oct 18, 2021 at 11:55:40PM -0400, Qian Cai wrote:
> > On 10/13/2021 1:47 AM, Mike Rapoport wrote:
> > > From: Mike Rapoport <[email protected]>
> > >
> > > Vladimir Zapolskiy reports:
> > >
> > > commit a7259df76702 ("memblock: make memblock_find_in_range method private")
> > > invokes a kernel panic while running kmemleak on OF platforms with nomaped
> > > regions:
> > >
> > > Unable to handle kernel paging request at virtual address fff000021e00000
> > > [...]
> > > scan_block+0x64/0x170
> > > scan_gray_list+0xe8/0x17c
> > > kmemleak_scan+0x270/0x514
> > > kmemleak_write+0x34c/0x4ac
> > >
> > > Indeed, NOMAP regions don't have linear map entries so an attempt to scan
> > > these areas would fault.
> > >
> > > Prevent such faults by excluding NOMAP regions from kmemleak.
> > >
> > > Link: https://lore.kernel.org/all/[email protected]
> > > Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private")
> > > Signed-off-by: Mike Rapoport <[email protected]>
> > > Tested-by: Vladimir Zapolskiy <[email protected]>
> >
> > Mike, reverting this commit on the top of today's linux-next fixed the early booting hang
> > on an arm64 server with kmemleak. Even with "earlycon", it could only print out those
> > lines.
> >
> > EFI stub: Booting Linux Kernel...
> > EFI stub: EFI_RNG_PROTOCOL unavailable
> > EFI stub: ERROR: FIRMWARE BUG: kernel image not aligned on 128k boundary
> > EFI stub: ERROR: FIRMWARE BUG: Image BSS overlaps adjacent EFI memory region
> > EFI stub: Using DTB from configuration table
> > EFI stub: Exiting boot services…
> >
> > I could help to confirm if it hangs right in the early boot somewhere if needed.
>
> The kernel config and a log of working kernel would help to start with.

I don't think there's much in the log other than the EFI stub above.

> > start_kernel()
> > setup_arch()
> > paging_init()
> > map_mem()
> > memblock_mark_nomap(

Is this actual trace? It would be good to know where exactly it got
stuck.

> So we have kmemleak_free_part_phys() here.

I wonder whether the memblock_mark_nomap() here is too early for
kmemleak. We don't have the linear map created, though it shouldn't be
an issue as the kernel sections are mapped. Also I think
delete_object_part() in kmemleak.c would bail out early as there
shouldn't be any prior memblock_alloc for this range.

--
Catalin

2021-10-19 15:10:38

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak



On 10/19/2021 7:37 AM, Catalin Marinas wrote:
>>> I could help to confirm if it hangs right in the early boot somewhere if needed.
>>
>> The kernel config and a log of working kernel would help to start with.

http://lsbug.org/tmp/

>
> I don't think there's much in the log other than the EFI stub above.
>
>>> start_kernel()
>>> setup_arch()
>>> paging_init()
>>> map_mem()
>>> memblock_mark_nomap(
>
> Is this actual trace? It would be good to know where exactly it got
> stuck.

No, I did not confirm anything yet. There is going to take a while to
figure out the exactly location that hang since even the early console
was not initialized yet. Any suggestion on how to debug in this case?

>
>> So we have kmemleak_free_part_phys() here.
>
> I wonder whether the memblock_mark_nomap() here is too early for
> kmemleak. We don't have the linear map created, though it shouldn't be
> an issue as the kernel sections are mapped. Also I think
> delete_object_part() in kmemleak.c would bail out early as there
> shouldn't be any prior memblock_alloc for this range.
>

2021-10-19 15:54:36

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

On Tue, Oct 19, 2021 at 11:06:11AM -0400, Qian Cai wrote:
> On 10/19/2021 7:37 AM, Catalin Marinas wrote:
> >>> I could help to confirm if it hangs right in the early boot somewhere if needed.
> >>
> >> The kernel config and a log of working kernel would help to start with.
>
> http://lsbug.org/tmp/

Thanks. I guess the log here is with the Mike's patch reverted.

> > I don't think there's much in the log other than the EFI stub above.
> >
> >>> start_kernel()
> >>> setup_arch()
> >>> paging_init()
> >>> map_mem()
> >>> memblock_mark_nomap(
> >
> > Is this actual trace? It would be good to know where exactly it got
> > stuck.
>
> No, I did not confirm anything yet. There is going to take a while to
> figure out the exactly location that hang since even the early console
> was not initialized yet. Any suggestion on how to debug in this case?

Try "earlycon=pl011,mmio32,0x12600000" on the kernel command line
and hopefully we get some early log.

--
Catalin

2021-10-19 18:04:13

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak



On 10/19/2021 11:53 AM, Catalin Marinas wrote:
> Thanks. I guess the log here is with the Mike's patch reverted.

Yes.

> Try "earlycon=pl011,mmio32,0x12600000" on the kernel command line
> and hopefully we get some early log.

Thanks for the suggestion, Catalin. I did not realize that a
manually-provided "earlycon" started earlier than just "earlycon"
and not defer to ACPI to populate parameters. Anyway,

[ 0.000000][ T0] Booting Linux on physical CPU 0x0000000000 [0x503f0002]
[ 0.000000][ T0] Linux version 5.15.0-rc6-next-20211019+ (root@admin5) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #104 SMP Tue Oct 19 17:36:17 UTC 2021
[ 0.000000][ T0] earlycon: pl11 at MMIO32 0x0000000012600000 (options '')
[ 0.000000][ T0] printk: bootconsole [pl11] enabled
[ 0.000000][ T0] efi: Getting UEFI parameters from /chosen in DT:
[ 0.000000][ T0] efi: System Table : 0x0000009ff7de0018
[ 0.000000][ T0] efi: MemMap Address : 0x0000009fe6dae018
[ 0.000000][ T0] efi: MemMap Size : 0x0000000000000600
[ 0.000000][ T0] efi: MemMap Desc. Size : 0x0000000000000030
[ 0.000000][ T0] efi: MemMap Desc. Version : 0x0000000000000001
[ 0.000000][ T0] efi: EFI v2.70 by American Megatrends
[ 0.000000][ T0] efi: ACPI 2.0=0x9ff5b40000 SMBIOS 3.0=0x9ff686fd98 ESRT=0x9ff1d18298 MEMRESERVE=0x9fe6dacd98
[ 0.000000][ T0] efi: Processing EFI memory map:
[ 0.000000][ T0] efi: 0x000090000000-0x000091ffffff [Conventional| | | | | | | | | | |WB|WT|WC|UC]
[ 0.000000][ T0] efi: 0x000092000000-0x0000928fffff [Runtime Data|RUN| | | | | | | | | |WB|WT|WC|UC]
[ 0.000000][ T0] ------------[ cut here ]------------
[ 0.000000][ T0] kernel BUG at mm/kmemleak.c:1140!
[ 0.000000][ T0] Internal error: Oops - BUG: 0 [#1] SMP
[ 0.000000][ T0] Modules linked in:
[ 0.000000][ T0] CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-rc6-next-20211019+ #104
[ 0.000000][ T0] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.000000][ T0] pc : kmemleak_free_part_phys+0x64/0x8c
[ 0.000000][ T0] lr : kmemleak_free_part_phys+0x38/0x8c
[ 0.000000][ T0] sp : ffff800011eafbc0
[ 0.000000][ T0] x29: ffff800011eafbc0 x28: 1fffff7fffb41c0d x27: fffffbfffda0e068
[ 0.000000][ T0] x26: 0000000092000000 x25: 1ffff000023d5f94 x24: ffff800011ed84d0
[ 0.000000][ T0] x23: ffff800011ed84c0 x22: ffff800011ed83d8 x21: 0000000000900000
[ 0.000000][ T0] x20: ffff800011782000 x19: 0000000092000000 x18: ffff800011ee0730
[ 0.000000][ T0] x17: 0000000000000000 x16: 0000000000000000 x15: 1ffff0000233252c
[ 0.000000][ T0] x14: ffff800019a905a0 x13: 0000000000000001 x12: ffff7000023d5ed7
[ 0.000000][ T0] x11: 1ffff000023d5ed6 x10: ffff7000023d5ed6 x9 : dfff800000000000
[ 0.000000][ T0] x8 : ffff800011eaf6b7 x7 : 0000000000000001 x6 : ffff800011eaf6b0
[ 0.000000][ T0] x5 : 00008ffffdc2a12a x4 : ffff7000023d5ed7 x3 : 1ffff000023dbf99
[ 0.000000][ T0] x2 : 1ffff000022f0463 x1 : 0000000000000000 x0 : ffffffffffffffff
[ 0.000000][ T0] Call trace:
[ 0.000000][ T0] kmemleak_free_part_phys+0x64/0x8c
[ 0.000000][ T0] memblock_mark_nomap+0x5c/0x78
[ 0.000000][ T0] reserve_regions+0x294/0x33c
[ 0.000000][ T0] efi_init+0x2d0/0x490
[ 0.000000][ T0] setup_arch+0x80/0x138
[ 0.000000][ T0] start_kernel+0xa0/0x3ec
[ 0.000000][ T0] __primary_switched+0xc0/0xc8
[ 0.000000][ T0] Code: 34000041 97d526e7 f9418e80 36000040 (d4210000)
[ 0.000000][ T0] random: get_random_bytes called from print_oops_end_marker+0x34/0x80 with crng_init=0
[ 0.000000][ T0] ---[ end trace 0000000000000000 ]---
[ 0.000000][ T0] Kernel panic - not syncing: Oops - BUG: Fatal exception
[ 0.000000][ T0] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception ]---

I did not quite figure out where this BUG() was triggered and I did not
see anything obviously after checking DEBUG_VIRTUAL code, but it did
finger to the kmemleak_free_part() line. I verified that phys == 0x92000000d,

void __ref kmemleak_free_part_phys(phys_addr_t phys, size_t size)
{
if (!IS_ENABLED(CONFIG_HIGHMEM) || PHYS_PFN(phys) < max_low_pfn)
kmemleak_free_part(__va(phys), size);
}

As you can see the above efi=debug information was truncated. Usually
on a working boot the whole thing is:

[ 0.000000] efi: Processing EFI memory map:
[ 0.000000] efi: 0x000010540000-0x00001054ffff [Memory Mapped I/O |RUN| | | | | | | | | | | ]
[ 0.000000] efi: 0x000090000000-0x00009007ffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000090080000-0x000091ebffff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000091ec0000-0x000091ffffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000092000000-0x0000928fffff [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000092900000-0x0000fffb7fff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x0000fffb8000-0x0000fffbffff [Boot Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x0000fffc0000-0x0000ffffffff [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000880000000-0x00088ae4afff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x00088ae4b000-0x00088fffffff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000890000000-0x000fffffffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x008800000000-0x009f81089fff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009f8108a000-0x009f82dabfff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009f82dac000-0x009fe6dabfff [Loader Code | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009fe6dac000-0x009fe6dacfff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009fe6dad000-0x009fe6dadfff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009fe6dae000-0x009fe6db2fff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009fe6db3000-0x009fe6f7bfff [Loader Code | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009fe6f7c000-0x009ff287cfff [Boot Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff287d000-0x009ff3293fff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff3294000-0x009ff5af0fff [Boot Code | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff5af1000-0x009ff5b2ffff [Reserved | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff5b30000-0x009ff5b4ffff [ACPI Reclaim Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff5b50000-0x009ff5baffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff5bb0000-0x009ff5bbffff [ACPI Memory NVS | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff5bc0000-0x009ff7deffff [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff7df0000-0x009ff7e5ffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff7e60000-0x009ff7ffffff [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff8000000-0x009ff801efff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff801f000-0x009ff801ffff [Boot Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009ff8020000-0x009fff9fffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009fffa00000-0x009fffbfffff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009fffc00000-0x009fffdbffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009fffdc0000-0x009fffdcffff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009fffdd0000-0x009fffdd4fff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x009fffdd5000-0x009fffffffff [Boot Data | | | | | | | | |WB|WT|WC|UC]

2021-10-19 18:34:36

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

On Tue, Oct 19, 2021 at 01:59:22PM -0400, Qian Cai wrote:
>
>
> On 10/19/2021 11:53 AM, Catalin Marinas wrote:
> > Thanks. I guess the log here is with the Mike's patch reverted.
>
> Yes.
>
> > Try "earlycon=pl011,mmio32,0x12600000" on the kernel command line
> > and hopefully we get some early log.
>
> Thanks for the suggestion, Catalin. I did not realize that a
> manually-provided "earlycon" started earlier than just "earlycon"
> and not defer to ACPI to populate parameters. Anyway,
>
> [ 0.000000][ T0] Booting Linux on physical CPU 0x0000000000 [0x503f0002]
> [ 0.000000][ T0] Linux version 5.15.0-rc6-next-20211019+ (root@admin5) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #104 SMP Tue Oct 19 17:36:17 UTC 2021
> [ 0.000000][ T0] earlycon: pl11 at MMIO32 0x0000000012600000 (options '')
> [ 0.000000][ T0] printk: bootconsole [pl11] enabled
> [ 0.000000][ T0] efi: Getting UEFI parameters from /chosen in DT:
> [ 0.000000][ T0] efi: System Table : 0x0000009ff7de0018
> [ 0.000000][ T0] efi: MemMap Address : 0x0000009fe6dae018
> [ 0.000000][ T0] efi: MemMap Size : 0x0000000000000600
> [ 0.000000][ T0] efi: MemMap Desc. Size : 0x0000000000000030
> [ 0.000000][ T0] efi: MemMap Desc. Version : 0x0000000000000001
> [ 0.000000][ T0] efi: EFI v2.70 by American Megatrends
> [ 0.000000][ T0] efi: ACPI 2.0=0x9ff5b40000 SMBIOS 3.0=0x9ff686fd98 ESRT=0x9ff1d18298 MEMRESERVE=0x9fe6dacd98
> [ 0.000000][ T0] efi: Processing EFI memory map:
> [ 0.000000][ T0] efi: 0x000090000000-0x000091ffffff [Conventional| | | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000][ T0] efi: 0x000092000000-0x0000928fffff [Runtime Data|RUN| | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000][ T0] ------------[ cut here ]------------
> [ 0.000000][ T0] kernel BUG at mm/kmemleak.c:1140!
> [ 0.000000][ T0] Internal error: Oops - BUG: 0 [#1] SMP
> [ 0.000000][ T0] Modules linked in:
> [ 0.000000][ T0] CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-rc6-next-20211019+ #104
> [ 0.000000][ T0] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 0.000000][ T0] pc : kmemleak_free_part_phys+0x64/0x8c
> [ 0.000000][ T0] lr : kmemleak_free_part_phys+0x38/0x8c
> [ 0.000000][ T0] sp : ffff800011eafbc0
> [ 0.000000][ T0] x29: ffff800011eafbc0 x28: 1fffff7fffb41c0d x27: fffffbfffda0e068
> [ 0.000000][ T0] x26: 0000000092000000 x25: 1ffff000023d5f94 x24: ffff800011ed84d0
> [ 0.000000][ T0] x23: ffff800011ed84c0 x22: ffff800011ed83d8 x21: 0000000000900000
> [ 0.000000][ T0] x20: ffff800011782000 x19: 0000000092000000 x18: ffff800011ee0730
> [ 0.000000][ T0] x17: 0000000000000000 x16: 0000000000000000 x15: 1ffff0000233252c
> [ 0.000000][ T0] x14: ffff800019a905a0 x13: 0000000000000001 x12: ffff7000023d5ed7
> [ 0.000000][ T0] x11: 1ffff000023d5ed6 x10: ffff7000023d5ed6 x9 : dfff800000000000
> [ 0.000000][ T0] x8 : ffff800011eaf6b7 x7 : 0000000000000001 x6 : ffff800011eaf6b0
> [ 0.000000][ T0] x5 : 00008ffffdc2a12a x4 : ffff7000023d5ed7 x3 : 1ffff000023dbf99
> [ 0.000000][ T0] x2 : 1ffff000022f0463 x1 : 0000000000000000 x0 : ffffffffffffffff
> [ 0.000000][ T0] Call trace:
> [ 0.000000][ T0] kmemleak_free_part_phys+0x64/0x8c
> [ 0.000000][ T0] memblock_mark_nomap+0x5c/0x78
> [ 0.000000][ T0] reserve_regions+0x294/0x33c
> [ 0.000000][ T0] efi_init+0x2d0/0x490
> [ 0.000000][ T0] setup_arch+0x80/0x138
> [ 0.000000][ T0] start_kernel+0xa0/0x3ec
> [ 0.000000][ T0] __primary_switched+0xc0/0xc8
> [ 0.000000][ T0] Code: 34000041 97d526e7 f9418e80 36000040 (d4210000)
> [ 0.000000][ T0] random: get_random_bytes called from print_oops_end_marker+0x34/0x80 with crng_init=0
> [ 0.000000][ T0] ---[ end trace 0000000000000000 ]---
> [ 0.000000][ T0] Kernel panic - not syncing: Oops - BUG: Fatal exception
> [ 0.000000][ T0] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception ]---
>
> I did not quite figure out where this BUG() was triggered and I did not

This is from here:
arch/arm64/include/asm/memory.h:

#define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })

kmemleak_free_part_phys() does __va() which uses PHYS_OFFSET and all this
happens before memstart_addr is set.

I'll try to see how this can be untangled...

> see anything obviously after checking DEBUG_VIRTUAL code, but it did
> finger to the kmemleak_free_part() line. I verified that phys == 0x92000000d,
>
> void __ref kmemleak_free_part_phys(phys_addr_t phys, size_t size)
> {
> if (!IS_ENABLED(CONFIG_HIGHMEM) || PHYS_PFN(phys) < max_low_pfn)
> kmemleak_free_part(__va(phys), size);
> }
>
> As you can see the above efi=debug information was truncated. Usually
> on a working boot the whole thing is:
>
> [ 0.000000] efi: Processing EFI memory map:
> [ 0.000000] efi: 0x000010540000-0x00001054ffff [Memory Mapped I/O |RUN| | | | | | | | | | | ]
> [ 0.000000] efi: 0x000090000000-0x00009007ffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x000090080000-0x000091ebffff [Loader Data | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x000091ec0000-0x000091ffffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x000092000000-0x0000928fffff [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x000092900000-0x0000fffb7fff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x0000fffb8000-0x0000fffbffff [Boot Data | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x0000fffc0000-0x0000ffffffff [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x000880000000-0x00088ae4afff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x00088ae4b000-0x00088fffffff [Loader Data | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x000890000000-0x000fffffffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x008800000000-0x009f81089fff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009f8108a000-0x009f82dabfff [Loader Data | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009f82dac000-0x009fe6dabfff [Loader Code | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009fe6dac000-0x009fe6dacfff [Loader Data | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009fe6dad000-0x009fe6dadfff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009fe6dae000-0x009fe6db2fff [Loader Data | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009fe6db3000-0x009fe6f7bfff [Loader Code | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009fe6f7c000-0x009ff287cfff [Boot Data | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff287d000-0x009ff3293fff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff3294000-0x009ff5af0fff [Boot Code | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff5af1000-0x009ff5b2ffff [Reserved | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff5b30000-0x009ff5b4ffff [ACPI Reclaim Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff5b50000-0x009ff5baffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff5bb0000-0x009ff5bbffff [ACPI Memory NVS | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff5bc0000-0x009ff7deffff [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff7df0000-0x009ff7e5ffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff7e60000-0x009ff7ffffff [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff8000000-0x009ff801efff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff801f000-0x009ff801ffff [Boot Data | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009ff8020000-0x009fff9fffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009fffa00000-0x009fffbfffff [Loader Data | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009fffc00000-0x009fffdbffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009fffdc0000-0x009fffdcffff [Loader Data | | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009fffdd0000-0x009fffdd4fff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
> [ 0.000000] efi: 0x009fffdd5000-0x009fffffffff [Boot Data | | | | | | | | |WB|WT|WC|UC]

--
Sincerely yours,
Mike.

2021-10-20 07:41:10

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

On Tue, Oct 19, 2021 at 09:33:11PM +0300, Mike Rapoport wrote:
> On Tue, Oct 19, 2021 at 01:59:22PM -0400, Qian Cai wrote:
> >
> > On 10/19/2021 11:53 AM, Catalin Marinas wrote:
> > > Thanks. I guess the log here is with the Mike's patch reverted.
> >
> > Yes.
> >
> > > Try "earlycon=pl011,mmio32,0x12600000" on the kernel command line
> > > and hopefully we get some early log.
> >
> > Thanks for the suggestion, Catalin. I did not realize that a
> > manually-provided "earlycon" started earlier than just "earlycon"
> > and not defer to ACPI to populate parameters. Anyway,
> >
> > [ 0.000000][ T0] Booting Linux on physical CPU 0x0000000000 [0x503f0002]
> > [ 0.000000][ T0] Linux version 5.15.0-rc6-next-20211019+ (root@admin5) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #104 SMP Tue Oct 19 17:36:17 UTC 2021
> > [ 0.000000][ T0] earlycon: pl11 at MMIO32 0x0000000012600000 (options '')
> > [ 0.000000][ T0] printk: bootconsole [pl11] enabled
> > [ 0.000000][ T0] efi: Getting UEFI parameters from /chosen in DT:
> > [ 0.000000][ T0] efi: System Table : 0x0000009ff7de0018
> > [ 0.000000][ T0] efi: MemMap Address : 0x0000009fe6dae018
> > [ 0.000000][ T0] efi: MemMap Size : 0x0000000000000600
> > [ 0.000000][ T0] efi: MemMap Desc. Size : 0x0000000000000030
> > [ 0.000000][ T0] efi: MemMap Desc. Version : 0x0000000000000001
> > [ 0.000000][ T0] efi: EFI v2.70 by American Megatrends
> > [ 0.000000][ T0] efi: ACPI 2.0=0x9ff5b40000 SMBIOS 3.0=0x9ff686fd98 ESRT=0x9ff1d18298 MEMRESERVE=0x9fe6dacd98
> > [ 0.000000][ T0] efi: Processing EFI memory map:
> > [ 0.000000][ T0] efi: 0x000090000000-0x000091ffffff [Conventional| | | | | | | | | | |WB|WT|WC|UC]
> > [ 0.000000][ T0] efi: 0x000092000000-0x0000928fffff [Runtime Data|RUN| | | | | | | | | |WB|WT|WC|UC]
> > [ 0.000000][ T0] ------------[ cut here ]------------
> > [ 0.000000][ T0] kernel BUG at mm/kmemleak.c:1140!
> > [ 0.000000][ T0] Internal error: Oops - BUG: 0 [#1] SMP
> >
> > I did not quite figure out where this BUG() was triggered and I did not
>
> This is from here:
> arch/arm64/include/asm/memory.h:
>
> #define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
>
> kmemleak_free_part_phys() does __va() which uses PHYS_OFFSET and all this
> happens before memstart_addr is set.
>
> I'll try to see how this can be untangled...

This late in the cycle I can only think of reverting kmemleak wavier from
memblock_mark_nomap() and putting it in
early_init_dt_alloc_reserved_memory_arch() being the only user setting
MEMBLOCK_NOMAP to an allocated chunk rather than marking NOMAP "unusable"
memory reported by firmware.

Thoughts?

--
Sincerely yours,
Mike.

2021-10-20 08:21:00

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

On Wed, Oct 20, 2021 at 10:38:23AM +0300, Mike Rapoport wrote:
> On Tue, Oct 19, 2021 at 09:33:11PM +0300, Mike Rapoport wrote:
> > On Tue, Oct 19, 2021 at 01:59:22PM -0400, Qian Cai wrote:
> > > [ 0.000000][ T0] Booting Linux on physical CPU 0x0000000000 [0x503f0002]
> > > [ 0.000000][ T0] Linux version 5.15.0-rc6-next-20211019+ (root@admin5) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #104 SMP Tue Oct 19 17:36:17 UTC 2021
> > > [ 0.000000][ T0] earlycon: pl11 at MMIO32 0x0000000012600000 (options '')
> > > [ 0.000000][ T0] printk: bootconsole [pl11] enabled
> > > [ 0.000000][ T0] efi: Getting UEFI parameters from /chosen in DT:
> > > [ 0.000000][ T0] efi: System Table : 0x0000009ff7de0018
> > > [ 0.000000][ T0] efi: MemMap Address : 0x0000009fe6dae018
> > > [ 0.000000][ T0] efi: MemMap Size : 0x0000000000000600
> > > [ 0.000000][ T0] efi: MemMap Desc. Size : 0x0000000000000030
> > > [ 0.000000][ T0] efi: MemMap Desc. Version : 0x0000000000000001
> > > [ 0.000000][ T0] efi: EFI v2.70 by American Megatrends
> > > [ 0.000000][ T0] efi: ACPI 2.0=0x9ff5b40000 SMBIOS 3.0=0x9ff686fd98 ESRT=0x9ff1d18298 MEMRESERVE=0x9fe6dacd98
> > > [ 0.000000][ T0] efi: Processing EFI memory map:
> > > [ 0.000000][ T0] efi: 0x000090000000-0x000091ffffff [Conventional| | | | | | | | | | |WB|WT|WC|UC]
> > > [ 0.000000][ T0] efi: 0x000092000000-0x0000928fffff [Runtime Data|RUN| | | | | | | | | |WB|WT|WC|UC]
> > > [ 0.000000][ T0] ------------[ cut here ]------------
> > > [ 0.000000][ T0] kernel BUG at mm/kmemleak.c:1140!
> > > [ 0.000000][ T0] Internal error: Oops - BUG: 0 [#1] SMP
> > >
> > > I did not quite figure out where this BUG() was triggered and I did not
> >
> > This is from here:
> > arch/arm64/include/asm/memory.h:
> >
> > #define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
> >
> > kmemleak_free_part_phys() does __va() which uses PHYS_OFFSET and all this
> > happens before memstart_addr is set.
> >
> > I'll try to see how this can be untangled...
>
> This late in the cycle I can only think of reverting kmemleak wavier from
> memblock_mark_nomap() and putting it in
> early_init_dt_alloc_reserved_memory_arch() being the only user setting
> MEMBLOCK_NOMAP to an allocated chunk rather than marking NOMAP "unusable"
> memory reported by firmware.

It makes sense, there aren't many places or nomap is called.

I think arch_reserve_mem_area() called from acpi_table_upgrade() also
follows a memblock allocation. But I'd call kmemleak in
acpi_table_upgrade() directly rather than in the arch back-end.

Regarding which callback, I think kmemleak_ignore_phys() is better
suited here since kmemleak still keeps track of the object but won't
scan it.

--
Catalin

2021-10-20 08:43:50

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

On Wed, Oct 20, 2021 at 09:18:46AM +0100, Catalin Marinas wrote:
> On Wed, Oct 20, 2021 at 10:38:23AM +0300, Mike Rapoport wrote:
> > On Tue, Oct 19, 2021 at 09:33:11PM +0300, Mike Rapoport wrote:
> > > On Tue, Oct 19, 2021 at 01:59:22PM -0400, Qian Cai wrote:
> > > > [ 0.000000][ T0] Booting Linux on physical CPU 0x0000000000 [0x503f0002]
> > > > [ 0.000000][ T0] Linux version 5.15.0-rc6-next-20211019+ (root@admin5) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #104 SMP Tue Oct 19 17:36:17 UTC 2021
> > > > [ 0.000000][ T0] earlycon: pl11 at MMIO32 0x0000000012600000 (options '')
> > > > [ 0.000000][ T0] printk: bootconsole [pl11] enabled
> > > > [ 0.000000][ T0] efi: Getting UEFI parameters from /chosen in DT:
> > > > [ 0.000000][ T0] efi: System Table : 0x0000009ff7de0018
> > > > [ 0.000000][ T0] efi: MemMap Address : 0x0000009fe6dae018
> > > > [ 0.000000][ T0] efi: MemMap Size : 0x0000000000000600
> > > > [ 0.000000][ T0] efi: MemMap Desc. Size : 0x0000000000000030
> > > > [ 0.000000][ T0] efi: MemMap Desc. Version : 0x0000000000000001
> > > > [ 0.000000][ T0] efi: EFI v2.70 by American Megatrends
> > > > [ 0.000000][ T0] efi: ACPI 2.0=0x9ff5b40000 SMBIOS 3.0=0x9ff686fd98 ESRT=0x9ff1d18298 MEMRESERVE=0x9fe6dacd98
> > > > [ 0.000000][ T0] efi: Processing EFI memory map:
> > > > [ 0.000000][ T0] efi: 0x000090000000-0x000091ffffff [Conventional| | | | | | | | | | |WB|WT|WC|UC]
> > > > [ 0.000000][ T0] efi: 0x000092000000-0x0000928fffff [Runtime Data|RUN| | | | | | | | | |WB|WT|WC|UC]
> > > > [ 0.000000][ T0] ------------[ cut here ]------------
> > > > [ 0.000000][ T0] kernel BUG at mm/kmemleak.c:1140!
> > > > [ 0.000000][ T0] Internal error: Oops - BUG: 0 [#1] SMP
> > > >
> > > > I did not quite figure out where this BUG() was triggered and I did not
> > >
> > > This is from here:
> > > arch/arm64/include/asm/memory.h:
> > >
> > > #define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
> > >
> > > kmemleak_free_part_phys() does __va() which uses PHYS_OFFSET and all this
> > > happens before memstart_addr is set.
> > >
> > > I'll try to see how this can be untangled...
> >
> > This late in the cycle I can only think of reverting kmemleak wavier from
> > memblock_mark_nomap() and putting it in
> > early_init_dt_alloc_reserved_memory_arch() being the only user setting
> > MEMBLOCK_NOMAP to an allocated chunk rather than marking NOMAP "unusable"
> > memory reported by firmware.
>
> It makes sense, there aren't many places or nomap is called.
>
> I think arch_reserve_mem_area() called from acpi_table_upgrade() also
> follows a memblock allocation. But I'd call kmemleak in
> acpi_table_upgrade() directly rather than in the arch back-end.

Hmm, not sure this is correct for x86. I don't see why can't it track the
memory allocated in acpi_table_upgrade().

> Regarding which callback, I think kmemleak_ignore_phys() is better
> suited here since kmemleak still keeps track of the object but won't
> scan it.

Ok.

--
Sincerely yours,
Mike.

2021-10-20 09:37:44

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

On Wed, Oct 20, 2021 at 11:42:28AM +0300, Mike Rapoport wrote:
> On Wed, Oct 20, 2021 at 09:18:46AM +0100, Catalin Marinas wrote:
> > On Wed, Oct 20, 2021 at 10:38:23AM +0300, Mike Rapoport wrote:
> > > On Tue, Oct 19, 2021 at 09:33:11PM +0300, Mike Rapoport wrote:
> > > > On Tue, Oct 19, 2021 at 01:59:22PM -0400, Qian Cai wrote:
> > > > > [ 0.000000][ T0] Booting Linux on physical CPU 0x0000000000 [0x503f0002]
> > > > > [ 0.000000][ T0] Linux version 5.15.0-rc6-next-20211019+ (root@admin5) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #104 SMP Tue Oct 19 17:36:17 UTC 2021
> > > > > [ 0.000000][ T0] earlycon: pl11 at MMIO32 0x0000000012600000 (options '')
> > > > > [ 0.000000][ T0] printk: bootconsole [pl11] enabled
> > > > > [ 0.000000][ T0] efi: Getting UEFI parameters from /chosen in DT:
> > > > > [ 0.000000][ T0] efi: System Table : 0x0000009ff7de0018
> > > > > [ 0.000000][ T0] efi: MemMap Address : 0x0000009fe6dae018
> > > > > [ 0.000000][ T0] efi: MemMap Size : 0x0000000000000600
> > > > > [ 0.000000][ T0] efi: MemMap Desc. Size : 0x0000000000000030
> > > > > [ 0.000000][ T0] efi: MemMap Desc. Version : 0x0000000000000001
> > > > > [ 0.000000][ T0] efi: EFI v2.70 by American Megatrends
> > > > > [ 0.000000][ T0] efi: ACPI 2.0=0x9ff5b40000 SMBIOS 3.0=0x9ff686fd98 ESRT=0x9ff1d18298 MEMRESERVE=0x9fe6dacd98
> > > > > [ 0.000000][ T0] efi: Processing EFI memory map:
> > > > > [ 0.000000][ T0] efi: 0x000090000000-0x000091ffffff [Conventional| | | | | | | | | | |WB|WT|WC|UC]
> > > > > [ 0.000000][ T0] efi: 0x000092000000-0x0000928fffff [Runtime Data|RUN| | | | | | | | | |WB|WT|WC|UC]
> > > > > [ 0.000000][ T0] ------------[ cut here ]------------
> > > > > [ 0.000000][ T0] kernel BUG at mm/kmemleak.c:1140!
> > > > > [ 0.000000][ T0] Internal error: Oops - BUG: 0 [#1] SMP
> > > > >
> > > > > I did not quite figure out where this BUG() was triggered and I did not
> > > >
> > > > This is from here:
> > > > arch/arm64/include/asm/memory.h:
> > > >
> > > > #define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
> > > >
> > > > kmemleak_free_part_phys() does __va() which uses PHYS_OFFSET and all this
> > > > happens before memstart_addr is set.
> > > >
> > > > I'll try to see how this can be untangled...
> > >
> > > This late in the cycle I can only think of reverting kmemleak wavier from
> > > memblock_mark_nomap() and putting it in
> > > early_init_dt_alloc_reserved_memory_arch() being the only user setting
> > > MEMBLOCK_NOMAP to an allocated chunk rather than marking NOMAP "unusable"
> > > memory reported by firmware.
> >
> > It makes sense, there aren't many places or nomap is called.
> >
> > I think arch_reserve_mem_area() called from acpi_table_upgrade() also
> > follows a memblock allocation. But I'd call kmemleak in
> > acpi_table_upgrade() directly rather than in the arch back-end.
>
> Hmm, not sure this is correct for x86. I don't see why can't it track the
> memory allocated in acpi_table_upgrade().

Kmemleak still tracks it after an ignore but it won't be scanned. I
don't think this memory contains pointers to virtual addresses.

--
Catalin

2021-10-20 10:14:34

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

On Wed, Oct 20, 2021 at 10:38:23AM +0300, Mike Rapoport wrote:
> On Tue, Oct 19, 2021 at 09:33:11PM +0300, Mike Rapoport wrote:
> > On Tue, Oct 19, 2021 at 01:59:22PM -0400, Qian Cai wrote:
> > > [ 0.000000][ T0] Booting Linux on physical CPU 0x0000000000 [0x503f0002]
> > > [ 0.000000][ T0] Linux version 5.15.0-rc6-next-20211019+ (root@admin5) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #104 SMP Tue Oct 19 17:36:17 UTC 2021
> > > [ 0.000000][ T0] earlycon: pl11 at MMIO32 0x0000000012600000 (options '')
> > > [ 0.000000][ T0] printk: bootconsole [pl11] enabled
> > > [ 0.000000][ T0] efi: Getting UEFI parameters from /chosen in DT:
> > > [ 0.000000][ T0] efi: System Table : 0x0000009ff7de0018
> > > [ 0.000000][ T0] efi: MemMap Address : 0x0000009fe6dae018
> > > [ 0.000000][ T0] efi: MemMap Size : 0x0000000000000600
> > > [ 0.000000][ T0] efi: MemMap Desc. Size : 0x0000000000000030
> > > [ 0.000000][ T0] efi: MemMap Desc. Version : 0x0000000000000001
> > > [ 0.000000][ T0] efi: EFI v2.70 by American Megatrends
> > > [ 0.000000][ T0] efi: ACPI 2.0=0x9ff5b40000 SMBIOS 3.0=0x9ff686fd98 ESRT=0x9ff1d18298 MEMRESERVE=0x9fe6dacd98
> > > [ 0.000000][ T0] efi: Processing EFI memory map:
> > > [ 0.000000][ T0] efi: 0x000090000000-0x000091ffffff [Conventional| | | | | | | | | | |WB|WT|WC|UC]
> > > [ 0.000000][ T0] efi: 0x000092000000-0x0000928fffff [Runtime Data|RUN| | | | | | | | | |WB|WT|WC|UC]
> > > [ 0.000000][ T0] ------------[ cut here ]------------
> > > [ 0.000000][ T0] kernel BUG at mm/kmemleak.c:1140!
> > > [ 0.000000][ T0] Internal error: Oops - BUG: 0 [#1] SMP
> > >
> > > I did not quite figure out where this BUG() was triggered and I did not
> >
> > This is from here:
> > arch/arm64/include/asm/memory.h:
> >
> > #define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
> >
> > kmemleak_free_part_phys() does __va() which uses PHYS_OFFSET and all this
> > happens before memstart_addr is set.
> >
> > I'll try to see how this can be untangled...
>
> This late in the cycle I can only think of reverting kmemleak wavier from
> memblock_mark_nomap() and putting it in
> early_init_dt_alloc_reserved_memory_arch() being the only user setting
> MEMBLOCK_NOMAP to an allocated chunk rather than marking NOMAP "unusable"
> memory reported by firmware.

BTW, would something like this work:

diff --git a/mm/memblock.c b/mm/memblock.c
index aa87ff5ae2a4..7e67378a8ddf 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -939,7 +939,7 @@ int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size)
{
int ret = memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP);

- if (!ret)
+ if (!ret && memblock_is_region_reserved(base, size))
kmemleak_free_part_phys(base, size);

return ret;

--
Catalin

2021-10-20 10:42:13

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] memblock: exclude NOMAP regions from kmemleak

On Wed, Oct 20, 2021 at 11:13:06AM +0100, Catalin Marinas wrote:
> On Wed, Oct 20, 2021 at 10:38:23AM +0300, Mike Rapoport wrote:
> > On Tue, Oct 19, 2021 at 09:33:11PM +0300, Mike Rapoport wrote:
> > > On Tue, Oct 19, 2021 at 01:59:22PM -0400, Qian Cai wrote:
> > > > [ 0.000000][ T0] Booting Linux on physical CPU 0x0000000000 [0x503f0002]
> > > > [ 0.000000][ T0] Linux version 5.15.0-rc6-next-20211019+ (root@admin5) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #104 SMP Tue Oct 19 17:36:17 UTC 2021
> > > > [ 0.000000][ T0] earlycon: pl11 at MMIO32 0x0000000012600000 (options '')
> > > > [ 0.000000][ T0] printk: bootconsole [pl11] enabled
> > > > [ 0.000000][ T0] efi: Getting UEFI parameters from /chosen in DT:
> > > > [ 0.000000][ T0] efi: System Table : 0x0000009ff7de0018
> > > > [ 0.000000][ T0] efi: MemMap Address : 0x0000009fe6dae018
> > > > [ 0.000000][ T0] efi: MemMap Size : 0x0000000000000600
> > > > [ 0.000000][ T0] efi: MemMap Desc. Size : 0x0000000000000030
> > > > [ 0.000000][ T0] efi: MemMap Desc. Version : 0x0000000000000001
> > > > [ 0.000000][ T0] efi: EFI v2.70 by American Megatrends
> > > > [ 0.000000][ T0] efi: ACPI 2.0=0x9ff5b40000 SMBIOS 3.0=0x9ff686fd98 ESRT=0x9ff1d18298 MEMRESERVE=0x9fe6dacd98
> > > > [ 0.000000][ T0] efi: Processing EFI memory map:
> > > > [ 0.000000][ T0] efi: 0x000090000000-0x000091ffffff [Conventional| | | | | | | | | | |WB|WT|WC|UC]
> > > > [ 0.000000][ T0] efi: 0x000092000000-0x0000928fffff [Runtime Data|RUN| | | | | | | | | |WB|WT|WC|UC]
> > > > [ 0.000000][ T0] ------------[ cut here ]------------
> > > > [ 0.000000][ T0] kernel BUG at mm/kmemleak.c:1140!
> > > > [ 0.000000][ T0] Internal error: Oops - BUG: 0 [#1] SMP
> > > >
> > > > I did not quite figure out where this BUG() was triggered and I did not
> > >
> > > This is from here:
> > > arch/arm64/include/asm/memory.h:
> > >
> > > #define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
> > >
> > > kmemleak_free_part_phys() does __va() which uses PHYS_OFFSET and all this
> > > happens before memstart_addr is set.
> > >
> > > I'll try to see how this can be untangled...
> >
> > This late in the cycle I can only think of reverting kmemleak wavier from
> > memblock_mark_nomap() and putting it in
> > early_init_dt_alloc_reserved_memory_arch() being the only user setting
> > MEMBLOCK_NOMAP to an allocated chunk rather than marking NOMAP "unusable"
> > memory reported by firmware.
>
> BTW, would something like this work:
>
> diff --git a/mm/memblock.c b/mm/memblock.c
> index aa87ff5ae2a4..7e67378a8ddf 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -939,7 +939,7 @@ int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size)
> {
> int ret = memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP);
>
> - if (!ret)
> + if (!ret && memblock_is_region_reserved(base, size))
> kmemleak_free_part_phys(base, size);

Apparently it would for the cases we have now.
But it will fail same way as now if somebody will call memblock_reserve() and then
memblock_mark_nomap() for the same chunk before arm64_memblock_init().

For instance, slight order change in efi-init::reserve_regions() will
trigger the same fault... :(

--
Sincerely yours,
Mike.