2023-01-13 14:56:08

by Borislav Petkov

[permalink] [raw]
Subject: Failure during Stack Depot allocating hash table of 1048576 entries with kvcalloc

Hi,

Lemme CC the stackdepot.c commit signers re the below splat.

Lemme know if you need more info.

Thx.

...
[ 0.000000] software IO TLB: area num 4.
[ 0.000000] Memory: 15612528K/16165300K available (12288K kernel code, 116072K rwdata, 5024K rodata, 2412K init, 3822
0K bss, 552516K reserved, 0K cma-reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[ 0.000000] Stack Depot allocating hash table of 1048576 entries with kvcalloc
[ 0.000000] swapper: vmalloc error: size 8388608, vm_struct allocation failed, mode:0xdc0(GFP_KERNEL|__GFP_ZERO), nod
emask=(null)
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.2.0-rc3+ #1
[ 0.000000] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019
[ 0.000000] Call Trace:
[ 0.000000] <TASK>
[ 0.000000] dump_stack_lvl+0x33/0x46
[ 0.000000] warn_alloc+0xff/0x120
[ 0.000000] ? __get_vm_area_node+0x173/0x180
[ 0.000000] __vmalloc_node_range+0x4d6/0x7d0
[ 0.000000] ? stack_depot_init.cold+0x5e/0x8f
[ 0.000000] ? __kmalloc_large_node+0xde/0x120
[ 0.000000] kvmalloc_node+0xb7/0xd0
[ 0.000000] ? stack_depot_init.cold+0x5e/0x8f
[ 0.000000] stack_depot_init.cold+0x5e/0x8f
[ 0.000000] kmemleak_init+0x17/0x106
[ 0.000000] start_kernel+0x3e8/0x64e
[ 0.000000] secondary_startup_64_no_verify+0xd3/0xdb
[ 0.000000] </TASK>
[ 0.000000] Mem-Info:
[ 0.000000] active_anon:0 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0
slab_reclaimable:0 slab_unreclaimable:25
mapped:0 shmem:0 pagetables:0
sec_pagetables:0 bounce:0
kernel_misc_reclaimable:0
free:3903107 free_pcp:0 free_cma:0
[ 0.000000] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:0kB pagetables:0kB sec_pagetables:0kB all_unreclaimable? no
[ 0.000000] Node 0 DMA free:15360kB boost:0kB min:0kB low:0kB high:0kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 0.000000] lowmem_reserve[]: 0 0 0 0
[ 0.000000] Node 0 DMA32 free:2262644kB boost:0kB min:0kB low:0kB high:0kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:2534208kB managed:2262644kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 0.000000] lowmem_reserve[]: 0 0 0 0
[ 0.000000] Node 0 Normal free:13334424kB boost:0kB min:0kB low:0kB high:0kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:13615104kB managed:13334780kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 0.000000] lowmem_reserve[]: 0 0 0 0
[ 0.000000] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15360kB
[ 0.000000] Node 0 DMA32: 5*4kB (UM) 2*8kB (M) 3*16kB (M) 3*32kB (M) 3*64kB (M) 2*128kB (M) 2*256kB (M) 3*512kB (M) 3*1024kB (M) 2*2048kB (M) 550*4096kB (M) = 2262644kB
[ 0.000000] Node 0 Normal: 6*4kB (UM) 6*8kB (UM) 5*16kB (UM) 0*32kB 0*64kB 2*128kB (UM) 4*256kB (M) 1*512kB (U) 2*1024kB (UM) 1*2048kB (U) 3254*4096kB (M) = 13334424kB
[ 0.000000] 0 total pagecache pages
[ 0.000000] 0 pages in swap cache
[ 0.000000] Free swap = 0kB
[ 0.000000] Total swap = 0kB
[ 0.000000] 4041325 pages RAM
[ 0.000000] 0 pages HighMem/MovableOnly
[ 0.000000] 138129 pages reserved
[ 0.000000] 0 pages hwpoisoned
[ 0.000000] Stack Depot hash table allocation failed, disabling
[ 0.000000] ftrace: allocating 35561 entries in 139 pages
[ 0.000000] ftrace: allocated 139 pages with 4 groups
[ 0.000000] Dynamic Preempt: full



--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


2023-01-13 15:17:05

by Alexander Potapenko

[permalink] [raw]
Subject: Re: Failure during Stack Depot allocating hash table of 1048576 entries with kvcalloc

On Fri, Jan 13, 2023 at 3:30 PM Borislav Petkov <[email protected]> wrote:
>
> Hi,
>
> Lemme CC the stackdepot.c commit signers re the below splat.
>
> Lemme know if you need more info.

The kernel commit and config would help.
Is this somehow related to SLUB_TINY? Perhaps stack depot is just
trying to allocate way more memory than allowed?
(I see that Memory: below shows a lot of free pages, but am not sure
how it works with SLUB_TINY).

> Thx.
>
> ...
> [ 0.000000] software IO TLB: area num 4.
> [ 0.000000] Memory: 15612528K/16165300K available (12288K kernel code, 116072K rwdata, 5024K rodata, 2412K init, 3822

2023-01-13 16:28:14

by Borislav Petkov

[permalink] [raw]
Subject: Re: Failure during Stack Depot allocating hash table of 1048576 entries with kvcalloc

On Fri, Jan 13, 2023 at 03:57:28PM +0100, Alexander Potapenko wrote:
> The kernel commit

I've pushed my testing branch here:

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=rc3%2b

"rc3+" is the name.

> and config would help.

attached.

> Is this somehow related to SLUB_TINY?

Don't think so - not present in my .config...

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Attachments:
(No filename) (461.00 B)
config-6.2.0-rc3+ (147.20 kB)
Download all attachments

2023-01-27 23:17:02

by Borislav Petkov

[permalink] [raw]
Subject: Re: Failure during Stack Depot allocating hash table of 1048576 entries with kvcalloc

Ok,

was finally able to bisect it. Lemme add everybody to Cc.

This one reverted ontop of latest Linus master fixes the vmalloc error here:

https://lore.kernel.org/r/[email protected]

commit 56a61617dd2276cbc56a6c868599716386d70041 (HEAD, refs/bisect/bad)
Author: Zhaoyang Huang <[email protected]>
Date: Thu Oct 27 17:50:24 2022 +0800

mm: use stack_depot for recording kmemleak's backtrace

Using stack_depot to record kmemleak's backtrace which has been
implemented on slub for reducing redundant information.

[[email protected]: fix build - remove now-unused __save_stack_trace()]
[[email protected]: v3]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: fix v3 layout oddities]
[[email protected]: coding-style cleanups]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Zhaoyang Huang <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Cc: ke.wang <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Zhaoyang Huang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>


--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-01-28 02:42:05

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: Failure during Stack Depot allocating hash table of 1048576 entries with kvcalloc

On 13. 01. 2023. 15:29, Borislav Petkov wrote:
> Hi,
>
> Lemme CC the stackdepot.c commit signers re the below splat.
>
> Lemme know if you need more info.
>
> Thx.
>
> ...
> [ 0.000000] software IO TLB: area num 4.
> [ 0.000000] Memory: 15612528K/16165300K available (12288K kernel code, 116072K rwdata, 5024K rodata, 2412K init, 3822
> 0K bss, 552516K reserved, 0K cma-reserved)
> [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
> [ 0.000000] Stack Depot allocating hash table of 1048576 entries with kvcalloc
> [ 0.000000] swapper: vmalloc error: size 8388608, vm_struct allocation failed, mode:0xdc0(GFP_KERNEL|__GFP_ZERO), nod
> emask=(null)
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.2.0-rc3+ #1
> [ 0.000000] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019
> [ 0.000000] Call Trace:
> [ 0.000000] <TASK>
> [ 0.000000] dump_stack_lvl+0x33/0x46
> [ 0.000000] warn_alloc+0xff/0x120
> [ 0.000000] ? __get_vm_area_node+0x173/0x180
> [ 0.000000] __vmalloc_node_range+0x4d6/0x7d0
> [ 0.000000] ? stack_depot_init.cold+0x5e/0x8f
> [ 0.000000] ? __kmalloc_large_node+0xde/0x120
> [ 0.000000] kvmalloc_node+0xb7/0xd0
> [ 0.000000] ? stack_depot_init.cold+0x5e/0x8f
> [ 0.000000] stack_depot_init.cold+0x5e/0x8f
> [ 0.000000] kmemleak_init+0x17/0x106
> [ 0.000000] start_kernel+0x3e8/0x64e
> [ 0.000000] secondary_startup_64_no_verify+0xd3/0xdb
> [ 0.000000] </TASK>
> [ 0.000000] Mem-Info:
> [ 0.000000] active_anon:0 inactive_anon:0 isolated_anon:0
> active_file:0 inactive_file:0 isolated_file:0
> unevictable:0 dirty:0 writeback:0
> slab_reclaimable:0 slab_unreclaimable:25
> mapped:0 shmem:0 pagetables:0
> sec_pagetables:0 bounce:0
> kernel_misc_reclaimable:0
> free:3903107 free_pcp:0 free_cma:0
> [ 0.000000] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:0kB pagetables:0kB sec_pagetables:0kB all_unreclaimable? no
> [ 0.000000] Node 0 DMA free:15360kB boost:0kB min:0kB low:0kB high:0kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [ 0.000000] lowmem_reserve[]: 0 0 0 0
> [ 0.000000] Node 0 DMA32 free:2262644kB boost:0kB min:0kB low:0kB high:0kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:2534208kB managed:2262644kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [ 0.000000] lowmem_reserve[]: 0 0 0 0
> [ 0.000000] Node 0 Normal free:13334424kB boost:0kB min:0kB low:0kB high:0kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:13615104kB managed:13334780kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [ 0.000000] lowmem_reserve[]: 0 0 0 0
> [ 0.000000] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15360kB
> [ 0.000000] Node 0 DMA32: 5*4kB (UM) 2*8kB (M) 3*16kB (M) 3*32kB (M) 3*64kB (M) 2*128kB (M) 2*256kB (M) 3*512kB (M) 3*1024kB (M) 2*2048kB (M) 550*4096kB (M) = 2262644kB
> [ 0.000000] Node 0 Normal: 6*4kB (UM) 6*8kB (UM) 5*16kB (UM) 0*32kB 0*64kB 2*128kB (UM) 4*256kB (M) 1*512kB (U) 2*1024kB (UM) 1*2048kB (U) 3254*4096kB (M) = 13334424kB
> [ 0.000000] 0 total pagecache pages
> [ 0.000000] 0 pages in swap cache
> [ 0.000000] Free swap = 0kB
> [ 0.000000] Total swap = 0kB
> [ 0.000000] 4041325 pages RAM
> [ 0.000000] 0 pages HighMem/MovableOnly
> [ 0.000000] 138129 pages reserved
> [ 0.000000] 0 pages hwpoisoned
> [ 0.000000] Stack Depot hash table allocation failed, disabling
> [ 0.000000] ftrace: allocating 35561 entries in 139 pages
> [ 0.000000] ftrace: allocated 139 pages with 4 groups
> [ 0.000000] Dynamic Preempt: full

This appears to be a duplicate of the report:
https://lore.kernel.org/linux-mm/[email protected]/raw

Regards,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union


2023-01-28 11:04:00

by Borislav Petkov

[permalink] [raw]
Subject: Re: Failure during Stack Depot allocating hash table of 1048576 entries with kvcalloc

On Sat, Jan 28, 2023 at 03:41:50AM +0100, Mirsad Goran Todorovac wrote:
> This appears to be a duplicate of the report:
> https://lore.kernel.org/linux-mm/[email protected]/raw

Yah, looks like

56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")

needs to be reverted.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-01-28 13:56:12

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: Failure during Stack Depot allocating hash table of 1048576 entries with kvcalloc

On 28.01.23 12:03, Borislav Petkov wrote:
> On Sat, Jan 28, 2023 at 03:41:50AM +0100, Mirsad Goran Todorovac wrote:
>> This appears to be a duplicate of the report:
>> https://lore.kernel.org/linux-mm/[email protected]/raw
>
> Yah, looks like
>
> 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
>
> needs to be reverted.

Unless I'm missing something (which might easily be the case) there is a
patch for that issue in -mm already:

https://lore.kernel.org/all/[email protected]/

Or where two different issues discussed in the thread Mirsad mentioned
above?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

2023-01-28 14:26:17

by Borislav Petkov

[permalink] [raw]
Subject: Re: Failure during Stack Depot allocating hash table of 1048576 entries with kvcalloc

On Sat, Jan 28, 2023 at 02:55:58PM +0100, Linux kernel regression tracking (Thorsten Leemhuis) wrote:
> On 28.01.23 12:03, Borislav Petkov wrote:
> > On Sat, Jan 28, 2023 at 03:41:50AM +0100, Mirsad Goran Todorovac wrote:
> >> This appears to be a duplicate of the report:
> >> https://lore.kernel.org/linux-mm/[email protected]/raw
> >
> > Yah, looks like
> >
> > 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
> >
> > needs to be reverted.
>
> Unless I'm missing something (which might easily be the case) there is a
> patch for that issue in -mm already:
>
> https://lore.kernel.org/all/[email protected]/
>
> Or where two different issues discussed in the thread Mirsad mentioned
> above?

Probably the same issue. This one fixes the issue on my machine - thanks!

Tested-by: Borislav Petkov (AMD) <[email protected]>

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-01-29 20:52:21

by Andrew Morton

[permalink] [raw]
Subject: Re: Failure during Stack Depot allocating hash table of 1048576 entries with kvcalloc

On Sat, 28 Jan 2023 15:26:00 +0100 Borislav Petkov <[email protected]> wrote:

> On Sat, Jan 28, 2023 at 02:55:58PM +0100, Linux kernel regression tracking (Thorsten Leemhuis) wrote:
> > On 28.01.23 12:03, Borislav Petkov wrote:
> > > On Sat, Jan 28, 2023 at 03:41:50AM +0100, Mirsad Goran Todorovac wrote:
> > >> This appears to be a duplicate of the report:
> > >> https://lore.kernel.org/linux-mm/[email protected]/raw
> > >
> > > Yah, looks like
> > >
> > > 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
> > >
> > > needs to be reverted.
> >
> > Unless I'm missing something (which might easily be the case) there is a
> > patch for that issue in -mm already:
> >
> > https://lore.kernel.org/all/[email protected]/
> >
> > Or where two different issues discussed in the thread Mirsad mentioned
> > above?
>
> Probably the same issue. This one fixes the issue on my machine - thanks!
>
> Tested-by: Borislav Petkov (AMD) <[email protected]>
>

OK, thanks, I didn't realize this issue was so serious.

I reordered Zhaoyang Huang's series so that "mm: use
stack_depot_early_init for kmemleak" comes ahead of "mm: move
KMEMLEAK's Kconfig items from lib to mm" and I've staged "mm: use
stack_depot_early_init for kmemleak" in the mm-hotfixes branch for
upstream merging in this -rc cycle.


2023-01-29 21:11:47

by Borislav Petkov

[permalink] [raw]
Subject: Re: Failure during Stack Depot allocating hash table of 1048576 entries with kvcalloc

On Sun, Jan 29, 2023 at 12:50:14PM -0800, Andrew Morton wrote:
> OK, thanks, I didn't realize this issue was so serious.

Yeah, I wouldn't call it serious - as a precondition, you need KMEMLEAK enabled
which probably shouldn't be run in production. For some reason it was enabled in
that test box's config - that's how I even saw this in the first place...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette