2021-06-17 11:00:45

by Sander Eikelenboom

[permalink] [raw]
Subject: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

L.S.,

I just tried to upgrade and test the linux kernel going from the 5.12 kernel series to 5.13-rc6 on my homeserver with Xen, but ran in some trouble.

Some VM's boot fine (with more than 256MB memory assigned), but the smaller (memory wise) PVH ones crash during kernel boot due to OOM.
Booting VM's with 5.12(.9) kernel still works fine, also when dom0 is running 5.13-rc6 (but it has more memory assigned, so that is not unexpected).

The 5.13-rc6'ish kernel is a pull of today, tried both with and without last AKPM's patches, but that
makes no difference.

Below are stacktraces from a few of the crashing VM's.

Attached is the kernel .config

Any pointers ?

--
Sander



[ 0.986515] Bluetooth: HCI UART protocol Intel registered
[ 0.986714] Bluetooth: HCI UART protocol AG6XX registered
[ 0.986760] usbcore: registered new interface driver bcm203x
[ 0.986812] usbcore: registered new interface driver bpa10x
[ 0.986854] usbcore: registered new interface driver bfusb
[ 0.986907] usbcore: registered new interface driver btusb
[ 0.986955] usbcore: registered new interface driver ath3k
[ 0.986998] hid: raw HID events driver (C) Jiri Kosina
[ 0.987250] usbcore: registered new interface driver usbhid
[ 0.987283] usbhid: USB HID core driver
[ 0.988461] swapper/0 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
[ 0.988530] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc6-20210617-doflr-mac80211debug+ #1
[ 0.988572] Call Trace:
[ 0.988587] dump_stack+0x76/0x94
[ 0.988609] dump_header+0x45/0x1d4
[ 0.988627] out_of_memory.cold.44+0x39/0x7e
[ 0.988649] __alloc_pages_slowpath.constprop.112+0xb9e/0xc80
[ 0.988678] ? idr_alloc_u32+0x8b/0xc0
[ 0.988697] __alloc_pages+0x318/0x330
[ 0.988715] alloc_page_interleave+0xe/0x70
[ 0.988737] allocate_slab+0x28d/0x330
[ 0.988757] ___slab_alloc+0x41e/0x5c0
[ 0.988777] ? bus_add_driver+0x48/0x1c0
[ 0.988797] ? call_usermodehelper_exec+0xed/0x160
[ 0.988822] ? bus_add_driver+0x48/0x1c0
[ 0.988841] __slab_alloc+0x17/0x30
[ 0.988861] kmem_cache_alloc_trace+0x403/0x440
[ 0.988885] ? si3054_driver_init+0x15/0x15
[ 0.988907] ? rdinit_setup+0x27/0x27
[ 0.988925] bus_add_driver+0x48/0x1c0
[ 0.988944] ? si3054_driver_init+0x15/0x15
[ 0.988963] driver_register+0x66/0xb0
[ 0.988984] ? si3054_driver_init+0x15/0x15
[ 0.989003] do_one_initcall+0x3f/0x1c0
[ 0.989024] kernel_init_freeable+0x21a/0x295
[ 0.989049] ? rest_init+0xa4/0xa4
[ 0.989069] kernel_init+0x5/0xfc
[ 0.989086] ret_from_fork+0x22/0x30
[ 0.989105] Mem-Info:
[ 0.989116] active_anon:0 inactive_anon:0 isolated_anon:0
[ 0.989116] active_file:0 inactive_file:0 isolated_file:0
[ 0.989116] unevictable:27090 dirty:0 writeback:0
[ 0.989116] slab_reclaimable:2960 slab_unreclaimable:3021
[ 0.989116] mapped:0 shmem:0 pagetables:3 bounce:0
[ 0.989116] free:783 free_pcp:16 free_cma:0
[ 0.989244] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:108360kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:5776kB pagetables:12kB all_unreclaimable? no
[ 1.178718] Node 0 DMA free:720kB min:148kB low:184kB high:220kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:80kB writepending:0kB present:15996kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 1.178780] xvda: xvda1 xvda2
[ 1.178825] lowmem_reserve[]: 0 146 146 146
[ 1.178828] Node 0 DMA32 free:2208kB min:1472kB low:1840kB high:2208kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:108252kB writepending:0kB present:245760kB managed:149632kB mlocked:0kB bounce:0kB free_pcp:44kB local_pcp:32kB free_cma:0kB
[ 1.178833] lowmem_reserve[]: 0 0 0 0
[ 1.178836] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 0*1024kB 0*2048kB 0*4096kB = 704kB
[ 1.179045] Node 0 DMA32: 2*4kB (ME) 3*8kB (M) 2*16kB (M) 3*32kB (UM) 1*64kB (M) 2*128kB (UE) 2*256kB (UE) 2*512kB (U) 0*1024kB 0*2048kB 0*4096kB = 2016kB
[ 1.179110] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 1.179152] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 1.179200] 27099 total pagecache pages
[ 1.179227] 0 pages in swap cache
[ 1.179244] Swap cache stats: add 0, delete 0, find 0/0
[ 1.179267] Free swap = 0kB
[ 1.179282] Total swap = 0kB
[ 1.179308] 65439 pages RAM
[ 1.179320] 0 pages HighMem/MovableOnly
[ 1.179336] 24191 pages reserved
[ 1.179352] 0 pages cma reserved
[ 1.179382] Tasks state (memory values in pages):
[ 1.179403] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 1.179470] Out of memory and no killable processes...
[ 1.179494] Kernel panic - not syncing: System is deadlocked on memory
[ 1.179521] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc6-20210617-doflr-mac80211debug+ #1
[ 1.179563] Call Trace:
[ 1.179576] dump_stack+0x76/0x94
[ 1.179595] panic+0xfc/0x2c0
[ 1.179613] out_of_memory.cold.44+0x5e/0x7e
[ 1.179637] __alloc_pages_slowpath.constprop.112+0xb9e/0xc80
[ 1.179669] ? idr_alloc_u32+0x8b/0xc0
[ 1.179686] __alloc_pages+0x318/0x330
[ 1.179704] alloc_page_interleave+0xe/0x70
[ 1.179722] allocate_slab+0x28d/0x330
[ 1.179739] ___slab_alloc+0x41e/0x5c0
[ 1.179755] ? bus_add_driver+0x48/0x1c0
[ 1.179773] ? call_usermodehelper_exec+0xed/0x160
[ 1.179794] ? bus_add_driver+0x48/0x1c0
[ 1.179823] __slab_alloc+0x17/0x30
[ 1.179838] kmem_cache_alloc_trace+0x403/0x440
[ 1.179858] ? si3054_driver_init+0x15/0x15
[ 1.179874] ? rdinit_setup+0x27/0x27
[ 1.179891] bus_add_driver+0x48/0x1c0
[ 1.179915] ? si3054_driver_init+0x15/0x15
[ 1.179950] driver_register+0x66/0xb0
[ 1.179968] ? si3054_driver_init+0x15/0x15
[ 1.179986] do_one_initcall+0x3f/0x1c0
[ 1.180005] kernel_init_freeable+0x21a/0x295
[ 1.180027] ? rest_init+0xa4/0xa4
[ 1.180045] kernel_init+0x5/0xfc
[ 1.180063] ret_from_fork+0x22/0x30
[ 1.180185] Kernel Offset: disabled



And another VM:

[ 1.034613] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[ 1.034633] Bluetooth: BNEP filters: protocol multicast
[ 1.034653] Bluetooth: BNEP socket layer initialized
[ 1.034671] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
[ 1.034696] Bluetooth: HIDP socket layer initialized
[ 1.034716] 8021q: 802.1Q VLAN Support v1.8
[ 1.034747] Key type dns_resolver registered
[ 1.034768] Key type ceph registered
[ 1.034865] libceph: loaded (mon/osd proto 15/24)
[ 1.035030] IPI shorthand broadcast: enabled
[ 1.035060] sched_clock: Marking stable (833426729, 200986153)->(1207636155, -173223273)
[ 1.035143] registered taskstats version 1
[ 1.035161] Loading compiled-in X.509 certificates
[ 1.035350] Btrfs loaded, crc32c=crc32c-generic, zoned=no
[ 1.167932] kworker/u2:0 invoked oom-killer: gfp_mask=0x100cc2(GFP_HIGHUSER), order=0, oom_score_adj=0
[ 1.167989] CPU: 0 PID: 7 Comm: kworker/u2:0 Not tainted 5.13.0-rc6-20210617-doflr-mac80211debug+ #1
[ 1.168026] Workqueue: events_unbound async_run_entry_fn
[ 1.168051] Call Trace:
[ 1.168064] dump_stack+0x76/0x94
[ 1.168082] dump_header+0x45/0x1d4
[ 1.168098] out_of_memory.cold.44+0x39/0x7e
[ 1.168119] __alloc_pages_slowpath.constprop.112+0xb9e/0xc80
[ 1.168145] ? __mod_memcg_lruvec_state+0x1d/0x100
[ 1.168167] __alloc_pages+0x318/0x330
[ 1.168183] pagecache_get_page+0x24b/0x400
[ 1.168199] grab_cache_page_write_begin+0x17/0x30
[ 1.168220] simple_write_begin+0x1e/0x1e0
[ 1.168237] generic_perform_write+0xef/0x1b0
[ 1.168257] __generic_file_write_iter+0x140/0x1b0
[ 1.168279] ? write_buffer+0x32/0x32
[ 1.168296] generic_file_write_iter+0x58/0xa0
[ 1.168316] __kernel_write+0x146/0x2c0
[ 1.168333] kernel_write+0x51/0xf0
[ 1.168350] xwrite+0x2c/0x5f
[ 1.168366] ? initrd_load+0x268/0x268
[ 1.168382] do_copy+0xc7/0x109
[ 1.168397] ? initrd_load+0x19e/0x268
[ 1.168412] ? do_name+0x11a/0x269
[ 1.168427] write_buffer+0x22/0x32
[ 1.168443] flush_buffer+0x2f/0x86
[ 1.168458] __gunzip+0x26e/0x315
[ 1.168474] ? bunzip2+0x397/0x397
[ 1.168490] ? initrd_load+0x268/0x268
[ 1.168505] gunzip+0xe/0x11
[ 1.168520] ? initrd_load+0x268/0x268
[ 1.168537] unpack_to_rootfs+0x159/0x28f
[ 1.168554] ? initrd_load+0x268/0x268
[ 1.168571] do_populate_rootfs+0x6c/0x160
[ 1.168588] async_run_entry_fn+0x1b/0xa0
[ 1.168603] process_one_work+0x1f6/0x390
[ 1.168620] worker_thread+0x28/0x3d0
[ 1.168638] ? process_one_work+0x390/0x390
[ 1.168654] kthread+0x111/0x130
[ 1.168671] ? kthread_park+0x80/0x80
[ 1.168686] ret_from_fork+0x22/0x30
[ 1.168705] Mem-Info:
[ 1.168716] active_anon:0 inactive_anon:0 isolated_anon:0
[ 1.168716] active_file:0 inactive_file:0 isolated_file:0
[ 1.168716] unevictable:28085 dirty:0 writeback:0
[ 1.168716] slab_reclaimable:2883 slab_unreclaimable:4055
[ 1.168716] mapped:0 shmem:0 pagetables:3 bounce:0
[ 1.168716] free:550 free_pcp:9 free_cma:0
[ 1.168818] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:112340kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:6256kB pagetables:12kB all_unreclaimable? yes
[ 1.168916] Node 0 DMA free:728kB min:148kB low:184kB high:220kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:14472kB writepending:0kB present:15996kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 1.169010] lowmem_reserve[]: 0 146 146 146
[ 1.169028] Node 0 DMA32 free:1472kB min:1472kB low:1840kB high:2208kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:97860kB writepending:0kB present:245760kB managed:149852kB mlocked:0kB bounce:0kB free_pcp:36kB local_pcp:36kB free_cma:0kB
[ 1.309647] lowmem_reserve[]: 0 0 0 0
[ 1.309668] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 1*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 0*1024kB 0*2048kB 0*4096kB = 728kB
[ 1.309720] Node 0 DMA32: 0*4kB 4*8kB (UM) 2*16kB (M) 4*32kB (UME) 2*64kB (ME) 1*128kB (U) 2*256kB (UE) 1*512kB (E) 0*1024kB 0*2048kB 0*4096kB = 1472kB
[ 1.309777] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 1.309810] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 1.309843] 28095 total pagecache pages
[ 1.309858] 0 pages in swap cache
[ 1.309874] Swap cache stats: add 0, delete 0, find 0/0
[ 1.309894] Free swap = 0kB
[ 1.309908] Total swap = 0kB
[ 1.309922] 65439 pages RAM
[ 1.309932] 0 pages HighMem/MovableOnly
[ 1.309947] 24136 pages reserved
[ 1.309961] 0 pages cma reserved
[ 1.309975] Tasks state (memory values in pages):
[ 1.309993] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 1.310035] Out of memory and no killable processes...
[ 1.310054] Kernel panic - not syncing: System is deadlocked on memory
[ 1.310077] CPU: 0 PID: 7 Comm: kworker/u2:0 Not tainted 5.13.0-rc6-20210617-doflr-mac80211debug+ #1
[ 1.310112] Workqueue: events_unbound async_run_entry_fn
[ 1.310133] Call Trace:
[ 1.310144] dump_stack+0x76/0x94
[ 1.310159] panic+0xfc/0x2c0
[ 1.310176] out_of_memory.cold.44+0x5e/0x7e
[ 1.310195] __alloc_pages_slowpath.constprop.112+0xb9e/0xc80
[ 1.310220] ? __mod_memcg_lruvec_state+0x1d/0x100
[ 1.310240] __alloc_pages+0x318/0x330
[ 1.310256] pagecache_get_page+0x24b/0x400
[ 1.310273] grab_cache_page_write_begin+0x17/0x30
[ 1.310293] simple_write_begin+0x1e/0x1e0
[ 1.310309] generic_perform_write+0xef/0x1b0
[ 1.310329] __generic_file_write_iter+0x140/0x1b0
[ 1.310349] ? write_buffer+0x32/0x32
[ 1.310364] generic_file_write_iter+0x58/0xa0
[ 1.310384] __kernel_write+0x146/0x2c0
[ 1.310400] kernel_write+0x51/0xf0
[ 1.310415] xwrite+0x2c/0x5f
[ 1.310430] ? initrd_load+0x268/0x268
[ 1.310446] do_copy+0xc7/0x109
[ 1.310461] ? initrd_load+0x19e/0x268
[ 1.310476] ? do_name+0x11a/0x269
[ 1.310491] write_buffer+0x22/0x32
[ 1.310507] flush_buffer+0x2f/0x86
[ 1.310522] __gunzip+0x26e/0x315
[ 1.310538] ? bunzip2+0x397/0x397
[ 1.310554] ? initrd_load+0x268/0x268
[ 1.310569] gunzip+0xe/0x11
[ 1.310584] ? initrd_load+0x268/0x268
[ 1.310600] unpack_to_rootfs+0x159/0x28f
[ 1.310616] ? initrd_load+0x268/0x268
[ 1.310632] do_populate_rootfs+0x6c/0x160
[ 1.310647] async_run_entry_fn+0x1b/0xa0
[ 1.310663] process_one_work+0x1f6/0x390
[ 1.310679] worker_thread+0x28/0x3d0
[ 1.510226] ? process_one_work+0x390/0x390
[ 1.510243] kthread+0x111/0x130
[ 1.510259] ? kthread_park+0x80/0x80
[ 1.510275] ret_from_fork+0x22/0x30
[ 1.510336] Kernel Offset: disabled


And another one:

[ 0.772775] IPVS: ipvs loaded.
[ 0.773014] NET: Registered protocol family 10
[ 0.777428] Segment Routing with IPv6
[ 0.777541] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[ 0.777687] NET: Registered protocol family 17
[ 1.114018] Bridge firewalling registered
[ 1.117325] Bluetooth: RFCOMM TTY layer initialized
[ 1.117372] Bluetooth: RFCOMM socket layer initialized
[ 1.117402] Bluetooth: RFCOMM ver 1.11
[ 1.117421] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[ 1.117448] Bluetooth: BNEP filters: protocol multicast
[ 1.117471] Bluetooth: BNEP socket layer initialized
[ 1.117493] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
[ 1.117527] Bluetooth: HIDP socket layer initialized
[ 1.117550] 8021q: 802.1Q VLAN Support v1.8
[ 1.117581] Key type dns_resolver registered
[ 1.117605] Key type ceph registered
[ 1.117725] libceph: loaded (mon/osd proto 15/24)
[ 1.117931] IPI shorthand broadcast: enabled
[ 1.117973] sched_clock: Marking stable (915120961, 202168371)->(1461715317, -344425985)
[ 1.118063] registered taskstats version 1
[ 1.118083] Loading compiled-in X.509 certificates
[ 1.118305] Btrfs loaded, crc32c=crc32c-generic, zoned=no
[ 1.187460] kworker/u2:0 invoked oom-killer: gfp_mask=0x100cc2(GFP_HIGHUSER), order=0, oom_score_adj=0
[ 1.187513] CPU: 0 PID: 7 Comm: kworker/u2:0 Not tainted 5.13.0-rc6-20210617-doflr-mac80211debug+ #1
[ 1.187555] Workqueue: events_unbound async_run_entry_fn
[ 1.187582] Call Trace:
[ 1.187597] dump_stack+0x76/0x94
[ 1.187617] dump_header+0x45/0x1d4
[ 1.187637] out_of_memory.cold.44+0x39/0x7e
[ 1.187660] __alloc_pages_slowpath.constprop.112+0xb9e/0xc80
[ 1.187692] ? __mod_memcg_lruvec_state+0x1d/0x100
[ 1.187718] __alloc_pages+0x318/0x330
[ 1.187737] pagecache_get_page+0x24b/0x400
[ 1.187756] grab_cache_page_write_begin+0x17/0x30
[ 1.187779] simple_write_begin+0x1e/0x1e0
[ 1.187798] generic_perform_write+0xef/0x1b0
[ 1.187822] __generic_file_write_iter+0x140/0x1b0
[ 1.187848] ? write_buffer+0x32/0x32
[ 1.187866] generic_file_write_iter+0x58/0xa0
[ 1.187890] __kernel_write+0x146/0x2c0
[ 1.187908] kernel_write+0x51/0xf0
[ 1.187926] xwrite+0x2c/0x5f
[ 1.187946] ? initrd_load+0x268/0x268
[ 1.187964] do_copy+0xc7/0x109
[ 1.187982] ? initrd_load+0x19e/0x268
[ 1.187999] ? do_name+0x11a/0x269
[ 1.188017] write_buffer+0x22/0x32
[ 1.188034] flush_buffer+0x2f/0x86
[ 1.188052] __gunzip+0x26e/0x315
[ 1.188071] ? bunzip2+0x397/0x397
[ 1.188090] ? initrd_load+0x268/0x268
[ 1.188107] gunzip+0xe/0x11
[ 1.188125] ? initrd_load+0x268/0x268
[ 1.188143] unpack_to_rootfs+0x159/0x28f
[ 1.188161] ? initrd_load+0x268/0x268
[ 1.188178] do_populate_rootfs+0x6c/0x160
[ 1.188197] async_run_entry_fn+0x1b/0xa0
[ 1.188214] process_one_work+0x1f6/0x390
[ 1.188234] worker_thread+0x28/0x3d0
[ 1.188253] ? process_one_work+0x390/0x390
[ 1.188271] kthread+0x111/0x130
[ 1.188293] ? kthread_park+0x80/0x80
[ 1.188312] ret_from_fork+0x22/0x30
[ 1.188335] Mem-Info:
[ 1.188347] active_anon:0 inactive_anon:0 isolated_anon:0
[ 1.188347] active_file:0 inactive_file:0 isolated_file:0
[ 1.188347] unevictable:28130 dirty:0 writeback:0
[ 1.188347] slab_reclaimable:2879 slab_unreclaimable:4027
[ 1.188347] mapped:0 shmem:0 pagetables:3 bounce:0
[ 1.188347] free:548 free_pcp:8 free_cma:0
[ 1.188465] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:112520kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:6224kB pagetables:12kB all_unreclaimable? yes
[ 1.188578] Node 0 DMA free:732kB min:148kB low:184kB high:220kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:10428kB writepending:0kB present:15996kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 1.188688] lowmem_reserve[]: 0 146 146 146
[ 1.188711] Node 0 DMA32 free:1460kB min:1472kB low:1840kB high:2208kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:102080kB writepending:0kB present:245760kB managed:149852kB mlocked:0kB bounce:0kB free_pcp:32kB local_pcp:32kB free_cma:0kB
[ 1.188819] lowmem_reserve[]: 0 0 0 0
[ 1.188838] Node 0 DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 0*32kB 1*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 0*1024kB 0*2048kB 0*4096kB = 732kB
[ 1.188899] Node 0 DMA32: 1*4kB (E) 4*8kB (UM) 3*16kB (ME) 3*32kB (ME) 2*64kB (UM) 1*128kB (U) 2*256kB (UE) 1*512kB (E) 0*1024kB 0*2048kB 0*4096kB = 1460kB
[ 1.388548] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 1.388596] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 1.388634] 28140 total pagecache pages
[ 1.388651] 0 pages in swap cache
[ 1.388667] Swap cache stats: add 0, delete 0, find 0/0
[ 1.388689] Free swap = 0kB
[ 1.388705] Total swap = 0kB
[ 1.388721] 65439 pages RAM
[ 1.388733] 0 pages HighMem/MovableOnly
[ 1.388749] 24136 pages reserved
[ 1.388764] 0 pages cma reserved
[ 1.388781] Tasks state (memory values in pages):
[ 1.388803] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 1.388871] Out of memory and no killable processes...
[ 1.388896] Kernel panic - not syncing: System is deadlocked on memory
[ 1.388924] CPU: 0 PID: 7 Comm: kworker/u2:0 Not tainted 5.13.0-rc6-20210617-doflr-mac80211debug+ #1
[ 1.388966] Workqueue: events_unbound async_run_entry_fn
[ 1.388994] Call Trace:
[ 1.389007] dump_stack+0x76/0x94
[ 1.389032] panic+0xfc/0x2c0
[ 1.389054] out_of_memory.cold.44+0x5e/0x7e
[ 1.389078] __alloc_pages_slowpath.constprop.112+0xb9e/0xc80
[ 1.389108] ? __mod_memcg_lruvec_state+0x1d/0x100
[ 1.389131] __alloc_pages+0x318/0x330
[ 1.389150] pagecache_get_page+0x24b/0x400
[ 1.389169] grab_cache_page_write_begin+0x17/0x30
[ 1.389192] simple_write_begin+0x1e/0x1e0
[ 1.389210] generic_perform_write+0xef/0x1b0
[ 1.389232] __generic_file_write_iter+0x140/0x1b0
[ 1.389256] ? write_buffer+0x32/0x32
[ 1.389274] generic_file_write_iter+0x58/0xa0
[ 1.389299] __kernel_write+0x146/0x2c0
[ 1.389318] kernel_write+0x51/0xf0
[ 1.389335] xwrite+0x2c/0x5f
[ 1.389353] ? initrd_load+0x268/0x268
[ 1.389370] do_copy+0xc7/0x109
[ 1.389388] ? initrd_load+0x19e/0x268
[ 1.389404] ? do_name+0x11a/0x269
[ 1.389421] write_buffer+0x22/0x32
[ 1.389450] flush_buffer+0x2f/0x86
[ 1.389465] __gunzip+0x26e/0x315
[ 1.389482] ? bunzip2+0x397/0x397
[ 1.389498] ? initrd_load+0x268/0x268
[ 1.389513] gunzip+0xe/0x11
[ 1.389544] ? initrd_load+0x268/0x268
[ 1.389562] unpack_to_rootfs+0x159/0x28f
[ 1.389581] ? initrd_load+0x268/0x268
[ 1.389599] do_populate_rootfs+0x6c/0x160
[ 1.389618] async_run_entry_fn+0x1b/0xa0
[ 1.389636] process_one_work+0x1f6/0x390
[ 1.389657] worker_thread+0x28/0x3d0
[ 1.389676] ? process_one_work+0x390/0x390
[ 1.389698] kthread+0x111/0x130
[ 1.389716] ? kthread_park+0x80/0x80
[ 1.389733] ret_from_fork+0x22/0x30
[ 1.389803] Kernel Offset: disabled


Attachments:
config-5.13.0-rc6-20210617-doflr-mac80211debug+ (138.53 kB)

2021-06-17 16:16:08

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

On 17/06/2021 17.01, Linus Torvalds wrote:
> On Thu, Jun 17, 2021 at 2:26 AM Sander Eikelenboom <[email protected]> wrote:
>>
>> I just tried to upgrade and test the linux kernel going from the 5.12 kernel series to 5.13-rc6 on my homeserver with Xen, but ran in some trouble.
>>
>> Some VM's boot fine (with more than 256MB memory assigned), but the smaller (memory wise) PVH ones crash during kernel boot due to OOM.
>> Booting VM's with 5.12(.9) kernel still works fine, also when dom0 is running 5.13-rc6 (but it has more memory assigned, so that is not unexpected).
>
> Adding Rasmus to the cc, because this looks kind of like the async
> roofs population thing that caused some other oom issues too.

Yes, that looks like the same issue.

> Rasmus? Original report here:
>
> https://lore.kernel.org/lkml/[email protected]/
>
> I do find it odd that we'd be running out of memory so early..

Indeed. It would be nice to know if these also reproduce with
initramfs_async=0 on the command line.

But what is even more curious is that in the other report
(https://lore.kernel.org/lkml/20210607144419.GA23706@xsang-OptiPlex-9020/),
it seemed to trigger with _more_ memory - though I may be misreading
what Oliver was telling me:

> please be noted that we use 'vmalloc=512M' for both parent and this
commit.
> since it's ok on parent but oom on this commit, we want to send this
report
> to show the potential problem of the commit on some cases.
>
> we also tested by changing to use 'vmalloc=128M', it will succeed.

Those tests were done in a VM with 16G memory, and then he also wrote

> we also tried to follow exactly above steps to test on
> some local machine (8G memory), but cannot reproduce.

Are there some special rules for what memory pools PID1 versus the
kworker threads can dip into?


Side note: I also had a ppc64 report with different symptoms (the
initramfs was corrupted), but that turned out to also reproduce with
e7cb072eb98 reverted, so that is likely unrelated. But just FTR that
thread is here:
https://lore.kernel.org/lkml/CA+QYu4qxf2CYe2gC6EYnOHXPKS-+cEXL=MnUvqRFaN7W1i6ahQ@mail.gmail.com/

Rasmus

2021-06-17 16:17:26

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

On 17/06/2021 17.01, Linus Torvalds wrote:
> On Thu, Jun 17, 2021 at 2:26 AM Sander Eikelenboom <[email protected]> wrote:
>>
>> I just tried to upgrade and test the linux kernel going from the 5.12 kernel series to 5.13-rc6 on my homeserver with Xen, but ran in some trouble.
>>
>> Some VM's boot fine (with more than 256MB memory assigned), but the smaller (memory wise) PVH ones crash during kernel boot due to OOM.
>> Booting VM's with 5.12(.9) kernel still works fine, also when dom0 is running 5.13-rc6 (but it has more memory assigned, so that is not unexpected).
>
> Adding Rasmus to the cc, because this looks kind of like the async
> roofs population thing that caused some other oom issues too.

Yes, that looks like the same issue.

> Rasmus? Original report here:
>
> https://lore.kernel.org/lkml/[email protected]/
>
> I do find it odd that we'd be running out of memory so early..

Indeed. It would be nice to know if these also reproduce with
initramfs_async=0 on the command line.

But what is even more curious is that in the other report
(https://lore.kernel.org/lkml/20210607144419.GA23706@xsang-OptiPlex-9020/),
it seemed to trigger with _more_ memory - though I may be misreading
what Oliver was telling me:

> please be noted that we use 'vmalloc=512M' for both parent and this
commit.
> since it's ok on parent but oom on this commit, we want to send this
report
> to show the potential problem of the commit on some cases.
>
> we also tested by changing to use 'vmalloc=128M', it will succeed.

Those tests were done in a VM with 16G memory, and then he also wrote

> we also tried to follow exactly above steps to test on
> some local machine (8G memory), but cannot reproduce.

Are there some special rules for what memory pools PID1 versus the
kworker threads can dip into?


Side note: I also had a ppc64 report with different symptoms (the
initramfs was corrupted), but that turned out to also reproduce with
e7cb072eb98 reverted, so that is likely unrelated. But just FTR that
thread is here:
https://lore.kernel.org/lkml/CA+QYu4qxf2CYe2gC6EYnOHXPKS-+cEXL=MnUvqRFaN7W1i6ahQ@mail.gmail.com/

Rasmus

2021-06-17 18:25:53

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

On Thu, Jun 17, 2021 at 2:26 AM Sander Eikelenboom <[email protected]> wrote:
>
> I just tried to upgrade and test the linux kernel going from the 5.12 kernel series to 5.13-rc6 on my homeserver with Xen, but ran in some trouble.
>
> Some VM's boot fine (with more than 256MB memory assigned), but the smaller (memory wise) PVH ones crash during kernel boot due to OOM.
> Booting VM's with 5.12(.9) kernel still works fine, also when dom0 is running 5.13-rc6 (but it has more memory assigned, so that is not unexpected).

Adding Rasmus to the cc, because this looks kind of like the async
roofs population thing that caused some other oom issues too.

Rasmus? Original report here:

https://lore.kernel.org/lkml/[email protected]/

I do find it odd that we'd be running out of memory so early..

Linus

2021-06-17 21:29:03

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

On 17/06/2021 20:02, Sander Eikelenboom wrote:
> On 17/06/2021 17:37, Rasmus Villemoes wrote:
>> On 17/06/2021 17.01, Linus Torvalds wrote:
>>> On Thu, Jun 17, 2021 at 2:26 AM Sander Eikelenboom <[email protected]> wrote:
>>>>
>>>> I just tried to upgrade and test the linux kernel going from the 5.12 kernel series to 5.13-rc6 on my homeserver with Xen, but ran in some trouble.
>>>>
>>>> Some VM's boot fine (with more than 256MB memory assigned), but the smaller (memory wise) PVH ones crash during kernel boot due to OOM.
>>>> Booting VM's with 5.12(.9) kernel still works fine, also when dom0 is running 5.13-rc6 (but it has more memory assigned, so that is not unexpected).
>>>
>>> Adding Rasmus to the cc, because this looks kind of like the async
>>> roofs population thing that caused some other oom issues too.
>>
>> Yes, that looks like the same issue.
>>
>>> Rasmus? Original report here:
>>>
>>> https://lore.kernel.org/lkml/[email protected]/
>>>
>>> I do find it odd that we'd be running out of memory so early..
>>
>> Indeed. It would be nice to know if these also reproduce with
>> initramfs_async=0 on the command line.
>>
>> But what is even more curious is that in the other report
>> (https://lore.kernel.org/lkml/20210607144419.GA23706@xsang-OptiPlex-9020/),
>> it seemed to trigger with _more_ memory - though I may be misreading
>> what Oliver was telling me:
>>
>>> please be noted that we use 'vmalloc=512M' for both parent and this
>> commit.
>>> since it's ok on parent but oom on this commit, we want to send this
>> report
>>> to show the potential problem of the commit on some cases.
>>>
>>> we also tested by changing to use 'vmalloc=128M', it will succeed.
>>
>> Those tests were done in a VM with 16G memory, and then he also wrote
>>
>>> we also tried to follow exactly above steps to test on
>>> some local machine (8G memory), but cannot reproduce.
>>
>> Are there some special rules for what memory pools PID1 versus the
>> kworker threads can dip into?
>>
>>
>> Side note: I also had a ppc64 report with different symptoms (the
>> initramfs was corrupted), but that turned out to also reproduce with
>> e7cb072eb98 reverted, so that is likely unrelated. But just FTR that
>> thread is here:
>> https://lore.kernel.org/lkml/CA+QYu4qxf2CYe2gC6EYnOHXPKS-+cEXL=MnUvqRFaN7W1i6ahQ@mail.gmail.com/
>>
>> Rasmus
>>
>
> I choose to first finish the bisection attempt, not so suprising it ends up with:
> e7cb072eb988e46295512617c39d004f9e1c26f8 is the first bad commit
>
> So at least that link is confirmed.
>
> I also checked out booting with "initramfs_async=0" and now the guest boots with the 5.13-rc6-ish kernel which fails without that.
>
> --
> Sander
>

CC'ed Juergen.

Juergen, do you know how the direct kernel boot works and if that could interfere
with this commit ?

After reading the last part of the commit message e7cb072eb98 namely:

Should one of the initcalls done after rootfs_initcall time (i.e., device_
and late_ initcalls) need something from the initramfs (say, a kernel
module or a firmware blob), it will simply wait for the initramfs
unpacking to be done before proceeding, which should in theory make this
completely safe.

But if some driver pokes around in the filesystem directly and not via one
of the official kernel interfaces (i.e. request_firmware*(),
call_usermodehelper*) that theory may not hold - also, I certainly might
have missed a spot when sprinkling wait_for_initramfs(). So there is an
escape hatch in the form of an initramfs_async= command line parameter.

It dawned on me I'm using "direct kernel boot" functionality, which lets you boot a guest
were the kernel and initramfs get copied in from dom0, that works great, but perhaps it
pokes around as the last part of the commit message warns about ?

(I think the feature was called "direct kernel boot", what I mean is using the for example:
kernel = '/boot/vmlinuz-5.13.0-rc6-20210617-doflr-mac80211debug+'
ramdisk = '/boot/initrd.img-5.13.0-rc6-20210617-doflr-mac80211debug+'
cmdline = 'root=UUID=2f757320-caca-4215-868d-73a4aacf12aa ro nomodeset xen_blkfront.max_ring_page_order=1 console=hvc0 earlyprintk=xen initramfs_async=0'

options in the xen guest config file to boot the (in this case PVH) guest.
)

--
Sander



2021-06-17 23:31:29

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

On 17/06/2021 17:37, Rasmus Villemoes wrote:
> On 17/06/2021 17.01, Linus Torvalds wrote:
>> On Thu, Jun 17, 2021 at 2:26 AM Sander Eikelenboom <[email protected]> wrote:
>>>
>>> I just tried to upgrade and test the linux kernel going from the 5.12 kernel series to 5.13-rc6 on my homeserver with Xen, but ran in some trouble.
>>>
>>> Some VM's boot fine (with more than 256MB memory assigned), but the smaller (memory wise) PVH ones crash during kernel boot due to OOM.
>>> Booting VM's with 5.12(.9) kernel still works fine, also when dom0 is running 5.13-rc6 (but it has more memory assigned, so that is not unexpected).
>>
>> Adding Rasmus to the cc, because this looks kind of like the async
>> roofs population thing that caused some other oom issues too.
>
> Yes, that looks like the same issue.
>
>> Rasmus? Original report here:
>>
>> https://lore.kernel.org/lkml/[email protected]/
>>
>> I do find it odd that we'd be running out of memory so early..
>
> Indeed. It would be nice to know if these also reproduce with
> initramfs_async=0 on the command line.
>
> But what is even more curious is that in the other report
> (https://lore.kernel.org/lkml/20210607144419.GA23706@xsang-OptiPlex-9020/),
> it seemed to trigger with _more_ memory - though I may be misreading
> what Oliver was telling me:
>
>> please be noted that we use 'vmalloc=512M' for both parent and this
> commit.
>> since it's ok on parent but oom on this commit, we want to send this
> report
>> to show the potential problem of the commit on some cases.
>>
>> we also tested by changing to use 'vmalloc=128M', it will succeed.
>
> Those tests were done in a VM with 16G memory, and then he also wrote
>
>> we also tried to follow exactly above steps to test on
>> some local machine (8G memory), but cannot reproduce.
>
> Are there some special rules for what memory pools PID1 versus the
> kworker threads can dip into?
>
>
> Side note: I also had a ppc64 report with different symptoms (the
> initramfs was corrupted), but that turned out to also reproduce with
> e7cb072eb98 reverted, so that is likely unrelated. But just FTR that
> thread is here:
> https://lore.kernel.org/lkml/CA+QYu4qxf2CYe2gC6EYnOHXPKS-+cEXL=MnUvqRFaN7W1i6ahQ@mail.gmail.com/
>
> Rasmus
>

I choose to first finish the bisection attempt, not so suprising it ends up with:
e7cb072eb988e46295512617c39d004f9e1c26f8 is the first bad commit

So at least that link is confirmed.

I also checked out booting with "initramfs_async=0" and now the guest boots with the 5.13-rc6-ish kernel which fails without that.

--
Sander

2021-06-18 03:22:36

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

On 17/06/2021 21:39, Sander Eikelenboom wrote:
> On 17/06/2021 20:02, Sander Eikelenboom wrote:
>> On 17/06/2021 17:37, Rasmus Villemoes wrote:
>>> On 17/06/2021 17.01, Linus Torvalds wrote:
>>>> On Thu, Jun 17, 2021 at 2:26 AM Sander Eikelenboom <[email protected]> wrote:
>>>>>
>>>>> I just tried to upgrade and test the linux kernel going from the 5.12 kernel series to 5.13-rc6 on my homeserver with Xen, but ran in some trouble.
>>>>>
>>>>> Some VM's boot fine (with more than 256MB memory assigned), but the smaller (memory wise) PVH ones crash during kernel boot due to OOM.
>>>>> Booting VM's with 5.12(.9) kernel still works fine, also when dom0 is running 5.13-rc6 (but it has more memory assigned, so that is not unexpected).
>>>>
>>>> Adding Rasmus to the cc, because this looks kind of like the async
>>>> roofs population thing that caused some other oom issues too.
>>>
>>> Yes, that looks like the same issue.
>>>
>>>> Rasmus? Original report here:
>>>>
>>>> https://lore.kernel.org/lkml/[email protected]/
>>>>
>>>> I do find it odd that we'd be running out of memory so early..
>>>
>>> Indeed. It would be nice to know if these also reproduce with
>>> initramfs_async=0 on the command line.
>>>
>>> But what is even more curious is that in the other report
>>> (https://lore.kernel.org/lkml/20210607144419.GA23706@xsang-OptiPlex-9020/),
>>> it seemed to trigger with _more_ memory - though I may be misreading
>>> what Oliver was telling me:
>>>
>>>> please be noted that we use 'vmalloc=512M' for both parent and this
>>> commit.
>>>> since it's ok on parent but oom on this commit, we want to send this
>>> report
>>>> to show the potential problem of the commit on some cases.
>>>>
>>>> we also tested by changing to use 'vmalloc=128M', it will succeed.
>>>
>>> Those tests were done in a VM with 16G memory, and then he also wrote
>>>
>>>> we also tried to follow exactly above steps to test on
>>>> some local machine (8G memory), but cannot reproduce.
>>>
>>> Are there some special rules for what memory pools PID1 versus the
>>> kworker threads can dip into?
>>>
>>>
>>> Side note: I also had a ppc64 report with different symptoms (the
>>> initramfs was corrupted), but that turned out to also reproduce with
>>> e7cb072eb98 reverted, so that is likely unrelated. But just FTR that
>>> thread is here:
>>> https://lore.kernel.org/lkml/CA+QYu4qxf2CYe2gC6EYnOHXPKS-+cEXL=MnUvqRFaN7W1i6ahQ@mail.gmail.com/
>>>
>>> Rasmus
>>>
>>
>> I choose to first finish the bisection attempt, not so suprising it ends up with:
>> e7cb072eb988e46295512617c39d004f9e1c26f8 is the first bad commit
>>
>> So at least that link is confirmed.
>>
>> I also checked out booting with "initramfs_async=0" and now the guest boots with the 5.13-rc6-ish kernel which fails without that.
>>
>> --
>> Sander
>>
>
> CC'ed Juergen.
>
> Juergen, do you know how the direct kernel boot works and if that could interfere
> with this commit ?
>
> After reading the last part of the commit message e7cb072eb98 namely:
>
> Should one of the initcalls done after rootfs_initcall time (i.e., device_
> and late_ initcalls) need something from the initramfs (say, a kernel
> module or a firmware blob), it will simply wait for the initramfs
> unpacking to be done before proceeding, which should in theory make this
> completely safe.
>
> But if some driver pokes around in the filesystem directly and not via one
> of the official kernel interfaces (i.e. request_firmware*(),
> call_usermodehelper*) that theory may not hold - also, I certainly might
> have missed a spot when sprinkling wait_for_initramfs(). So there is an
> escape hatch in the form of an initramfs_async= command line parameter.
>
> It dawned on me I'm using "direct kernel boot" functionality, which lets you boot a guest
> were the kernel and initramfs get copied in from dom0, that works great, but perhaps it
> pokes around as the last part of the commit message warns about ?
>
> (I think the feature was called "direct kernel boot", what I mean is using the for example:
> kernel = '/boot/vmlinuz-5.13.0-rc6-20210617-doflr-mac80211debug+'
> ramdisk = '/boot/initrd.img-5.13.0-rc6-20210617-doflr-mac80211debug+'
> cmdline = 'root=UUID=2f757320-caca-4215-868d-73a4aacf12aa ro nomodeset xen_blkfront.max_ring_page_order=1 console=hvc0 earlyprintk=xen initramfs_async=0'
>
> options in the xen guest config file to boot the (in this case PVH) guest.
> )
>
> --
> Sander
>

OK, done some experimentation and it seems with 256M assigned to the VM it was almost at the edge of OOM with the 5.12 kernel as well in the config I am using it.
With v5.12 when I assign 240M it boots, with 230M it doesn't. With 5.13 the tipping point seems to be around 265M and 270M, so my config was already quite close to the edge.

The "direct kernel boot" feature I'm using just seems somewhat memory hungry, but using another compression algorithm for the kernel and initramfs already helped in my case.

So sorry for the noise, clearly user-error.

--
Sander



2021-06-21 17:07:43

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

On 18/06/2021 03.06, Sander Eikelenboom wrote:
> On 17/06/2021 21:39, Sander Eikelenboom wrote:

>
> OK, done some experimentation and it seems with 256M assigned to the VM
> it was almost at the edge of OOM with the 5.12 kernel as well in the
> config I am using it.
> With v5.12 when I assign 240M it boots, with 230M it doesn't. With 5.13
> the tipping point seems to be around 265M and 270M, so my config was
> already quite close to the edge.
>
> The "direct kernel boot" feature I'm using just seems somewhat memory
> hungry, but using another compression algorithm for the kernel and
> initramfs already helped in my case.
>
> So sorry for the noise, clearly user-error.

Hm, perhaps, but I'm still a bit nervous about that report from Oliver
Sang/kernel test robot, which was for a VM equipped with 16G of memory.
But despite quite a few attempts, I haven't been able to reproduce that
locally, so unfortunately I have no idea what's going on.

Rasmus

2021-06-21 21:38:51

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

On 21/06/2021 18:54, Rasmus Villemoes wrote:
> On 18/06/2021 03.06, Sander Eikelenboom wrote:
>> On 17/06/2021 21:39, Sander Eikelenboom wrote:
>
>>
>> OK, done some experimentation and it seems with 256M assigned to the VM
>> it was almost at the edge of OOM with the 5.12 kernel as well in the
>> config I am using it.
>> With v5.12 when I assign 240M it boots, with 230M it doesn't. With 5.13
>> the tipping point seems to be around 265M and 270M, so my config was
>> already quite close to the edge.
>>
>> The "direct kernel boot" feature I'm using just seems somewhat memory
>> hungry, but using another compression algorithm for the kernel and
>> initramfs already helped in my case.
>>
>> So sorry for the noise, clearly user-error.
>
> Hm, perhaps, but I'm still a bit nervous about that report from Oliver
> Sang/kernel test robot, which was for a VM equipped with 16G of memory.
> But despite quite a few attempts, I haven't been able to reproduce that
> locally, so unfortunately I have no idea what's going on.
>
> Rasmus
>

Hmm I just tried to switch all VM's to a 5.13-rc7 kernel.
Some worked since i reduced the size, but some still fail.

The difference seems the be the number of vcpu's I assign to the VM's

The ones with 1 vcpu now boot with 256MB assigned (that was what I tested before),
but the ones with 2 vcpu's assigned don't and still OOM
on the same kernel and initramfs that I pass in from the host.

Could that box from the test-robot have a massive amount of cpu-cores
and that it is some how related to that ?

--
Sander