Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: e7cb072eb988e46295512617c39d004f9e1c26f8 ("init/initramfs.c: do unpacking asynchronously")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: locktorture
version:
with following parameters:
runtime: 300s
test: cpuhotplug
test-description: This torture test consists of creating a number of kernel threads which acquire the lock and hold it for specific amount of time, thus simulating different critical region behaviors.
test-url: https://www.kernel.org/doc/Documentation/locking/locktorture.txt
on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
please be noted that we use 'vmalloc=512M' for both parent and this commit.
since it's ok on parent but oom on this commit, we want to send this report
to show the potential problem of the commit on some cases.
we also tested by changing to use 'vmalloc=128M', it will succeed.
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
[ 4.443950] e1000: Copyright (c) 1999-2006 Intel Corporation.
[ 4.716374] ACPI: _SB_.LNKC: Enabled at IRQ 11
[ 5.081518] e1000 0000:00:03.0 eth0: (PCI:33MHz:32-bit) 52:54:00:12:34:56
[ 5.082999] e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network Connection
[ 5.085275] VFIO - User Level meta-driver version: 0.3
[ 8.029204] kworker/u4:0 invoked oom-killer: gfp_mask=0x100cc0(GFP_USER), order=0, oom_score_adj=0
[ 8.031021] CPU: 1 PID: 7 Comm: kworker/u4:0 Not tainted 5.12.0-11533-ge7cb072eb988 #1
[ 8.031988] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 8.031988] Workqueue: events_unbound async_run_entry_fn
[ 8.031988] Call Trace:
[ 8.031988] dump_stack (kbuild/src/consumer/lib/dump_stack.c:122)
[ 8.031988] dump_header (kbuild/src/consumer/mm/oom_kill.c:463)
[ 8.031988] ? lock_release (kbuild/src/consumer/kernel/locking/lockdep.c:5190 kbuild/src/consumer/kernel/locking/lockdep.c:5532)
[ 8.031988] ? out_of_memory (kbuild/src/consumer/include/linux/rcupdate.h:710 kbuild/src/consumer/mm/oom_kill.c:379 kbuild/src/consumer/mm/oom_kill.c:1102 kbuild/src/consumer/mm/oom_kill.c:1048)
[ 8.031988] out_of_memory.cold (kbuild/src/consumer/mm/oom_kill.c:1106 kbuild/src/consumer/mm/oom_kill.c:1048)
To reproduce:
# build kernel
cd linux
cp config-5.12.0-11533-ge7cb072eb988 .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
Thanks,
Oliver Sang
On 07/06/2021 16.44, kernel test robot wrote:
>
>
> Greeting,
>
> FYI, we noticed the following commit (built with gcc-9):
>
> commit: e7cb072eb988e46295512617c39d004f9e1c26f8 ("init/initramfs.c: do unpacking asynchronously")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
>
> in testcase: locktorture
> version:
> with following parameters:
>
> runtime: 300s
> test: cpuhotplug
>
> test-description: This torture test consists of creating a number of kernel threads which acquire the lock and hold it for specific amount of time, thus simulating different critical region behaviors.
> test-url: https://www.kernel.org/doc/Documentation/locking/locktorture.txt
>
>
> on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
> please be noted that we use 'vmalloc=512M' for both parent and this commit.
> since it's ok on parent but oom on this commit, we want to send this report
> to show the potential problem of the commit on some cases.
>
> we also tested by changing to use 'vmalloc=128M', it will succeed.
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <[email protected]>
>
>
> [ 4.443950] e1000: Copyright (c) 1999-2006 Intel Corporation.
> [ 4.716374] ACPI: _SB_.LNKC: Enabled at IRQ 11
> [ 5.081518] e1000 0000:00:03.0 eth0: (PCI:33MHz:32-bit) 52:54:00:12:34:56
> [ 5.082999] e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network Connection
> [ 5.085275] VFIO - User Level meta-driver version: 0.3
> [ 8.029204] kworker/u4:0 invoked oom-killer: gfp_mask=0x100cc0(GFP_USER), order=0, oom_score_adj=0
> [ 8.031021] CPU: 1 PID: 7 Comm: kworker/u4:0 Not tainted 5.12.0-11533-ge7cb072eb988 #1
> [ 8.031988] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> [ 8.031988] Workqueue: events_unbound async_run_entry_fn
> [ 8.031988] Call Trace:
> [ 8.031988] dump_stack (kbuild/src/consumer/lib/dump_stack.c:122)
> [ 8.031988] dump_header (kbuild/src/consumer/mm/oom_kill.c:463)
> [ 8.031988] ? lock_release (kbuild/src/consumer/kernel/locking/lockdep.c:5190 kbuild/src/consumer/kernel/locking/lockdep.c:5532)
> [ 8.031988] ? out_of_memory (kbuild/src/consumer/include/linux/rcupdate.h:710 kbuild/src/consumer/mm/oom_kill.c:379 kbuild/src/consumer/mm/oom_kill.c:1102 kbuild/src/consumer/mm/oom_kill.c:1048)
> [ 8.031988] out_of_memory.cold (kbuild/src/consumer/mm/oom_kill.c:1106 kbuild/src/consumer/mm/oom_kill.c:1048)
>
>
> To reproduce:
>
> # build kernel
> cd linux
> cp config-5.12.0-11533-ge7cb072eb988 .config
> make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
> make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
> cd <mod-install-dir>
> find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
So I got this far...
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
Is there some way to reproduce which doesn't require adding an lkp user?
Also, I don't have 16G to give to a virtual machine. I tried running the
bzImage with that modules.cgz under qemu with some naive parameters just
to get some output [1], but other than failing because there's no rootfs
to mount (as expected), I only managed to make it fail when providing
too little memory (the .cgz is around 70M, decompressed about 200M -
giving '-m 1G' to qemu works fine). You mention the vmalloc= argument,
but I can't make the decompression fail when passing either vmalloc=128M
or vmalloc=512M or no vmalloc= at all.
As an extra data point, what happens if you add initramfs_async=0 to the
command line?
[1] qemu-system-x86_64 -kernel arch/i386/boot/bzImage -initrd
../../tmp/header-install/modules.cgz -append "console=ttyAMA0
console=ttyS0 vmalloc=512M" -serial stdio -smp 2 -m 1G
Rasmus
hi Rasmus,
On Tue, Jun 08, 2021 at 09:42:58AM +0200, Rasmus Villemoes wrote:
> On 07/06/2021 16.44, kernel test robot wrote:
> >
> >
> > Greeting,
> >
> > FYI, we noticed the following commit (built with gcc-9):
> >
> > commit: e7cb072eb988e46295512617c39d004f9e1c26f8 ("init/initramfs.c: do unpacking asynchronously")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> >
> > in testcase: locktorture
> > version:
> > with following parameters:
> >
> > runtime: 300s
> > test: cpuhotplug
> >
> > test-description: This torture test consists of creating a number of kernel threads which acquire the lock and hold it for specific amount of time, thus simulating different critical region behaviors.
> > test-url: https://www.kernel.org/doc/Documentation/locking/locktorture.txt
> >
> >
> > on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> >
> > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> >
> >
> > please be noted that we use 'vmalloc=512M' for both parent and this commit.
> > since it's ok on parent but oom on this commit, we want to send this report
> > to show the potential problem of the commit on some cases.
> >
> > we also tested by changing to use 'vmalloc=128M', it will succeed.
> >
> >
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot <[email protected]>
> >
> >
> > [ 4.443950] e1000: Copyright (c) 1999-2006 Intel Corporation.
> > [ 4.716374] ACPI: _SB_.LNKC: Enabled at IRQ 11
> > [ 5.081518] e1000 0000:00:03.0 eth0: (PCI:33MHz:32-bit) 52:54:00:12:34:56
> > [ 5.082999] e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network Connection
> > [ 5.085275] VFIO - User Level meta-driver version: 0.3
> > [ 8.029204] kworker/u4:0 invoked oom-killer: gfp_mask=0x100cc0(GFP_USER), order=0, oom_score_adj=0
> > [ 8.031021] CPU: 1 PID: 7 Comm: kworker/u4:0 Not tainted 5.12.0-11533-ge7cb072eb988 #1
> > [ 8.031988] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> > [ 8.031988] Workqueue: events_unbound async_run_entry_fn
> > [ 8.031988] Call Trace:
> > [ 8.031988] dump_stack (kbuild/src/consumer/lib/dump_stack.c:122)
> > [ 8.031988] dump_header (kbuild/src/consumer/mm/oom_kill.c:463)
> > [ 8.031988] ? lock_release (kbuild/src/consumer/kernel/locking/lockdep.c:5190 kbuild/src/consumer/kernel/locking/lockdep.c:5532)
> > [ 8.031988] ? out_of_memory (kbuild/src/consumer/include/linux/rcupdate.h:710 kbuild/src/consumer/mm/oom_kill.c:379 kbuild/src/consumer/mm/oom_kill.c:1102 kbuild/src/consumer/mm/oom_kill.c:1048)
> > [ 8.031988] out_of_memory.cold (kbuild/src/consumer/mm/oom_kill.c:1106 kbuild/src/consumer/mm/oom_kill.c:1048)
> >
> >
> > To reproduce:
> >
> > # build kernel
> > cd linux
> > cp config-5.12.0-11533-ge7cb072eb988 .config
> > make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
> > make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
> > cd <mod-install-dir>
> > find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
>
> So I got this far...
>
> > git clone https://github.com/intel/lkp-tests.git
> > cd lkp-tests
> > bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
>
> Is there some way to reproduce which doesn't require adding an lkp user?
no need to run the test by 'lkp' account. lkp-tests will create a .lkp folder
under home path. do you mean this? normally we run 'qemu' by 'root'.
> Also, I don't have 16G to give to a virtual machine. I tried running the
> bzImage with that modules.cgz under qemu with some naive parameters just
> to get some output [1], but other than failing because there's no rootfs
> to mount (as expected), I only managed to make it fail when providing
> too little memory (the .cgz is around 70M, decompressed about 200M -
> giving '-m 1G' to qemu works fine). You mention the vmalloc= argument,
> but I can't make the decompression fail when passing either vmalloc=128M
> or vmalloc=512M or no vmalloc= at all.
sorry about this. we also tried to follow exactly above steps to test on
some local machine (8G memory), but cannot reproduce. we are analyzing
what's the diference in our automaion run in test cluster, which reproduced
the issue consistently. will update you when we have findings.
>
> As an extra data point, what happens if you add initramfs_async=0 to the
> command line?
yes, we tested this before sending out the report. the issue gone
if initramfs_async=0 is added.
>
> [1] qemu-system-x86_64 -kernel arch/i386/boot/bzImage -initrd
> ../../tmp/header-install/modules.cgz -append "console=ttyAMA0
> console=ttyS0 vmalloc=512M" -serial stdio -smp 2 -m 1G
>
> Rasmus
On 11/06/2021 10.48, Oliver Sang wrote:
> hi Rasmus,
>
> On Tue, Jun 08, 2021 at 09:42:58AM +0200, Rasmus Villemoes wrote:
>> On 07/06/2021 16.44, kernel test robot wrote:
>>>
>> Also, I don't have 16G to give to a virtual machine. I tried running the
>> bzImage with that modules.cgz under qemu with some naive parameters just
>> to get some output [1], but other than failing because there's no rootfs
>> to mount (as expected), I only managed to make it fail when providing
>> too little memory (the .cgz is around 70M, decompressed about 200M -
>> giving '-m 1G' to qemu works fine). You mention the vmalloc= argument,
>> but I can't make the decompression fail when passing either vmalloc=128M
>> or vmalloc=512M or no vmalloc= at all.
>
> sorry about this. we also tried to follow exactly above steps to test on
> some local machine (8G memory), but cannot reproduce. we are analyzing
> what's the diference in our automaion run in test cluster, which reproduced
> the issue consistently. will update you when we have findings.
OK. It's really odd that providing the VM with _more_ memory makes it
fail (other then the obvious failure in the other direction when there's
simply not enough memory for the unpacked initramfs itself). But
unfortunately that also sounds like I won't be able to reproduce with
the HW I have.
>> As an extra data point, what happens if you add initramfs_async=0 to the
>> command line?
>
> yes, we tested this before sending out the report. the issue gone
> if initramfs_async=0 is added.
Hm. Sounds like some initcall after rootfs_initcall time must
allocate/hog a lot of memory, perhaps with some heuristic depending on
how much is available.
Can you try with initcall_debug=1? I think that should produce a lot of
output, hopefully that would make it possible to see which initcalls
have been done just prior to (or while) the initramfs unpacking hits ENOMEM.
Thanks,
Rasmus
hi Rasmus,
On Fri, Jun 11, 2021 at 11:20:18AM +0200, Rasmus Villemoes wrote:
> On 11/06/2021 10.48, Oliver Sang wrote:
> > hi Rasmus,
> >
> > On Tue, Jun 08, 2021 at 09:42:58AM +0200, Rasmus Villemoes wrote:
> >> On 07/06/2021 16.44, kernel test robot wrote:
> >>>
>
>
> >> Also, I don't have 16G to give to a virtual machine. I tried running the
> >> bzImage with that modules.cgz under qemu with some naive parameters just
> >> to get some output [1], but other than failing because there's no rootfs
> >> to mount (as expected), I only managed to make it fail when providing
> >> too little memory (the .cgz is around 70M, decompressed about 200M -
> >> giving '-m 1G' to qemu works fine). You mention the vmalloc= argument,
> >> but I can't make the decompression fail when passing either vmalloc=128M
> >> or vmalloc=512M or no vmalloc= at all.
> >
> > sorry about this. we also tried to follow exactly above steps to test on
> > some local machine (8G memory), but cannot reproduce. we are analyzing
> > what's the diference in our automaion run in test cluster, which reproduced
> > the issue consistently. will update you when we have findings.
>
> OK. It's really odd that providing the VM with _more_ memory makes it
> fail (other then the obvious failure in the other direction when there's
> simply not enough memory for the unpacked initramfs itself). But
> unfortunately that also sounds like I won't be able to reproduce with
> the HW I have.
>
> >> As an extra data point, what happens if you add initramfs_async=0 to the
> >> command line?
> >
> > yes, we tested this before sending out the report. the issue gone
> > if initramfs_async=0 is added.
>
> Hm. Sounds like some initcall after rootfs_initcall time must
> allocate/hog a lot of memory, perhaps with some heuristic depending on
> how much is available.
>
> Can you try with initcall_debug=1? I think that should produce a lot of
> output, hopefully that would make it possible to see which initcalls
> have been done just prior to (or while) the initramfs unpacking hits ENOMEM.
thanks a lot for guidance!
by several jobs with initcall_debug=1, the oom happens consistently after
"calling panel_init_module"
below is an example, and detail dmesg.xz attached:
[ 15.120470] calling img_ascii_lcd_driver_init+0x0/0x16 @ 1
[ 15.122305] initcall img_ascii_lcd_driver_init+0x0/0x16 returned 0 after 128 usecs
[ 15.124660] calling panel_init_module+0x0/0x1f3 @ 1
[ 19.941826] kworker/u4:0 invoked oom-killer: gfp_mask=0x40cd0(GFP_KERNEL|__GFP_COMP|__GFP_RECLAIMABLE), order=0, oom_score_adj=0
[ 19.967679] CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.12.0-11533-ge7cb072eb988 #1
[ 19.968362] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 19.968362] Workqueue: events_unbound async_run_entry_fn
[ 19.968362] Call Trace:
[ 19.968362] dump_stack+0x69/0x85
[ 19.968362] dump_header+0x57/0x3ff
[ 19.968362] ? lock_release+0xa8/0x2b0
[ 19.968362] ? out_of_memory+0x310/0x4d0
[ 19.968362] out_of_memory.cold+0x2b/0x6f
[ 19.968362] __alloc_pages_slowpath+0xa3a/0xbb0
[ 19.968362] __alloc_pages+0x2a6/0x320
[ 19.968362] allocate_slab+0x242/0x300
[ 19.968362] ___slab_alloc+0x1e3/0x740
[ 19.968362] ? alloc_inode+0x77/0x80
[ 19.968362] ? lock_is_held_type+0x8a/0xf0
[ 19.968362] __slab_alloc+0x27/0x60
[ 19.968362] ? __slab_alloc+0x27/0x60
[ 19.968362] ? alloc_inode+0x77/0x80
[ 19.968362] kmem_cache_alloc+0x24d/0x290
[ 19.968362] ? alloc_inode+0x77/0x80
[ 19.968362] alloc_inode+0x77/0x80
[ 19.968362] new_inode_pseudo+0xf/0x50
[ 19.968362] new_inode+0x17/0x30
[ 19.968362] ramfs_get_inode+0x35/0x150
[ 19.968362] ? _raw_spin_unlock_irqrestore+0x2d/0x60
[ 19.968362] ? trace_hardirqs_on+0x48/0xf0
[ 19.968362] ? _raw_spin_unlock_irqrestore+0x38/0x60
[ 19.968362] ? lock_is_held_type+0x8a/0xf0
[ 19.968362] ramfs_mknod+0x36/0xa0
[ 19.968362] ? ramfs_mknod+0x36/0xa0
[ 19.968362] ? trace_preempt_on+0x36/0xd0
[ 19.968362] ? d_add+0xe4/0x1c0
[ 19.968362] ? ramfs_mkdir+0x50/0x50
[ 19.968362] ramfs_create+0x20/0x30
[ 19.968362] ? ramfs_create+0x20/0x30
[ 19.968362] path_openat+0x594/0xef0
[ 19.968362] do_filp_open+0x84/0xf0
[ 19.968362] ? lock_is_held_type+0x8a/0xf0
[ 19.968362] file_open_name+0xaf/0x1b0
[ 19.968362] filp_open+0x27/0x50
[ 19.968362] do_name+0xbc/0x27c
[ 19.968362] write_buffer+0x22/0x31
[ 19.968362] flush_buffer+0x26/0x74
[ 19.968362] __gunzip+0x204/0x28a
[ 19.968362] ? decompress_method+0x3b/0x3b
[ 19.968362] ? __gunzip+0x28a/0x28a
[ 19.968362] gunzip+0x16/0x18
[ 19.968362] ? write_buffer+0x31/0x31
[ 19.968362] ? initrd_load+0x23d/0x23d
[ 19.968362] unpack_to_rootfs+0x15d/0x273
[ 19.968362] ? write_buffer+0x31/0x31
[ 19.968362] ? initrd_load+0x23d/0x23d
[ 19.968362] do_populate_rootfs+0x5f/0x13f
[ 19.968362] async_run_entry_fn+0x24/0xb0
[ 19.968362] process_one_work+0x1ee/0x540
[ 19.968362] worker_thread+0x15e/0x3c0
[ 19.968362] kthread+0x109/0x140
[ 19.968362] ? process_one_work+0x540/0x540
[ 19.968362] ? kthread_insert_work_sanity_check+0x60/0x60
[ 19.968362] ret_from_fork+0x19/0x30
[ 20.217543] Mem-Info:
[ 20.218400] active_anon:0 inactive_anon:0 isolated_anon:0
[ 20.218400] active_file:0 inactive_file:0 isolated_file:0
[ 20.218400] unevictable:85038 dirty:0 writeback:0
[ 20.218400] slab_reclaimable:7910 slab_unreclaimable:1402
[ 20.218400] mapped:0 shmem:0 pagetables:4 bounce:0
[ 20.218400] free:3987860 free_pcp:117 free_cma:0
[ 20.226846] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:340152kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB writeback_tmp:0kB kernel_stack:464kB pagetables:16kB all_unreclaimable? yes
[ 20.233380] DMA free:572kB min:2296kB low:2356kB high:2416kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:4664kB writepending:0kB present:15992kB managed:15916kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 20.240158] lowmem_reserve[]: 0 48 15946 15946
[ 20.241518] Normal free:2816kB min:2816kB low:3008kB high:3200kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:10008kB writepending:0kB present:481272kB managed:49344kB mlocked:0kB bounce:0kB free_pcp:28kB local_pcp:16kB free_cma:0kB
[ 20.248350] lowmem_reserve[]: 0 0 127183 127183
[ 20.249784] HighMem free:15948052kB min:512kB low:64120kB high:127728kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:325496kB writepending:0kB present:16279432kB managed:16279432kB mlocked:0kB bounce:0kB free_pcp:440kB local_pcp:208kB free_cma:0kB
[ 20.257038] lowmem_reserve[]: 0 0 0 0
[ 20.258272] DMA: 0*4kB 0*8kB 1*16kB (E) 1*32kB (E) 0*64kB 0*128kB 0*256kB 1*512kB (E) 0*1024kB 0*2048kB 0*4096kB = 560kB
[ 20.261437] Normal: 12*4kB (UME) 4*8kB (UM) 3*16kB (UM) 2*32kB (UE) 1*64kB (U) 2*128kB (UM) 1*256kB (M) 2*512kB (ME) 1*1024kB (U) 0*2048kB 0*4096kB = 2816kB
[ 20.265294] HighMem: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 0*64kB 2*128kB (UM) 2*256kB (UM) 1*512kB (M) 1*1024kB (M) 2*2048kB (M) 3892*4096kB (M) = 15948052kB
[ 20.269081] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 20.271583] 85050 total pagecache pages
[ 20.272767] 4194174 pages RAM
[ 20.273839] 4069858 pages HighMem/MovableOnly
[ 20.275209] 108001 pages reserved
[ 20.276264] Tasks state (memory values in pages):
[ 20.277590] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 20.280179] Out of memory and no killable processes...
[ 20.281709] Kernel panic - not syncing: System is deadlocked on memory
>
> Thanks,
> Rasmus
On 11/06/2021 17.29, Oliver Sang wrote:
> hi Rasmus,
>
> On Fri, Jun 11, 2021 at 11:20:18AM +0200, Rasmus Villemoes wrote:
>> OK. It's really odd that providing the VM with _more_ memory makes it
>> fail (other then the obvious failure in the other direction when there's
>> simply not enough memory for the unpacked initramfs itself). But
>> unfortunately that also sounds like I won't be able to reproduce with
>> the HW I have.
>>
>>>> As an extra data point, what happens if you add initramfs_async=0 to the
>>>> command line?
>>>
>>> yes, we tested this before sending out the report. the issue gone
>>> if initramfs_async=0 is added.
>>
>> Hm. Sounds like some initcall after rootfs_initcall time must
>> allocate/hog a lot of memory, perhaps with some heuristic depending on
>> how much is available.
>>
>> Can you try with initcall_debug=1? I think that should produce a lot of
>> output, hopefully that would make it possible to see which initcalls
>> have been done just prior to (or while) the initramfs unpacking hits ENOMEM.
>
> thanks a lot for guidance!
>
> by several jobs with initcall_debug=1, the oom happens consistently after
> "calling panel_init_module"
>
> below is an example, and detail dmesg.xz attached:
Interesting. Well, I'm not sure panel_init_module has anything to do
with it; there is an almost five second gap until the OOM hits. So my
_guess_ is that we're just hitting
panel_init_module ->
parport_register_driver (macro wrapper for __parport_register_driver) ->
get_lowlevel_driver ->
request_module ->
...
wait_for_initramfs
which is why there's no more output from PID1. But just in case, can you
try disabling CONFIG_PARPORT_PANEL and see if you still hit the OOM?
It's not clear where all the memory goes. Is there any way to print some
memory statistics after each initcall?
You said that it was consistent, so perhaps something like this ugly
hack could be used to "bisect" to see which other initcall(s) we have a
bad interaction with:
diff --git a/init/initramfs.c b/init/initramfs.c
index af27abc59643..2386057f1faa 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -732,3 +732,19 @@ static int __init populate_rootfs(void)
return 0;
}
rootfs_initcall(populate_rootfs);
+
+static unsigned count;
+static int __init hack_setup(char *str)
+{
+ count = simple_strtoul(str, NULL, 0);
+ return 1;
+}
+__setup("hack=", hack_setup);
+
+void hack(void)
+{
+ if (!initramfs_cookie || !count)
+ return;
+ if (!--count)
+ wait_for_initramfs();
+}
diff --git a/init/main.c b/init/main.c
index e9c42a183e33..a96367080011 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1236,6 +1236,7 @@ static inline void
do_trace_initcall_finish(initcall_t fn, int ret)
}
#endif /* !TRACEPOINTS_ENABLED */
+extern void hack(void);
int __init_or_module do_one_initcall(initcall_t fn)
{
int count = preempt_count();
@@ -1248,6 +1249,7 @@ int __init_or_module do_one_initcall(initcall_t fn)
do_trace_initcall_start(fn);
ret = fn();
do_trace_initcall_finish(fn, ret);
+ hack();
msgbuf[0] = 0;
(compile-tested only). By my count, there are 322 initcalls from
populate_rootfs() to panel_init_module() inclusive. Booting with hack=1
should be equivalent to initramfs_async=0 and hence succeed, and
hack=322 should make you see the above behaviour.
Rasmus
On 11/06/2021 10.48, Oliver Sang wrote:
>>> To reproduce:
>>>
>>> # build kernel
>>> cd linux
>>> cp config-5.12.0-11533-ge7cb072eb988 .config
>>> make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
>>> make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
>>> cd <mod-install-dir>
>>> find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
Hm, when I do this with that exact commit, config, command lines and
Ubuntu's idea of what gcc-9 is, I get a modules.cgz of 73MB,
decompressing to 215MB. But in the two dmesg captures you've sent, there
are lines
[ 0.016519] RAMDISK: [mem 0x74c12000-0x7fffffff]
[ 0.023424] RAMDISK: [mem 0x74c16000-0x7fffffff]
corresponding to 189MB compressed. I'm wondering why there is the huge
difference.
Artificially inflating my modules.cgz to around the same size ("cp -r
lib/modules lib/modules2; cp -r lib/modules lib/modules3;") doesn't help
reproduce, unfortunately.
Rasmus