Kernel BUG at mm/vmalloc.c:3089! on x86_64 Kasan configured kernel reported
this while testing LTP cgroup_fj_stress_memory_4_4_none test cases.
Also found on arm64 and i386 devices and qemu.
metadata:
git branch: master
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
git commit: de2e69cfe54a8f2ed4b75f09d3110c514f45d38e
git describe: next-20200721
make_kernelversion: 5.8.0-rc6
kernel-config:
https://builds.tuxbuild.com/zU-I3LEfC1AaKQ59Er60ZQ/kernel.config
crash log,
[ 1421.080221] ------------[ cut here ]------------
[ 1421.084874] kernel BUG at mm/vmalloc.c:3089!
[ 1421.090356] invalid opcode: 0000 [#1] SMP KASAN PTI
[ 1421.096009] CPU: 1 PID: 19100 Comm: kworker/1:1 Not tainted
5.8.0-rc6-next-20200721 #1
[ 1421.103933] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.0b 07/27/2017
[ 1421.111418] Workqueue: events pcpu_balance_workfn
[ 1421.116138] RIP: 0010:free_vm_area+0x2d/0x30
[ 1421.120413] Code: e5 41 54 49 89 fc 48 83 c7 08 e8 9e 5e 04 00 49
8b 7c 24 08 e8 74 f8 ff ff 49 39 c4 75 0c 4c 89 e7 e8 97 d2 03 00 41
5c 5d c3 <0f> 0b 90 48 b8 00 00 00 00 00 fc ff df 55 48 89 e5 41 56 49
89 fe
[ 1421.139154] RSP: 0018:ffff88840142fc80 EFLAGS: 00010282
[ 1421.144381] RAX: 0000000000000000 RBX: ffff88841b843738 RCX: ffffffff86ca1d78
[ 1421.151515] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8883bfacd630
[ 1421.158647] RBP: ffff88840142fc88 R08: 0000000000000001 R09: ffffed1080285f7e
[ 1421.165780] R10: 0000000000000003 R11: ffffed1080285f7d R12: ffff888409e89880
[ 1421.172913] R13: ffff88841b843730 R14: 0000000000000080 R15: 0000000000000080
[ 1421.180045] FS: 0000000000000000(0000) GS:ffff88841fa80000(0000)
knlGS:0000000000000000
[ 1421.188132] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1421.193876] CR2: 00007f1230b41080 CR3: 000000025d40e002 CR4: 00000000003706e0
[ 1421.201008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1421.208132] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1421.215255] Call Trace:
[ 1421.217703] pcpu_free_vm_areas+0x30/0x44
[ 1421.221714] pcpu_balance_workfn+0x7bd/0x8f0
[ 1421.225987] ? pcpu_create_chunk+0x2f0/0x2f0
[ 1421.230261] ? read_word_at_a_time+0x12/0x20
[ 1421.234531] ? strscpy+0xc1/0x190
[ 1421.237842] process_one_work+0x474/0x7b0
[ 1421.241856] worker_thread+0x7b/0x6a0
[ 1421.245521] ? wake_up_process+0x10/0x20
[ 1421.249448] ? process_one_work+0x7b0/0x7b0
[ 1421.253635] kthread+0x1aa/0x200
[ 1421.256867] ? kthread_create_on_node+0xd0/0xd0
[ 1421.261400] ret_from_fork+0x22/0x30
[ 1421.264978] Modules linked in: x86_pkg_temp_thermal
[ 1421.269869] ---[ end trace 6352cf97284f07da ]---
[ 1421.274955] RIP: 0010:free_vm_area+0x2d/0x30
[ 1421.281026] Code: e5 41 54 49 89 fc 48 83 c7 08 e8 9e 5e 04 00 49
8b 7c 24 08 e8 74 f8 ff ff 49 39 c4 75 0c 4c 89 e7 e8 97 d2 03 00 41
5c 5d c3 <0f> 0b 90 48 b8 00 00 00 00 00 fc ff df 55 48 89 e5 41 56 49
89 fe
[ 1421.300553] RSP: 0018:ffff88840142fc80 EFLAGS: 00010282
[ 1421.307051] RAX: 0000000000000000 RBX: ffff88841b843738 RCX: ffffffff86ca1d78
[ 1421.314184] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8883bfacd630
[ 1421.321317] RBP: ffff88840142fc88 R08: 0000000000000001 R09: ffffed1080285f7e
[ 1421.328477] R10: 0000000000000003 R11: ffffed1080285f7d R12: ffff888409e89880
[ 1421.335639] R13: ffff88841b843730 R14: 0000000000000080 R15: 0000000000000080
[ 1421.342777] FS: 0000000000000000(0000) GS:ffff88841fa80000(0000)
knlGS:0000000000000000
[ 1421.350870] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1421.356643] CR2: 00007f1230b41080 CR3: 000000025d40e002 CR4: 00000000003706e0
[ 1421.363811] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1421.370951] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Full test log,
https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20200721/testrun/2972982/suite/linux-log-parser/test/check-kernel-bug-1594684/log
--
Linaro LKFT
https://lkft.linaro.org
Adding Roman Gushchin to Cc, he touched that code recently.
Naresh, if nobody has any immediate ideas, you could double-check by
reverting these commits:
e0b8d00b7561 mm: memcg/percpu: per-memcg percpu memory statistics
99411af13595 mm/percpu: fix 'defined but not used' warning
9398ce6306b6 mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix
54116d471779 mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix
ec518e090843 mm: memcg/percpu: account percpu memory to memory cgroups
9bc897d18dc3 percpu: return number of released bytes from pcpu_free_area()
Arnd
On Wed, Jul 22, 2020 at 10:12 AM Naresh Kamboju
<[email protected]> wrote:
>
> Kernel BUG at mm/vmalloc.c:3089! on x86_64 Kasan configured kernel reported
> this while testing LTP cgroup_fj_stress_memory_4_4_none test cases.
>
> Also found on arm64 and i386 devices and qemu.
>
> metadata:
> git branch: master
> git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> git commit: de2e69cfe54a8f2ed4b75f09d3110c514f45d38e
> git describe: next-20200721
> make_kernelversion: 5.8.0-rc6
> kernel-config:
> https://builds.tuxbuild.com/zU-I3LEfC1AaKQ59Er60ZQ/kernel.config
>
> crash log,
> [ 1421.080221] ------------[ cut here ]------------
> [ 1421.084874] kernel BUG at mm/vmalloc.c:3089!
> [ 1421.090356] invalid opcode: 0000 [#1] SMP KASAN PTI
> [ 1421.096009] CPU: 1 PID: 19100 Comm: kworker/1:1 Not tainted
> 5.8.0-rc6-next-20200721 #1
> [ 1421.103933] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> 2.0b 07/27/2017
> [ 1421.111418] Workqueue: events pcpu_balance_workfn
> [ 1421.116138] RIP: 0010:free_vm_area+0x2d/0x30
> [ 1421.120413] Code: e5 41 54 49 89 fc 48 83 c7 08 e8 9e 5e 04 00 49
> 8b 7c 24 08 e8 74 f8 ff ff 49 39 c4 75 0c 4c 89 e7 e8 97 d2 03 00 41
> 5c 5d c3 <0f> 0b 90 48 b8 00 00 00 00 00 fc ff df 55 48 89 e5 41 56 49
> 89 fe
> [ 1421.139154] RSP: 0018:ffff88840142fc80 EFLAGS: 00010282
> [ 1421.144381] RAX: 0000000000000000 RBX: ffff88841b843738 RCX: ffffffff86ca1d78
> [ 1421.151515] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8883bfacd630
> [ 1421.158647] RBP: ffff88840142fc88 R08: 0000000000000001 R09: ffffed1080285f7e
> [ 1421.165780] R10: 0000000000000003 R11: ffffed1080285f7d R12: ffff888409e89880
> [ 1421.172913] R13: ffff88841b843730 R14: 0000000000000080 R15: 0000000000000080
> [ 1421.180045] FS: 0000000000000000(0000) GS:ffff88841fa80000(0000)
> knlGS:0000000000000000
> [ 1421.188132] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1421.193876] CR2: 00007f1230b41080 CR3: 000000025d40e002 CR4: 00000000003706e0
> [ 1421.201008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1421.208132] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1421.215255] Call Trace:
> [ 1421.217703] pcpu_free_vm_areas+0x30/0x44
> [ 1421.221714] pcpu_balance_workfn+0x7bd/0x8f0
> [ 1421.225987] ? pcpu_create_chunk+0x2f0/0x2f0
> [ 1421.230261] ? read_word_at_a_time+0x12/0x20
> [ 1421.234531] ? strscpy+0xc1/0x190
> [ 1421.237842] process_one_work+0x474/0x7b0
> [ 1421.241856] worker_thread+0x7b/0x6a0
> [ 1421.245521] ? wake_up_process+0x10/0x20
> [ 1421.249448] ? process_one_work+0x7b0/0x7b0
> [ 1421.253635] kthread+0x1aa/0x200
> [ 1421.256867] ? kthread_create_on_node+0xd0/0xd0
> [ 1421.261400] ret_from_fork+0x22/0x30
> [ 1421.264978] Modules linked in: x86_pkg_temp_thermal
> [ 1421.269869] ---[ end trace 6352cf97284f07da ]---
> [ 1421.274955] RIP: 0010:free_vm_area+0x2d/0x30
> [ 1421.281026] Code: e5 41 54 49 89 fc 48 83 c7 08 e8 9e 5e 04 00 49
> 8b 7c 24 08 e8 74 f8 ff ff 49 39 c4 75 0c 4c 89 e7 e8 97 d2 03 00 41
> 5c 5d c3 <0f> 0b 90 48 b8 00 00 00 00 00 fc ff df 55 48 89 e5 41 56 49
> 89 fe
> [ 1421.300553] RSP: 0018:ffff88840142fc80 EFLAGS: 00010282
> [ 1421.307051] RAX: 0000000000000000 RBX: ffff88841b843738 RCX: ffffffff86ca1d78
> [ 1421.314184] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8883bfacd630
> [ 1421.321317] RBP: ffff88840142fc88 R08: 0000000000000001 R09: ffffed1080285f7e
> [ 1421.328477] R10: 0000000000000003 R11: ffffed1080285f7d R12: ffff888409e89880
> [ 1421.335639] R13: ffff88841b843730 R14: 0000000000000080 R15: 0000000000000080
> [ 1421.342777] FS: 0000000000000000(0000) GS:ffff88841fa80000(0000)
> knlGS:0000000000000000
> [ 1421.350870] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1421.356643] CR2: 00007f1230b41080 CR3: 000000025d40e002 CR4: 00000000003706e0
> [ 1421.363811] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1421.370951] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>
> Full test log,
> https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20200721/testrun/2972982/suite/linux-log-parser/test/check-kernel-bug-1594684/log
>
> --
> Linaro LKFT
> https://lkft.linaro.org
On Wed, Jul 22, 2020 at 1:55 AM Arnd Bergmann <[email protected]> wrote:
>
> Adding Roman Gushchin to Cc, he touched that code recently.
>
> Naresh, if nobody has any immediate ideas, you could double-check by
> reverting these commits:
>
> e0b8d00b7561 mm: memcg/percpu: per-memcg percpu memory statistics
> 99411af13595 mm/percpu: fix 'defined but not used' warning
> 9398ce6306b6 mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix
> 54116d471779 mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix
> ec518e090843 mm: memcg/percpu: account percpu memory to memory cgroups
> 9bc897d18dc3 percpu: return number of released bytes from pcpu_free_area()
>
> Arnd
>
I think syzbot has bisected this issue to the suspect patch.
https://lore.kernel.org/lkml/[email protected]/