Hello everybody,
I see this on a git server with lots of ssh logins. It happens every few
hours to days. No idea how to reproduce, guess it's a race condition?
general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP NOPTI
CPU: 3 PID: 2213757 Comm: kworker/3:2 Not tainted 5.10.18-1-lts #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/28/2020
Workqueue: cgroup_destroy css_release_work_fn
RIP: 0010:css_release_work_fn+0x3c/0x200
Code: 54 55 53 48 89 fb 48 8b 6f a0 4c 8b 67 98 48 c7 c7 80 d3 ed b4 e8 d4 83 91 00 48 8b 43 c0 48 8b 53 b8 83 4b ec 04 48 89 42 08 <48> 89 10 4c 89 6b c0 48 85 ed 0f 84 ab 00 00 00 48 8b 53 d8 48 8d
RSP: 0018:ffffb75f4098fe78 EFLAGS: 00010206
RAX: dead000000000122 RBX: ffff9b112c157068 RCX: ffff9b117ddab5a0
RDX: ffff9b110e5c2020 RSI: 807fffffffffffff RDI: ffffffffb4edd380
RBP: ffffffffb4ff6120 R08: ffff9b11012e7eb0 R09: ffff9b11234f5c74
R10: 0000000000000018 R11: 0000000000000018 R12: ffff9b110c01a000
R13: dead000000000122 R14: ffff9b117ddb2600 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff9b117dd80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0ff5098620 CR3: 000000000e740000 CR4: 00000000003506e0
Call Trace:
process_one_work+0x1df/0x370
worker_thread+0x50/0x400
? process_one_work+0x370/0x370
kthread+0x11b/0x140
? __kthread_bind_mask+0x60/0x60
ret_from_fork+0x22/0x30
Modules linked in: vsock_loopback vmw_vsock_virtio_transport_common nf_tables vmw_vsock_vmci_transport vsock libcrc32c nfnetlink vmwgfx amd_energy joydev crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mousedev aesni_intel crypto_simd vmw_balloon ttm cryptd glue_helper rapl drm_kms_helper psmouse pcspkr vmxnet3 cec syscopyarea sysfillrect sysimgblt intel_agp vmw_vmci fb_sys_fops i2c_piix4 intel_gtt mac_hid drm sg fuse agpgart ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 dm_mod sr_mod cdrom ata_generic pata_acpi crc32c_intel serio_raw vmw_pvscsi ata_piix
---[ end trace e3405678b69341c6 ]---
RIP: 0010:css_release_work_fn+0x3c/0x200
Code: 54 55 53 48 89 fb 48 8b 6f a0 4c 8b 67 98 48 c7 c7 80 d3 ed b4 e8 d4 83 91 00 48 8b 43 c0 48 8b 53 b8 83 4b ec 04 48 89 42 08 <48> 89 10 4c 89 6b c0 48 85 ed 0f 84 ab 00 00 00 48 8b 53 d8 48 8d
RSP: 0018:ffffb75f4098fe78 EFLAGS: 00010206
RAX: dead000000000122 RBX: ffff9b112c157068 RCX: ffff9b117ddab5a0
RDX: ffff9b110e5c2020 RSI: 807fffffffffffff RDI: ffffffffb4edd380
RBP: ffffffffb4ff6120 R08: ffff9b11012e7eb0 R09: ffff9b11234f5c74
R10: 0000000000000018 R11: 0000000000000018 R12: ffff9b110c01a000
R13: dead000000000122 R14: ffff9b117ddb2600 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff9b117dd80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0ff5098620 CR3: 000000000e740000 CR4: 00000000003506e0
--
main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];)
putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}
Hello everybody,
Christian Hesse <[email protected]> on Tue, 2021/03/02 09:34:
> I see this on a git server with lots of ssh logins. It happens every few
> hours to days. No idea how to reproduce, guess it's a race condition?
>
> general protection fault, probably for non-canonical address
> 0xdead000000000122: 0000 [#1] SMP NOPTI CPU: 3 PID: 2213757 Comm:
> kworker/3:2 Not tainted 5.10.18-1-lts #1
I've seen more crashes with 5.10.23-1-lts and 5.11.6-arch1-1. Looks like
5.11.2-arch1-1 is stable for now, but I did not test everything in between.
Anybody has an idea what commit could have caused this?
--
main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];)
putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}
Christian Hesse <[email protected]> on Mo, 2021/03/15 14:10:
> Hello everybody,
>
> Christian Hesse <[email protected]> on Tue, 2021/03/02 09:34:
> > I see this on a git server with lots of ssh logins. It happens every few
> > hours to days. No idea how to reproduce, guess it's a race condition?
> >
> > general protection fault, probably for non-canonical address
> > 0xdead000000000122: 0000 [#1] SMP NOPTI CPU: 3 PID: 2213757 Comm:
> > kworker/3:2 Not tainted 5.10.18-1-lts #1
>
> I've seen more crashes with 5.10.23-1-lts and 5.11.6-arch1-1. Looks like
> 5.11.2-arch1-1 is stable for now, but I did not test everything in between.
Had several more crashes, latest with 5.10.29-1-lts. So... Anybody has an
idea what commit could have caused this?
--
main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];)
putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}
Hillf Danton <[email protected]> on Mo, 2021/04/12 16:05:
> Looks like double free or use after free based on 0xdead.
> If possible, would you try the mainline with KASAN enabled, given the fear
> that few guys can find time for 5.10 this week?
Currently running 5.11.13 with KASAN enabled for about a week. Either this
has been fixed lately or I am suffering a race that does not happen with
KASAN enabled.
--
main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];)
putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}
Christian Hesse <[email protected]> on Mon, 2021/04/19 12:55:
> Hillf Danton <[email protected]> on Mo, 2021/04/12 16:05:
> > Looks like double free or use after free based on 0xdead.
> > If possible, would you try the mainline with KASAN enabled, given the fear
> > that few guys can find time for 5.10 this week?
>
> Currently running 5.11.13 with KASAN enabled for about a week. Either this
> has been fixed lately or I am suffering a race that does not happen with
> KASAN enabled.
Currently running 5.11.16 for about a week, looks stable now.
Still saw a crash with recent 5.10.x though... Looks like I have to build
that with KASAN.
--
main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];)
putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}