Hi everyone,
While running the XFS fuzz test suite (which launches ~50 VMs, which is
enough to eat nearly all the DRAM on the system) on a VM host that I'd
recently upgrade to 5.14.1, I noticed the following crash in dmesg on
the host:
BUG: kernel NULL pointer dereference, address: 0000000000000068
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 6 PID: 4173897 Comm: CPU 3/KVM Tainted: G W 5.14.1-67-server #67.3
RIP: 0010:internal_get_user_pages_fast+0x621/0x9d0
Code: f7 c2 00 00 01 00 0f 85 94 fb ff ff 48 8b 4c 24 18 48 8d bc 24 8c 00 00 00 8b b4 24 8c 00 00 00 e8 b4 cd ff ff e9 76 fb ff ff <48> 81 7a 68 80 08 04 bc 0f 85 21 ff ff
8 89 c7 be
RSP: 0018:ffffaa90087679b0 EFLAGS: 00010046
RAX: ffffe3f37905b900 RBX: 00007f2dd561e000 RCX: ffffe3f37905b934
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffe3f37905b900
RBP: 00007f2dd561f000 R08: ffffe3f37905b900 R09: 000000049b4b6000
R10: ffffaa9008767b17 R11: 0000000000000000 R12: 8000000e416e4067
R13: ffff9dc39b4b60f0 R14: 000000ffffffffff R15: ffffe3f37905b900
FS: 00007f2e07fff700(0000) GS:ffff9dcf3f980000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000068 CR3: 00000004c5898003 CR4: 00000000001726e0
Call Trace:
get_user_pages_fast_only+0x13/0x20
hva_to_pfn+0xa9/0x3e0
? check_preempt_wakeup+0xec/0x230
try_async_pf+0xa1/0x270
? try_to_wake_up+0x1f0/0x580
? generic_exec_single+0x50/0xa0
direct_page_fault+0x113/0xad0
kvm_mmu_page_fault+0x69/0x680
? __schedule+0x301/0x13f0
? enqueue_hrtimer+0x2f/0x80
? vmx_sync_pir_to_irr+0x73/0x100
? vmx_set_hv_timer+0x31/0x100
vmx_handle_exit+0xe1/0x5d0
kvm_arch_vcpu_ioctl_run+0xd81/0x1c70
? kvm_vcpu_ioctl+0xe8/0x670
kvm_vcpu_ioctl+0x267/0x670
? kvm_on_user_return+0x7e/0x80
? fire_user_return_notifiers+0x38/0x50
__x64_sys_ioctl+0x83/0xa0
do_syscall_64+0x56/0x80
? do_syscall_64+0x63/0x80
? syscall_exit_to_user_mode+0x1d/0x40
? do_syscall_64+0x63/0x80
? do_syscall_64+0x63/0x80
? asm_exc_page_fault+0x5/0x20
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f2e2018c50b
Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007f2e07ffe5b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f2e2018c50b
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001e
RBP: 0000556cd92161e0 R08: 0000556cd823b1d0 R09: 00000000000000ff
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
R13: 0000556cd88a8080 R14: 0000000000000001 R15: 0000000000000000
Modules linked in: vhost_net vhost vhost_iotlb tap nfsv4 nfs xt_REDIRECT md5 xt_CHECKSUM xt_MASQUERADE ip6table_mangle ip6table_nat iptable_mangle ebtable_filter ebtables tun iptable_nat nf_nat joydev af_packet bonding bridge stp llc ip_set_hash_ip ip_set_hash_net xt_set tcp_diag udp_diag raw_diag inet_diag ip_set_hash_mac ip_set nfnetlink binfmt_misc nls_iso8859_1 nls_cp437 vfat fat bfq ipmi_ssif intel_rapl_msr at24 regmap_i2c intel_rapl_common iosf_mbi wmi ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt ipt_REJECT nfsd nf_reject_ipv4 xt_comment xt_limit xt_addrtype xt_tcpudp auth_rpcgss xt_conntrack nf_conntrack nfs_acl lockd nf_defrag_ipv6 nf_defrag_ipv4 grace ip6table_filter ip6_tables sunrpc iptable_filter ip_tables x_tables uas usb_storage megaraid_sas
CR2: 0000000000000068
---[ end trace 09ba7735db5e61a6 ]---
RIP: 0010:internal_get_user_pages_fast+0x621/0x9d0
Code: f7 c2 00 00 01 00 0f 85 94 fb ff ff 48 8b 4c 24 18 48 8d bc 24 8c 00 00 00 8b b4 24 8c 00 00 00 e8 b4 cd ff ff e9 76 fb ff ff <48> 81 7a 68 80 08 04 bc 0f 85 21 ff ff ff 8b 54 24 68 48 89 c7 be
RSP: 0018:ffffaa90087679b0 EFLAGS: 00010046
RAX: ffffe3f37905b900 RBX: 00007f2dd561e000 RCX: ffffe3f37905b934
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffe3f37905b900
RBP: 00007f2dd561f000 R08: ffffe3f37905b900 R09: 000000049b4b6000
R10: ffffaa9008767b17 R11: 0000000000000000 R12: 8000000e416e4067
R13: ffff9dc39b4b60f0 R14: 000000ffffffffff R15: ffffe3f37905b900
FS: 00007f2e07fff700(0000) GS:ffff9dcf3f980000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000068 CR3: 00000004c5898003 CR4: 00000000001726e0
I also noticed that a number of the VMs seemed to be totally livelocked
on "memset_erms" and the only thing I could do was terminate them all.
I'll dig into this more tomorrow, but on the off chance this rings a
bell for anyone, is this a known error?
--Darrick
I've noticed this as well on 5.14.1 and 5.14.3. On 5.14.1, I was using
the system (my desktop), and the VMs started to die followed by the host
system slowly grinding to a halt.
On 5.14.3, it died overnight. 5.13.x doesn't seem to be effected from
what I've seen.
The crash happens after a number of days (or a week+) with only 3 VMs
running and desktop usage.
BUG: kernel NULL pointer dereference, address: 0000000000000068
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 3 PID: 1185197 Comm: CPU 7/KVM Tainted: G E 5.14.3 #26
Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE WIFI/X570
AORUS ELITE WIFI, BIOS F35 07/08/2021
RIP: 0010:internal_get_user_pages_fast+0x738/0xda0
Code: 84 24 a0 00 00 00 65 48 2b 04 25 28 00 00 00 0f 85 54 06 00 00 48
81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <48> 81 78
68 a0 a3 63 82 0f 85 14 fe ff ff 44 89 f2 be 01 00 00 00
RSP: 0018:ffffa701c4d5bb40 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffcc156c899980 RCX: ffffcc156c8999b4
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffcc156c899980
RBP: 00007f1e2aec1000 R08: 0000000000000000 R09: ffffcc156c899980
R10: 0000000000000000 R11: 000000000000000c R12: ffff9176c47bd600
R13: 000000ffffffffff R14: 0000000000080005 R15: 8000000b22666867
FS: 00007f1d4a5fc700(0000) GS:ffff91817eac0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000068 CR3: 000000074f36a000 CR4: 0000000000350ee0
Call Trace:
get_user_pages_fast_only+0x13/0x20
__direct_pte_prefetch+0x12d/0x240 [kvm]
? mmu_set_spte+0x335/0x4d0 [kvm]
? kvm_mmu_max_mapping_level+0xda/0xf0 [kvm]
direct_page_fault+0x850/0xab0 [kvm]
? kvm_mtrr_check_gfn_range_consistency+0x61/0x120 [kvm]
kvm_check_async_pf_completion+0x9a/0x110 [kvm]
kvm_arch_vcpu_ioctl_run+0x1667/0x16a0 [kvm]
kvm_vcpu_ioctl+0x267/0x650 [kvm]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f1e5a033cc7
Code: 00 00 00 48 8b 05 c9 91 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff
ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 8b 0d 99 91 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007f1d4a5fb5c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f1e5a033cc7
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001c
RBP: 0000564ae48beaa0 R08: 0000564ae30405b8 R09: 00000000000000ff
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
R13: 0000564ae348b020 R14: 0000000000000001 R15: 0000000000000000
Modules linked in: md4(E) nls_utf8(E) cifs(E) dns_resolver(E) fscache(E)
netfs(E) libdes(E) ufs(E) qnx4(E) hfsplus(E) hfs(E) minix(E) msdos(E)
jfs(E) xfs(E) cpuid(E) ses(E) enclosure(E) scsi_transport_sas(E)
udp_diag(E) tcp_diag(E) inet_diag(E) dm_mod(E) vhost_net(E) tun(E)
vhost(E) vhost_iotlb(E) macvtap(E) macvlan(E) tap(E) xt_addrtype(E)
xt_nat(E) wireguard(E) libchacha20poly1305(E) chacha_x86_64(E)
poly1305_x86_64(E) libblake2s(E) blake2s_x86_64(E) curve25519_x86_64(E)
libcurve25519_generic(E) libchacha(E) libblake2s_generic(E)
ip6_udp_tunnel(E) udp_tunnel(E) rfcomm(E) snd_seq_dummy(E)
snd_hrtimer(E) snd_seq(E) ip6t_REJECT(E) nf_reject_ipv6(E)
xt_multiport(E) xt_cgroup(E) xt_mark(E) xt_owner(E) xt_CHECKSUM(E)
cdc_acm(E) xt_MASQUERADE(E) xt_conntrack(E) ipt_REJECT(E)
nf_reject_ipv4(E) xt_tcpudp(E) nft_compat(E) nft_chain_nat(E) nf_nat(E)
nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) cmac(E)
algif_hash(E) algif_skcipher(E) af_alg(E) nft_counter(E) nf_tables(E)
bridge(E) stp(E) llc(E) nfnetlink(E) bnep(E) binfmt_misc(E)
intel_rapl_msr(E) intel_rapl_common(E) edac_mce_amd(E) btusb(E) btrtl(E)
snd_hda_codec_realtek(E) btbcm(E) btintel(E) snd_hda_codec_generic(E)
ledtrig_audio(E) bluetooth(E) snd_hda_codec_hdmi(E) snd_hda_intel(E)
snd_intel_dspcfg(E) snd_intel_sdw_acpi(E) uvcvideo(E)
jitterentropy_rng(E) snd_hda_codec(E) snd_usb_audio(E)
videobuf2_vmalloc(E) videobuf2_memops(E) snd_usbmidi_lib(E)
snd_hda_core(E) iwlmvm(E) snd_rawmidi(E) sha512_ssse3(E) kvm_amd(E)
videobuf2_v4l2(E) snd_hwdep(E) snd_seq_device(E) sha512_generic(E)
nls_ascii(E) mac80211(E) videobuf2_common(E) snd_pcm(E) nls_cp437(E)
libarc4(E) drbg(E) snd_timer(E) ansi_cprng(E) kvm(E) videodev(E) vfat(E)
irqbypass(E) iwlwifi(E) sp5100_tco(E) ecdh_generic(E) ccp(E) fat(E)
mc(E) joydev(E) rapl(E) watchdog(E) ecc(E) k10temp(E) wmi_bmof(E)
efi_pstore(E) pcspkr(E) snd(E) rng_core(E) soundcore(E) sg(E)
cfg80211(E) rfkill(E) evdev(E) acpi_cpufreq(E) msr(E) parport_pc(E)
ppdev(E) lp(E) parport(E) fuse(E) configfs(E) sunrpc(E) efivarfs(E)
ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E)
btrfs(E) blake2b_generic(E) zstd_compress(E) raid10(E) raid1(E) raid0(E)
multipath(E) linear(E) raid456(E) async_raid6_recov(E) async_memcpy(E)
async_pq(E) async_xor(E) async_tx(E) xor(E) hid_logitech_hidpp(E)
raid6_pq(E) libcrc32c(E) crc32c_generic(E) md_mod(E) sr_mod(E) cdrom(E)
sd_mod(E) hid_logitech_dj(E) hid_generic(E) usbhid(E) hid(E) uas(E)
usb_storage(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E)
amdgpu(E) gpu_sched(E) drm_ttm_helper(E) nvme(E) ttm(E) nvme_core(E)
aesni_intel(E) t10_pi(E) igb(E) ahci(E) libaes(E) drm_kms_helper(E)
libahci(E) crc_t10dif(E) xhci_pci(E) dca(E) crypto_simd(E) cec(E) ptp(E)
crct10dif_generic(E) cryptd(E) libata(E) xhci_hcd(E) pps_core(E)
crct10dif_pclmul(E) i2c_piix4(E) drm(E) scsi_mod(E) usbcore(E)
i2c_algo_bit(E) crct10dif_common(E) wmi(E) button(E)
CR2: 0000000000000068
---[ end trace 1b0e733016be1d2c ]---
RIP: 0010:internal_get_user_pages_fast+0x738/0xda0
Code: 84 24 a0 00 00 00 65 48 2b 04 25 28 00 00 00 0f 85 54 06 00 00 48
81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <48> 81 78
68 a0 a3 63 82 0f 85 14 fe ff ff 44 89 f2 be 01 00 00 00
RSP: 0018:ffffa701c4d5bb40 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffcc156c899980 RCX: ffffcc156c8999b4
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffcc156c899980
RBP: 00007f1e2aec1000 R08: 0000000000000000 R09: ffffcc156c899980
R10: 0000000000000000 R11: 000000000000000c R12: ffff9176c47bd600
R13: 000000ffffffffff R14: 0000000000080005 R15: 8000000b22666867
FS: 00007f1d4a5fc700(0000) GS:ffff91817eac0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000068 CR3: 000000074f36a000 CR4: 0000000000350ee0
watchdog: BUG: soft lockup - CPU#22 stuck for 22s! [CPU 0/KVM:1185190]
These messages were retrieved via `journalctl --boot=-1`.
After this error, there are fairly continuous "soft lockup" messages and
stacks in journalctl, and the system had to be hard-booted to get an
interactive environment back. Screens would not wake, and SSH was not
accessible. Even REISUB didn't seem to work.
Just in case it's useful: I'm running a Ryzen 9 3900X, 64GB of RAM, 970
Evo Plus for / (ext4), RAID6 (4x8TB SATA drives) for /home (ext4), and
reference AMD Radeon 6700XT. The VMs are running off of the 970 Evo Plus.
I'm now running 5.14.6 after the latest boot; if I run into this again,
I'll follow up.
Hello,
I got this crash again on 5.14.7 in the early morning of the 27th.
Things hung up shortly after I'd gone to bed. Uptime was 1 day 9 hours 9
minutes.
I've rolled back to 5.13.19 for now; since this bug seems to effect my
system every day to few days. Please let me know if there's any
additional useful information I can provide.
The VMs that I'm running include a Gitlab server + runner, Matrix
Synapse server, and a Minecraft server.
The Minecraft server seems to keep 1 core maxed out fairly regularly;
otherwise they all have fairly low CPU load. The Minecraft server keeps
memory usage maxed out; the other two VMs seem to use ~50-75% of RAM as
reported by the Virtual Machine Manager UI and maybe a few % of CPU time
at any given point.
--------------------
BUG: kernel NULL pointer dereference, address: 0000000000000068
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 21 PID: 8494 Comm: CPU 7/KVM Tainted: G E 5.14.7 #32
Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE WIFI/X570
AORUS ELITE WIFI, BIOS F35 07/08/2021
RIP: 0010:internal_get_user_pages_fast+0x738/0xda0
Code: 84 24 a0 00 00 00 65 48 2b 04 25 28 00 00 00 0f 85 54 06 00 00 48
81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <48> 81 78
68 a0 a3 >
RSP: 0018:ffffb31845c43b40 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffe8499635d580 RCX: ffffe8499635d5b4
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffe8499635d580
RBP: 00007f230d9c2000 R08: 0000000000000000 R09: ffffe8499635d580
R10: 0000000000000000 R11: 000000000000000c R12: ffff8f54d1bbbe00
R13: 000000ffffffffff R14: 0000000000080005 R15: 800000058d756867
FS: 00007f22795fa640(0000) GS:ffff8f617ef40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000068 CR3: 000000022a4a0000 CR4: 0000000000350ee0
Call Trace:
get_user_pages_fast_only+0x13/0x20
__direct_pte_prefetch+0x12d/0x240 [kvm]
? mmu_set_spte+0x335/0x4d0 [kvm]
? kvm_mmu_max_mapping_level+0xf0/0x100 [kvm]
direct_page_fault+0x850/0xab0 [kvm]
? kvm_mtrr_check_gfn_range_consistency+0x61/0x120 [kvm]
kvm_check_async_pf_completion+0x9a/0x110 [kvm]
kvm_arch_vcpu_ioctl_run+0x1667/0x16a0 [kvm]
kvm_vcpu_ioctl+0x267/0x650 [kvm]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f238d1be957
Code: 3c 1c 48 f7 d8 4c 39 e0 77 b9 e8 24 ff ff ff 85 c0 78 be 4c 89 e0
5b 5d 41 5c c3 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01
f0 ff ff >
RSP: 002b:00007f22795f9528 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f238d1be957
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001c
RBP: 000055e40b2f9870 R08: 000055e40aa305b8 R09: 000000000000002c
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
R13: 000055e40ae7b020 R14: 0000000000000000 R15: 0000000000000000
Modules linked in: md4(E) nls_utf8(E) cifs(E) dns_resolver(E) fscache(E)
netfs(E) libdes(E) ufs(E) qnx4(E) hfsplus(E) hfs(E) minix(E) msdos(E)
jfs(E) xf>
binfmt_misc(E) intel_rapl_msr(E) intel_rapl_common(E) btusb(E) btrtl(E)
edac_mce_amd(E) btbcm(E) btintel(E) kvm_amd(E) uvcvideo(E) bluetooth(E)
videobu>
sunrpc(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E)
crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) zstd_compress(E)
raid10(E) ra>
CR2: 0000000000000068
---[ end trace ce417e1d9ee841db ]---
RIP: 0010:internal_get_user_pages_fast+0x738/0xda0
Code: 84 24 a0 00 00 00 65 48 2b 04 25 28 00 00 00 0f 85 54 06 00 00 48
81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <48> 81 78
68 a0 a3 >
RSP: 0018:ffffb31845c43b40 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffe8499635d580 RCX: ffffe8499635d5b4
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffe8499635d580
RBP: 00007f230d9c2000 R08: 0000000000000000 R09: ffffe8499635d580
R10: 0000000000000000 R11: 000000000000000c R12: ffff8f54d1bbbe00
R13: 000000ffffffffff R14: 0000000000080005 R15: 800000058d756867
FS: 00007f22795fa640(0000) GS:ffff8f617ef40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000068 CR3: 000000022a4a0000 CR4: 0000000000350ee0
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 9-....: (5233 ticks this GP) idle=5d2/1/0x4000000000000000
softirq=8121851/8121851 fqs=2619
(t=5250 jiffies g=17211713 q=5567)
Sending NMI from CPU 9 to CPUs 7:
NMI backtrace for cpu 7
CPU: 7 PID: 8492 Comm: CPU 5/KVM Tainted: G D E 5.14.7 #32
Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE WIFI/X570
AORUS ELITE WIFI, BIOS F35 07/08/2021
RIP: 0010:native_queued_spin_lock_slowpath+0x19c/0x1d0
Code: c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 00 d7 02 00
48 03 04 f5 80 2a 19 b6 48 89 10 8b 42 08 85 c0 75 09 f3 90 <8b> 42 08
85 c0 74 >
RSP: 0018:ffffb31845be3cb0 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffffb31847e45000 RCX: 0000000000200000
RDX: ffff8f617ebed700 RSI: 000000000000000c RDI: ffffb31847e45004
RBP: ffffb31847e45004 R08: 0000000000200000 R09: ffffb31845be3d0c
R10: 8000000000000000 R11: 000000000000000c R12: 0000000000000000
R13: 0000000111bd11c0 R14: 0000000000000000 R15: ffff8f55d0e60000
FS: 00007f227a5fc640(0000) GS:ffff8f617ebc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f87754dd9a8 CR3: 000000022a4a0000 CR4: 0000000000350ee0
Call Trace:
queued_write_lock_slowpath+0x73/0x80
direct_page_fault+0x639/0xab0 [kvm]
? kvm_mtrr_check_gfn_range_consistency+0x61/0x120 [kvm]
kvm_check_async_pf_completion+0x9a/0x110 [kvm]
kvm_arch_vcpu_ioctl_run+0x1667/0x16a0 [kvm]
kvm_vcpu_ioctl+0x267/0x650 [kvm]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f238d1be957
Code: 3c 1c 48 f7 d8 4c 39 e0 77 b9 e8 24 ff ff ff 85 c0 78 be 4c 89 e0
5b 5d 41 5c c3 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01
f0 ff ff >
RSP: 002b:00007f227a5fb528 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f238d1be957
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001a
RBP: 000055e40b2de890 R08: 000055e40aa305b8 R09: c000000000000000
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
R13: 000055e40ae7b020 R14: 0000000000000001 R15: 0000000000000000
NMI backtrace for cpu 9
CPU: 9 PID: 8493 Comm: CPU 6/KVM Tainted: G D E 5.14.7 #32
Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE WIFI/X570
AORUS ELITE WIFI, BIOS F35 07/08/2021
Call Trace:
<IRQ>
dump_stack_lvl+0x46/0x5a
nmi_cpu_backtrace.cold+0x32/0x69
? lapic_can_unplug_cpu+0x80/0x80
nmi_trigger_cpumask_backtrace+0xd7/0xe0
rcu_dump_cpu_stacks+0xc1/0xef
rcu_sched_clock_irq.cold+0xc7/0x1e9
update_process_times+0x8c/0xc0
tick_sched_handle+0x22/0x60
tick_sched_timer+0x7a/0xd0
? tick_do_update_jiffies64.part.0+0xa0/0xa0
__hrtimer_run_queues+0x12a/0x270
hrtimer_interrupt+0x110/0x2c0
__sysvec_apic_timer_interrupt+0x5c/0xd0
sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
asm_sysvec_apic_timer_interrupt+0x12/0x20
RIP: 0010:queued_write_lock_slowpath+0x56/0x80
Code: 0d 48 89 ef c6 07 00 0f 1f 40 00 5b 5d c3 f0 81 0b 00 01 00 00 ba
ff 00 00 00 b9 00 01 00 00 8b 03 3d 00 01 00 00 74 0b f3 90 <8b> 03 3d
00 01 00 >
RSP: 0018:ffffb31845c1bc28 EFLAGS: 00000206
RAX: 00000000000001ff RBX: ffffb31847e45000 RCX: 0000000000000100
RDX: 00000000000000ff RSI: 0000000000000000 RDI: ffffb31847e45000
RBP: ffffb31847e45004 R08: 0000000000000007 R09: ffffb31845c1bc7c
R10: 8000000000000000 R11: 000fffffffe00000 R12: 0000000000000000
R13: 0000000111bc0010 R14: 0000000000000000 R15: ffff8f55d0e62290
direct_page_fault+0x639/0xab0 [kvm]
<snip>
Thanks,
Stephen
On Tue, Sep 28, 2021, Stephen wrote:
> Hello,
>
> I got this crash again on 5.14.7 in the early morning of the 27th.
> Things hung up shortly after I'd gone to bed. Uptime was 1 day 9 hours 9
> minutes.
...
> BUG: kernel NULL pointer dereference, address: 0000000000000068
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP NOPTI
> CPU: 21 PID: 8494 Comm: CPU 7/KVM Tainted: G??????????? E???? 5.14.7 #32
> Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE WIFI/X570
> AORUS ELITE WIFI, BIOS F35 07/08/2021
> RIP: 0010:internal_get_user_pages_fast+0x738/0xda0
> Code: 84 24 a0 00 00 00 65 48 2b 04 25 28 00 00 00 0f 85 54 06 00 00 48
> 81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <48> 81 78
> 68 a0 a3 >
I haven't reproduced the crash, but the code signature (CMP against an absolute
address) is quite distinct, and is consistent across all three crashes. I'm pretty
sure the issue is that page_is_secretmem() doesn't check for a null page->mapping,
e.g. if the page is truncated, which IIUC can happen in parallel since gup() doesn't
hold the lock.
I think this should fix the problems?
diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 21c3771e6a56..988528b5da43 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -23,7 +23,7 @@ static inline bool page_is_secretmem(struct page *page)
mapping = (struct address_space *)
((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
- if (mapping != page->mapping)
+ if (!mapping || mapping != page->mapping)
return false;
return mapping->a_ops == &secretmem_aops;
On Wed, Sep 29, 2021 at 03:21:09PM +0000, Sean Christopherson wrote:
> On Tue, Sep 28, 2021, Stephen wrote:
> > Hello,
> >
> > I got this crash again on 5.14.7 in the early morning of the 27th.
> > Things hung up shortly after I'd gone to bed. Uptime was 1 day 9 hours 9
> > minutes.
>
> ...
>
> > BUG: kernel NULL pointer dereference, address: 0000000000000068
> > #PF: supervisor read access in kernel mode
> > #PF: error_code(0x0000) - not-present page
> > PGD 0 P4D 0
> > Oops: 0000 [#1] SMP NOPTI
> > CPU: 21 PID: 8494 Comm: CPU 7/KVM Tainted: G??????????? E???? 5.14.7 #32
> > Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE WIFI/X570
> > AORUS ELITE WIFI, BIOS F35 07/08/2021
> > RIP: 0010:internal_get_user_pages_fast+0x738/0xda0
> > Code: 84 24 a0 00 00 00 65 48 2b 04 25 28 00 00 00 0f 85 54 06 00 00 48
> > 81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <48> 81 78
> > 68 a0 a3 >
>
> I haven't reproduced the crash, but the code signature (CMP against an absolute
> address) is quite distinct, and is consistent across all three crashes. I'm pretty
> sure the issue is that page_is_secretmem() doesn't check for a null page->mapping,
> e.g. if the page is truncated, which IIUC can happen in parallel since gup() doesn't
> hold the lock.
>
> I think this should fix the problems?
>
> diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
> index 21c3771e6a56..988528b5da43 100644
> --- a/include/linux/secretmem.h
> +++ b/include/linux/secretmem.h
> @@ -23,7 +23,7 @@ static inline bool page_is_secretmem(struct page *page)
> mapping = (struct address_space *)
> ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
>
> - if (mapping != page->mapping)
> + if (!mapping || mapping != page->mapping)
I'll roll this out on my vm host and try to re-run the mass fuzztest
overnight, though IT claims they're going to kill power to the whole
datacenter until Monday(!)...
--D
> return false;
>
> return mapping->a_ops == &secretmem_aops;
> I think this should fix the problems?
>
> diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
> index 21c3771e6a56..988528b5da43 100644
> --- a/include/linux/secretmem.h
> +++ b/include/linux/secretmem.h
> @@ -23,7 +23,7 @@ static inline bool page_is_secretmem(struct page *page)
> mapping = (struct address_space *)
> ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
>
> - if (mapping != page->mapping)
> + if (!mapping || mapping != page->mapping)
> return false;
>
> return mapping->a_ops == &secretmem_aops;
I have validated that my system was stable after several days on
v5.13.19. I'm now booted into a v5.14.8 kernel with this patch, and I'll
try to report back if I see a crash; or in roughly a week if the system
seems to have stabilized.
Thanks,
Stephen
On Thu, Sep 30, 2021 at 10:59:57AM -0700, Darrick J. Wong wrote:
> On Wed, Sep 29, 2021 at 03:21:09PM +0000, Sean Christopherson wrote:
> > On Tue, Sep 28, 2021, Stephen wrote:
> > > Hello,
> > >
> > > I got this crash again on 5.14.7 in the early morning of the 27th.
> > > Things hung up shortly after I'd gone to bed. Uptime was 1 day 9 hours 9
> > > minutes.
> >
> > ...
> >
> > > BUG: kernel NULL pointer dereference, address: 0000000000000068
> > > #PF: supervisor read access in kernel mode
> > > #PF: error_code(0x0000) - not-present page
> > > PGD 0 P4D 0
> > > Oops: 0000 [#1] SMP NOPTI
> > > CPU: 21 PID: 8494 Comm: CPU 7/KVM Tainted: G??????????? E???? 5.14.7 #32
> > > Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE WIFI/X570
> > > AORUS ELITE WIFI, BIOS F35 07/08/2021
> > > RIP: 0010:internal_get_user_pages_fast+0x738/0xda0
> > > Code: 84 24 a0 00 00 00 65 48 2b 04 25 28 00 00 00 0f 85 54 06 00 00 48
> > > 81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <48> 81 78
> > > 68 a0 a3 >
> >
> > I haven't reproduced the crash, but the code signature (CMP against an absolute
> > address) is quite distinct, and is consistent across all three crashes. I'm pretty
> > sure the issue is that page_is_secretmem() doesn't check for a null page->mapping,
> > e.g. if the page is truncated, which IIUC can happen in parallel since gup() doesn't
> > hold the lock.
> >
> > I think this should fix the problems?
> >
> > diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
> > index 21c3771e6a56..988528b5da43 100644
> > --- a/include/linux/secretmem.h
> > +++ b/include/linux/secretmem.h
> > @@ -23,7 +23,7 @@ static inline bool page_is_secretmem(struct page *page)
> > mapping = (struct address_space *)
> > ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
> >
> > - if (mapping != page->mapping)
> > + if (!mapping || mapping != page->mapping)
>
> I'll roll this out on my vm host and try to re-run the mass fuzztest
> overnight, though IT claims they're going to kill power to the whole
> datacenter until Monday(!)...
...which they did, 30 minutes after I sent this email. :(
I'll hopefully be able to report back to the list in a day or two.
--D
>
> --D
>
> > return false;
> >
> > return mapping->a_ops == &secretmem_aops;
On Mon, Oct 04, 2021 at 09:54:32AM -0700, Darrick J. Wong wrote:
> On Thu, Sep 30, 2021 at 10:59:57AM -0700, Darrick J. Wong wrote:
> > On Wed, Sep 29, 2021 at 03:21:09PM +0000, Sean Christopherson wrote:
> > > On Tue, Sep 28, 2021, Stephen wrote:
> > > > Hello,
> > > >
> > > > I got this crash again on 5.14.7 in the early morning of the 27th.
> > > > Things hung up shortly after I'd gone to bed. Uptime was 1 day 9 hours 9
> > > > minutes.
> > >
> > > ...
> > >
> > > > BUG: kernel NULL pointer dereference, address: 0000000000000068
> > > > #PF: supervisor read access in kernel mode
> > > > #PF: error_code(0x0000) - not-present page
> > > > PGD 0 P4D 0
> > > > Oops: 0000 [#1] SMP NOPTI
> > > > CPU: 21 PID: 8494 Comm: CPU 7/KVM Tainted: G??????????? E???? 5.14.7 #32
> > > > Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE WIFI/X570
> > > > AORUS ELITE WIFI, BIOS F35 07/08/2021
> > > > RIP: 0010:internal_get_user_pages_fast+0x738/0xda0
> > > > Code: 84 24 a0 00 00 00 65 48 2b 04 25 28 00 00 00 0f 85 54 06 00 00 48
> > > > 81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <48> 81 78
> > > > 68 a0 a3 >
> > >
> > > I haven't reproduced the crash, but the code signature (CMP against an absolute
> > > address) is quite distinct, and is consistent across all three crashes. I'm pretty
> > > sure the issue is that page_is_secretmem() doesn't check for a null page->mapping,
> > > e.g. if the page is truncated, which IIUC can happen in parallel since gup() doesn't
> > > hold the lock.
> > >
> > > I think this should fix the problems?
> > >
> > > diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
> > > index 21c3771e6a56..988528b5da43 100644
> > > --- a/include/linux/secretmem.h
> > > +++ b/include/linux/secretmem.h
> > > @@ -23,7 +23,7 @@ static inline bool page_is_secretmem(struct page *page)
> > > mapping = (struct address_space *)
> > > ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
> > >
> > > - if (mapping != page->mapping)
> > > + if (!mapping || mapping != page->mapping)
> >
> > I'll roll this out on my vm host and try to re-run the mass fuzztest
> > overnight, though IT claims they're going to kill power to the whole
> > datacenter until Monday(!)...
>
> ...which they did, 30 minutes after I sent this email. :(
>
> I'll hopefully be able to report back to the list in a day or two.
Looks like everything went smoothly with the mass fuzz fstesting.
I'll let you know if I see any further failures, but for now:
Tested-by: Darrick J. Wong <[email protected]>
--D
> --D
>
> >
> > --D
> >
> > > return false;
> > >
> > > return mapping->a_ops == &secretmem_aops;
> I'll try to report back if I see a crash; or in roughly a week if the
system seems to have stabilized.
Just wanted to provide a follow-up here and say that I've run on both
v5.14.8 and v5.14.9 with this patch and everything seems to be good; no
further crashes or problems.
Thank you,
Stephen
Hi,
On Sat, Oct 09, 2021 at 12:00:39PM -0700, Stephen wrote:
> > I'll try to report back if I see a crash; or in roughly a week if the
> system seems to have stabilized.
>
> Just wanted to provide a follow-up here and say that I've run on both
> v5.14.8 and v5.14.9 with this patch and everything seems to be good; no
> further crashes or problems.
In Debian we got a report as well related to this issue (cf.
https://bugs.debian.org/996175). Do you know did the patch felt
through the cracks?
Regards,
Salvatore
On 13/10/21 21:00, Salvatore Bonaccorso wrote:
> Hi,
>
> On Sat, Oct 09, 2021 at 12:00:39PM -0700, Stephen wrote:
>>> I'll try to report back if I see a crash; or in roughly a week if the
>> system seems to have stabilized.
>>
>> Just wanted to provide a follow-up here and say that I've run on both
>> v5.14.8 and v5.14.9 with this patch and everything seems to be good; no
>> further crashes or problems.
>
> In Debian we got a report as well related to this issue (cf.
> https://bugs.debian.org/996175). Do you know did the patch felt
> through the cracks?
Yeah, it's not a KVM patch so the mm maintainers didn't see it. I'll
handle it tomorrow.
Paolo
On Wed, Oct 13, 2021, Paolo Bonzini wrote:
> On 13/10/21 21:00, Salvatore Bonaccorso wrote:
> > Hi,
> >
> > On Sat, Oct 09, 2021 at 12:00:39PM -0700, Stephen wrote:
> > > > I'll try to report back if I see a crash; or in roughly a week if the
> > > system seems to have stabilized.
> > >
> > > Just wanted to provide a follow-up here and say that I've run on both
> > > v5.14.8 and v5.14.9 with this patch and everything seems to be good; no
> > > further crashes or problems.
> >
> > In Debian we got a report as well related to this issue (cf.
> > https://bugs.debian.org/996175). Do you know did the patch felt
> > through the cracks?
>
> Yeah, it's not a KVM patch so the mm maintainers didn't see it. I'll handle
> it tomorrow.
It's queued in the -mm tree.
https://lore.kernel.org/mm-commits/20211010224759.Ny1hd1WiD%[email protected]/
Hi,
On Wed, Oct 13, 2021 at 07:29:20PM +0000, Sean Christopherson wrote:
> On Wed, Oct 13, 2021, Paolo Bonzini wrote:
> > On 13/10/21 21:00, Salvatore Bonaccorso wrote:
> > > Hi,
> > >
> > > On Sat, Oct 09, 2021 at 12:00:39PM -0700, Stephen wrote:
> > > > > I'll try to report back if I see a crash; or in roughly a week if the
> > > > system seems to have stabilized.
> > > >
> > > > Just wanted to provide a follow-up here and say that I've run on both
> > > > v5.14.8 and v5.14.9 with this patch and everything seems to be good; no
> > > > further crashes or problems.
> > >
> > > In Debian we got a report as well related to this issue (cf.
> > > https://bugs.debian.org/996175). Do you know did the patch felt
> > > through the cracks?
> >
> > Yeah, it's not a KVM patch so the mm maintainers didn't see it. I'll handle
> > it tomorrow.
>
> It's queued in the -mm tree.
>
> https://lore.kernel.org/mm-commits/20211010224759.Ny1hd1WiD%[email protected]/
Sean and Paolo, thank you, missed the above.
Regards,
Salvatore