2024-05-18 12:53:43

by Chris Rankin

[permalink] [raw]
Subject: [BUG] Linux 6.8.10 NPE

Hi,

I am using vanilla Linux 6.8.10, and I've just noticed this BUG in my
dmesg log. I have no idea what triggered it, and especially since I
have not even mounted any NFS filesystems?!

Cheers,
Chris

[ 9114.607417] BUG: kernel NULL pointer dereference, address: 0000000000000068
[ 9114.613082] #PF: supervisor read access in kernel mode
[ 9114.616929] #PF: error_code(0x0000) - not-present page
[ 9114.620775] PGD 0 P4D 0
[ 9114.622013] Oops: 0000 [#16] PREEMPT SMP PTI
[ 9114.624987] CPU: 2 PID: 16501 Comm: sadc Tainted: G D I
6.8.10 #1
[ 9114.630993] Hardware name: Gigabyte Technology Co., Ltd.
EX58-UD3R/EX58-UD3R, BIOS FB 05/04/2009
[ 9114.638561] RIP: 0010:nfsd_show+0x39/0x18e [nfsd]
[ 9114.642026] Code: fb 48 83 ec 10 48 8b 47 70 8b 2d 34 9b 03 00 48
8b 80 b0 00 00 00 4c 8b a0 60 02 00 00 e8 99 84 f2 df 49 8b 84 24 f8
0b 00 00 <48> 8b 2c e8 e8 6d c0 f2 df 48 8d bd 00 03 00 00 e8 8f ff ff
ff 48
[ 9114.659472] RSP: 0018:ffffc9000b1afcf8 EFLAGS: 00010202
[ 9114.663405] RAX: 0000000000000000 RBX: ffff88810cd5de80 RCX: 0000000000001000
[ 9114.669239] RDX: ffff88813f45b900 RSI: 0000000000000001 RDI: ffff88810cd5de80
[ 9114.675069] RBP: 000000000000000d R08: 0000000000400cc0 R09: 00000000ffffffff
[ 9114.680905] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa11e51e0
[ 9114.686737] R13: ffffc9000b1afef8 R14: ffff88810cd5de80 R15: ffffffff81a2a520
[ 9114.692586] FS: 00007f3637e49740(0000) GS:ffff888343c80000(0000)
knlGS:0000000000000000
[ 9114.699370] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9114.703809] CR2: 0000000000000068 CR3: 000000027a18c000 CR4: 00000000000006f0
[ 9114.709642] Call Trace:
[ 9114.710798] <TASK>
[ 9114.711602] ? __die_body+0x1a/0x5c
[ 9114.713802] ? page_fault_oops+0x321/0x36e
[ 9114.716636] ? exc_page_fault+0x105/0x117
[ 9114.719348] ? asm_exc_page_fault+0x22/0x30
[ 9114.722237] ? nfsd_show+0x39/0x18e [nfsd]
[ 9114.725103] ? nfsd_show+0x31/0x18e [nfsd]
[ 9114.727954] seq_read_iter+0x171/0x353
[ 9114.730410] seq_read+0xe0/0x108
[ 9114.732350] ? startup_64+0x1/0x60
[ 9114.734458] proc_reg_read+0x8c/0xa7
[ 9114.736746] vfs_read+0xa6/0x1bf
[ 9114.738685] ? __do_sys_newfstat+0x34/0x5c
[ 9114.741486] ksys_read+0x74/0xc0
[ 9114.743418] do_syscall_64+0x6c/0xdc
[ 9114.745695] entry_SYSCALL_64_after_hwframe+0x60/0x68
[ 9114.749449] RIP: 0033:0x7f3638039cc1
[ 9114.751752] Code: 00 48 8b 15 59 81 0d 00 f7 d8 64 89 02 b8 ff ff
ff ff eb bd e8 b0 aa 01 00 f3 0f 1e fa 80 3d 85 03 0e 00 00 74 13 31
c0 0f 05 <48> 3d 00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48
83 ec
[ 9114.769198] RSP: 002b:00007ffec523c288 EFLAGS: 00000246 ORIG_RAX:
0000000000000000
[ 9114.775499] RAX: ffffffffffffffda RBX: 000055619e0982e0 RCX: 00007f3638039cc1
[ 9114.781332] RDX: 0000000000000400 RSI: 000055619e0a3b70 RDI: 0000000000000004
[ 9114.787166] RBP: 00007ffec523c2c0 R08: 0000000000000001 R09: 0000000000000000
[ 9114.792997] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3638111050
[ 9114.798830] R13: 00007f3638110f00 R14: 0000000000000000 R15: 000055619e0982e0
[ 9114.804668] </TASK>
[ 9114.805559] Modules linked in: udf usb_storage snd_seq_dummy
rpcrdma rdma_cm iw_cm ib_cm ib_core nf_nat_ftp nf_conntrack_ftp
cfg80211 af_packet nf_conntrack_netbios_ns nf_conntrack_broadcast
nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat
nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle
ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_mangle iptable_raw
iptable_security ebtable_filter ebtables ip6table_filter ip6_tables
iptable_filter ip_tables x_tables it87 hwmon_vid bnep binfmt_misc
snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi
snd_hda_intel uvcvideo btusb uvc btintel btbcm videobuf2_vmalloc
snd_intel_dspcfg videobuf2_memops bluetooth videobuf2_v4l2
snd_hda_codec videodev snd_usb_audio intel_powerclamp coretemp
snd_virtuoso snd_hda_core kvm_intel snd_oxygen_lib videobuf2_common
ecdh_generic snd_usbmidi_lib snd_mpu401_uart snd_hwdep mc snd_rawmidi
[ 9114.805644] input_leds joydev led_class snd_seq rfkill kvm ecc
snd_seq_device gpio_ich pktcdvd snd_pcm iTCO_wdt r8169 irqbypass
i2c_i801 snd_hrtimer realtek snd_timer mdio_devres intel_cstate snd
libphy acpi_cpufreq i2c_smbus pcspkr intel_uncore psmouse lpc_ich
i7core_edac mxm_wmi soundcore tiny_power_button button nfsd
auth_rpcgss nfs_acl lockd grace sunrpc dm_mod loop fuse dax configfs
nfnetlink zram zsmalloc ext4 crc32c_generic crc16 mbcache jbd2 amdgpu
video amdxcp i2c_algo_bit mfd_core drm_ttm_helper ttm hid_microsoft
drm_exec gpu_sched drm_suballoc_helper drm_buddy drm_display_helper
sr_mod usbhid sd_mod cdrom drm_kms_helper ahci pata_jmicron libahci
drm uhci_hcd libata ehci_pci ehci_hcd xhci_pci firewire_ohci xhci_hcd
firewire_core scsi_mod crc32c_intel sha512_ssse3 usbcore
drm_panel_orientation_quirks sha256_ssse3 cec serio_raw sha1_ssse3
rc_core crc_itu_t bsg usb_common scsi_common wmi msr [last unloaded:
sg]
[ 9114.974386] CR2: 0000000000000068
[ 9114.976469] ---[ end trace 0000000000000000 ]---
[ 9114.979859] RIP: 0010:nfsd_show+0x39/0x18e [nfsd]
[ 9114.983413] Code: fb 48 83 ec 10 48 8b 47 70 8b 2d 34 9b 03 00 48
8b 80 b0 00 00 00 4c 8b a0 60 02 00 00 e8 99 84 f2 df 49 8b 84 24 f8
0b 00 00 <48> 8b 2c e8 e8 6d c0 f2 df 48 8d bd 00 03 00 00 e8 8f ff ff
ff 48
[ 9115.000909] RSP: 0018:ffffc90002edfcf8 EFLAGS: 00010202
[ 9115.004850] RAX: 0000000000000000 RBX: ffff88813fee9780 RCX: 0000000000001000
[ 9115.010725] RDX: ffff88810b474740 RSI: 0000000000000001 RDI: ffff88813fee9780
[ 9115.016621] RBP: 000000000000000d R08: 0000000000400cc0 R09: 00000000ffffffff
[ 9115.022507] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa11e51e0
[ 9115.028414] R13: ffffc90002edfef8 R14: ffff88813fee9780 R15: ffffffff81a2a520
[ 9115.034338] FS: 00007f3637e49740(0000) GS:ffff888343c00000(0000)
knlGS:0000000000000000
[ 9115.041191] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9115.045739] CR2: 00007f7171a17ef0 CR3: 000000027a18c000 CR4: 00000000000006f0


2024-05-19 11:29:34

by Paul Grandperrin

[permalink] [raw]
Subject: Re: [BUG] Linux 6.8.10 NPE

> I am using vanilla Linux 6.8.10, and I've just noticed this BUG in my
dmesg log. I have no idea what triggered it, and especially since I
have not even mounted any NFS filesystems?!

Hi all,
I have the exact same bug. I'm using the NixOS kernel but as soon as it
was updated to 6.8.10 my server has gone in a crash-reboot-loop.

The server is hosting an NFS deamon and it crashes about 10 seconds
after the tty login prompt is displayed.

Dowgrading to 6.8.9 fixes the issue.

Regards,
Paul Grandperrin



2024-05-22 15:11:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [BUG] Linux 6.8.10 NPE

On Sun, May 19, 2024 at 01:28:58PM +0200, Paul Grandperrin wrote:
> > I am using vanilla Linux 6.8.10, and I've just noticed this BUG in my
> dmesg log. I have no idea what triggered it, and especially since I
> have not even mounted any NFS filesystems?!
>
> Hi all,
> I have the exact same bug. I'm using the NixOS kernel but as soon as it was
> updated to 6.8.10 my server has gone in a crash-reboot-loop.
>
> The server is hosting an NFS deamon and it crashes about 10 seconds after
> the tty login prompt is displayed.
>
> Dowgrading to 6.8.9 fixes the issue.

Any chance you all can use 'git bisect' to find the offending commit?

thanks,

greg k-h

2024-05-22 15:12:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [BUG] Linux 6.8.10 NPE

On Sun, May 19, 2024 at 01:28:58PM +0200, Paul Grandperrin wrote:
> > I am using vanilla Linux 6.8.10, and I've just noticed this BUG in my
> dmesg log. I have no idea what triggered it, and especially since I
> have not even mounted any NFS filesystems?!
>
> Hi all,
> I have the exact same bug. I'm using the NixOS kernel but as soon as it was
> updated to 6.8.10 my server has gone in a crash-reboot-loop.
>
> The server is hosting an NFS deamon and it crashes about 10 seconds after
> the tty login prompt is displayed.
>
> Dowgrading to 6.8.9 fixes the issue.

Any chance you all can use 'git bisect' to track down the offending
commit?

thanks,

greg k-h

2024-05-24 11:28:24

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [BUG] Linux 6.8.10 NPE

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

On 22.05.24 17:12, Greg KH wrote:
> On Sun, May 19, 2024 at 01:28:58PM +0200, Paul Grandperrin wrote:
>>> I am using vanilla Linux 6.8.10, and I've just noticed this BUG in my
>> dmesg log. I have no idea what triggered it, and especially since I
>> have not even mounted any NFS filesystems?!
>>
>> Hi all,
>> I have the exact same bug. I'm using the NixOS kernel but as soon as it was
>> updated to 6.8.10 my server has gone in a crash-reboot-loop.
>>
>> The server is hosting an NFS deamon and it crashes about 10 seconds after
>> the tty login prompt is displayed.
>>
>> Dowgrading to 6.8.9 fixes the issue.
>
> Any chance you all can use 'git bisect' to track down the offending
> commit?

Paul, any progress on this?

BTW, there is also a report about a NFS related general protection
fault, see this thread:

https://lore.kernel.org/all/CAK8fFZ7rbh5o9XG1D5KAPSRyES-8W8AphxsLJXOWUFZK49i8fA@mail.gmail.com/

It was bisected to 4b14885411f74b ("nfsd: make all of the nfsd stats
per-network namespace") [v6.9-rc1, v6.8.10 (abf5fb593c90d3)]

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.