Greeting,
FYI, we noticed the following commit (built with gcc-11):
commit: 8de21cdda7041f63e56d336fec2012a6d1606eea ("SQUASH: nfsd: ensure we fill in pre-op-attrs in nfsd4_create_file")
https://git.kernel.org/cgit/linux/kernel/git/jlayton/linux.git nfsd-deleg-race
in testcase: fsmark
version: fsmark-x86_64-698ee57-1_20220517
with following parameters:
iterations: 1x
nr_threads: 1t
disk: 1HDD
fs: ext4
fs2: nfsv4
filesize: 4K
test_size: 40M
sync_method: fsyncBeforeClose
nr_files_per_directory: 1fpd
cpufreq_governor: performance
ucode: 0xd000331
test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/
on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz with 128G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
[ 52.105062][ T3881] ------------[ cut here ]------------
[ 52.110625][ T3881] kernel BUG at fs/nfsd/xdr4.h:752!
[ 52.115923][ T3881] invalid opcode: 0000 [#1] SMP NOPTI
[ 52.121380][ T3881] CPU: 26 PID: 3881 Comm: nfsd Tainted: G S 5.19.0-rc5-00050-g8de21cdda704 #1
[ 52.131597][ T3881] Hardware name: Intel Corporation M50CYP2SB1U/M50CYP2SB1U, BIOS SE5C620.86B.01.01.0003.2104260124 04/26/2021
[ 52.143520][ T3881] RIP: 0010:do_open_lookup (fs/nfsd/xdr4.h:752 fs/nfsd/nfs4proc.c:505) nfsd
[ 52.149690][ T3881] Code: 8b 75 00 e9 0a ff ff ff 8b 16 48 8b bb 08 01 00 00 48 83 c6 04 89 57 58 48 83 c7 5c e8 14 91 b9 c1 48 8b 75 00 e9 f7 fe ff ff <0f> 0b b8 00 00 27 18 e9 7d fd ff ff 66 66 2e 0f 1f 84 00 00 00 00
All code
========
0: 8b 75 00 mov 0x0(%rbp),%esi
3: e9 0a ff ff ff jmpq 0xffffffffffffff12
8: 8b 16 mov (%rsi),%edx
a: 48 8b bb 08 01 00 00 mov 0x108(%rbx),%rdi
11: 48 83 c6 04 add $0x4,%rsi
15: 89 57 58 mov %edx,0x58(%rdi)
18: 48 83 c7 5c add $0x5c,%rdi
1c: e8 14 91 b9 c1 callq 0xffffffffc1b99135
21: 48 8b 75 00 mov 0x0(%rbp),%rsi
25: e9 f7 fe ff ff jmpq 0xffffffffffffff21
2a:* 0f 0b ud2 <-- trapping instruction
2c: b8 00 00 27 18 mov $0x18270000,%eax
31: e9 7d fd ff ff jmpq 0xfffffffffffffdb3
36: 66 data16
37: 66 data16
38: 2e cs
39: 0f .byte 0xf
3a: 1f (bad)
3b: 84 00 test %al,(%rax)
3d: 00 00 add %al,(%rax)
...
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: b8 00 00 27 18 mov $0x18270000,%eax
7: e9 7d fd ff ff jmpq 0xfffffffffffffd89
c: 66 data16
d: 66 data16
e: 2e cs
f: 0f .byte 0xf
10: 1f (bad)
11: 84 00 test %al,(%rax)
13: 00 00 add %al,(%rax)
...
[ 52.169619][ T3881] RSP: 0018:ffa000000c4cbda0 EFLAGS: 00010246
[ 52.175783][ T3881] RAX: 0000000000000000 RBX: ff110001d7b98900 RCX: 0000000000000000
[ 52.183857][ T3881] RDX: ff1100018a9bce8c RSI: ff1100018a9bce00 RDI: ff110001107a16a0
[ 52.191935][ T3881] RBP: ffa000000c4cbde0 R08: ff11000130c6d140 R09: ff110001b7ffba68
[ 52.200012][ T3881] R10: ff110001a8f08800 R11: 0000000001ccd554 R12: ff110001d7ba0030
[ 52.208083][ T3881] R13: ff110001e5070000 R14: ff1100014dd58000 R15: ffffffff837353c0
[ 52.216149][ T3881] FS: 0000000000000000(0000) GS:ff1100103f880000(0000) knlGS:0000000000000000
[ 52.225175][ T3881] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 52.231863][ T3881] CR2: 00005652554130a8 CR3: 000000207ec0a001 CR4: 0000000000771ee0
[ 52.239936][ T3881] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 52.248011][ T3881] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 52.256088][ T3881] PKRU: 55555554
[ 52.259740][ T3881] Call Trace:
[ 52.263127][ T3881] <TASK>
[ 52.266162][ T3881] nfsd4_open (fs/nfsd/nfs4proc.c:628) nfsd
[ 52.271202][ T3881] nfsd4_proc_compound (fs/nfsd/nfs4proc.c:2764) nfsd
[ 52.277017][ T3881] nfsd_dispatch (fs/nfsd/nfssvc.c:1056) nfsd
[ 52.282310][ T3881] svc_process_common (net/sunrpc/svc.c:1339)
[ 52.287414][ T3881] ? nfsd_svc (fs/nfsd/nfssvc.c:1028) nfsd
[ 52.292439][ T3881] ? nfsd_shutdown_threads (fs/nfsd/nfssvc.c:932) nfsd
[ 52.298418][ T3881] svc_process (net/sunrpc/svc.c:1470)
[ 52.302828][ T3881] nfsd (fs/nfsd/nfssvc.c:979) nfsd
[ 52.307246][ T3881] kthread (kernel/kthread.c:376)
[ 52.311309][ T3881] ? kthread_complete_and_exit (kernel/kthread.c:331)
[ 52.317021][ T3881] ret_from_fork (arch/x86/entry/entry_64.S:302)
[ 52.321514][ T3881] </TASK>
[ 52.324614][ T3881] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfsd auth_rpcgss dm_mod xfs libcrc32c sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft crc64 sg intel_rapl_msr intel_rapl_common ipmi_ssif ast drm_vram_helper drm_ttm_helper x86_pkg_temp_thermal intel_powerclamp ttm ahci coretemp drm_kms_helper crct10dif_pclmul crc32_pclmul libahci syscopyarea crc32c_intel ghash_clmulni_intel rapl intel_cstate acpi_ipmi sysfillrect mei_me sysimgblt ipmi_si intel_uncore ioatdma libata joydev mei fb_sys_fops intel_pch_thermal ipmi_devintf dca wmi ipmi_msghandler acpi_pad acpi_power_meter drm fuse ip_tables
[ 52.380018][ T3881] ---[ end trace 0000000000000000 ]---
[ 52.394666][ T3881] RIP: 0010:do_open_lookup (fs/nfsd/xdr4.h:752 fs/nfsd/nfs4proc.c:505) nfsd
[ 52.400815][ T3881] Code: 8b 75 00 e9 0a ff ff ff 8b 16 48 8b bb 08 01 00 00 48 83 c6 04 89 57 58 48 83 c7 5c e8 14 91 b9 c1 48 8b 75 00 e9 f7 fe ff ff <0f> 0b b8 00 00 27 18 e9 7d fd ff ff 66 66 2e 0f 1f 84 00 00 00 00
All code
========
0: 8b 75 00 mov 0x0(%rbp),%esi
3: e9 0a ff ff ff jmpq 0xffffffffffffff12
8: 8b 16 mov (%rsi),%edx
a: 48 8b bb 08 01 00 00 mov 0x108(%rbx),%rdi
11: 48 83 c6 04 add $0x4,%rsi
15: 89 57 58 mov %edx,0x58(%rdi)
18: 48 83 c7 5c add $0x5c,%rdi
1c: e8 14 91 b9 c1 callq 0xffffffffc1b99135
21: 48 8b 75 00 mov 0x0(%rbp),%rsi
25: e9 f7 fe ff ff jmpq 0xffffffffffffff21
2a:* 0f 0b ud2 <-- trapping instruction
2c: b8 00 00 27 18 mov $0x18270000,%eax
31: e9 7d fd ff ff jmpq 0xfffffffffffffdb3
36: 66 data16
37: 66 data16
38: 2e cs
39: 0f .byte 0xf
3a: 1f (bad)
3b: 84 00 test %al,(%rax)
3d: 00 00 add %al,(%rax)
...
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: b8 00 00 27 18 mov $0x18270000,%eax
7: e9 7d fd ff ff jmpq 0xfffffffffffffd89
c: 66 data16
d: 66 data16
e: 2e cs
f: 0f .byte 0xf
10: 1f (bad)
11: 84 00 test %al,(%rax)
13: 00 00 add %al,(%rax)
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://01.org/lkp