2022-05-02 08:39:57

by kernel test robot

[permalink] [raw]
Subject: [net] 02dcea6024: INFO:task_blocked_for_more_than#seconds


(please be noted we reported
"[net] 02dcea6024: netperf.Throughput_tps -9.1% regression"
on https://lore.kernel.org/all/20220429062049.GA8992@xsang-OptiPlex-9020/
yesterday, but now we found a func issue on fsmark tests.
FYI)

Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: 02dcea6024d411cda23bd77e5e604f94a32c332d ("net: allow sk_prot->release_cb() without sock lock held")
https://github.com/ammarfaizi2/linux-block axboe/linux-block/sock-nolock.2

in testcase: fsmark
version: fsmark-x86_64-698ee57-1_20220410
with following parameters:

iterations: 1x
nr_threads: 1t
disk: 1HDD
fs: xfs
fs2: nfsv4
filesize: 4K
test_size: 50M
sync_method: fsyncBeforeClose
nr_files_per_directory: 1fpd
cpufreq_governor: performance
ucode: 0xd000331

test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/


on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz with 128G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 991.054690][ T722] INFO: task nfsd:4136 blocked for more than 491 seconds.
[ 991.061875][ T722] Not tainted 5.18.0-rc4-00072-g02dcea6024d4 #2
[ 991.068688][ T722] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 991.077381][ T722] task:nfsd state:D stack: 0 pid: 4136 ppid: 2 flags:0x00004000
[ 991.086602][ T722] Call Trace:
[ 991.089925][ T722] <TASK>
[ 991.092899][ T722] __schedule (kbuild/src/x86_64/kernel/sched/core.c:5073 kbuild/src/x86_64/kernel/sched/core.c:6388)
[ 991.097265][ T722] schedule (kbuild/src/x86_64/arch/x86/include/asm/preempt.h:85 (discriminator 1) kbuild/src/x86_64/kernel/sched/core.c:6461 (discriminator 1))
[ 991.101278][ T722] __lock_sock (kbuild/src/x86_64/include/linux/spinlock.h:354 kbuild/src/x86_64/net/core/sock.c:2771)
[ 991.105553][ T722] ? prepare_to_wait_exclusive (kbuild/src/x86_64/kernel/sched/wait.c:414)
[ 991.111222][ T722] lock_sock_nested (kbuild/src/x86_64/net/core/sock.c:3317)
[ 991.115944][ T722] tcp_recvmsg (kbuild/src/x86_64/net/ipv4/tcp.c:2574)
[ 991.120310][ T722] inet6_recvmsg (kbuild/src/x86_64/net/ipv6/af_inet6.c:674 (discriminator 8))
[ 991.124854][ T722] svc_tcp_read_marker+0xa6/0x180
[ 991.130949][ T722] svc_tcp_recvfrom (kbuild/src/x86_64/net/sunrpc/svcsock.c:994)
[ 991.135746][ T722] ? del_timer_sync (kbuild/src/x86_64/kernel/time/timer.c:1384)
[ 991.140456][ T722] ? schedule_timeout (kbuild/src/x86_64/kernel/time/timer.c:1890)
[ 991.145422][ T722] svc_handle_xprt (kbuild/src/x86_64/net/sunrpc/svc_xprt.c:846)
[ 991.150212][ T722] svc_recv (kbuild/src/x86_64/net/sunrpc/svc_xprt.c:884)
[ 991.154311][ T722] ? nfsd_shutdown_threads (kbuild/src/x86_64/fs/nfsd/nfssvc.c:932) nfsd
[ 991.160251][ T722] nfsd (kbuild/src/x86_64/fs/nfsd/nfssvc.c:973 (discriminator 1)) nfsd
[ 991.164629][ T722] kthread (kbuild/src/x86_64/kernel/kthread.c:376)
[ 991.168651][ T722] ? kthread_complete_and_exit (kbuild/src/x86_64/kernel/kthread.c:331)
[ 991.174316][ T722] ret_from_fork (kbuild/src/x86_64/arch/x86/entry/entry_64.S:298)
[ 991.178782][ T722] </TASK>
[ 1482.574672][ T722] INFO: task nfsd:4136 blocked for more than 983 seconds.
[ 1482.581894][ T722] Not tainted 5.18.0-rc4-00072-g02dcea6024d4 #2
[ 1482.588737][ T722] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1482.597456][ T722] task:nfsd state:D stack: 0 pid: 4136 ppid: 2 flags:0x00004000
[ 1482.606714][ T722] Call Trace:
[ 1482.610068][ T722] <TASK>
[ 1482.613066][ T722] __schedule (kbuild/src/x86_64/kernel/sched/core.c:5073 kbuild/src/x86_64/kernel/sched/core.c:6388)
[ 1482.617455][ T722] schedule (kbuild/src/x86_64/arch/x86/include/asm/preempt.h:85 (discriminator 1) kbuild/src/x86_64/kernel/sched/core.c:6461 (discriminator 1))
[ 1482.621488][ T722] __lock_sock (kbuild/src/x86_64/include/linux/spinlock.h:354 kbuild/src/x86_64/net/core/sock.c:2771)
[ 1482.625796][ T722] ? prepare_to_wait_exclusive (kbuild/src/x86_64/kernel/sched/wait.c:414)
[ 1482.631482][ T722] lock_sock_nested (kbuild/src/x86_64/net/core/sock.c:3317)
[ 1482.636203][ T722] tcp_recvmsg (kbuild/src/x86_64/net/ipv4/tcp.c:2574)
[ 1482.640590][ T722] inet6_recvmsg (kbuild/src/x86_64/net/ipv6/af_inet6.c:674 (discriminator 8))
[ 1482.645145][ T722] svc_tcp_read_marker+0xa6/0x180
[ 1482.651256][ T722] svc_tcp_recvfrom (kbuild/src/x86_64/net/sunrpc/svcsock.c:994)
[ 1482.656065][ T722] ? del_timer_sync (kbuild/src/x86_64/kernel/time/timer.c:1384)
[ 1482.660796][ T722] ? schedule_timeout (kbuild/src/x86_64/kernel/time/timer.c:1890)
[ 1482.665790][ T722] svc_handle_xprt (kbuild/src/x86_64/net/sunrpc/svc_xprt.c:846)
[ 1482.670602][ T722] svc_recv (kbuild/src/x86_64/net/sunrpc/svc_xprt.c:884)
[ 1482.674717][ T722] ? nfsd_shutdown_threads (kbuild/src/x86_64/fs/nfsd/nfssvc.c:932) nfsd
[ 1482.680676][ T722] nfsd (kbuild/src/x86_64/fs/nfsd/nfssvc.c:973 (discriminator 1)) nfsd
[ 1482.685049][ T722] kthread (kbuild/src/x86_64/kernel/kthread.c:376)
[ 1482.689067][ T722] ? kthread_complete_and_exit (kbuild/src/x86_64/kernel/kthread.c:331)
[ 1482.694735][ T722] ret_from_fork (kbuild/src/x86_64/arch/x86/entry/entry_64.S:298)
[ 1482.699180][ T722] </TASK>


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



--
0-DAY CI Kernel Test Service
https://01.org/lkp



Attachments:
(No filename) (5.95 kB)
config-5.18.0-rc4-00072-g02dcea6024d4 (165.11 kB)
job-script (8.64 kB)
dmesg.xz (34.48 kB)
job.yaml (5.80 kB)
reproduce (912.00 B)
Download all attachments

2022-05-02 20:20:53

by Jens Axboe

[permalink] [raw]
Subject: Re: [net] 02dcea6024: INFO:task_blocked_for_more_than#seconds

On 4/30/22 8:49 AM, kernel test robot wrote:
>
> (please be noted we reported
> "[net] 02dcea6024: netperf.Throughput_tps -9.1% regression"
> on https://lore.kernel.org/all/20220429062049.GA8992@xsang-OptiPlex-9020/
> yesterday, but now we found a func issue on fsmark tests.
> FYI)

Please just stop testing that branch, it's been abandoned.

--
Jens Axboe