Kernel: Linux freedom 5.1.0-rc4 #1 SMP Sun Apr 7 18:50:31 MDT 2019
x86_64 x86_64 x86_64 GNU/Linux
Sending large files to NAS box via NFS failed (It looks like files are
getting transferred, but NFS
gets stuck/won't finish.
Note:
1. NFS works ok with kernel 5.0.0
2. To test NFS, (using the5.1.0-rc4 kernel), I connected to my NAS box
and moved a small log file
(less that 20kB) and it moved from computer to NAS without problem.
I then deleted the small log file from the NAS box, again without
problem.
(connected via NFS).
I then copied a database archive (24929 bytes) from NAS box to computer,
again without fail. But today I moved 3 large tarball/archives from
computer to NAS box (TX: cum: 8.50GB), and it seems
to have moved OK, but NFS does not want to complete the transfer.
I can't get a directory listing from where I started the copy, and
the process queue shows:
root 6176 0.0 0.0 20824 2504 pts/5 D+ 10:28 0:11 cp
site.Apr-05-2019.tgz site.Apr-12-2019.tgz site.Mar-29-2019.tgz
nas/website/BACKUP/2019.04.12
DMESG:
perf: interrupt took too long (2694 > 2500), lowering
kernel.perf_event_max_sample_rate to 74250
[ 1644.507579] perf: interrupt took too long (3369 > 3367), lowering
kernel.perf_event_max_sample_rate to 59250
[ 2349.205399] FS-Cache: Loaded
[ 2349.348842] FS-Cache: Netfs 'nfs' registered for caching
[ 2349.652305] NFS: Registering the id_resolver key type
[ 2349.652316] Key type id_resolver registered
[ 2349.652317] Key type id_legacy registered
[ 3595.680685] nfs: server 192.168.20.114 not responding, still trying
[ 3595.680692] nfs: server 192.168.20.114 not responding, still trying
[ 3595.680695] nfs: server 192.168.20.114 not responding, still trying
[ 3626.400412] INFO: task cp:6176 blocked for more than 120 seconds.
[ 3626.400418] Tainted: G I 5.1.0-rc4 #1
[ 3626.400419] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 3626.400422] cp D 0 6176 5890 0x00000000
[ 3626.400425] Call Trace:
[ 3626.400434] ? __schedule+0x2a1/0x870
[ 3626.400436] schedule+0x28/0x70
[ 3626.400440] io_schedule+0x12/0x40
[ 3626.400443] wait_on_page_bit_common+0x10a/0x280
[ 3626.400447] ? file_check_and_advance_wb_err+0xc0/0xc0
[ 3626.400450] __filemap_fdatawait_range+0xd8/0x140
[ 3626.400453] filemap_write_and_wait_range+0x3b/0x70
[ 3626.400471] nfs_file_fsync+0x44/0x1f0 [nfs]
[ 3626.400474] filp_close+0x2a/0x70
[ 3626.400475] __x64_sys_close+0x1e/0x50
[ 3626.400477] do_syscall_64+0x4f/0x100
[ 3626.400479] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3626.400481] RIP: 0033:0x7f6be90958d4
[ 3626.400485] Code: Bad RIP value.
[ 3626.400486] RSP: 002b:00007ffe07ec4398 EFLAGS: 00000246 ORIG_RAX:
0000000000000003
[ 3626.400487] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
00007f6be90958d4
[ 3626.400488] RDX: 0000000000020000 RSI: 000055b82b141000 RDI:
0000000000000004
[ 3626.400488] RBP: 00007ffe07ec4770 R08: 0000000000020000 R09:
0000000000000001
[ 3626.400489] R10: 0000000000020000 R11: 0000000000000246 R12:
00007ffe07ec4970
[ 3626.400490] R13: 0000000000000000 R14: 00007ffe07ec4900 R15:
00007ffe07ec52db
[ 3747.231331] INFO: task cp:6176 blocked for more than 241 seconds.
[ 3747.231337] Tainted: G I 5.1.0-rc4 #1
[ 3747.231339] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
According to NTOP, it looks like data was transferred:
TX: cum: 8.50GB peak: 208b rates: 0b 0b 18b
My Hardware:
04:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E
Gigabit Ethernet Controller (rev 12)
Subsystem: ASUSTeK Computer Inc. Motherboard
Flags: bus master, fast devsel, latency 0, IRQ 26
Memory at fb9fc000 (64-bit, non-prefetchable) [size=16K]
I/O ports at b800 [size=256]
Expansion ROM at fb9c0000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: sky2
Kernel modules: sky2
06:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E
Gigabit Ethernet Controller (rev 12)
Subsystem: ASUSTeK Computer Inc. Motherboard
Flags: bus master, fast devsel, latency 0, IRQ 25
Memory at fbbfc000 (64-bit, non-prefetchable) [size=16K]
I/O ports at d800 [size=256]
Expansion ROM at fbbc0000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: sky2
Kernel modules: sky2
Drivers:
lsmod | grep nfs yields:
nfsv3 53248 1
nfsv4 692224 0
nfs 315392 3 nfsv4,nfsv3
fscache 380928 2 nfsv4,nfs
nfsd 413696 13
auth_rpcgss 102400 2 nfsd,rpcsec_gss_krb5
nfs_acl 16384 2 nfsd,nfsv3
lockd 106496 3 nfsd,nfsv3,nfs
grace 16384 2 nfsd,lockd
sunrpc 438272 28
nfsd,nfsv4,auth_rpcgss,lockd,nfsv3,rpcsec_gss_krb5,nfs_acl,nfs
I can ftp to the NAS box and see that two of the three files completed:-
rw-r--r-- 1 0 0 5027646357 Apr 12 10:31
site.Apr-05-2019.tgz
-rw-r--r-- 1 0 0 5152017050 Apr 12 10:34
site.Apr-12-2019.tgz
Sorry for the long post, but I wanted to be complete. Perhaps its a
corner case. I'm fairly certain the
NAS machine is using the ext4 filesystem and I don't think its a
filesystem size limit.
Thanks in advance, please email me directly as I am not on the list,
Bob