Hey,
I’m running into an issue where writes via mmap are extremely slow.
The mmap program I’m using to test is attached.
The symptom is that the program usually completes in 0.x seconds, but
then sometimes takes minutes to complete! E.g.:
% dd if=/dev/urandom of=/tmp/was bs=1M count=99
% ./fusexmp_fh /tmp/mnt
% time ~/mmap /tmp/was /tmp/mnt/tmp/stapelberg.1
Mapped src: 0x10000 and dst: 0x21b8b000
memcpy done
~/mmap /tmp/was /tmp/mnt/tmp/stapelberg.1 0.06s user 0.20s system 48%
cpu 0.544 total
% time ~/mmap /tmp/was /tmp/mnt/tmp/stapelberg.1
Mapped src: 0x10000 and dst: 0x471fb000
memcpy done
~/mmap /tmp/was /tmp/mnt/tmp/stapelberg.1 0.05s user 0.22s system 0%
cpu 2:03.39 total
This affects both an in-house FUSE file system and also FUSE’s
fusexmp_fh from 2.9.7 (matching what our in-house FS uses).
While this is happening, the machine is otherwise idle. E.g. dstat shows:
--total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai stl| read writ| recv send| in out | int csw
1 0 98 1 0| 0 0 | 19k 23k| 0 0 | 14k 27k
1 0 98 1 0| 0 0 | 33k 53k| 0 0 | 14k 29k
0 0 98 1 0| 0 176k| 27k 26k| 0 0 | 13k 25k
[…]
While this is happening, using cp(1) to copy the same file is fast (1
second). It’s only mmap-based writing that’s slow.
This is with Linux 5.2.17, but has been going on for years apparently.
I haven’t quite figured out what the pattern is with regards to the
machines that are affected. One wild guess I have is that it might be
related to RAM? The machine on which I can most frequently reproduce
the issue has 192GB of RAM, whereas I haven’t been able to reproduce
the issue on my workstation with 64GB of RAM.
Any ideas what I could check to further narrow down this issue?
Thanks,