Hello,
I'm trying to setup a nfs server with over 3500+ zfs datasets
being exported. Only NFSv3 is needed.
The OS had a 4.4.172 kernel and nfs-utils 1.30. With these versions
when a client tried to mount an exported dataset, rpc.mountd spiked to
100% for several minutes, the kernel produced a bug and a trace
output, and the client never finished.
I have built a 4.19.56 kernel, libevent 2.1.10, util-linux 2.34 (for
libblkid), and nfs-utils 2.4.1. With this setup rpc.mountd does spike
to 100%, but at least a mount finishes, but it takes about 5 minutes.
nfs-utils was configured with:
./configure --disable-tirpc --disable-nfsv4 --disable-nfsv41
--disable-gss --disable-ipv6
Stracing the new rpc.mountd it appears all the time is spent reading
the mtab.
Below is some of the output from:
/sbin/rpc.mountd --foreground --debug all
rpc.mountd: nfsd_fh: found 0x6173d50 path /
rpc.mountd: auth_unix_ip: inbuf 'nfsd 10.222.33.24'
rpc.mountd: auth_unix_ip: client 0x1d5bbb0 '10.222.33.0/24'
rpc.mountd: auth_unix_ip: inbuf 'nfsd 10.222.33.254'
rpc.mountd: auth_unix_ip: client 0x1d5bbb0 '10.222.33.0/24'
rpc.mountd: nfsd_export: inbuf '10.222.33.0/24 /nfsexport'
rpc.mountd: nfsd_export: found 0x6174260 path /nfsexport
rpc.mountd: nfsd_fh: inbuf '10.222.33.0/24 7
\x43000a00000000001ce354a654a34fd4a09f9b59f6aebb11'
rpc.mountd: nfsd_fh: found 0x6174270 path /nfsexport
rpc.mountd: nfsd_export: inbuf '10.222.33.0/24 /nfsexport/home'
rpc.mountd: nfsd_export: found 0x4cf8bc0 path /nfsexport/home
rpc.mountd: Received NULL request from 10.222.33.254
rpc.mountd: Received NULL request from 10.222.33.254
rpc.mountd: Received MNT3(/nfsexport/home/timmy) request from 10.222.33.254
rpc.mountd: authenticated mount request from 10.222.33.254:694 for
/nfsexport/home/timmy (/nfsexport/home/timmy)
rpc.mountd: nfsd_fh: inbuf '10.222.33.0/24 6 \x947e3e1400c9c79b0000000000000000'
rpc.mountd: nfsd_fh: found 0x4e54390 path /nfsexport/home/timmy
As you can see it searches for /, then /nfsexport, then /nfsexport/home,
and finally /nfsexport/home/timmy
But when zfs populates the mtab, the top level of the datasets
( /nfsexport ) is at the bottom of the mtab, 3500 lines down.
The next level is also at the bottom. So getmntent has to
read the mtab stream through several times. Actually:
open("/etc/mtab", O_RDONLY|O_CLOEXEC) = 10
is called 50000 times during this one mount attempt.
Are they any suggestions anyone can make to help out?
More extensive output can be made available.
Thanks,
John
On Fri, Jul 05, 2019 at 04:59:02PM -0600, John Bartoszewski wrote:
> I'm trying to setup a nfs server with over 3500+ zfs datasets
> being exported. Only NFSv3 is needed.
>
> The OS had a 4.4.172 kernel and nfs-utils 1.30. With these versions
> when a client tried to mount an exported dataset, rpc.mountd spiked to
> 100% for several minutes, the kernel produced a bug and a trace
> output, and the client never finished.
>
> I have built a 4.19.56 kernel, libevent 2.1.10, util-linux 2.34 (for
> libblkid), and nfs-utils 2.4.1. With this setup rpc.mountd does spike
> to 100%, but at least a mount finishes, but it takes about 5 minutes.
Have you experimented with the --num-threads option? If so, did it
help?
> nfs-utils was configured with:
> ./configure --disable-tirpc --disable-nfsv4 --disable-nfsv41
> --disable-gss --disable-ipv6
>
> Stracing the new rpc.mountd it appears all the time is spent reading
> the mtab.
>
> Below is some of the output from:
> /sbin/rpc.mountd --foreground --debug all
>
> rpc.mountd: nfsd_fh: found 0x6173d50 path /
> rpc.mountd: auth_unix_ip: inbuf 'nfsd 10.222.33.24'
> rpc.mountd: auth_unix_ip: client 0x1d5bbb0 '10.222.33.0/24'
> rpc.mountd: auth_unix_ip: inbuf 'nfsd 10.222.33.254'
> rpc.mountd: auth_unix_ip: client 0x1d5bbb0 '10.222.33.0/24'
> rpc.mountd: nfsd_export: inbuf '10.222.33.0/24 /nfsexport'
> rpc.mountd: nfsd_export: found 0x6174260 path /nfsexport
> rpc.mountd: nfsd_fh: inbuf '10.222.33.0/24 7
> \x43000a00000000001ce354a654a34fd4a09f9b59f6aebb11'
> rpc.mountd: nfsd_fh: found 0x6174270 path /nfsexport
> rpc.mountd: nfsd_export: inbuf '10.222.33.0/24 /nfsexport/home'
> rpc.mountd: nfsd_export: found 0x4cf8bc0 path /nfsexport/home
> rpc.mountd: Received NULL request from 10.222.33.254
> rpc.mountd: Received NULL request from 10.222.33.254
> rpc.mountd: Received MNT3(/nfsexport/home/timmy) request from 10.222.33.254
> rpc.mountd: authenticated mount request from 10.222.33.254:694 for
> /nfsexport/home/timmy (/nfsexport/home/timmy)
> rpc.mountd: nfsd_fh: inbuf '10.222.33.0/24 6 \x947e3e1400c9c79b0000000000000000'
> rpc.mountd: nfsd_fh: found 0x4e54390 path /nfsexport/home/timmy
>
> As you can see it searches for /, then /nfsexport, then /nfsexport/home,
> and finally /nfsexport/home/timmy
>
> But when zfs populates the mtab, the top level of the datasets
> ( /nfsexport ) is at the bottom of the mtab, 3500 lines down.
> The next level is also at the bottom. So getmntent has to
> read the mtab stream through several times. Actually:
> open("/etc/mtab", O_RDONLY|O_CLOEXEC) = 10
> is called 50000 times during this one mount attempt.
I haven't looked at the v3 mountd code in a while. I guess the next
step would be to figure out what the rest of the call stack is--who's
calling getmntent and why?
--b.