2016-09-01 11:22:31

by James Harvey

[permalink] [raw]
Subject: nfs-server (v4.x only) takes extra 6 minutes to start without rpcbind

Up to date Arch. Kernel 4.7.2 (-1 Arch). nfs-utils 1.3.4 (-1 Arch).
No firewall running on this system.

Arch doesn't have any distribution-specific patches - it uses the
sourceforge source -
https://git.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/nfs-utils


I started writing that an nfs-server configured for v4.x only would
never start if there was no rpcbind, BUT wound up finding out it does
start, it just takes ~ 6 minutes when with rpcbind it's "instant" /
less than 1 second.

"This" has been occurring since I started with NFS v4.x using
nfs-utils 1.3.2 & 1.3.3. By "this", I mean it looking to me like
nfs-server wouldn't start without rpcbind. It was probably just
starting delayed and I was impatient before 6 minutes.


I've found several posts from people saying nfs-server won't start
without rpcbind on v4.x only, so others have definitely ran into this,
just never gave it 6 minutes to finish, or maybe theirs never
finished.

Looks like there was a similar issue patched in 12/2013. Same issue
of rpcbind being needed for v4.x, but without rpcbind the problem was
a writing fd to kernel failure from rpc.nfsd, which appears to me to
have been fatal, and I'm not getting.
http://www.spinics.net/lists/linux-nfs/msg41053.html

This MAY answer the question - it's only visible for RedHat
subscribers, which I'm not - and I have no idea if on RHEL what
versions he has and if it includes the 12/2013 patch I mentioned -
https://access.redhat.com/solutions/902013



/run/sysconfig/nfs-utils is:
==========
RPCNFSDARGS="-N 2 -N 3 8"
RPCMOUNTDARGS="-N 2 -N 3"
STATDARGS=""
SMNOTIFYARGS=""
RPCIDMAPDARGS=""
RPCGSSDARGS=""
RPCSVCGSSDARGS=""
BLKMAPDARGS=""
GSS_USE_PROXY="yes"
==========

/proc/fs/nfsd/versions is: "-2 -3 +4 +4.1 +4.2"
/proc/fs/nfsd/threads is: "8"

Running defaults (just adding "-N 2 -N 3" to RPCNFSDARGS and
RPCMOUNTDOPTS in /etc/sysconfig/nfs), the NFS server starts fine like
this, with these enabled:
* proc-fs-nfsd.mount
* var-lib-nfs-rpc_pipefs.mount
* gssproxy.service
* nfs-blkmap.service
* nfs-config.service
* nfs-idmapd.service
* nfs-mountd.service
* nfs-server.service
* rpc-statd.service
* rpcbind.service
* rpcbind.socket

nfs-server starts right away:
==========
Sep 01 05:05:14 terra systemd[1]: Starting Preprocess NFS configuration...
Sep 01 05:05:14 terra systemd[1]: Started Preprocess NFS configuration.
==========



I'm running NFS v4.x only, and I'd like to be able to disable
everything not needed.

I can mask gssproxy.service, nfs-blkmap.service, and
rpc-statd.service, and the NFS server still starts instantly with just
these:
* proc-fs-nfsd.mount
* var-lib-nfs-rpc_pipefs.mount
* nfs-config.service
* nfs-idmapd.service
* nfs-mountd.service
* nfs-server.service
* rpcbind.service
* rpcbind.socket


Since I'm running NFS v4.x only, it's my understanding I should also
be able to mask rpcbind.service and rpcbind.socket and just run with
these:
* proc-fs-nfsd.mount
* var-lib-nfs-rpc_pipefs.mount
* nfs-config.service
* nfs-idmapd.service
* nfs-mountd.service
* nfs-server.service

But without rpcbind.service and rpcbind.socket, I get the 6 minute delay

# systemctl start nfs-config

# systemctl start proc-fs-nfsd.mount
[ 45.907749] RPC: Registered named UNIX socket transport module.
[ 45.907755] RPC: Registered udp transport module.
[ 45.907757] RPC: Registered tcp transport module.
[ 45.907759] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 45.943152] Installing knfsd (copyright (C) 1996 [email protected]).

# rpcdebug -m nfsd -s all

# systemctl start var-lib-nfs-rpc_pipefs.mount

# rpc.mountd --foreground --no-udp -N 2 -N 3
rpc.mountd: Version 1.3.3 starting

# rpc.idmapd -f -vvv
rpc.idmapd: Skipping configuration file "/etc/idmapd.conf": No such
file or directory
rpc.idmapd: libnfsidmap: Unable to determine the NFSv4 domain; Using
'localdomain' as the NFSv4 domain which means UIDs will be mapped to
the 'Nobody-User' user defined in /etc/idmapd.conf

rpc.idmapd: libnfsidmap: using (default) domain: localdomain
rpc.idmapd: libnfsidmap: Realms list: 'LOCALDOMAIN'
rpc.idmapd: libnfsidmap: loaded plugin
/usr/lib/libnfsidmap/nsswitch.so for method nsswitch

rpc.idmapd: Expiration time is 0 seconds.
rpc.idmapd: Opened /proc/net/rpc/nfs4.nametoid/channel
rpc.idmapd: Opened /proc/net/rpc/nfs4.idtoname/channel
rpc.idmapd: Unable to open '/proc/sys/fs/nfs/idmap_cache_timeout' to
set client cache expiration time to 0 seconds

{in journalctl}

Sep 01 06:37:19 terra rpc.idmapd[2342]: libnfsidmap: Unable to
determine the NFSv4 domain; Using 'localdomain' as the NFSv4 domain
which means UIDs will be mapped to the 'Nobody-User' user defined in
/etc/idmapd.conf
Sep 01 06:37:19 terra rpc.idmapd[2342]: libnfsidmap: using (default)
domain: localdomain
Sep 01 06:37:19 terra rpc.idmapd[2342]: libnfsidmap: Realms list: 'LOCALDOMAIN'
Sep 01 06:37:19 terra rpc.idmapd[2342]: libnfsidmap: loaded plugin
/usr/lib/libnfsidmap/nsswitch.so for method nsswitch
Sep 01 06:37:19 terra rpc.idmapd[2342]: Opened
/proc/net/rpc/nfs4.nametoid/channel
Sep 01 06:37:19 terra rpc.idmapd[2342]: Opened
/proc/net/rpc/nfs4.idtoname/channel
Sep 01 06:37:19 terra rpc.idmapd[2342]: Unable to open
'/proc/sys/fs/nfs/idmap_cache_timeout' to set client cache expiration
time to 0 seconds

# exportfs -r

# rpc.nfsd --debug --no-udp -N 2 -N 3 8
rpc.nfsd: knfsd is currently down
rpc.nfsd: Writing version string to kernel: -2 -3 +4
rpc.nfsd: Created AF_INET TCP socket.

{about 2 minutes later}
[ 560.800037] set_max_drc nfsd_drc_max_mem 3948544

{then about 97 seconds later}
rpc.nfsd: Created AF_INET6 TCP socket.
[ 657.120279] nfsd: creating service
[ 657.120286] nfsd: allocating 32 readahead buffers.
[ 657.120376] nfsd4_umh_cltrack_upcall: cmd: init
[ 657.120381] nfsd4_umh_cltrack_upcall: arg: (null)
[ 657.120383] nfsd4_umh_cltrack_upcall: env0:
NFSDCLTRACK_GRACE_START=1472726583
[ 657.120385] nfsd4_umh_cltrack_upcall: env1: (null)
[ 657.151076] NFSD: end of grace period
[ 657.151086] nfsd4_umh_cltrack_upcall: cmd: gracedone
[ 657.151088] nfsd4_umh_cltrack_upcall: arg: 1472726583
[ 657.151090] nfsd4_umh_cltrack_upcall: env0:
NFSDCLTRACK_LEGACY_TOPDIR=/var/lib/nfs/v4recovery
[ 657.151092] nfsd4_umh_cltrack_upcall: env1: (null)
[ 657.153448] nfsd4_umh_cltrack_upcall: /sbin/nfsdcltrack return value: 0
[ 657.153885] nfsd4_umh_cltrack_upcall: /sbin/nfsdcltrack return value: 0
[ 657.153890] NFSD: starting 90-second grace period (net ffffffff818b8a40)
[ 747.359945] NFSD: laundromat service - starting
[ 747.359953] NFSD: laundromat_main - sleeping for 90 seconds
[ 837.599965] NFSD: laundromat service - starting
[ 837.599973] NFSD: laundromat_main - sleeping for 90 seconds

If I reboot and perform the same process, except running "strace
rpc.nfsd --no-udp -N 2 -N 3 8" then I see:

==========
execve("/usr/bin/rpc.nfsd", ["rpc.nfsd", "--no-udp", "-N", "2", "-N",
"3", "8"], [/* 33 vars */]) = 0
brk(NULL) = 0x1609000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=69980, ...}) = 0
mmap(NULL, 69980, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f8548f20000
close(3) = 0
open("/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\3\2\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1951744, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7f8548f1e000
mmap(NULL, 3791152, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f8548972000
mprotect(0x7f8548b07000, 2093056, PROT_NONE) = 0
mmap(0x7f8548d06000, 24576, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x194000) = 0x7f8548d06000
mmap(0x7f8548d0c000, 14640, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f8548d0c000
close(3) = 0
arch_prctl(ARCH_SET_FS, 0x7f8548f1f440) = 0
mprotect(0x7f8548d06000, 16384, PROT_READ) = 0
mprotect(0x604000, 4096, PROT_READ) = 0
mprotect(0x7f8548f32000, 4096, PROT_READ) = 0
munmap(0x7f8548f20000, 69980) = 0
brk(NULL) = 0x1609000
brk(0x162a000) = 0x162a000
getpid() = 2156
rt_sigaction(SIGUSR1, {0x402b10, [USR1], SA_RESTORER|SA_RESTART,
0x7f85489a50b0}, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGUSR2, {0x402b10, [USR2], SA_RESTORER|SA_RESTART,
0x7f85489a50b0}, {SIG_DFL, [], 0}, 8) = 0
chdir("/var/lib/nfs") = 0
stat("/proc/fs/nfsd/threads", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
open("/proc/fs/nfsd/portlist", O_RDONLY) = 3
read(3,

{{{ about 153 seconds later - yes the read line is split by this time
as shown }}}

"tcp 2049\n", 128) = 9
close(3) = 0
open("/dev/null", O_RDWR) = 3
dup2(3, 0) = 0
dup2(3, 1) = 1
dup2(3, 2) = 2
open("/proc/self/fd", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 4
fstat(4, {st_mode=S_IFDIR|0500, st_size=0, ...}) = 0
getdents(4, /* 7 entries */, 32768) = 168
close(3) = 0
getdents(4, /* 0 entries */, 32768) = 0
close(4) = 0
open("/proc/fs/nfsd/threads", O_WRONLY) = 3
write(3, "8\n", 2

{{{ about 193 seconds later }}}

) = 2
close(3) = 0
exit_group(0) = ?
+++ exited with 0 +++
==========

Now that it's completed, I can cat /proc/fs/nfsd/{portlist,threads} instantly.

But, if I stop nfsd and try restarting it, I get the delay again.


2016-09-02 00:50:35

by James Harvey

[permalink] [raw]
Subject: Re: nfs-server (v4.x only) takes extra 6 minutes to start without rpcbind

On Thu, Sep 1, 2016 at 7:22 AM, james harvey <[email protected]> wrote:
> Up to date Arch. Kernel 4.7.2 (-1 Arch). nfs-utils 1.3.4 (-1 Arch).
> No firewall running on this system.
>
> I started writing that an nfs-server configured for v4.x only would
> never start if there was no rpcbind, BUT wound up finding out it does
> start, it just takes ~ 6 minutes when with rpcbind it's "instant" /
> less than 1 second.

Replicated on Fedora 24, kernel 4.6.7 (-300.fc24.x86_64), nfs-utils
1.3.4 (-1.rc2.fc24), and rpcbind 0.2.3 (-11.rc1.fc24).

{{{ fresh network iso install, using minimal install selection in
install GUI }}}
# yum install nfs-utils
# systemctl enable nfs-server
# vi /etc/sysconfig/nfs
{{{ RPCNFSDARGS="-N 2 -N 3" \ RPCMOUNTDOPTS="-N 2 -N 3" }}}
{{{ I've tried -N options with and without spaces, as I've seen it both ways }}}
# systemctl reboot

Well, Fedora (as Arch does) starts nfs-server for v4.x just fine with
rpcbind. Fedora fails to start nfs-idmapd with "/usr/sbin/rpc.idmapd:
symbol lookup error: /lib64/libnfsidmap.so.0: undefined symbol:
__dn_expand". nfs-idmapd starts fine on Arch.

Starting to started is "instant" (less than 1 second.)


But, moving on:
# systemctl mask rpcbind.service rpcbind.socket
# systemctl reboot

# systemctl status nfs-server
Sep 01 20:29:00 fedora systemd[1]: Starting NFS server and services...
Sep 01 20:35:22 fedora systemd[1]: Started NFS server and services


Then to strace:
# systemctl disable nfs-server
# systemctl reboot
# systemctl start nfs-config.service
{{{ proc-fs-nfsd.mount and var-lib-nfsd-rpc_pipefs.mount are already active }}}
# systemctl start nfs-idmapd
{{{ still fails with undefined symbol __dn_expand }}}

# rpc.mountd --foreground -N 2 -N 3

{{{ another tty }}}
# strace rpc.nfsd --debug -N 2 -N 3 8

Gives the attached strace output, which shows the same two really long
delays when reading or writing /proc/fs/nfsd/portlist.


Attachments:
strace.rpc.nfsd (11.16 kB)

2016-09-03 00:38:25

by Steve Dickson

[permalink] [raw]
Subject: Re: nfs-server (v4.x only) takes extra 6 minutes to start without rpcbind

Hello,

On 09/01/2016 08:50 PM, james harvey wrote:
> On Thu, Sep 1, 2016 at 7:22 AM, james harvey <[email protected]> wrote:
>> Up to date Arch. Kernel 4.7.2 (-1 Arch). nfs-utils 1.3.4 (-1 Arch).
>> No firewall running on this system.
>>
>> I started writing that an nfs-server configured for v4.x only would
>> never start if there was no rpcbind, BUT wound up finding out it does
>> start, it just takes ~ 6 minutes when with rpcbind it's "instant" /
>> less than 1 second.
>
> Replicated on Fedora 24, kernel 4.6.7 (-300.fc24.x86_64), nfs-utils
> 1.3.4 (-1.rc2.fc24), and rpcbind 0.2.3 (-11.rc1.fc24).
>
> {{{ fresh network iso install, using minimal install selection in
> install GUI }}}
> # yum install nfs-utils
> # systemctl enable nfs-server
> # vi /etc/sysconfig/nfs
> {{{ RPCNFSDARGS="-N 2 -N 3" \ RPCMOUNTDOPTS="-N 2 -N 3" }}}
> {{{ I've tried -N options with and without spaces, as I've seen it both ways }}}
> # systemctl reboot
>
> Well, Fedora (as Arch does) starts nfs-server for v4.x just fine with
> rpcbind. Fedora fails to start nfs-idmapd with "/usr/sbin/rpc.idmapd:
> symbol lookup error: /lib64/libnfsidmap.so.0: undefined symbol:
> __dn_expand". nfs-idmapd starts fine on Arch.
This has been fixed by https://bodhi.fedoraproject.org/updates/FEDORA-2016-625bd99661

>
> Starting to started is "instant" (less than 1 second.)
>
>
> But, moving on:
> # systemctl mask rpcbind.service rpcbind.socket
> # systemctl reboot
There has been problems with systemd starting the listening
socket... I wonder if this is yea another one...

>
> # systemctl status nfs-server
> Sep 01 20:29:00 fedora systemd[1]: Starting NFS server and services...
> Sep 01 20:35:22 fedora systemd[1]: Started NFS server and services
>
>
> Then to strace:
> # systemctl disable nfs-server
> # systemctl reboot
> # systemctl start nfs-config.service
> {{{ proc-fs-nfsd.mount and var-lib-nfsd-rpc_pipefs.mount are already active }}}
> # systemctl start nfs-idmapd
> {{{ still fails with undefined symbol __dn_expand }}}
Again this has been fixed...

>
> # rpc.mountd --foreground -N 2 -N 3
>
> {{{ another tty }}}
> # strace rpc.nfsd --debug -N 2 -N 3 8
>
> Gives the attached strace output, which shows the same two really long
> delays when reading or writing /proc/fs/nfsd/portlist.
>
I don't see the attachment...

steved.

2016-09-03 02:55:21

by James Harvey

[permalink] [raw]
Subject: Re: nfs-server (v4.x only) takes extra 6 minutes to start without rpcbind

On Fri, Sep 2, 2016 at 8:38 PM, Steve Dickson <[email protected]> wrote:
> On 09/01/2016 08:50 PM, james harvey wrote:
>> On Thu, Sep 1, 2016 at 7:22 AM, james harvey <[email protected]> wrote:
>>> Up to date Arch. Kernel 4.7.2 (-1 Arch). nfs-utils 1.3.4 (-1 Arch).
>>> No firewall running on this system.
>>>
>>> I started writing that an nfs-server configured for v4.x only would
>>> never start if there was no rpcbind, BUT wound up finding out it does
>>> start, it just takes ~ 6 minutes when with rpcbind it's "instant" /
>>> less than 1 second.
>>
>> Replicated on Fedora 24, kernel 4.6.7 (-300.fc24.x86_64), nfs-utils
>> 1.3.4 (-1.rc2.fc24), and rpcbind 0.2.3 (-11.rc1.fc24).
>>
>> {{{ fresh network iso install, using minimal install selection in
>> install GUI }}}
>> # yum install nfs-utils
>> # systemctl enable nfs-server
>> # vi /etc/sysconfig/nfs
>> {{{ RPCNFSDARGS="-N 2 -N 3" \ RPCMOUNTDOPTS="-N 2 -N 3" }}}
>> {{{ I've tried -N options with and without spaces, as I've seen it both ways }}}
>> # systemctl reboot
>>
>> Well, Fedora (as Arch does) starts nfs-server for v4.x just fine with
>> rpcbind. Fedora fails to start nfs-idmapd with "/usr/sbin/rpc.idmapd:
>> symbol lookup error: /lib64/libnfsidmap.so.0: undefined symbol:
>> __dn_expand". nfs-idmapd starts fine on Arch.
> This has been fixed by https://bodhi.fedoraproject.org/updates/FEDORA-2016-625bd99661
>
>>
>> Starting to started is "instant" (less than 1 second.)
>>
>>
>> But, moving on:
>> # systemctl mask rpcbind.service rpcbind.socket
>> # systemctl reboot
> There has been problems with systemd starting the listening
> socket... I wonder if this is yea another one...

FWIW, after nfs-server eventually starts, systemctl still shows
rpcbind.socket (and rpcbind.service) as masked, inactive, and dead.

Also /var/run/rpcbind.sock doesn't exist.

I disabled all related services, and rebooted with systemd debugging.
(Disabled them so they could be manually started with boot noise off
on its own.)

dmesg (with all the systemd debugging output) only mentions rpcbind
along with saying "Cannot add dependency job, ignoring: Unit
rpcbind.socket is masked." Not looking to me like it's trying to
start it.

I'm not familiar with how the proc fs is handled internally. IF this
were in the sys fs, which I am more familiar with, I would be thinking
whichever kernel module provided the show and store functions for
.../nfsd/portlist is taking a really long time to execute when it
can't connect to rpcbind.

This didn't wind up helping here, but turning on systemd debugging
showed me rpc-statd.service was trying to be started (as
nfs-server.service Wants it.) Although masking rpc-statd.service
certainly prevents it from starting, it doesn't eliminate the long
delay.

The relevant part of the 6MB dmesg is:

==========
[ 98.158892] systemd[1]: Mounted NFSD configuration filesystem.
[ 98.161917] systemd[1]: Mounted RPC Pipe File System.
[ 98.172261] systemd[1]: Starting Preprocess NFS configuration...
[ 98.184153] systemd[1]: Started Preprocess NFS configuration.
[ 98.185463] systemd[1]: Started NFS Mount Daemon.
[ 98.187190] systemd[1]: Starting NFSv4 ID-name mapping service...
[ 98.190856] systemd[1]: nfs-idmapd.service: Control process exited,
code=exited status=127
[ 98.191561] systemd[1]: Failed to start NFSv4 ID-name mapping service.
[ 98.192050] systemd[1]: nfs-idmapd.service: Unit entered failed state.
[ 98.192813] systemd[1]: nfs-idmapd.service: Failed with result 'exit-code'.
[ 393.992248] NFSD: starting 90-second grace period (net ffffffff81cfb7c0)
[ 393.996640] systemd[1]: Received SIGCHLD from PID 788 (rpc.nfsd).
[ 393.997043] systemd[1]: Child 788 (rpc.nfsd) died (code=exited,
status=0/SUCCESS)
[ 393.997095] systemd[1]: nfs-server.service: Child 788 belongs to
nfs-server.service
[ 393.997110] systemd[1]: nfs-server.service: Main process exited,
code=exited, status=0/SUCCESS
[ 393.997444] systemd[1]: nfs-server.service: Changed start -> exited
[ 393.997454] systemd[1]: nfs-server.service: Job
nfs-server.service/start finished, result=done
[ 393.997466] systemd[1]: Started NFS server and services.
[ 394.009311] systemd[1]: Starting Notify NFS peers of a restart...
[ 394.031645] systemd[1]: Started Notify NFS peers of a restart.
==========



>> # systemctl status nfs-server
>> Sep 01 20:29:00 fedora systemd[1]: Starting NFS server and services...
>> Sep 01 20:35:22 fedora systemd[1]: Started NFS server and services
>>
>>
>> Then to strace:
>> # systemctl disable nfs-server
>> # systemctl reboot
>> # systemctl start nfs-config.service
>> {{{ proc-fs-nfsd.mount and var-lib-nfsd-rpc_pipefs.mount are already active }}}
>> # systemctl start nfs-idmapd
>> {{{ still fails with undefined symbol __dn_expand }}}
> Again this has been fixed...
>
>>
>> # rpc.mountd --foreground -N 2 -N 3
>>
>> {{{ another tty }}}
>> # strace rpc.nfsd --debug -N 2 -N 3 8
>>
>> Gives the attached strace output, which shows the same two really long
>> delays when reading or writing /proc/fs/nfsd/portlist.
>>
> I don't see the attachment...
>
> steved.

Strange, wonder if the list removed it - I do see it on my sent message.

Anyways, I just uploaded it here: http://pastebin.ca/3709748

The delays happen at lines 145-149, which print as one line, I just
broke it apart showing the characters that show on the line before and
after the delay.

Same at lines 166-170.

2016-10-12 06:43:50

by NeilBrown

[permalink] [raw]
Subject: Re: nfs-server (v4.x only) takes extra 6 minutes to start without rpcbind

On Sat, Sep 03 2016, james harvey wrote:

>
> FWIW, after nfs-server eventually starts, systemctl still shows
> rpcbind.socket (and rpcbind.service) as masked, inactive, and dead.

I think this is a kernel bug. You can fix it by reverting

Commit: 4b0ab51db32e ("SUNRPC: xs_sock_mark_closed() does not need to trigger socket autoclose")

See also http://www.spinics.net/lists/linux-nfs/msg59264.html and
follow-ups.


>>>
>>> # rpc.mountd --foreground -N 2 -N 3
>>>
>>> {{{ another tty }}}
>>> # strace rpc.nfsd --debug -N 2 -N 3 8
>>>
>>> Gives the attached strace output, which shows the same two really long
>>> delays when reading or writing /proc/fs/nfsd/portlist.
>>>
>> I don't see the attachment...
>>
>> steved.
>
> Strange, wonder if the list removed it - I do see it on my sent message.
>
> Anyways, I just uploaded it here: http://pastebin.ca/3709748
>
> The delays happen at lines 145-149, which print as one line, I just
> broke it apart showing the characters that show on the line before and
> after the delay.

The delay at 'read' is waiting for nfsd_mutex. Probably something else
was trying to start up nfsd and hit the 3 minute wait while writing the
threads. Maybe you interrupted that or backgrounded it or something.

>
> Same at lines 166-170.

This one is the real problem. Starting nfsd tries to register with
rpcbind (even for v4-only) and (due to the bug) it takes 3 minutes to
timeout.

NeilBrown

> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Attachments:
signature.asc (800.00 B)