2023-02-07 11:15:15

by Florian Möller

[permalink] [raw]
Subject: Reoccurring 5 second delays during NFS calls

Hi all,

we are currently in the process of migrating our file server infrastructure to
NFS. In our test environments, the following problem has now occurred several
times in certain situations:

A previously very fast NFS file operation suddenly takes 5 seconds longer - per
file. This leads to applications running very slowly and severely delayed file
operations.

Here are the details:

NFS server:
OS: Ubuntu 22.04.1, all patches installed
Kernel: Ubuntu Mainline, Versions
6.1.7-060107-generic_6.1.7-060107.202301181200
6.1.8-060108_6.1.8-060108.202301240742
6.1.9-060109_6.1.9-060109.202302010835
Security options: all Kerberos security options are affected
(The bug does not seem to occur without Kerberos security.)
Output of exportfs -v:
/export
gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
/export
gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)


Client 1:
OS: Arch Linux, all patches installed
Kernel: 6.1.9-arch1-2, 6.1.9-arch1-1
Mount-Line: servername:/ on /nfs type nfs4
(rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=XX.XX.XX.XX,local_lock=none,addr=YY.YY.YY.YY,_netdev)
krb5: 1.20.1-1
libevent: 2.1.12-4
nfs-utils: 2.6.2-1
util-linux: 2.38.1-1


Client 2:
OS: openSuSE 15.4, all patches installed
Kernel: 5.14.21-150400.24.41-default
Mount-Line: servername:/ on /nfs type nfs4
(rw,relatime,sync,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,softerr,softreval,noac,noresvport,proto=tcp,timeo=1,retrans=2,sec=krb5,clientaddr=XX.XX.XX.XX,lookupcache=none,local_lock=none,addr=YY.YY.YY.YY)

libgssapi3: 7.8.0-bp154.2.4.1
libevent: 2.1.8-2.23
nfs-client: 2.1.1-150100.10.27.1
util-linux: 2.37.2-140400.8.14.1


The error occurs for example if a file is touched twice:

touch testfile && echo "done" && touch testfile && echo "and again"

However, touching a large number of files (about 10000) with (pairwise)
different filenames works fast.


Here is another example that triggers the error:

1st step: create many files (shell code in Z-shell syntax):

for i in {1..10000}; do
echo "test" > $i.txt
done

This is fast.

2nd step:

for i in {1..10000}; do
echo $i
cat $i.txt
done

This takes 5 seconds per cat(1) call.
After unmounting and mounting, the 2nd step also runs quickly at first. But
after executing the 2nd step several times in a row, the error occurs again
(quite soon, after the 2nd or 3rd execution).

We were not able to reproduce the error without a Kerberos security type set.


Attached are a log from the server and from the client. In both cases

rpcdebug -m nfs -s all
rpcdebug -m nfsd -s all
rpcdebug -m rpc -s all

was set.


Best regards,
Florian Möller



--
Dr. Florian Möller
Universität Würzburg
Institut für Mathematik
Emil-Fischer-Straße 30
97074 Würzburg, Germany
Tel. +49 931 3185596


Attachments:
client-log.txt (20.79 kB)
server-log.txt (11.42 kB)
Download all attachments

2023-02-07 15:22:05

by Jeffrey Layton

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

On Tue, 2023-02-07 at 11:58 +0100, Florian M?ller wrote:
> Hi all,
>
> we are currently in the process of migrating our file server infrastructure to
> NFS. In our test environments, the following problem has now occurred several
> times in certain situations:
>
> A previously very fast NFS file operation suddenly takes 5 seconds longer - per
> file. This leads to applications running very slowly and severely delayed file
> operations.
>
> Here are the details:
>
> NFS server:
> OS: Ubuntu 22.04.1, all patches installed
> Kernel: Ubuntu Mainline, Versions
> 6.1.7-060107-generic_6.1.7-060107.202301181200
> 6.1.8-060108_6.1.8-060108.202301240742
> 6.1.9-060109_6.1.9-060109.202302010835
> Security options: all Kerberos security options are affected
> (The bug does not seem to occur without Kerberos security.)
> Output of exportfs -v:
> /export
> gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
> /export
> gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>


I see you're using the -o async export option. Note that you may end up
with corrupt data on a server reboot (see exports(5) manpage).

Assuming you're aware of this and want to keep that anyway, then the
patch I just posted to the mailing list may help you, if the stalls are
coming during CLOSE operations:

https://lore.kernel.org/linux-nfs/[email protected]/T/#mcb88f091263d07d8b9c13e6cc5ce0a0413d3f761

>
> Client 1:
> OS: Arch Linux, all patches installed
> Kernel: 6.1.9-arch1-2, 6.1.9-arch1-1
> Mount-Line: servername:/ on /nfs type nfs4
> (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=XX.XX.XX.XX,local_lock=none,addr=YY.YY.YY.YY,_netdev)
> krb5: 1.20.1-1
> libevent: 2.1.12-4
> nfs-utils: 2.6.2-1
> util-linux: 2.38.1-1
>
>
> Client 2:
> OS: openSuSE 15.4, all patches installed
> Kernel: 5.14.21-150400.24.41-default
> Mount-Line: servername:/ on /nfs type nfs4
> (rw,relatime,sync,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,softerr,softreval,noac,noresvport,proto=tcp,timeo=1,retrans=2,sec=krb5,clientaddr=XX.XX.XX.XX,lookupcache=none,local_lock=none,addr=YY.YY.YY.YY)
>
> libgssapi3: 7.8.0-bp154.2.4.1
> libevent: 2.1.8-2.23
> nfs-client: 2.1.1-150100.10.27.1
> util-linux: 2.37.2-140400.8.14.1
>
>
> The error occurs for example if a file is touched twice:
>
> touch testfile && echo "done" && touch testfile && echo "and again"
>
> However, touching a large number of files (about 10000) with (pairwise)
> different filenames works fast.
>
>
> Here is another example that triggers the error:
>
> 1st step: create many files (shell code in Z-shell syntax):
>
> for i in {1..10000}; do
> echo "test" > $i.txt
> done
>
> This is fast.
>
> 2nd step:
>
> for i in {1..10000}; do
> echo $i
> cat $i.txt
> done
>
> This takes 5 seconds per cat(1) call.
> After unmounting and mounting, the 2nd step also runs quickly at first. But
> after executing the 2nd step several times in a row, the error occurs again
> (quite soon, after the 2nd or 3rd execution).
>
> We were not able to reproduce the error without a Kerberos security type set.
>
>
> Attached are a log from the server and from the client. In both cases
>
> rpcdebug -m nfs -s all
> rpcdebug -m nfsd -s all
> rpcdebug -m rpc -s all
>
> was set.
>
>
> Best regards,
> Florian M?ller
>
>
>

--
Jeff Layton <[email protected]>

2023-02-21 13:34:14

by Florian Möller

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

Hi all,

unfortunately the patch did not help, neither using -o async nor using -o sync.
We did some more tests (which is the reason for the delay of this reply):

We used a qemu image housing both the NFS server and the client and did some
kernel debugging.
OS: Ubuntu 22.04.1
Kernel: 5.15.78
Mount line: rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,
namlen=255,hard,proto=tcp,timeo=600,retrans=2,
sec=krb5p,clientaddr=10.0.0.1,
local_lock=none,addr=10.0.0.1

We touched a file and then touched the file again. This triggers the delay reliably.

We set breakpoints on all functions starting with nfs4 and on
"update_open_stateid". The attached file "1sttouch.log" contains a gdb log of
the first touch.
"2ndtouch.log" shows the gdb output of the second touch. The delay occurs in
line 116 in update_open_stateid.

We then deleted all breakpoints and set a sole breakpoint on
update_open_stateid. We touched the file again and used only the "next" command
of gdb. The gdb output is in "2ndtouch-next.log", the delay occurs in line 8.

Please let us know if you need more information or if you want us to perform
further tests.

Best regards,
Florian Möller

Am 07.02.23 um 16:21 schrieb Jeff Layton:
> On Tue, 2023-02-07 at 11:58 +0100, Florian Möller wrote:
>> Hi all,
>>
>> we are currently in the process of migrating our file server infrastructure to
>> NFS. In our test environments, the following problem has now occurred several
>> times in certain situations:
>>
>> A previously very fast NFS file operation suddenly takes 5 seconds longer - per
>> file. This leads to applications running very slowly and severely delayed file
>> operations.
>>
>> Here are the details:
>>
>> NFS server:
>> OS: Ubuntu 22.04.1, all patches installed
>> Kernel: Ubuntu Mainline, Versions
>> 6.1.7-060107-generic_6.1.7-060107.202301181200
>> 6.1.8-060108_6.1.8-060108.202301240742
>> 6.1.9-060109_6.1.9-060109.202302010835
>> Security options: all Kerberos security options are affected
>> (The bug does not seem to occur without Kerberos security.)
>> Output of exportfs -v:
>> /export
>> gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>> /export
>> gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>>
>
>
> I see you're using the -o async export option. Note that you may end up
> with corrupt data on a server reboot (see exports(5) manpage).
>
> Assuming you're aware of this and want to keep that anyway, then the
> patch I just posted to the mailing list may help you, if the stalls are
> coming during CLOSE operations:
>
> https://lore.kernel.org/linux-nfs/[email protected]/T/#mcb88f091263d07d8b9c13e6cc5ce0a0413d3f761
>
>>
>> Client 1:
>> OS: Arch Linux, all patches installed
>> Kernel: 6.1.9-arch1-2, 6.1.9-arch1-1
>> Mount-Line: servername:/ on /nfs type nfs4
>> (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=XX.XX.XX.XX,local_lock=none,addr=YY.YY.YY.YY,_netdev)
>> krb5: 1.20.1-1
>> libevent: 2.1.12-4
>> nfs-utils: 2.6.2-1
>> util-linux: 2.38.1-1
>>
>>
>> Client 2:
>> OS: openSuSE 15.4, all patches installed
>> Kernel: 5.14.21-150400.24.41-default
>> Mount-Line: servername:/ on /nfs type nfs4
>> (rw,relatime,sync,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,softerr,softreval,noac,noresvport,proto=tcp,timeo=1,retrans=2,sec=krb5,clientaddr=XX.XX.XX.XX,lookupcache=none,local_lock=none,addr=YY.YY.YY.YY)
>>
>> libgssapi3: 7.8.0-bp154.2.4.1
>> libevent: 2.1.8-2.23
>> nfs-client: 2.1.1-150100.10.27.1
>> util-linux: 2.37.2-140400.8.14.1
>>
>>
>> The error occurs for example if a file is touched twice:
>>
>> touch testfile && echo "done" && touch testfile && echo "and again"
>>
>> However, touching a large number of files (about 10000) with (pairwise)
>> different filenames works fast.
>>
>>
>> Here is another example that triggers the error:
>>
>> 1st step: create many files (shell code in Z-shell syntax):
>>
>> for i in {1..10000}; do
>> echo "test" > $i.txt
>> done
>>
>> This is fast.
>>
>> 2nd step:
>>
>> for i in {1..10000}; do
>> echo $i
>> cat $i.txt
>> done
>>
>> This takes 5 seconds per cat(1) call.
>> After unmounting and mounting, the 2nd step also runs quickly at first. But
>> after executing the 2nd step several times in a row, the error occurs again
>> (quite soon, after the 2nd or 3rd execution).
>>
>> We were not able to reproduce the error without a Kerberos security type set.
>>
>>
>> Attached are a log from the server and from the client. In both cases
>>
>> rpcdebug -m nfs -s all
>> rpcdebug -m nfsd -s all
>> rpcdebug -m rpc -s all
>>
>> was set.
>>
>>
>> Best regards,
>> Florian Möller
>>
>>
>>
>

--
Dr. Florian Möller
Universität Würzburg
Institut für Mathematik
Emil-Fischer-Straße 30
97074 Würzburg, Germany
Tel. +49 931 3185596


Attachments:
1sttouch.log (7.75 kB)
2ndtouch.log (8.30 kB)
2ndtouch-next.log (1.17 kB)
Download all attachments

2023-02-21 14:14:34

by Benjamin Coddington

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

On 21 Feb 2023, at 8:33, Florian Möller wrote:

> Hi all,
>
> unfortunately the patch did not help, neither using -o async nor using -o sync. We did some more tests (which is the reason for the delay of this reply):
>
> We used a qemu image housing both the NFS server and the client and did some kernel debugging.
> OS: Ubuntu 22.04.1
> Kernel: 5.15.78
> Mount line: rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,
> namlen=255,hard,proto=tcp,timeo=600,retrans=2,
> sec=krb5p,clientaddr=10.0.0.1,
> local_lock=none,addr=10.0.0.1
>
> We touched a file and then touched the file again. This triggers the delay reliably.
>
> We set breakpoints on all functions starting with nfs4 and on "update_open_stateid". The attached file "1sttouch.log" contains a gdb log of the first touch.
> "2ndtouch.log" shows the gdb output of the second touch. The delay occurs in line 116 in update_open_stateid.
>
> We then deleted all breakpoints and set a sole breakpoint on update_open_stateid. We touched the file again and used only the "next" command of gdb. The gdb output is in "2ndtouch-next.log", the delay occurs in line 8.
>
> Please let us know if you need more information or if you want us to perform further tests.
>
> Best regards,
> Florian Möller

Hi Florian,

The 5 second value and location of the delay is making me suspect something
is wrong with the open stateid sequence processing.

The client introduces 5-second delays in order to correctly order stateid
updates from the server. Usually this happens because there are multiple
processes sending OPEN/CLOSE calls and the server processess them out of
order, or the client receives the responses out of order.

It would be helpful to have a network capture of the problem, along with the
matching output from these tracepoints on the client:

nfs4:nfs4_open_stateid_update
nfs4:nfs4_open_stateid_update_wait
nfs4:nfs4_close_stateid_update_wait
sunrpc:xs_stream_read_request
sunrpc:rpc_request
sunrpc:rpc_task_end

And these tracepoints on the server:

nfsd:nfsd_preprocess
sunrpc:svc_process

I'm interested in seeing how the client is processing the sequence numbers
of the open stateid, or if perhaps there's a delegation in play.

LMK if you need help with the tracepoints -- you can simply append those
lines into /sys/kernel/debug/tracing/set_event, then reproduce the problem.
The output of those tracepoints will be in /sys/kernel/debug/tracing/trace.

Ben


2023-02-21 16:52:25

by Florian Möller

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

Hi Benjamin,

here are the trace and a listing of the corresponding network packages. If the
listing is not detailed enough, I can send you a full package dump tomorrow.

The command I used was

touch test.txt && sleep 2 && touch test.txt

test.txt did not exist previously. So you have an example of a touch without and
with delay.

Best regards,
Florian

Am 21.02.23 um 15:13 schrieb Benjamin Coddington:
> On 21 Feb 2023, at 8:33, Florian Möller wrote:
>
>> Hi all,
>>
>> unfortunately the patch did not help, neither using -o async nor using -o sync. We did some more tests (which is the reason for the delay of this reply):
>>
>> We used a qemu image housing both the NFS server and the client and did some kernel debugging.
>> OS: Ubuntu 22.04.1
>> Kernel: 5.15.78
>> Mount line: rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,
>> namlen=255,hard,proto=tcp,timeo=600,retrans=2,
>> sec=krb5p,clientaddr=10.0.0.1,
>> local_lock=none,addr=10.0.0.1
>>
>> We touched a file and then touched the file again. This triggers the delay reliably.
>>
>> We set breakpoints on all functions starting with nfs4 and on "update_open_stateid". The attached file "1sttouch.log" contains a gdb log of the first touch.
>> "2ndtouch.log" shows the gdb output of the second touch. The delay occurs in line 116 in update_open_stateid.
>>
>> We then deleted all breakpoints and set a sole breakpoint on update_open_stateid. We touched the file again and used only the "next" command of gdb. The gdb output is in "2ndtouch-next.log", the delay occurs in line 8.
>>
>> Please let us know if you need more information or if you want us to perform further tests.
>>
>> Best regards,
>> Florian Möller
>
> Hi Florian,
>
> The 5 second value and location of the delay is making me suspect something
> is wrong with the open stateid sequence processing.
>
> The client introduces 5-second delays in order to correctly order stateid
> updates from the server. Usually this happens because there are multiple
> processes sending OPEN/CLOSE calls and the server processess them out of
> order, or the client receives the responses out of order.
>
> It would be helpful to have a network capture of the problem, along with the
> matching output from these tracepoints on the client:
>
> nfs4:nfs4_open_stateid_update
> nfs4:nfs4_open_stateid_update_wait
> nfs4:nfs4_close_stateid_update_wait
> sunrpc:xs_stream_read_request
> sunrpc:rpc_request
> sunrpc:rpc_task_end
>
> And these tracepoints on the server:
>
> nfsd:nfsd_preprocess
> sunrpc:svc_process
>
> I'm interested in seeing how the client is processing the sequence numbers
> of the open stateid, or if perhaps there's a delegation in play.
>
> LMK if you need help with the tracepoints -- you can simply append those
> lines into /sys/kernel/debug/tracing/set_event, then reproduce the problem.
> The output of those tracepoints will be in /sys/kernel/debug/tracing/trace.
>
> Ben

--
Dr. Florian Möller
Universität Würzburg
Institut für Mathematik
Emil-Fischer-Straße 30
97074 Würzburg, Germany
Tel. +49 931 3185596


Attachments:
nfs-network.txt (2.27 kB)
nfs-trace.txt (4.89 kB)
Download all attachments

2023-02-21 18:59:53

by Benjamin Coddington

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

On 21 Feb 2023, at 11:52, Florian Möller wrote:

> Hi Benjamin,
>
> here are the trace and a listing of the corresponding network packages. If the listing is not detailed enough, I can send you a full package dump tomorrow.
>
> The command I used was
>
> touch test.txt && sleep 2 && touch test.txt
>
> test.txt did not exist previously. So you have an example of a touch without and with delay.

Thanks! These are great - I can see from them that the client is indeed
waiting in the stateid update mechanism because the server has returned
NFS4ERR_STALE to the client's first CLOSE.

That is unusual. The server is signaling that the open file's stateid is old,
so I am interested to see if the first CLOSE is sent with the stateid's
sequence that was returned from the server. I could probably see this if I
had the server-side tracepoint data.

But from what's here I cannot see the details to figure this out because
you've only sent the summary of the packet capture.. can you send along the
raw capture?

I usually use something like:

tshark -w /tmp/pcap -i enps0 -P host server

.. then the file /tmp/pcap has the raw capture. We can use that to look at
the sequence numbers for the stateid as sent from client to server.

However, if you use krb5p, we won't be able to see these details. Only krb5
or krb5i. If the problem only reproduces with krb5p, we have to rely on the
tracepoints output on the client and server.

Ben


2023-02-22 11:55:20

by Benjamin Coddington

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

On 22 Feb 2023, at 3:19, Florian Möller wrote:

> Am 21.02.23 um 19:58 schrieb Benjamin Coddington:
>> On 21 Feb 2023, at 11:52, Florian Möller wrote:
>>
>>> Hi Benjamin,
>>>
>>> here are the trace and a listing of the corresponding network packages. If the listing is not detailed enough, I can send you a full package dump tomorrow.
>>>
>>> The command I used was
>>>
>>> touch test.txt && sleep 2 && touch test.txt
>>>
>>> test.txt did not exist previously. So you have an example of a touch without and with delay.
>>
>> Thanks! These are great - I can see from them that the client is indeed
>> waiting in the stateid update mechanism because the server has returned
>> NFS4ERR_STALE to the client's first CLOSE.
>>
>> That is unusual. The server is signaling that the open file's stateid is old,
>> so I am interested to see if the first CLOSE is sent with the stateid's
>> sequence that was returned from the server. I could probably see this if I
>> had the server-side tracepoint data.
>
> Hi Benjamin,
>
> the server-side tracepoints
>
> nfsd:nfsd_preprocess
> sunrpc:svc_process
>
> were enabled. It seems they did not produce any output.
>
> What I did now was:
> - enable all nfsd tracepoints,
> - enable all nfs4 tracepoints,
> - enable all sunrpc tracepoints.
>
> The command I used was
>
> touch somefile && sleep 2 && touch somefile.
>
> Then I unmounted the NFS share - this also causes a delay.
>
> I changed the security type to krb5 and captured trace and network output for a version 4.0 and a version 4.2 mount. The delay does not occur when using version 4.0.


In frame 9 of nfs-v4.2-krb5.pcap, the server responds to PUTFH with
NFS4ERR_STALE, so nothing to do with the open stateid sequencing. I also
see:

nfsd-1693 [000] ..... 1951.353889: nfsd_exp_find_key: fsid=1::{0x0,0xe5fcf0,0xffffc900,0x811e87a3,0xffffffff,0xe5fd00} domain=gss/krb5i status=-2
nfsd-1693 [000] ..... 1951.353889: nfsd_set_fh_dentry_badexport: xid=0xe1511810 fh_hash=0x3f9e713a status=-2
nfsd-1693 [000] ..... 1951.353890: nfsd_compound_status: op=2/4 OP_PUTFH status=70


.. so nfsd's exp_find_key() is having trouble and returns -ENOENT. Does
this look familiar to anyone?

I am not as familiar with how the server operates here, so my next step
would be to start inserting trace_printk's into the kernel source to figure
out what's going wrong in there. However, we can also use the function
graph tracer to see where the kernel is going. That would look like:

echo exp_find_key > /sys/kernel/tracing/set_graph_function
echo 7 > /sys/kernel/debug/tracing/max_graph_depth
echo function_graph > /sys/kernel/debug/tracing/current_tracer
> /sys/kernel/debug/tracing/trace

.. reproduce

cat /sys/kernel/debut/tracing/trace

Hopefully someone with more knfsd/sunrpc experience recognizes this.. but it
makes a lot more sense now that krb5 is part of the problem.

Ben


2023-02-22 12:23:46

by Jeffrey Layton

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

On Wed, 2023-02-22 at 06:54 -0500, Benjamin Coddington wrote:
> On 22 Feb 2023, at 3:19, Florian M?ller wrote:
>
> > Am 21.02.23 um 19:58 schrieb Benjamin Coddington:
> > > On 21 Feb 2023, at 11:52, Florian M?ller wrote:
> > >
> > > > Hi Benjamin,
> > > >
> > > > here are the trace and a listing of the corresponding network packages. If the listing is not detailed enough, I can send you a full package dump tomorrow.
> > > >
> > > > The command I used was
> > > >
> > > > touch test.txt && sleep 2 && touch test.txt
> > > >
> > > > test.txt did not exist previously. So you have an example of a touch without and with delay.
> > >
> > > Thanks! These are great - I can see from them that the client is indeed
> > > waiting in the stateid update mechanism because the server has returned
> > > NFS4ERR_STALE to the client's first CLOSE.
> > >
> > > That is unusual. The server is signaling that the open file's stateid is old,
> > > so I am interested to see if the first CLOSE is sent with the stateid's
> > > sequence that was returned from the server. I could probably see this if I
> > > had the server-side tracepoint data.
> >
> > Hi Benjamin,
> >
> > the server-side tracepoints
> >
> > nfsd:nfsd_preprocess
> > sunrpc:svc_process
> >
> > were enabled. It seems they did not produce any output.
> >
> > What I did now was:
> > - enable all nfsd tracepoints,
> > - enable all nfs4 tracepoints,
> > - enable all sunrpc tracepoints.
> >
> > The command I used was
> >
> > touch somefile && sleep 2 && touch somefile.
> >
> > Then I unmounted the NFS share - this also causes a delay.
> >
> > I changed the security type to krb5 and captured trace and network output for a version 4.0 and a version 4.2 mount. The delay does not occur when using version 4.0.
>
>
> In frame 9 of nfs-v4.2-krb5.pcap, the server responds to PUTFH with
> NFS4ERR_STALE, so nothing to do with the open stateid sequencing. I also
> see:
>
> nfsd-1693 [000] ..... 1951.353889: nfsd_exp_find_key: fsid=1::{0x0,0xe5fcf0,0xffffc900,0x811e87a3,0xffffffff,0xe5fd00} domain=gss/krb5i status=-2
> nfsd-1693 [000] ..... 1951.353889: nfsd_set_fh_dentry_badexport: xid=0xe1511810 fh_hash=0x3f9e713a status=-2
> nfsd-1693 [000] ..... 1951.353890: nfsd_compound_status: op=2/4 OP_PUTFH status=70
>

This just means that the kernel called into the "cache" infrastructure
to find an export entry, and there wasn't one.


Looking back at the original email here, I'd say this is expected since
the export wasn't set up for krb5i.

Output of exportfs -v:
/export
gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
/export
gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)


>
> .. so nfsd's exp_find_key() is having trouble and returns -ENOENT. Does
> this look familiar to anyone?
>
> I am not as familiar with how the server operates here, so my next step
> would be to start inserting trace_printk's into the kernel source to figure
> out what's going wrong in there. However, we can also use the function
> graph tracer to see where the kernel is going. That would look like:
>
> echo exp_find_key > /sys/kernel/tracing/set_graph_function
> echo 7 > /sys/kernel/debug/tracing/max_graph_depth
> echo function_graph > /sys/kernel/debug/tracing/current_tracer
> > /sys/kernel/debug/tracing/trace
>
> .. reproduce
>
> cat /sys/kernel/debut/tracing/trace
>
> Hopefully someone with more knfsd/sunrpc experience recognizes this.. but it
> makes a lot more sense now that krb5 is part of the problem.
>
> Ben
>

--
Jeff Layton <[email protected]>

2023-02-22 12:46:56

by Benjamin Coddington

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

On 22 Feb 2023, at 7:22, Jeff Layton wrote:

> On Wed, 2023-02-22 at 06:54 -0500, Benjamin Coddington wrote:
>> On 22 Feb 2023, at 3:19, Florian Möller wrote:
>>
>>> Am 21.02.23 um 19:58 schrieb Benjamin Coddington:
>>>> On 21 Feb 2023, at 11:52, Florian Möller wrote:
>>>>
>>>>> Hi Benjamin,
>>>>>
>>>>> here are the trace and a listing of the corresponding network packages. If the listing is not detailed enough, I can send you a full package dump tomorrow.
>>>>>
>>>>> The command I used was
>>>>>
>>>>> touch test.txt && sleep 2 && touch test.txt
>>>>>
>>>>> test.txt did not exist previously. So you have an example of a touch without and with delay.
>>>>
>>>> Thanks! These are great - I can see from them that the client is indeed
>>>> waiting in the stateid update mechanism because the server has returned
>>>> NFS4ERR_STALE to the client's first CLOSE.
>>>>
>>>> That is unusual. The server is signaling that the open file's stateid is old,
>>>> so I am interested to see if the first CLOSE is sent with the stateid's
>>>> sequence that was returned from the server. I could probably see this if I
>>>> had the server-side tracepoint data.
>>>
>>> Hi Benjamin,
>>>
>>> the server-side tracepoints
>>>
>>> nfsd:nfsd_preprocess
>>> sunrpc:svc_process
>>>
>>> were enabled. It seems they did not produce any output.
>>>
>>> What I did now was:
>>> - enable all nfsd tracepoints,
>>> - enable all nfs4 tracepoints,
>>> - enable all sunrpc tracepoints.
>>>
>>> The command I used was
>>>
>>> touch somefile && sleep 2 && touch somefile.
>>>
>>> Then I unmounted the NFS share - this also causes a delay.
>>>
>>> I changed the security type to krb5 and captured trace and network output for a version 4.0 and a version 4.2 mount. The delay does not occur when using version 4.0.
>>
>>
>> In frame 9 of nfs-v4.2-krb5.pcap, the server responds to PUTFH with
>> NFS4ERR_STALE, so nothing to do with the open stateid sequencing. I also
>> see:
>>
>> nfsd-1693 [000] ..... 1951.353889: nfsd_exp_find_key: fsid=1::{0x0,0xe5fcf0,0xffffc900,0x811e87a3,0xffffffff,0xe5fd00} domain=gss/krb5i status=-2
>> nfsd-1693 [000] ..... 1951.353889: nfsd_set_fh_dentry_badexport: xid=0xe1511810 fh_hash=0x3f9e713a status=-2
>> nfsd-1693 [000] ..... 1951.353890: nfsd_compound_status: op=2/4 OP_PUTFH status=70
>>
>
> This just means that the kernel called into the "cache" infrastructure
> to find an export entry, and there wasn't one.
>
>
> Looking back at the original email here, I'd say this is expected since
> the export wasn't set up for krb5i.
>
> Output of exportfs -v:
> /export
> gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
> /export
> gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)

Jeff and I were chatting on #linux-nfs, and its back to looking like a
client issue.. as he's pointed out that the client starts out with krb5,
then in Frame 8 (the CLOSE that receives STALE) switches to krb5i, then back
to krb5 in Frame 10..

.. this should be reproduceable. Let me dust off my old kdc.

Ben



2023-02-22 12:48:18

by Florian Möller

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

Am 22.02.23 um 13:22 schrieb Jeff Layton:
> On Wed, 2023-02-22 at 06:54 -0500, Benjamin Coddington wrote:
>> On 22 Feb 2023, at 3:19, Florian Möller wrote:
>>
>>> Am 21.02.23 um 19:58 schrieb Benjamin Coddington:
>>>> On 21 Feb 2023, at 11:52, Florian Möller wrote:
>>>>
>>>>> Hi Benjamin,
>>>>>
>>>>> here are the trace and a listing of the corresponding network packages. If the listing is not detailed enough, I can send you a full package dump tomorrow.
>>>>>
>>>>> The command I used was
>>>>>
>>>>> touch test.txt && sleep 2 && touch test.txt
>>>>>
>>>>> test.txt did not exist previously. So you have an example of a touch without and with delay.
>>>>
>>>> Thanks! These are great - I can see from them that the client is indeed
>>>> waiting in the stateid update mechanism because the server has returned
>>>> NFS4ERR_STALE to the client's first CLOSE.
>>>>
>>>> That is unusual. The server is signaling that the open file's stateid is old,
>>>> so I am interested to see if the first CLOSE is sent with the stateid's
>>>> sequence that was returned from the server. I could probably see this if I
>>>> had the server-side tracepoint data.
>>>
>>> Hi Benjamin,
>>>
>>> the server-side tracepoints
>>>
>>> nfsd:nfsd_preprocess
>>> sunrpc:svc_process
>>>
>>> were enabled. It seems they did not produce any output.
>>>
>>> What I did now was:
>>> - enable all nfsd tracepoints,
>>> - enable all nfs4 tracepoints,
>>> - enable all sunrpc tracepoints.
>>>
>>> The command I used was
>>>
>>> touch somefile && sleep 2 && touch somefile.
>>>
>>> Then I unmounted the NFS share - this also causes a delay.
>>>
>>> I changed the security type to krb5 and captured trace and network output for a version 4.0 and a version 4.2 mount. The delay does not occur when using version 4.0.
>>
>>
>> In frame 9 of nfs-v4.2-krb5.pcap, the server responds to PUTFH with
>> NFS4ERR_STALE, so nothing to do with the open stateid sequencing. I also
>> see:
>>
>> nfsd-1693 [000] ..... 1951.353889: nfsd_exp_find_key: fsid=1::{0x0,0xe5fcf0,0xffffc900,0x811e87a3,0xffffffff,0xe5fd00} domain=gss/krb5i status=-2
>> nfsd-1693 [000] ..... 1951.353889: nfsd_set_fh_dentry_badexport: xid=0xe1511810 fh_hash=0x3f9e713a status=-2
>> nfsd-1693 [000] ..... 1951.353890: nfsd_compound_status: op=2/4 OP_PUTFH status=70
>>
>
> This just means that the kernel called into the "cache" infrastructure
> to find an export entry, and there wasn't one.
>
>
> Looking back at the original email here, I'd say this is expected since
> the export wasn't set up for krb5i.
>
> Output of exportfs -v:
> /export
> gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
> /export
> gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)

I changed the export definitions to

/export
gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
/export
gss/krb5i(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
/export
gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)

such that all three kerberos security types are exported. With this setup the
delay is gone. Without the krb5i export the delay occurs again.
Mounting the NFS share with different kerberos security types does not change
this behaviour - without the krb5i export there is a delay.

Best,
Florian



>
>
>>
>> .. so nfsd's exp_find_key() is having trouble and returns -ENOENT. Does
>> this look familiar to anyone?
>>
>> I am not as familiar with how the server operates here, so my next step
>> would be to start inserting trace_printk's into the kernel source to figure
>> out what's going wrong in there. However, we can also use the function
>> graph tracer to see where the kernel is going. That would look like:
>>
>> echo exp_find_key > /sys/kernel/tracing/set_graph_function
>> echo 7 > /sys/kernel/debug/tracing/max_graph_depth
>> echo function_graph > /sys/kernel/debug/tracing/current_tracer
>>> /sys/kernel/debug/tracing/trace
>>
>> .. reproduce
>>
>> cat /sys/kernel/debut/tracing/trace
>>
>> Hopefully someone with more knfsd/sunrpc experience recognizes this.. but it
>> makes a lot more sense now that krb5 is part of the problem.
>>
>> Ben
>>
>

--
Dr. Florian Möller
Universität Würzburg
Institut für Mathematik
Emil-Fischer-Straße 30
97074 Würzburg, Germany
Tel. +49 931 3185596


2023-02-22 19:44:53

by Benjamin Coddington

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

On 22 Feb 2023, at 7:48, Florian Möller wrote:

> Am 22.02.23 um 13:22 schrieb Jeff Layton:
>> On Wed, 2023-02-22 at 06:54 -0500, Benjamin Coddington wrote:
>>> On 22 Feb 2023, at 3:19, Florian Möller wrote:
>>>
>>>> Am 21.02.23 um 19:58 schrieb Benjamin Coddington:
>>>>> On 21 Feb 2023, at 11:52, Florian Möller wrote:
>>>>>
>>>>>> Hi Benjamin,
>>>>>>
>>>>>> here are the trace and a listing of the corresponding network packages. If the listing is not detailed enough, I can send you a full package dump tomorrow.
>>>>>>
>>>>>> The command I used was
>>>>>>
>>>>>> touch test.txt && sleep 2 && touch test.txt
>>>>>>
>>>>>> test.txt did not exist previously. So you have an example of a touch without and with delay.
>>>>>
>>>>> Thanks! These are great - I can see from them that the client is indeed
>>>>> waiting in the stateid update mechanism because the server has returned
>>>>> NFS4ERR_STALE to the client's first CLOSE.
>>>>>
>>>>> That is unusual. The server is signaling that the open file's stateid is old,
>>>>> so I am interested to see if the first CLOSE is sent with the stateid's
>>>>> sequence that was returned from the server. I could probably see this if I
>>>>> had the server-side tracepoint data.
>>>>
>>>> Hi Benjamin,
>>>>
>>>> the server-side tracepoints
>>>>
>>>> nfsd:nfsd_preprocess
>>>> sunrpc:svc_process
>>>>
>>>> were enabled. It seems they did not produce any output.
>>>>
>>>> What I did now was:
>>>> - enable all nfsd tracepoints,
>>>> - enable all nfs4 tracepoints,
>>>> - enable all sunrpc tracepoints.
>>>>
>>>> The command I used was
>>>>
>>>> touch somefile && sleep 2 && touch somefile.
>>>>
>>>> Then I unmounted the NFS share - this also causes a delay.
>>>>
>>>> I changed the security type to krb5 and captured trace and network output for a version 4.0 and a version 4.2 mount. The delay does not occur when using version 4.0.
>>>
>>>
>>> In frame 9 of nfs-v4.2-krb5.pcap, the server responds to PUTFH with
>>> NFS4ERR_STALE, so nothing to do with the open stateid sequencing. I also
>>> see:
>>>
>>> nfsd-1693 [000] ..... 1951.353889: nfsd_exp_find_key: fsid=1::{0x0,0xe5fcf0,0xffffc900,0x811e87a3,0xffffffff,0xe5fd00} domain=gss/krb5i status=-2
>>> nfsd-1693 [000] ..... 1951.353889: nfsd_set_fh_dentry_badexport: xid=0xe1511810 fh_hash=0x3f9e713a status=-2
>>> nfsd-1693 [000] ..... 1951.353890: nfsd_compound_status: op=2/4 OP_PUTFH status=70
>>>
>>
>> This just means that the kernel called into the "cache" infrastructure
>> to find an export entry, and there wasn't one.
>>
>>
>> Looking back at the original email here, I'd say this is expected since
>> the export wasn't set up for krb5i.
>>
>> Output of exportfs -v:
>> /export
>> gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>> /export
>> gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>
> I changed the export definitions to
>
> /export gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
> /export gss/krb5i(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
> /export gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>
> such that all three kerberos security types are exported. With this setup the delay is gone. Without the krb5i export the delay occurs again.
> Mounting the NFS share with different kerberos security types does not change this behaviour - without the krb5i export there is a delay.

After getting the ol' KDC back in shape and trying to reproduce this, I
noticed you're using a client specifier that's been deprecated (the
gss/krb5), along with sec=sys in the same line. That's a little weird, not
sure what the results of that might be.

I can't even get any mount to work with that:

[root@fedora ~]# exportfs -v
/exports gss/krb5(sync,wdelay,hide,no_subtree_check,nordirplus,fsid=0,sec=sys,rw,insecure,no_root_squash,no_all_squash)
/exports gss/krb5p(sync,wdelay,hide,no_subtree_check,nordirplus,fsid=0,sec=sys,rw,insecure,no_root_squash,no_all_squash)
[root@fedora ~]# mount -t nfs -ov4.1,sec=krb5 fedora.fh.bcodding.com:/exports /mnt/fedora
mount.nfs: mounting fedora.fh.bcodding.com:/exports failed, reason given by server: No such file or directory

However, if I use the "sec=" method, everything seems to work fine:

[root@fedora ~]# exportfs -v
/exports 10.0.1.0/24(sync,wdelay,hide,no_subtree_check,nordirplus,sec=sys:krb5:krb5p,rw,insecure,no_root_squash,no_all_squash)
[root@fedora ~]# mount -t nfs -ov4.1,sec=krb5 fedora.fh.bcodding.com:/exports /mnt/fedora
[root@fedora ~]# touch /mnt/fedora/krbfoo
[root@fedora ~]# touch /mnt/fedora/krbfoo
[root@fedora ~]# umount /mnt/fedora
[root@fedora ~]# mount -t nfs -ov4.1,sec=krb5p fedora.fh.bcodding.com:/exports /mnt/fedora
[root@fedora ~]# touch /mnt/fedora/krbfoo
[root@fedora ~]# touch /mnt/fedora/krbfoo
[root@fedora ~]# umount /mnt/fedora

.. so I can't seem to reproduce this because I can't even mount. Did you
perhaps have a previous mount with that client with a different security
flavor?

I think your NFS client ended up having a machine credential with krb5i, but
then the export with that flavor was removed. When the client tried to do a
CLOSE, it picked the machine cred, which caused the server to fail in the
lookup of the export, which caused the client to think its stateid was
wrong.

Ben


2023-02-22 20:14:51

by Rick Macklem

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

On Wed, Feb 22, 2023 at 11:44 AM Benjamin Coddington
<[email protected]> wrote:
>
> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to [email protected]
>
>
> On 22 Feb 2023, at 7:48, Florian Möller wrote:
>
> > Am 22.02.23 um 13:22 schrieb Jeff Layton:
> >> On Wed, 2023-02-22 at 06:54 -0500, Benjamin Coddington wrote:
> >>> On 22 Feb 2023, at 3:19, Florian Möller wrote:
> >>>
> >>>> Am 21.02.23 um 19:58 schrieb Benjamin Coddington:
> >>>>> On 21 Feb 2023, at 11:52, Florian Möller wrote:
> >>>>>
> >>>>>> Hi Benjamin,
> >>>>>>
> >>>>>> here are the trace and a listing of the corresponding network packages. If the listing is not detailed enough, I can send you a full package dump tomorrow.
> >>>>>>
> >>>>>> The command I used was
> >>>>>>
> >>>>>> touch test.txt && sleep 2 && touch test.txt
> >>>>>>
> >>>>>> test.txt did not exist previously. So you have an example of a touch without and with delay.
> >>>>>
> >>>>> Thanks! These are great - I can see from them that the client is indeed
> >>>>> waiting in the stateid update mechanism because the server has returned
> >>>>> NFS4ERR_STALE to the client's first CLOSE.
> >>>>>
> >>>>> That is unusual. The server is signaling that the open file's stateid is old,
> >>>>> so I am interested to see if the first CLOSE is sent with the stateid's
> >>>>> sequence that was returned from the server. I could probably see this if I
> >>>>> had the server-side tracepoint data.
> >>>>
> >>>> Hi Benjamin,
> >>>>
> >>>> the server-side tracepoints
> >>>>
> >>>> nfsd:nfsd_preprocess
> >>>> sunrpc:svc_process
> >>>>
> >>>> were enabled. It seems they did not produce any output.
> >>>>
> >>>> What I did now was:
> >>>> - enable all nfsd tracepoints,
> >>>> - enable all nfs4 tracepoints,
> >>>> - enable all sunrpc tracepoints.
> >>>>
> >>>> The command I used was
> >>>>
> >>>> touch somefile && sleep 2 && touch somefile.
> >>>>
> >>>> Then I unmounted the NFS share - this also causes a delay.
> >>>>
> >>>> I changed the security type to krb5 and captured trace and network output for a version 4.0 and a version 4.2 mount. The delay does not occur when using version 4.0.
> >>>
> >>>
> >>> In frame 9 of nfs-v4.2-krb5.pcap, the server responds to PUTFH with
> >>> NFS4ERR_STALE, so nothing to do with the open stateid sequencing. I also
> >>> see:
> >>>
> >>> nfsd-1693 [000] ..... 1951.353889: nfsd_exp_find_key: fsid=1::{0x0,0xe5fcf0,0xffffc900,0x811e87a3,0xffffffff,0xe5fd00} domain=gss/krb5i status=-2
> >>> nfsd-1693 [000] ..... 1951.353889: nfsd_set_fh_dentry_badexport: xid=0xe1511810 fh_hash=0x3f9e713a status=-2
> >>> nfsd-1693 [000] ..... 1951.353890: nfsd_compound_status: op=2/4 OP_PUTFH status=70
> >>>
> >>
> >> This just means that the kernel called into the "cache" infrastructure
> >> to find an export entry, and there wasn't one.
> >>
> >>
> >> Looking back at the original email here, I'd say this is expected since
> >> the export wasn't set up for krb5i.
> >>
> >> Output of exportfs -v:
> >> /export
> >> gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
> >> /export
> >> gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
> >
> > I changed the export definitions to
> >
> > /export gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
> > /export gss/krb5i(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
> > /export gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
> >
> > such that all three kerberos security types are exported. With this setup the delay is gone. Without the krb5i export the delay occurs again.
> > Mounting the NFS share with different kerberos security types does not change this behaviour - without the krb5i export there is a delay.
>
> After getting the ol' KDC back in shape and trying to reproduce this, I
> noticed you're using a client specifier that's been deprecated (the
> gss/krb5), along with sec=sys in the same line. That's a little weird, not
> sure what the results of that might be.
>
> I can't even get any mount to work with that:
>
> [root@fedora ~]# exportfs -v
> /exports gss/krb5(sync,wdelay,hide,no_subtree_check,nordirplus,fsid=0,sec=sys,rw,insecure,no_root_squash,no_all_squash)
> /exports gss/krb5p(sync,wdelay,hide,no_subtree_check,nordirplus,fsid=0,sec=sys,rw,insecure,no_root_squash,no_all_squash)
> [root@fedora ~]# mount -t nfs -ov4.1,sec=krb5 fedora.fh.bcodding.com:/exports /mnt/fedora
> mount.nfs: mounting fedora.fh.bcodding.com:/exports failed, reason given by server: No such file or directory
>
> However, if I use the "sec=" method, everything seems to work fine:
>
> [root@fedora ~]# exportfs -v
> /exports 10.0.1.0/24(sync,wdelay,hide,no_subtree_check,nordirplus,sec=sys:krb5:krb5p,rw,insecure,no_root_squash,no_all_squash)
> [root@fedora ~]# mount -t nfs -ov4.1,sec=krb5 fedora.fh.bcodding.com:/exports /mnt/fedora
> [root@fedora ~]# touch /mnt/fedora/krbfoo
> [root@fedora ~]# touch /mnt/fedora/krbfoo
> [root@fedora ~]# umount /mnt/fedora
> [root@fedora ~]# mount -t nfs -ov4.1,sec=krb5p fedora.fh.bcodding.com:/exports /mnt/fedora
> [root@fedora ~]# touch /mnt/fedora/krbfoo
> [root@fedora ~]# touch /mnt/fedora/krbfoo
> [root@fedora ~]# umount /mnt/fedora
>
> .. so I can't seem to reproduce this because I can't even mount. Did you
> perhaps have a previous mount with that client with a different security
> flavor?
>
> I think your NFS client ended up having a machine credential with krb5i, but
> then the export with that flavor was removed. When the client tried to do a
> CLOSE, it picked the machine cred, which caused the server to fail in the
> lookup of the export, which caused the client to think its stateid was
> wrong.
>
Just a shot in the dark, but is the client deciding to use SP4_MACH_CRED?
If so (and the server allows it by replying NFS_OK to ExchangeID), then the
client must use krb5i or krb5p for the state maintenance operations.
Easy to check. Just look at the ExchangeID done at mount time with wireshark.

rick

> Ben
>

2023-02-23 08:28:03

by Florian Möller

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

Am 22.02.23 um 20:43 schrieb Benjamin Coddington:
> On 22 Feb 2023, at 7:48, Florian Möller wrote:
>
>> Am 22.02.23 um 13:22 schrieb Jeff Layton:
>>> On Wed, 2023-02-22 at 06:54 -0500, Benjamin Coddington wrote:
>>>> On 22 Feb 2023, at 3:19, Florian Möller wrote:
>>>>
>>>>> Am 21.02.23 um 19:58 schrieb Benjamin Coddington:
>>>>>> On 21 Feb 2023, at 11:52, Florian Möller wrote:
>>>>>>
>>>>>>> Hi Benjamin,
>>>>>>>
>>>>>>> here are the trace and a listing of the corresponding network packages. If the listing is not detailed enough, I can send you a full package dump tomorrow.
>>>>>>>
>>>>>>> The command I used was
>>>>>>>
>>>>>>> touch test.txt && sleep 2 && touch test.txt
>>>>>>>
>>>>>>> test.txt did not exist previously. So you have an example of a touch without and with delay.
>>>>>>
>>>>>> Thanks! These are great - I can see from them that the client is indeed
>>>>>> waiting in the stateid update mechanism because the server has returned
>>>>>> NFS4ERR_STALE to the client's first CLOSE.
>>>>>>
>>>>>> That is unusual. The server is signaling that the open file's stateid is old,
>>>>>> so I am interested to see if the first CLOSE is sent with the stateid's
>>>>>> sequence that was returned from the server. I could probably see this if I
>>>>>> had the server-side tracepoint data.
>>>>>
>>>>> Hi Benjamin,
>>>>>
>>>>> the server-side tracepoints
>>>>>
>>>>> nfsd:nfsd_preprocess
>>>>> sunrpc:svc_process
>>>>>
>>>>> were enabled. It seems they did not produce any output.
>>>>>
>>>>> What I did now was:
>>>>> - enable all nfsd tracepoints,
>>>>> - enable all nfs4 tracepoints,
>>>>> - enable all sunrpc tracepoints.
>>>>>
>>>>> The command I used was
>>>>>
>>>>> touch somefile && sleep 2 && touch somefile.
>>>>>
>>>>> Then I unmounted the NFS share - this also causes a delay.
>>>>>
>>>>> I changed the security type to krb5 and captured trace and network output for a version 4.0 and a version 4.2 mount. The delay does not occur when using version 4.0.
>>>>
>>>>
>>>> In frame 9 of nfs-v4.2-krb5.pcap, the server responds to PUTFH with
>>>> NFS4ERR_STALE, so nothing to do with the open stateid sequencing. I also
>>>> see:
>>>>
>>>> nfsd-1693 [000] ..... 1951.353889: nfsd_exp_find_key: fsid=1::{0x0,0xe5fcf0,0xffffc900,0x811e87a3,0xffffffff,0xe5fd00} domain=gss/krb5i status=-2
>>>> nfsd-1693 [000] ..... 1951.353889: nfsd_set_fh_dentry_badexport: xid=0xe1511810 fh_hash=0x3f9e713a status=-2
>>>> nfsd-1693 [000] ..... 1951.353890: nfsd_compound_status: op=2/4 OP_PUTFH status=70
>>>>
>>>
>>> This just means that the kernel called into the "cache" infrastructure
>>> to find an export entry, and there wasn't one.
>>>
>>>
>>> Looking back at the original email here, I'd say this is expected since
>>> the export wasn't set up for krb5i.
>>>
>>> Output of exportfs -v:
>>> /export
>>> gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>>> /export
>>> gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>>
>> I changed the export definitions to
>>
>> /export gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>> /export gss/krb5i(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>> /export gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>>
>> such that all three kerberos security types are exported. With this setup the delay is gone. Without the krb5i export the delay occurs again.
>> Mounting the NFS share with different kerberos security types does not change this behaviour - without the krb5i export there is a delay.
>
> After getting the ol' KDC back in shape and trying to reproduce this, I
> noticed you're using a client specifier that's been deprecated (the
> gss/krb5), along with sec=sys in the same line. That's a little weird, not
> sure what the results of that might be.

Our original /etc/exports entries were:

/export gss/krb5p(rw,crossmnt,async,fsid=0)
/export gss/krb5(rw,crossmnt,async,fsid=0)

These were based on a template shipping with Ubuntu (file
/usr/share/nfs-kernel-server/conffiles/etc.exports from package
nfs-kernel-server) using the deprecated gss/krb syntax.
We didn't explicitly add any sec= option.

>
> I can't even get any mount to work with that:
>
> [root@fedora ~]# exportfs -v
> /exports gss/krb5(sync,wdelay,hide,no_subtree_check,nordirplus,fsid=0,sec=sys,rw,insecure,no_root_squash,no_all_squash)
> /exports gss/krb5p(sync,wdelay,hide,no_subtree_check,nordirplus,fsid=0,sec=sys,rw,insecure,no_root_squash,no_all_squash)
> [root@fedora ~]# mount -t nfs -ov4.1,sec=krb5 fedora.fh.bcodding.com:/exports /mnt/fedora
> mount.nfs: mounting fedora.fh.bcodding.com:/exports failed, reason given by server: No such file or directory

For us, mounting was never really a problem; we tried several distros, each with
several kernels. Except for the delay, everything worked fine, no errors or
deprecation warnings were given.

> However, if I use the "sec=" method, everything seems to work fine:
>
> [root@fedora ~]# exportfs -v
> /exports 10.0.1.0/24(sync,wdelay,hide,no_subtree_check,nordirplus,sec=sys:krb5:krb5p,rw,insecure,no_root_squash,no_all_squash)
> [root@fedora ~]# mount -t nfs -ov4.1,sec=krb5 fedora.fh.bcodding.com:/exports /mnt/fedora
> [root@fedora ~]# touch /mnt/fedora/krbfoo
> [root@fedora ~]# touch /mnt/fedora/krbfoo
> [root@fedora ~]# umount /mnt/fedora
> [root@fedora ~]# mount -t nfs -ov4.1,sec=krb5p fedora.fh.bcodding.com:/exports /mnt/fedora
> [root@fedora ~]# touch /mnt/fedora/krbfoo
> [root@fedora ~]# touch /mnt/fedora/krbfoo
> [root@fedora ~]# umount /mnt/fedora
>
> .. so I can't seem to reproduce this because I can't even mount. Did you
> perhaps have a previous mount with that client with a different security
> flavor?

I can confirm that the delay doesn't occur with

/export *(rw,crossmnt,async,fsid=0,sec=krb5:krb5p)

exportfs -v now gives

/export
<world>(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=krb5:krb5p,rw,secure,root_squash,no_all_squash)

> I think your NFS client ended up having a machine credential with krb5i, but
> then the export with that flavor was removed. When the client tried to do a
> CLOSE, it picked the machine cred, which caused the server to fail in the
> lookup of the export, which caused the client to think its stateid was
> wrong.

Thank you for your efforts! Things are working now.

Perhaps someone could rewrite the "RPCSEC_GSS security" section of exports(5).
It starts with

You may use the special strings "gss/krb5", "gss/krb5i", or "gss/krb5p" to
restrict access to clients using rpcsec_gss security.
However, this syntax is deprecated; on linux kernels since 2.6.23, you should
instead use the "sec=" export option

After all this hassle, the first sentence (together with the Ubuntu template) is
a bit misleading.

Best,
Florian

--
Dr. Florian Möller
Universität Würzburg
Institut für Mathematik
Emil-Fischer-Straße 30
97074 Würzburg, Germany
Tel. +49 931 3185596


2023-02-23 11:08:11

by Benjamin Coddington

[permalink] [raw]
Subject: Re: Reoccurring 5 second delays during NFS calls

On 22 Feb 2023, at 15:14, Rick Macklem wrote:

> On Wed, Feb 22, 2023 at 11:44 AM Benjamin Coddington
> <[email protected]> wrote:
>>
>> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to [email protected]
>>
>>
>> On 22 Feb 2023, at 7:48, Florian Möller wrote:
>>
>>> Am 22.02.23 um 13:22 schrieb Jeff Layton:
>>>> On Wed, 2023-02-22 at 06:54 -0500, Benjamin Coddington wrote:
>>>>> On 22 Feb 2023, at 3:19, Florian Möller wrote:
>>>>>
>>>>>> Am 21.02.23 um 19:58 schrieb Benjamin Coddington:
>>>>>>> On 21 Feb 2023, at 11:52, Florian Möller wrote:
>>>>>>>
>>>>>>>> Hi Benjamin,
>>>>>>>>
>>>>>>>> here are the trace and a listing of the corresponding network packages. If the listing is not detailed enough, I can send you a full package dump tomorrow.
>>>>>>>>
>>>>>>>> The command I used was
>>>>>>>>
>>>>>>>> touch test.txt && sleep 2 && touch test.txt
>>>>>>>>
>>>>>>>> test.txt did not exist previously. So you have an example of a touch without and with delay.
>>>>>>>
>>>>>>> Thanks! These are great - I can see from them that the client is indeed
>>>>>>> waiting in the stateid update mechanism because the server has returned
>>>>>>> NFS4ERR_STALE to the client's first CLOSE.
>>>>>>>
>>>>>>> That is unusual. The server is signaling that the open file's stateid is old,
>>>>>>> so I am interested to see if the first CLOSE is sent with the stateid's
>>>>>>> sequence that was returned from the server. I could probably see this if I
>>>>>>> had the server-side tracepoint data.
>>>>>>
>>>>>> Hi Benjamin,
>>>>>>
>>>>>> the server-side tracepoints
>>>>>>
>>>>>> nfsd:nfsd_preprocess
>>>>>> sunrpc:svc_process
>>>>>>
>>>>>> were enabled. It seems they did not produce any output.
>>>>>>
>>>>>> What I did now was:
>>>>>> - enable all nfsd tracepoints,
>>>>>> - enable all nfs4 tracepoints,
>>>>>> - enable all sunrpc tracepoints.
>>>>>>
>>>>>> The command I used was
>>>>>>
>>>>>> touch somefile && sleep 2 && touch somefile.
>>>>>>
>>>>>> Then I unmounted the NFS share - this also causes a delay.
>>>>>>
>>>>>> I changed the security type to krb5 and captured trace and network output for a version 4.0 and a version 4.2 mount. The delay does not occur when using version 4.0.
>>>>>
>>>>>
>>>>> In frame 9 of nfs-v4.2-krb5.pcap, the server responds to PUTFH with
>>>>> NFS4ERR_STALE, so nothing to do with the open stateid sequencing. I also
>>>>> see:
>>>>>
>>>>> nfsd-1693 [000] ..... 1951.353889: nfsd_exp_find_key: fsid=1::{0x0,0xe5fcf0,0xffffc900,0x811e87a3,0xffffffff,0xe5fd00} domain=gss/krb5i status=-2
>>>>> nfsd-1693 [000] ..... 1951.353889: nfsd_set_fh_dentry_badexport: xid=0xe1511810 fh_hash=0x3f9e713a status=-2
>>>>> nfsd-1693 [000] ..... 1951.353890: nfsd_compound_status: op=2/4 OP_PUTFH status=70
>>>>>
>>>>
>>>> This just means that the kernel called into the "cache" infrastructure
>>>> to find an export entry, and there wasn't one.
>>>>
>>>>
>>>> Looking back at the original email here, I'd say this is expected since
>>>> the export wasn't set up for krb5i.
>>>>
>>>> Output of exportfs -v:
>>>> /export
>>>> gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>>>> /export
>>>> gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>>>
>>> I changed the export definitions to
>>>
>>> /export gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>>> /export gss/krb5i(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>>> /export gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>>>
>>> such that all three kerberos security types are exported. With this setup the delay is gone. Without the krb5i export the delay occurs again.
>>> Mounting the NFS share with different kerberos security types does not change this behaviour - without the krb5i export there is a delay.
>>
>> After getting the ol' KDC back in shape and trying to reproduce this, I
>> noticed you're using a client specifier that's been deprecated (the
>> gss/krb5), along with sec=sys in the same line. That's a little weird, not
>> sure what the results of that might be.
>>
>> I can't even get any mount to work with that:
>>
>> [root@fedora ~]# exportfs -v
>> /exports gss/krb5(sync,wdelay,hide,no_subtree_check,nordirplus,fsid=0,sec=sys,rw,insecure,no_root_squash,no_all_squash)
>> /exports gss/krb5p(sync,wdelay,hide,no_subtree_check,nordirplus,fsid=0,sec=sys,rw,insecure,no_root_squash,no_all_squash)
>> [root@fedora ~]# mount -t nfs -ov4.1,sec=krb5 fedora.fh.bcodding.com:/exports /mnt/fedora
>> mount.nfs: mounting fedora.fh.bcodding.com:/exports failed, reason given by server: No such file or directory
>>
>> However, if I use the "sec=" method, everything seems to work fine:
>>
>> [root@fedora ~]# exportfs -v
>> /exports 10.0.1.0/24(sync,wdelay,hide,no_subtree_check,nordirplus,sec=sys:krb5:krb5p,rw,insecure,no_root_squash,no_all_squash)
>> [root@fedora ~]# mount -t nfs -ov4.1,sec=krb5 fedora.fh.bcodding.com:/exports /mnt/fedora
>> [root@fedora ~]# touch /mnt/fedora/krbfoo
>> [root@fedora ~]# touch /mnt/fedora/krbfoo
>> [root@fedora ~]# umount /mnt/fedora
>> [root@fedora ~]# mount -t nfs -ov4.1,sec=krb5p fedora.fh.bcodding.com:/exports /mnt/fedora
>> [root@fedora ~]# touch /mnt/fedora/krbfoo
>> [root@fedora ~]# touch /mnt/fedora/krbfoo
>> [root@fedora ~]# umount /mnt/fedora
>>
>> .. so I can't seem to reproduce this because I can't even mount. Did you
>> perhaps have a previous mount with that client with a different security
>> flavor?
>>
>> I think your NFS client ended up having a machine credential with krb5i, but
>> then the export with that flavor was removed. When the client tried to do a
>> CLOSE, it picked the machine cred, which caused the server to fail in the
>> lookup of the export, which caused the client to think its stateid was
>> wrong.
>>
> Just a shot in the dark, but is the client deciding to use SP4_MACH_CRED?
> If so (and the server allows it by replying NFS_OK to ExchangeID), then the
> client must use krb5i or krb5p for the state maintenance operations.
> Easy to check. Just look at the ExchangeID done at mount time with wireshark.

Yep, that's what's going on.

And it seems that when using the gss/krb5(options) exporting style it causes
exp_find_key() stuff to be looking for an auth_domain of gss/krb5i, which
doesn't exist. There's a bug in here somewhere.

Ben