LinuxLists.cc - NFS (or RPC) batching of calls over TCP results in 'unmatched' replies?

2014-05-20 13:15:00

Subject: NFS (or RPC) batching of calls over TCP results in 'unmatched' replies?

Hello list.

I wonder if anyone here could confirm my suspicions (and lack of
understanding) re. batched RPC calls? Especially in the context of NFS
reads.

--[ Background;
Initially I was to believe that for every dispatched RPC call (say an
NFS read) a reply message from the server was expected - so that the
client can use the XID to match it to the call.

However, when capturing traffic using libpcap (tcpdump, wireshark, gulp
and in-house software all exhibit this behaviour) during peak workloads
(100MB/s read - 1600 64k NFS read calls per second) , later analysis
shows that the number of identified RPC replies is significantly less
than the number of calls. This is despite libpcap reporting no packet
loss
at either the NIC or kernel level. For example, using the tshark display
filter rpc.msgtyp==0 and rpc.msgtyp==1 respectively;

Calls: 250364
Replies: 83419

Calls: 165770
Replies: 48488

Calls: 171208
Replies: 14861

This is against 3 different NFS servers (the last being Linux knfsd).

This was, to me, unexpected and so I read more and came across the
rather tiny amount of information on RPC batching in the RFCs (section
7.4.1 - a small paragraph). The IBM and Oracle websites also have some
info but not much. I have found nothing that enables
me to identify batched requests (or rather the replies for a batch) in
the protocol itself. Wireshark seems unable to do this also.

I realise this is a bit off-topic but I was hoping someone might be able
to point me in the right direction? Am I barking up the wrong tree? Is
this behaviour expected - am I unable to match calls to replies when
batching is used? The IBM website suggests;

'Batching assumes the following:

Each remote procedure call in the pipeline requires no response from the
server, and the server does not send a response message.'

This sounds a bit awkward! It then goes on to suggest that;

'The remote procedure call's time out must be 0'

For batched calls. But there is no call timeout in the RPC protocol
right? Again, this must be up to the client implementation I suspect?

Any help appreciated.

Regards,

Jim

PS. I may well join the Linux RPC mailing list and direct this question
at them instead.

--
Senior Software Engineer
Systems Development
Framestore

2014-05-20 16:55:42

by Jim Vanns

[permalink] [raw]

Subject: Re: NFS (or RPC) batching of calls over TCP results in 'unmatched' replies?

<snip>

> > My suspicion is that tshark is missing some replies. Either the packet
> > loss counters are wrong, or tshark is failing to identify RPC replies
> > that start in the middle of a tcp segment? I seem to recall seeing that
> > before.
>
> Yes - I had wondered about that and I will modify my own code to verify
> if this is the case.
>
> I *assumed* that tcpdump or wireshark already did this! I guess it just
> expects application protocols to be aligned to the TCP boundary.

I have (rather rushed to be honest) written some code that scans the
entire TCP segment captured for a record fragment marker and I still see
the same results - very few replies for a great many calls.

Still at a loss (pun intended ;)

Jim

> Thanks for your help so far.
>
> Jim
>
> > --b.
> >
> > > The IBM and Oracle websites also have some
> > > info but not much. I have found nothing that enables
> > > me to identify batched requests (or rather the replies for a batch) in
> > > the protocol itself. Wireshark seems unable to do this also.
> > >
> > > I realise this is a bit off-topic but I was hoping someone might be able
> > > to point me in the right direction? Am I barking up the wrong tree? Is
> > > this behaviour expected - am I unable to match calls to replies when
> > > batching is used? The IBM website suggests;
> > >
> > > 'Batching assumes the following:
> > >
> > > Each remote procedure call in the pipeline requires no response from the
> > > server, and the server does not send a response message.'
> > >
> > > This sounds a bit awkward! It then goes on to suggest that;
> > >
> > > 'The remote procedure call's time out must be 0'
> > >
> > > For batched calls. But there is no call timeout in the RPC protocol
> > > right? Again, this must be up to the client implementation I suspect?
> > >
> > > Any help appreciated.
> > >
> > > Regards,
> > >
> > > Jim
> > >
> > > PS. I may well join the Linux RPC mailing list and direct this question
> > > at them instead.
> > >
> > > --
> > > Senior Software Engineer
> > > Systems Development
> > > Framestore
> > >
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > > the body of a message to [email protected]
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
>

--
Software Engineer
Systems Development
Framestore

2014-05-20 15:26:57

by J. Bruce Fields

[permalink] [raw]

Subject: Re: NFS (or RPC) batching of calls over TCP results in 'unmatched' replies?

On Tue, May 20, 2014 at 01:29:26PM +0100, Jim Vanns wrote:
> Hello list.
>
> I wonder if anyone here could confirm my suspicions (and lack of
> understanding) re. batched RPC calls? Especially in the context of NFS
> reads.
>
> --[ Background;
> Initially I was to believe that for every dispatched RPC call (say an
> NFS read) a reply message from the server was expected - so that the
> client can use the XID to match it to the call.
>
> However, when capturing traffic using libpcap (tcpdump, wireshark, gulp
> and in-house software all exhibit this behaviour) during peak workloads
> (100MB/s read - 1600 64k NFS read calls per second) , later analysis
> shows that the number of identified RPC replies is significantly less
> than the number of calls.

Either capture is missing packets, or the analysis is missing rpc
replies, or the server is slow to reply and the client is retrying
aggressively.

> This is despite libpcap reporting no packet
> loss
> at either the NIC or kernel level. For example, using the tshark display
> filter rpc.msgtyp==0 and rpc.msgtyp==1 respectively;
>
> Calls: 250364
> Replies: 83419
>
> Calls: 165770
> Replies: 48488
>
> Calls: 171208
> Replies: 14861
>
> This is against 3 different NFS servers (the last being Linux knfsd).
>
> This was, to me, unexpected and so I read more and came across the
> rather tiny amount of information on RPC batching in the RFCs (section
> 7.4.1 - a small paragraph).

Section 7.4.1 of rfc 1831?

I'd never noticed that before....

In any case, I think that's just describing something that an RPC-based
protocol *could* do. NFS doesn't do this--in particular, NFS read calls
require replies.

My suspicion is that tshark is missing some replies. Either the packet
loss counters are wrong, or tshark is failing to identify RPC replies
that start in the middle of a tcp segment? I seem to recall seeing that
before.

--b.

> The IBM and Oracle websites also have some
> info but not much. I have found nothing that enables
> me to identify batched requests (or rather the replies for a batch) in
> the protocol itself. Wireshark seems unable to do this also.
>
> I realise this is a bit off-topic but I was hoping someone might be able
> to point me in the right direction? Am I barking up the wrong tree? Is
> this behaviour expected - am I unable to match calls to replies when
> batching is used? The IBM website suggests;
>
> 'Batching assumes the following:
>
> Each remote procedure call in the pipeline requires no response from the
> server, and the server does not send a response message.'
>
> This sounds a bit awkward! It then goes on to suggest that;
>
> 'The remote procedure call's time out must be 0'
>
> For batched calls. But there is no call timeout in the RPC protocol
> right? Again, this must be up to the client implementation I suspect?
>
> Any help appreciated.
>
> Regards,
>
> Jim
>
> PS. I may well join the Linux RPC mailing list and direct this question
> at them instead.
>
> --
> Senior Software Engineer
> Systems Development
> Framestore
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2014-05-20 16:34:26

by Jim Vanns

[permalink] [raw]

Subject: Re: NFS (or RPC) batching of calls over TCP results in 'unmatched' replies?

Hi - thanks for your reply.

<snip>

> Either capture is missing packets, or the analysis is missing rpc
> replies, or the server is slow to reply and the client is retrying
> aggressively.

That would seem the obvious thing and of course was my first port of
call.
It certainly is suspicious and generally not what I would have
expected.

That said, to reiterate, libpcap (pcap_stats()) tells me neither the
kernel
nor the NIC dropped any packets. However, I need to be sure the NIC
actually reports
this info to libpcap as it isn't always supported.

What I can say is that I have performed the capture both on the server
(knfsd) and the
client and they both agree that there are significantly less replied
identified than calls!

<snip>

> Section 7.4.1 of rfc 1831?
>
> I'd never noticed that before....
>
> In any case, I think that's just describing something that an RPC-based
> protocol *could* do. NFS doesn't do this--in particular, NFS read calls
> require replies.

Right. OK. This makes much more sense than not getting replies to calls!

> My suspicion is that tshark is missing some replies. Either the packet
> loss counters are wrong, or tshark is failing to identify RPC replies
> that start in the middle of a tcp segment? I seem to recall seeing that
> before.

Yes - I had wondered about that and I will modify my own code to verify
if this is the case.

I *assumed* that tcpdump or wireshark already did this! I guess it just
expects application
protocols to be aligned to the TCP boundary.

Thanks for your help so far.

Jim

> --b.
>
> > The IBM and Oracle websites also have some
> > info but not much. I have found nothing that enables
> > me to identify batched requests (or rather the replies for a batch) in
> > the protocol itself. Wireshark seems unable to do this also.
> >
> > I realise this is a bit off-topic but I was hoping someone might be able
> > to point me in the right direction? Am I barking up the wrong tree? Is
> > this behaviour expected - am I unable to match calls to replies when
> > batching is used? The IBM website suggests;
> >
> > 'Batching assumes the following:
> >
> > Each remote procedure call in the pipeline requires no response from the
> > server, and the server does not send a response message.'
> >
> > This sounds a bit awkward! It then goes on to suggest that;
> >
> > 'The remote procedure call's time out must be 0'
> >
> > For batched calls. But there is no call timeout in the RPC protocol
> > right? Again, this must be up to the client implementation I suspect?
> >
> > Any help appreciated.
> >
> > Regards,
> >
> > Jim
> >
> > PS. I may well join the Linux RPC mailing list and direct this question
> > at them instead.
> >
> > --
> > Senior Software Engineer
> > Systems Development
> > Framestore
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Software Engineer
Systems Development
Framestore