by NeilBrown

[permalink] [raw]

Subject: Re: [PATCH 0/9] Multiple network connections for a single NFS mount.

On Thu, May 30 2019, Tom Talpey wrote:

> On 5/30/2019 1:20 PM, Olga Kornievskaia wrote:
>> On Thu, May 30, 2019 at 1:05 PM Tom Talpey <[email protected]> wrote:
>>>
>>> On 5/29/2019 8:41 PM, NeilBrown wrote:
>>>> I've also re-arrange the patches a bit, merged two, and remove the
>>>> restriction to TCP and NFSV4.x,x>=1. Discussions seemed to suggest
>>>> these restrictions were not needed, I can see no need.
>>>
>>> I believe the need is for the correctness of retries. Because NFSv2,
>>> NFSv3 and NFSv4.0 have no exactly-once semantics of their own, server
>>> duplicate request caches are important (although often imperfect).
>>> These caches use client XID's, source ports and addresses, sometimes
>>> in addition to other methods, to detect retry. Existing clients are
>>> careful to reconnect with the same source port, to ensure this. And
>>> existing servers won't change.
>>
>> Retries are already bound to the same connection so there shouldn't be
>> an issue of a retransmission coming from a different source port.
>
> So, there's no path redundancy? If any connection is lost and can't
> be reestablished, the requests on that connection will time out?

Path redundancy happens lower down in the stack. Presumably a bonding
driver will divert flows to a working path when one path fails.
NFS doesn't see paths at all. It just sees TCP connections - each with
the same source and destination address. How these are associated, from
time to time, with different hardware is completely transparent to NFS.

>
> I think a common configuration will be two NICs and two network paths,
> a so-called shotgun. Admins will be quite frustrated to discover it
> gives no additional robustness, and perhaps even less.
>
> Why not simply restrict this to the fully-correct, fully-functional
> NFSv4.1+ scenario, and not try to paper over the shortcomings?

Because I cannot see any shortcomings in using it for v3 or v4.0.

Also, there are situations where NFSv3 is a measurably better choice
than NFSv4.1. Al least it seems to allow a quicker failover for HA.
But that is really a topic for another day.

NeilBrown

>
> Tom.
>
>>
>>> Multiple connections will result in multiple source ports, and possibly
>>> multiple source addresses, meaning retried client requests may be
>>> accepted as new, rather than having any chance of being recognized as
>>> retries.
>>>
>>> NFSv4.1+ don't have this issue, but removing the restrictions would
>>> seem to break the downlevel mounts.
>>>
>>> Tom.
>>>
>>
>>

Attachments:

signature.asc (847.00 B)

2019-05-30 22:56:51

On Thu, May 30 2019, Rick Macklem wrote:

> Olga Kornievskaia wrote:
>>On Thu, May 30, 2019 at 1:05 PM Tom Talpey <[email protected]> wrote:
>>>
>>> On 5/29/2019 8:41 PM, NeilBrown wrote:
>>> > I've also re-arrange the patches a bit, merged two, and remove the
>>> > restriction to TCP and NFSV4.x,x>=1. Discussions seemed to suggest
>>> > these restrictions were not needed, I can see no need.
>>>
>>> I believe the need is for the correctness of retries. Because NFSv2,
>>> NFSv3 and NFSv4.0 have no exactly-once semantics of their own, server
>>> duplicate request caches are important (although often imperfect).
>>> These caches use client XID's, source ports and addresses, sometimes
>>> in addition to other methods, to detect retry. Existing clients are
>>> careful to reconnect with the same source port, to ensure this. And
>>> existing servers won't change.
>>
>>Retries are already bound to the same connection so there shouldn't be
>>an issue of a retransmission coming from a different source port.
> I don't think the above is correct for NFSv4.0 (it may very well be true for NFSv3).

It is correct for the Linux implementation of NFS, though the term
"xprt" is more accurate than "connection".

A "task" is bound it a specific "xprt" which, in the case of tcp, has a
fixed source port. If the TCP connection breaks, a new one is created
with the same addresses and ports, and this new connection serves the
same xprt.

> Here's what RFC7530 Sec. 3.1.1 says:
> 3.1.1. Client Retransmission Behavior
>
> When processing an NFSv4 request received over a reliable transport
> such as TCP, the NFSv4 server MUST NOT silently drop the request,
> except if the established transport connection has been broken.
> Given such a contract between NFSv4 clients and servers, clients MUST
> NOT retry a request unless one or both of the following are true:
>
> o The transport connection has been broken
>
> o The procedure being retried is the NULL procedure
>
> If the transport connection is broken, the retry needs to be done on a new TCP
> connection, does it not? (I'm assuming you are referring to a retry of an RPC here.)
> (My interpretation of "broken" is "can't be fixed, so the client must use a different
> TCP connection.)

Yes, a new connection. But the Linux client makes sure to use the same
source port.

>
> Also, NFSv4.0 cannot use Sun RPC over UDP, whereas some DRCs only
> work for UDP traffic. (The FreeBSD server does have DRC support for TCP, but
> the algorithm is very different than what is used for UDP, due to the long delay
> before a retried RPC request is received. This can result in significant server
> overheads, so some sites choose to disable the DRC for TCP traffic or tune it
> in such a way as it becomes almost useless.)
> The FreeBSD DRC code for NFS over TCP expects the retry to be from a different
> port# (due to a new connection re: the above) for NFSv4.0. For NFSv3, my best
> recollection is that it doesn't care what the source port# is. (It basically uses a
> hash on the RPC request excluding TCP/IP header to recognize possible
> duplicates.)

Interesting .... hopefully the hash is sufficiently strong.
I think it is best to assume same source port, but there is no formal
standard.

Thanks,
NeilBrown

>
> I don't know what other NFS servers choose to do w.r.t. the DRC for NFS over TCP,
> however for some reason I thought that the Linux knfsd only used a DRC for UDP?
> (Someone please clarify this.)
>
> rick
>
>> Multiple connections will result in multiple source ports, and possibly
>> multiple source addresses, meaning retried client requests may be
>> accepted as new, rather than having any chance of being recognized as
>> retries.
>>
>> NFSv4.1+ don't have this issue, but removing the restrictions would
>> seem to break the downlevel mounts.
>>
>> Tom.
>>

Attachments:

signature.asc (847.00 B)

2019-05-31 01:45:53

by Tom Talpey

[permalink] [raw]

Subject: Re: [PATCH 0/9] Multiple network connections for a single NFS mount.

On 5/30/2019 2:41 PM, Olga Kornievskaia wrote:
> On Thu, May 30, 2019 at 1:41 PM Tom Talpey <[email protected]> wrote:
>>
>> On 5/30/2019 1:20 PM, Olga Kornievskaia wrote:
>>> On Thu, May 30, 2019 at 1:05 PM Tom Talpey <[email protected]> wrote:
>>>>
>>>> On 5/29/2019 8:41 PM, NeilBrown wrote:
>>>>> I've also re-arrange the patches a bit, merged two, and remove the
>>>>> restriction to TCP and NFSV4.x,x>=1. Discussions seemed to suggest
>>>>> these restrictions were not needed, I can see no need.
>>>>
>>>> I believe the need is for the correctness of retries. Because NFSv2,
>>>> NFSv3 and NFSv4.0 have no exactly-once semantics of their own, server
>>>> duplicate request caches are important (although often imperfect).
>>>> These caches use client XID's, source ports and addresses, sometimes
>>>> in addition to other methods, to detect retry. Existing clients are
>>>> careful to reconnect with the same source port, to ensure this. And
>>>> existing servers won't change.
>>>
>>> Retries are already bound to the same connection so there shouldn't be
>>> an issue of a retransmission coming from a different source port.
>>
>> So, there's no path redundancy? If any connection is lost and can't
>> be reestablished, the requests on that connection will time out?
>
> For v3 and v4.0 in the current code base with a single connection,
> when it goes down, you are out of luck. When we have multiple
> connections and would like the benefit of using them but not
> sacrifices replay cache correctness, it's a small price to restrict
> the re-transmissions and suffer the consequence of not being able to
> do an operation during network issues.

I agree that the corruption resulting from a blown cache lookup would
be bad. But I'm also saying that users will be frustrated when random
operations time out, even when new ones work. Also, I think it may
lead to application issues.

>> I think a common configuration will be two NICs and two network paths,
>
> Are you talking about session trunking here?

No, not necessarily. Certianly not when doing what you propose
over NFSv3.

> Why do you think two NICs would be a common configuration. I have
> performance numbers that demonstrate performance improvement for a
> single NIC case. I would say a single NIC with a high speed networks
> (25/40G) would be a common configuration.

They're both common! And sure, it's good for a single NIC because of
RSS (receive side scaling). The multiple connections spread interrupts
over several cores. The same as would happen with multiple NICs.

>
>> a so-called shotgun. Admins will be quite frustrated to discover it
>> gives no additional robustness, and perhaps even less.
>>
>> Why not simply restrict this to the fully-correct, fully-functional
>> NFSv4.1+ scenario, and not try to paper over the shortcomings?
>
> I think mainly because customers are still using v3 but want to
> improve performance. I'd love for everybody to switch to 4.1 but
> that's not happening.

Yeah, you and me both. But trying to "fix" NFSv3 with this is not
going to move the world forward, and I predict will cost many woeful
days ahead when it fails to work transparently.

Tom.

>>>> Multiple connections will result in multiple source ports, and possibly
>>>> multiple source addresses, meaning retried client requests may be
>>>> accepted as new, rather than having any chance of being recognized as
>>>> retries.
>>>>
>>>> NFSv4.1+ don't have this issue, but removing the restrictions would
>>>> seem to break the downlevel mounts.
>>>>
>>>> Tom.
>>>>
>>>
>>>
>
>

2019-05-31 01:48:28

by Tom Talpey

[permalink] [raw]

Subject: Re: [PATCH 0/9] Multiple network connections for a single NFS mount.

On 5/30/2019 6:38 PM, NeilBrown wrote:
> On Thu, May 30 2019, Tom Talpey wrote:
>
>> On 5/30/2019 1:20 PM, Olga Kornievskaia wrote:
>>> On Thu, May 30, 2019 at 1:05 PM Tom Talpey <[email protected]> wrote:
>>>>
>>>> On 5/29/2019 8:41 PM, NeilBrown wrote:
>>>>> I've also re-arrange the patches a bit, merged two, and remove the
>>>>> restriction to TCP and NFSV4.x,x>=1. Discussions seemed to suggest
>>>>> these restrictions were not needed, I can see no need.
>>>>
>>>> I believe the need is for the correctness of retries. Because NFSv2,
>>>> NFSv3 and NFSv4.0 have no exactly-once semantics of their own, server
>>>> duplicate request caches are important (although often imperfect).
>>>> These caches use client XID's, source ports and addresses, sometimes
>>>> in addition to other methods, to detect retry. Existing clients are
>>>> careful to reconnect with the same source port, to ensure this. And
>>>> existing servers won't change.
>>>
>>> Retries are already bound to the same connection so there shouldn't be
>>> an issue of a retransmission coming from a different source port.
>>
>> So, there's no path redundancy? If any connection is lost and can't
>> be reestablished, the requests on that connection will time out?
>
> Path redundancy happens lower down in the stack. Presumably a bonding
> driver will divert flows to a working path when one path fails.
> NFS doesn't see paths at all. It just sees TCP connections - each with
> the same source and destination address. How these are associated, from
> time to time, with different hardware is completely transparent to NFS.

But, you don't propose to constrain this to bonded connections. So
NFS will create connections on whatever collection of NICs which are
locally, and if these aren't bonded, well, the issues become visible.

BTW, RDMA NICs are never bonded.

Tom.

>
>>
>> I think a common configuration will be two NICs and two network paths,
>> a so-called shotgun. Admins will be quite frustrated to discover it
>> gives no additional robustness, and perhaps even less.
>>
>> Why not simply restrict this to the fully-correct, fully-functional
>> NFSv4.1+ scenario, and not try to paper over the shortcomings?
>
> Because I cannot see any shortcomings in using it for v3 or v4.0.
>
> Also, there are situations where NFSv3 is a measurably better choice
> than NFSv4.1. Al least it seems to allow a quicker failover for HA.
> But that is really a topic for another day.
>
> NeilBrown
>
>>
>> Tom.
>>
>>>
>>>> Multiple connections will result in multiple source ports, and possibly
>>>> multiple source addresses, meaning retried client requests may be
>>>> accepted as new, rather than having any chance of being recognized as
>>>> retries.
>>>>
>>>> NFSv4.1+ don't have this issue, but removing the restrictions would
>>>> seem to break the downlevel mounts.
>>>>
>>>> Tom.
>>>>
>>>
>>>

2019-05-31 02:20:18

by Rick Macklem

[permalink] [raw]

Subject: Re: [PATCH 0/9] Multiple network connections for a single NFS mount.

NeilBrown wrote:
>On Thu, May 30 2019, Rick Macklem wrote:
>
>> Olga Kornievskaia wrote:
>>>On Thu, May 30, 2019 at 1:05 PM Tom Talpey <[email protected]> wrote:
>>>>
>>>> On 5/29/2019 8:41 PM, NeilBrown wrote:
>>>> > I've also re-arrange the patches a bit, merged two, and remove the
>>>> > restriction to TCP and NFSV4.x,x>=1. Discussions seemed to suggest
>>>> > these restrictions were not needed, I can see no need.
>>>>
>>>> I believe the need is for the correctness of retries. Because NFSv2,
>>>> NFSv3 and NFSv4.0 have no exactly-once semantics of their own, server
>>>> duplicate request caches are important (although often imperfect).
>>>> These caches use client XID's, source ports and addresses, sometimes
>>>> in addition to other methods, to detect retry. Existing clients are
>>>> careful to reconnect with the same source port, to ensure this. And
>>>> existing servers won't change.
>>>
>>>Retries are already bound to the same connection so there shouldn't be
>>>an issue of a retransmission coming from a different source port.
>> I don't think the above is correct for NFSv4.0 (it may very well be true for NFSv3).
>
>It is correct for the Linux implementation of NFS, though the term
>"xprt" is more accurate than "connection".
>
>A "task" is bound it a specific "xprt" which, in the case of tcp, has a
>fixed source port. If the TCP connection breaks, a new one is created
>with the same addresses and ports, and this new connection serves the
>same xprt.
Ok, that's interesting. The FreeBSD client side krpc uses "xprt"s too
(I assume they came from some old Sun open sources for RPC)
but it just creates a new socket and binds it to any port# available.
When this happens in the FreeBSD client, the old connection is sometimes still
sitting around in some FIN_WAIT state. My TCP is pretty minimal, but I didn't
think you could safely create a new connection using the same port#s at that point,
or at least the old BSD TCP stack code won't allow it.

Anyhow, the FreeBSD client doesn't use same source port# for the new connection.

>> Here's what RFC7530 Sec. 3.1.1 says:
>> 3.1.1. Client Retransmission Behavior
>>
>> When processing an NFSv4 request received over a reliable transport
>> such as TCP, the NFSv4 server MUST NOT silently drop the request,
>> except if the established transport connection has been broken.
>> Given such a contract between NFSv4 clients and servers, clients MUST
>> NOT retry a request unless one or both of the following are true:
>>
>> o The transport connection has been broken
>>
>> o The procedure being retried is the NULL procedure
>>
>> If the transport connection is broken, the retry needs to be done on a new TCP
>> connection, does it not? (I'm assuming you are referring to a retry of an RPC here.)
>> (My interpretation of "broken" is "can't be fixed, so the client must use a different
>> TCP connection.)
>
>Yes, a new connection. But the Linux client makes sure to use the same
>source port.
Ok. I guess my DRC code that expects "different source port#" for NFSv4.0 is
broken. It will result in a DRC miss, which isn't great, but is always possible for
any DRC design. (Not nearly as bad as a false hit.)

>>
>> Also, NFSv4.0 cannot use Sun RPC over UDP, whereas some DRCs only
>> work for UDP traffic. (The FreeBSD server does have DRC support for TCP, but
>> the algorithm is very different than what is used for UDP, due to the long delay
>> before a retried RPC request is received. This can result in significant server
>> overheads, so some sites choose to disable the DRC for TCP traffic or tune it
>> in such a way as it becomes almost useless.)
>> The FreeBSD DRC code for NFS over TCP expects the retry to be from a different
>> port# (due to a new connection re: the above) for NFSv4.0. For NFSv3, my best
>> recollection is that it doesn't care what the source port# is. (It basically uses a
>> hash on the RPC request excluding TCP/IP header to recognize possible
>> duplicates.)
>
>Interesting .... hopefully the hash is sufficiently strong.
It doesn't just use the hash (it still expects same xid, etc), it just doesn't use the TCP
source port#.

To be honest, when I played with this many years ago, unless the size of the DRC
is very large and entries persist in the cache for a long time, they always fall out
of the cache before the retry happens over TCP. At least for the cases I tried back
then, where the RPC retry timeout for TCP was pretty large.
(Sites that use FreeBSD servers under heavy load usually find the DRC grows too
large and tune it down until it no longer would work for TCP anyhow.)

My position is that this all got fixed by sessions and if someone uses NFSv4.0 instead
of NFSv4.1, they may just have to live with the limitations of no "exactly once"
semantics. (Personally, NFSv4.0 should just be deprecated. I know people still have good uses for NFSv3, but I have trouble believing NFSv4.0 is preferred over NFSv4.1,
although Bruce did note a case where there was a performance difference.)

>I think it is best to assume same source port, but there is no formal
>standard.
I'd say you can't assume "same port#" or "different port#', since there is no standard.
But I would agree that "assuming same port#" will just result in false misses for
clients that don't use the same port#.

rick

>Thanks,
>NeilBrown
>
>
>
>> I don't know what other NFS servers choose to do w.r.t. the DRC for NFS over TCP,
>> however for some reason I thought that the Linux knfsd only used a DRC for UDP?
>> (Someone please clarify this.)
>>
>> rick
>>
>>> Multiple connections will result in multiple source ports, and possibly
>>> multiple source addresses, meaning retried client requests may be
>>> accepted as new, rather than having any chance of being recognized as
>>> retries.
>>>
>>> NFSv4.1+ don't have this issue, but removing the restrictions would
>>> seem to break the downlevel mounts.
>>>
>>> Tom.
>>>

2019-05-31 02:31:31

On Wed, Jun 12 2019, Chuck Lever wrote:

> Hi Neil-
>
>
>> On Jun 11, 2019, at 9:49 PM, NeilBrown <[email protected]> wrote:
>>
>> On Tue, Jun 11 2019, Chuck Lever wrote:
>>
>>> Hi Neil-
>>>
>>>
>>>> On Jun 10, 2019, at 9:09 PM, NeilBrown <[email protected]> wrote:
>>>>
>>>> On Fri, May 31 2019, Chuck Lever wrote:
>>>>
>>>>>> On May 30, 2019, at 6:56 PM, NeilBrown <[email protected]> wrote:
>>>>>>
>>>>>> On Thu, May 30 2019, Chuck Lever wrote:
>>>>>>
>>>>>>> Hi Neil-
>>>>>>>
>>>>>>> Thanks for chasing this a little further.
>>>>>>>
>>>>>>>
>>>>>>>> On May 29, 2019, at 8:41 PM, NeilBrown <[email protected]> wrote:
>>>>>>>>
>>>>>>>> This patch set is based on the patches in the multipath_tcp branch of
>>>>>>>> git://git.linux-nfs.org/projects/trondmy/nfs-2.6.git
>>>>>>>>
>>>>>>>> I'd like to add my voice to those supporting this work and wanting to
>>>>>>>> see it land.
>>>>>>>> We have had customers/partners wanting this sort of functionality for
>>>>>>>> years. In SLES releases prior to SLE15, we've provide a
>>>>>>>> "nosharetransport" mount option, so that several filesystem could be
>>>>>>>> mounted from the same server and each would get its own TCP
>>>>>>>> connection.
>>>>>>>
>>>>>>> Is it well understood why splitting up the TCP connections result
>>>>>>> in better performance?
>>>>>>>
>>>>>>>
>>>>>>>> In SLE15 we are using this 'nconnect' feature, which is much nicer.
>>>>>>>>
>>>>>>>> Partners have assured us that it improves total throughput,
>>>>>>>> particularly with bonded networks, but we haven't had any concrete
>>>>>>>> data until Olga Kornievskaia provided some concrete test data - thanks
>>>>>>>> Olga!
>>>>>>>>
>>>>>>>> My understanding, as I explain in one of the patches, is that parallel
>>>>>>>> hardware is normally utilized by distributing flows, rather than
>>>>>>>> packets. This avoid out-of-order deliver of packets in a flow.
>>>>>>>> So multiple flows are needed to utilizes parallel hardware.
>>>>>>>
>>>>>>> Indeed.
>>>>>>>
>>>>>>> However I think one of the problems is what happens in simpler scenarios.
>>>>>>> We had reports that using nconnect > 1 on virtual clients made things
>>>>>>> go slower. It's not always wise to establish multiple connections
>>>>>>> between the same two IP addresses. It depends on the hardware on each
>>>>>>> end, and the network conditions.
>>>>>>
>>>>>> This is a good argument for leaving the default at '1'. When
>>>>>> documentation is added to nfs(5), we can make it clear that the optimal
>>>>>> number is dependant on hardware.
>>>>>
>>>>> Is there any visibility into the NIC hardware that can guide this setting?
>>>>>
>>>>
>>>> I doubt it, partly because there is more than just the NIC hardware at issue.
>>>> There is also the server-side hardware and possibly hardware in the middle.
>>>
>>> So the best guidance is YMMV. :-)
>>>
>>>
>>>>>>> What about situations where the network capabilities between server and
>>>>>>> client change? Problem is that neither endpoint can detect that; TCP
>>>>>>> usually just deals with it.
>>>>>>
>>>>>> Being able to manually change (-o remount) the number of connections
>>>>>> might be useful...
>>>>>
>>>>> Ugh. I have problems with the administrative interface for this feature,
>>>>> and this is one of them.
>>>>>
>>>>> Another is what prevents your client from using a different nconnect=
>>>>> setting on concurrent mounts of the same server? It's another case of a
>>>>> per-mount setting being used to control a resource that is shared across
>>>>> mounts.
>>>>
>>>> I think that horse has well and truly bolted.
>>>> It would be nice to have a "server" abstraction visible to user-space
>>>> where we could adjust settings that make sense server-wide, and then a way
>>>> to mount individual filesystems from that "server" - but we don't.
>>>
>>> Even worse, there will be some resource sharing between containers that
>>> might be undesirable. The host should have ultimate control over those
>>> resources.
>>>
>>> But that is neither here nor there.
>>>
>>>
>>>> Probably the best we can do is to document (in nfs(5)) which options are
>>>> per-server and which are per-mount.
>>>
>>> Alternately, the behavior of this option could be documented this way:
>>>
>>> The default value is one. To resolve conflicts between nconnect settings on
>>> different mount points to the same server, the value set on the first mount
>>> applies until there are no more mounts of that server, unless nosharecache
>>> is specified. When following a referral to another server, the nconnect
>>> setting is inherited, but the effective value is determined by other mounts
>>> of that server that are already in place.
>>>
>>> I hate to say it, but the way to make this work deterministically is to
>>> ask administrators to ensure that the setting is the same on all mounts
>>> of the same server. Again I'd rather this take care of itself, but it
>>> appears that is not going to be possible.
>>>
>>>
>>>>> Adding user tunables has never been known to increase the aggregate
>>>>> amount of happiness in the universe. I really hope we can come up with
>>>>> a better administrative interface... ideally, none would be best.
>>>>
>>>> I agree that none would be best. It isn't clear to me that that is
>>>> possible.
>>>> At present, we really don't have enough experience with this
>>>> functionality to be able to say what the trade-offs are.
>>>> If we delay the functionality until we have the perfect interface,
>>>> we may never get that experience.
>>>>
>>>> We can document "nconnect=" as a hint, and possibly add that
>>>> "nconnect=1" is a firm guarantee that more will not be used.
>>>
>>> Agree that 1 should be the default. If we make this setting a
>>> hint, then perhaps it should be renamed; nconnect makes it sound
>>> like the client will always open N connections. How about "maxconn" ?
>>>
>>> Then, to better define the behavior:
>>>
>>> The range of valid maxconn values is 1 to 3? to 8? to NCPUS? to the
>>> count of the client’s NUMA nodes? I’d be in favor of a small number
>>> to start with. Solaris' experience with multiple connections is that
>>> there is very little benefit past 8.
>>>
>>> If maxconn is specified with a datagram transport, does the mount
>>> operation fail, or is the setting is ignored?
>>
>> With Trond's patches, the setting is ignored (as he said in a reply).
>> With my version, the setting is honoured.
>> Specifically, 'n' separate UDP sockets are created, each bound to a
>> different local port, each sending to the same server port.
>> If a bonding driver is using the source-port in the output hash
>> (xmit_policy=layer3+4 in the terminology of
>> linux/Documentation/net/bonding.txt),
>> then this would get better throughput over bonded network interfaces.
>
> One assumes the server end is careful to send a reply back
> to the same client UDP source port from whence came the
> matching request?
>
>
>>> If maxconn is a hint, when does the client open additional
>>> connections?
>>>
>>> IMO documentation should be clear that this setting is not for the
>>> purpose of multipathing/trunking (using multiple NICs on the client
>>> or server). The client has to do trunking detection/discovery in that
>>> case, and nconnect doesn't add that logic. This is strictly for
>>> enabling multiple connections between one client-server IP address
>>> pair.
>>>
>>> Do we need to state explicitly that all transport connections for a
>>> mount (or client-server pair) are the same connection type (i.e., all
>>> TCP or all RDMA, never a mix)?
>>>
>>>
>>>> Then further down the track, we might change the actual number of
>>>> connections automatically if a way can be found to do that without cost.
>>>
>>> Fair enough.
>>>
>>>
>>>> Do you have any objections apart from the nconnect= mount option?
>>>
>>> Well I realize my last e-mail sounded a little negative, but I'm
>>> actually in favor of adding the ability to open multiple connections
>>> per client-server pair. I just want to be careful about making this
>>> a feature that has as few downsides as possible right from the start.
>>> I'll try to be more helpful in my responses.
>>>
>>> Remaining implementation issues that IMO need to be sorted:
>>>
>>> • We want to take care that the client can recover network resources
>>> that have gone idle. Can we reuse the auto-close logic to close extra
>>> connections?
>>
>> Were you aware that auto-close was ineffective with NFSv4 as the regular
>> RENEW (or SEQUENCE for v4.1) keeps a connection open?
>> My patches already force session management requests onto a single xprt.
>> It probably makes sense to do the same for RENEW and SEQUENCE.
>> Then when there is no fs activity, the other connections will close.
>> There is no mechanism to re-open only some of them though. Any
>> non-trivial amount of traffic will cause all connection to re-open.
>
> This seems sensible.
>
>
>>> • How will the client schedule requests on multiple connections?
>>> Should we enable the use of different schedulers?
>>> • How will retransmits be handled?
>>> • How will the client recover from broken connections? Today's clients
>>> use disconnect to determine when to retransmit, thus there might be
>>> some unwanted interactions here that result in mount hangs.
>>> • Assume NFSv4.1 session ID rather than client ID trunking: is Linux
>>> client support in place for this already?
>>> • Are there any concerns about how the Linux server DRC will behave in
>>> multi-connection scenarios?
>>>
>>> None of these seem like a deal breaker. And possibly several of these
>>> are already decided, but just need to be published/documented.
>>
>> How about this:
>
> Thanks for writing this up.
>
>
>> NFS normally sends all requests to the server (and receives all replies)
>> over a single network connection, whether TCP, RDMA or (for NFSv3 and
>> earlier) UDP. Often this is sufficient to utilize all available
>> network bandwidth, but not always. When there is sufficient
>> parallelism in the server, the client, and the network connection, the
>> restriction to a single TCP stream can become a limitation.
>>
>> A simple scenario which portrays this limitation involves several
>> direct network connections between client and server where the multiple
>> interfaces on each end are bonded together. If this bonding diverts
>> different flows to different interfaces, then a single TCP connection
>> will be limited to a single network interface, while multiple
>> connections could make use of all interfaces. Various other scenarios
>> are possible including network controllers with multiple DMA/TSO
>> engines where a given flow can only be associated with a single engine
>> at a time, or Receive-side scaling which can direct different flows to
>> different receive queues and thence to different CPU cores.
>>
>> NFS has two distinct and complementary mechanisms to enable the use of
>> multiple connections to carry requests and replies. We will refer to
>> these as trunking and nconnect, though the NFS RFCs use the term
>> "trunking" in a way that covers both.
>>
>> With trunking (also known as multipathing), the server-side IP address
>> of each connection is different. RFC8587 (and other documents)
>> describe how a client can determine if two connections to different
>> addresses actually refer to the same server and so can be used for
>> trunking. The client can use explicit configuration, possibly using
>> the NFSv4 `fs_locations` attribute, to find the different addresses,
>> and can then establish multiple trunks. With trunking, the different
>> connections could conceivably be over different protocols, both TCP and
>> RDMA for example. Trunking makes use of explicit parallelism in the
>> network configuration.
>>
>> With nconnect, both the client and server side IP addresses are the
>> same on each connection, but the client side port number varies.
>
> Note that the client IP source port number is not relevant for RDMA
> connections. Multiple connections to the same service are de-
> multiplexed using other means.
>
> So then the goal of nconnect is specifically to enable multiple
> independent flows between the same two network endpoints.

Yes, focusing on "independent flows" is likely to be best. Multiple
ports must be just an example.

>
> Note that a server is also responsible for detecting when two
> unique IP addresses are the same client for purposes of open/lock
> state recovery. It's possible that the same client IP address can
> host multiple NFS client instances each at different source ports.
>
> NFSv4 has protocol to do this (SETCLIENTID and EXCHANGE_ID), but
> NFSv2/3 do not. This is one reason why Tom has been counseling
> caution about multichannel NFSv2/3. Perhaps that is only an issue
> for NLM, which is already a separate connection...
>

I don't think there are any interesting issues here. NLM and STATMON
remain separate for NFSv3 and don't change their behaviour at all.
NFSv3 has no concept of clients, only of permissions associated with
each individual request. The server cannot differentiate between
requests from different (privileged) ports on the same client.

It is really the client that has responsibility for identifying itself.
The server only needs to reliably track whatever the client claims.

>
>> This enables NFS to benefit from transparent parallelism in the network
>> stack, such as interface bonding and receive-side scaling as described
>> earlier.
>>
>> When multiple connections are available, NFS will send
>> session-management requests on a single connection (the first
>> connection opened)
>
> Maybe you meant "lease management" requests?

Probably I do .... though maybe I can be forgiven for mistakenly
thinking that CREATE_SESSION and DESTROY_SESSION could be described as
"session management" :-)

>
> EXCHANGE_ID, RECLAIM_COMPLETE, CREATE_SESSION, DESTROY_SESSION
> and DESTROY_CLIENTID will of course go over the main connection.
> However, each connection will need to use BIND_CONN_TO_SESSION
> to join an existing session. That's how the server knows
> the additional connections are from a client instance it has
> already recognized.
>
> For NFSv4.0, SETCLIENTID, SETCLIENTID_CONFIRM, and RENEW
> would go over the main connection (and those have nothing to do
> with sessions).

Well.... they have nothing to do with NFSv4.1 Sessions.
But it is useful to have a name for the collection of RPCs related to a
particular negotiated clientid, and "session" (small 's') seems as good
a name as any....

>
>
>> while general filesystem access requests will be
>> distrubuted over all available connections. When load is light (as
>> measured by the number of outstanding requests on each connection)
>> requests will be distributed in a round-robin fashion. When the number
>> of outstanding requests on any connection exceeds 2, and also exceeds
>> the average across all connection, that connection will be skipping in
>> the round-robin. As flows are likely to be distributed over hardware
>> in a non-fair manner (such as a hash on the port number), it is likely
>> that each hardware resource might serve a different number of flows.
>> Bypassing flows with above-average backlog goes some way to restoring
>> fairness to the distribution of requests across hardware resources.
>>
>> In the (hopefully rare) case that a retransmit is needed for an
>> (apparently) lost packet, the same connection - or at least the same
>> source port number - will be used for all retransmits. This ensures
>> that any Duplicate Reply Cache on the server has the best possible
>> chance of recognizing the retransmission for what it is. When a given
>> connection breaks and needs to be re-established, pending requests on
>> that connection will be resent. Pending requests on other connections
>> will not be affected.
>
> I'm having trouble with several points regarding retransmission.
>
> 1. Retransmission is also used to recover when a server or its
> backend storage drops a request. An NFSv3 server is permitted by
> spec to drop requests without notifying clients. That's a nit with
> your write-up, but...

Only if you think that when a server drops a request, it isn't "lost".

>
> 2. An NFSv4 server MUST NOT drop requests; if it ever does it is
> required to close the connection to force the client to retransmit.
> In fact, current clients depend on connection loss to know when
> to retransmit. Both Linux and Solaris no longer use retransmit
> timeouts to trigger retransmit; they will only retransmit after
> connection loss.
>
> 2a. IMO the spec is written such that a client is allowed to send
> a retransmission on another connection that already exists. But
> maybe that is not what we want to implement.

It certainly isn't what we *do* implement.
For v3 and v4.0, I think it is best to use the same xprt - which may or
may not be the same connection, but does have the same port numbers.
For v4.1 it might make sense to use another xprt if that is easy to implement.

>
> 3. RPC/RDMA clients always drop the connection before retransmitting
> because they have to reset the connection's credit accounting.
>
> 4. RPC/RDMA cannot depend on IP source port, because the RPC part
> of the stack has no visibility into the choice of source port that
> is chosen. Thus the server's DRC cannot use the source port. I
> think server DRC's need to be prepared to deal with multiple client
> connections.

OK, that could be an issue.
Linux uses an independent xid sequence for each xprt, so two separate
xprts can easily use the same xid for different requests.
If RDMA cannot see the source port, it might depend more on the xid and
so risk getting confused.

There was a patch floating around which reserved a few bits of the xid
for an xprt index to ensure all xids were unique, but Trond didn't like
sub-dividing the xid space (which is fair enough).
So maybe it isn't safe to use nconnect with RDMA and protocol versions
earlier than 4.1.

>
> 5. The DRC (and thus considerations about the client IP source port)
> does not come into play for NFSv4.1 sessions.
>
>
>> Trunking (as described here) is not currently supported by the Linux
>> NFS client except in pNFS configurations (I think - is that right?).
>> nconnect is supported and currently requires a mount option.
>>
>> If the "nonconnect" mount option is given, the nconnect is completely
>> disabled to the target server. If "nconnect=N" is given (for some N
>> from 1 to 256) then that many connections will initially be created and
>> used. Over time, the number of connections may be increased or
>> decreased depending on available resources and recent demand. This may
>> also happen if neither "nonconnect" or "nconnect=" is given. However
>> no design or implementation yet exists for this possibility.
>
> See my e-mail from earlier today on mount option behavior.
>
> I prefer "nconnect=1" to "nonconnect"....
>
>
>> Where multiple filesystem are mounted from the same server, the
>> "nconnect" option given for the first mount will apply to all mounts
>> from that server. If the option is given on subsequent mounts from the
>> server, it will be silently ignored.
>>
>>
>> What doesn't that cover?
>> Have written it, I wonder if I should change the terminology do
>> distinguish between "multipath trunking" where the server IP address
>> varies, and "connection trunking" where the server IP address is fixed.
>
> I agree that the write-up needs to be especially careful about
> terminology.
>
> "multi-path trunking" is probably not appropriate, but "connection
> trunking" might be close. I used "multi-flow" above, fwiw.

Why not "multi-path trunking" when the server IP varies?
I like "multi-flow trunking" when the server IP doesn't change!

Maybe the mount option should be flows=N ??

Thanks a lot,
NeilBrown

>
>
>> Suppose we do add multi-path (non-pNFS) trunking support. Would it
>> make sense to have multiple connections over each path?
>
> IMO, yes.
>
>> Would each path benefit from the same number of connections?
>
> Probably not, the client will need to have some mechanism for
> deciding how many connections to open for each trunk, or it
> will have to use a fixed number (like nconnect).
>
>
>> How do we manage that?
>
> Presumably via the same mechanism that the client would use
> to determine how many connections to open for a single pair
> of endpoints.
>
> Actually I suspect that for pNFS file and flexfile layouts,
> the client will want to use multi-flow when communicating
> with DS's. So this future may be here pretty quickly.
>
>
> --
> Chuck Lever

Attachments:

signature.asc (847.00 B)

2019-06-13 17:03:07

On Tue, Jul 23 2019, Schumaker, Anna wrote:

> Hi Neil,
>
> On Thu, 2019-05-30 at 10:41 +1000, NeilBrown wrote:
>> With NFSv4.1, different network connections need to be explicitly
>> bound to a session. During session startup, this is not possible
>> so only a single connection must be used for session startup.
>>
>> So add a task flag to disable the default round-robin choice of
>> connections (when nconnect > 1) and force the use of a single
>> connection.
>> Then use that flag on all requests for session management - for
>> consistence, include NFSv4.0 management (SETCLIENTID) and session
>> destruction
>>
>> Reported-by: Chuck Lever <[email protected]>
>> Signed-off-by: NeilBrown <[email protected]>
>> ---
>> fs/nfs/nfs4proc.c | 22 +++++++++++++---------
>> include/linux/sunrpc/sched.h | 1 +
>> net/sunrpc/clnt.c | 24 +++++++++++++++++++++++-
>> 3 files changed, 37 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index c29cbef6b53f..22b3dbfc4fa1 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -5978,7 +5978,7 @@ int nfs4_proc_setclientid(struct nfs_client
>> *clp, u32 program,
>> .rpc_message = &msg,
>> .callback_ops = &nfs4_setclientid_ops,
>> .callback_data = &setclientid,
>> - .flags = RPC_TASK_TIMEOUT,
>> + .flags = RPC_TASK_TIMEOUT | RPC_TASK_NO_ROUND_ROBIN,
>> };
>> int status;
>>
>> @@ -6044,7 +6044,8 @@ int nfs4_proc_setclientid_confirm(struct
>> nfs_client *clp,
>> dprintk("NFS call setclientid_confirm auth=%s, (client ID
>> %llx)\n",
>> clp->cl_rpcclient->cl_auth->au_ops->au_name,
>> clp->cl_clientid);
>> - status = rpc_call_sync(clp->cl_rpcclient, &msg,
>> RPC_TASK_TIMEOUT);
>> + status = rpc_call_sync(clp->cl_rpcclient, &msg,
>> + RPC_TASK_TIMEOUT |
>> RPC_TASK_NO_ROUND_ROBIN);
>> trace_nfs4_setclientid_confirm(clp, status);
>> dprintk("NFS reply setclientid_confirm: %d\n", status);
>> return status;
>> @@ -7633,7 +7634,7 @@ static int _nfs4_proc_secinfo(struct inode
>> *dir, const struct qstr *name, struct
>> NFS_SP4_MACH_CRED_SECINFO, &clnt, &msg);
>>
>> status = nfs4_call_sync(clnt, NFS_SERVER(dir), &msg,
>> &args.seq_args,
>> - &res.seq_res, 0);
>> + &res.seq_res, RPC_TASK_NO_ROUND_ROBIN);
>
> I'm confused about what setting RPC_TASK_NO_ROUND_ROBIN as the
> "cache_reply" argument to nfs4_call_sync() actually does. As far as I
> can tell, it's passed to nfs4_init_sequence() which sets it to the
> nfs4_sequence_args "sa_cache_this" field, which is a one bit boolean
> (defined in include/linux/nfs_xdr.h). So why pass the flag?

Thanks for reviewing my patch! Yes, that is an error.
I think I had confused nfs4_call_sync() with rpc_call_sync().
Similar names, different args.

For those calls, I want to get RPC_TASK_NO_ROUND_ROBIN set in the .flags
field of 'task_setup' in nfs4_call_sync_sequence().

Maybe we could add a 'flags' arg to nfs4_call_sync_sequence() and
add a new nfs4_call_sync_state() which sets RPC_TASK_NO_ROUND_ROBIN.

If you are happy with this approach, let me know and I send a proper
patch.

Thanks,
NeilBrown

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 39896afc6edf..dd2725fe7a74 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1077,7 +1077,8 @@ static int nfs4_call_sync_sequence(struct rpc_clnt *clnt,
struct nfs_server *server,
struct rpc_message *msg,
struct nfs4_sequence_args *args,
- struct nfs4_sequence_res *res)
+ struct nfs4_sequence_res *res,
+ int flags)
{
int ret;
struct rpc_task *task;
@@ -1091,7 +1092,8 @@ static int nfs4_call_sync_sequence(struct rpc_clnt *clnt,
.rpc_client = clnt,
.rpc_message = msg,
.callback_ops = clp->cl_mvops->call_sync_ops,
- .callback_data = &data
+ .callback_data = &data,
+ .flags = flags,
};

task = rpc_run_task(&task_setup);
@@ -1112,7 +1114,20 @@ int nfs4_call_sync(struct rpc_clnt *clnt,
int cache_reply)
{
nfs4_init_sequence(args, res, cache_reply, 0);
- return nfs4_call_sync_sequence(clnt, server, msg, args, res);
+ return nfs4_call_sync_sequence(clnt, server, msg, args, res, 0);
+}
+
+int nfs4_call_sync_state(struct rpc_clnt *clnt,
+ struct nfs_server *server,
+ struct rpc_message *msg,
+ struct nfs4_sequence_args *args,
+ struct nfs4_sequence_res *res,
+ int cache_reply)
+{
+ /* State management commands are never round-robined */
+ nfs4_init_sequence(args, res, cache_reply, 0);
+ return nfs4_call_sync_sequence(clnt, server, msg, args, res,
+ RPC_TASK_NO_ROUND_ROBIN);
}

static void
@@ -7387,7 +7402,7 @@ static int _nfs40_proc_get_locations(struct inode *inode,

nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 1);
status = nfs4_call_sync_sequence(clnt, server, &msg,
- &args.seq_args, &res.seq_res);
+ &args.seq_args, &res.seq_res, 0);
if (status)
return status;

@@ -7440,7 +7455,7 @@ static int _nfs41_proc_get_locations(struct inode *inode,

nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 1);
status = nfs4_call_sync_sequence(clnt, server, &msg,
- &args.seq_args, &res.seq_res);
+ &args.seq_args, &res.seq_res, 0);
if (status == NFS4_OK &&
res.seq_res.sr_status_flags & SEQ4_STATUS_LEASE_MOVED)
status = -NFS4ERR_LEASE_MOVED;
@@ -7529,7 +7544,7 @@ static int _nfs40_proc_fsid_present(struct inode *inode, const struct cred *cred

nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 1);
status = nfs4_call_sync_sequence(clnt, server, &msg,
- &args.seq_args, &res.seq_res);
+ &args.seq_args, &res.seq_res, 0);
nfs_free_fhandle(res.fh);
if (status)
return status;
@@ -7570,7 +7585,7 @@ static int _nfs41_proc_fsid_present(struct inode *inode, const struct cred *cred

nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 1);
status = nfs4_call_sync_sequence(clnt, server, &msg,
- &args.seq_args, &res.seq_res);
+ &args.seq_args, &res.seq_res, 0);
nfs_free_fhandle(res.fh);
if (status == NFS4_OK &&
res.seq_res.sr_status_flags & SEQ4_STATUS_LEASE_MOVED)
@@ -7656,8 +7671,8 @@ static int _nfs4_proc_secinfo(struct inode *dir, const struct qstr *name, struct
nfs4_state_protect(NFS_SERVER(dir)->nfs_client,
NFS_SP4_MACH_CRED_SECINFO, &clnt, &msg);

- status = nfs4_call_sync(clnt, NFS_SERVER(dir), &msg, &args.seq_args,
- &res.seq_res, RPC_TASK_NO_ROUND_ROBIN);
+ status = nfs4_call_sync_state(clnt, NFS_SERVER(dir), &msg, &args.seq_args,
+ &res.seq_res, 0);
dprintk("NFS reply secinfo: %d\n", status);

put_cred(cred);
@@ -9357,8 +9372,8 @@ _nfs41_proc_secinfo_no_name(struct nfs_server *server, struct nfs_fh *fhandle,
}

dprintk("--> %s\n", __func__);
- status = nfs4_call_sync(clnt, server, &msg, &args.seq_args,
- &res.seq_res, RPC_TASK_NO_ROUND_ROBIN);
+ status = nfs4_call_sync_state(clnt, server, &msg, &args.seq_args,
+ &res.seq_res, 0);
dprintk("<-- %s status=%d\n", __func__, status);

put_cred(cred);
@@ -9497,7 +9512,7 @@ static int _nfs41_test_stateid(struct nfs_server *server,
dprintk("NFS call test_stateid %p\n", stateid);
nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 1);
status = nfs4_call_sync_sequence(rpc_client, server, &msg,
- &args.seq_args, &res.seq_res);
+ &args.seq_args, &res.seq_res, 0);
if (status != NFS_OK) {
dprintk("NFS reply test_stateid: failed, %d\n", status);
return status;

Attachments:

signature.asc (847.00 B)

2019-07-31 05:35:19

by NeilBrown

[permalink] [raw]

Subject: [PATCH] NFS: add flags arg to nfs4_call_sync_sequence()

Adding a flags argument allows flags such as RPC_TASK_NO_ROUND_ROBIN
to be passed to rpc_run_task().
This is needed when calling nfs4_call_sync() for state-management
commands.
Rather than adding a flags argument to nfs4_call_sync(), add a new
nfs4_call_sync_state(), which passes RPC_TASK_NO_ROUND_ROBIN to
nfs4_call_sync_sequence().

A previous commit incorrectly passed RPC_TASK_NO_ROUND_ROBIN
in the last arg to nfs4_call_sync(), where that arg is not a general
flags argument. This patch fixes that by changing those call-sites
to call the new nfs4_call_sync_state()

Repored-by: Anna Schumaker <[email protected]>
Fixes: 5a0c257f8e0f ("NFS: send state management on a single connection.")
Signed-off-by: NeilBrown <[email protected]>
---
fs/nfs/nfs4proc.c | 39 +++++++++++++++++++++++++++------------
1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 39896afc6edf..dd2725fe7a74 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1077,7 +1077,8 @@ static int nfs4_call_sync_sequence(struct rpc_clnt *clnt,
struct nfs_server *server,
struct rpc_message *msg,
struct nfs4_sequence_args *args,
- struct nfs4_sequence_res *res)
+ struct nfs4_sequence_res *res,
+ int flags)
{
int ret;
struct rpc_task *task;
@@ -1091,7 +1092,8 @@ static int nfs4_call_sync_sequence(struct rpc_clnt *clnt,
.rpc_client = clnt,
.rpc_message = msg,
.callback_ops = clp->cl_mvops->call_sync_ops,
- .callback_data = &data
+ .callback_data = &data,
+ .flags = flags,
};

task = rpc_run_task(&task_setup);
@@ -1112,7 +1114,20 @@ int nfs4_call_sync(struct rpc_clnt *clnt,
int cache_reply)
{
nfs4_init_sequence(args, res, cache_reply, 0);
- return nfs4_call_sync_sequence(clnt, server, msg, args, res);
+ return nfs4_call_sync_sequence(clnt, server, msg, args, res, 0);
+}
+
+int nfs4_call_sync_state(struct rpc_clnt *clnt,
+ struct nfs_server *server,
+ struct rpc_message *msg,
+ struct nfs4_sequence_args *args,
+ struct nfs4_sequence_res *res,
+ int cache_reply)
+{
+ /* State management commands are never round-robined */
+ nfs4_init_sequence(args, res, cache_reply, 0);
+ return nfs4_call_sync_sequence(clnt, server, msg, args, res,
+ RPC_TASK_NO_ROUND_ROBIN);
}

static void
@@ -7387,7 +7402,7 @@ static int _nfs40_proc_get_locations(struct inode *inode,

nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 1);
status = nfs4_call_sync_sequence(clnt, server, &msg,
- &args.seq_args, &res.seq_res);
+ &args.seq_args, &res.seq_res, 0);
if (status)
return status;

@@ -7440,7 +7455,7 @@ static int _nfs41_proc_get_locations(struct inode *inode,

nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 1);
status = nfs4_call_sync_sequence(clnt, server, &msg,
- &args.seq_args, &res.seq_res);
+ &args.seq_args, &res.seq_res, 0);
if (status == NFS4_OK &&
res.seq_res.sr_status_flags & SEQ4_STATUS_LEASE_MOVED)
status = -NFS4ERR_LEASE_MOVED;
@@ -7529,7 +7544,7 @@ static int _nfs40_proc_fsid_present(struct inode *inode, const struct cred *cred

nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 1);
status = nfs4_call_sync_sequence(clnt, server, &msg,
- &args.seq_args, &res.seq_res);
+ &args.seq_args, &res.seq_res, 0);
nfs_free_fhandle(res.fh);
if (status)
return status;
@@ -7570,7 +7585,7 @@ static int _nfs41_proc_fsid_present(struct inode *inode, const struct cred *cred

nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 1);
status = nfs4_call_sync_sequence(clnt, server, &msg,
- &args.seq_args, &res.seq_res);
+ &args.seq_args, &res.seq_res, 0);
nfs_free_fhandle(res.fh);
if (status == NFS4_OK &&
res.seq_res.sr_status_flags & SEQ4_STATUS_LEASE_MOVED)
@@ -7656,8 +7671,8 @@ static int _nfs4_proc_secinfo(struct inode *dir, const struct qstr *name, struct
nfs4_state_protect(NFS_SERVER(dir)->nfs_client,
NFS_SP4_MACH_CRED_SECINFO, &clnt, &msg);

- status = nfs4_call_sync(clnt, NFS_SERVER(dir), &msg, &args.seq_args,
- &res.seq_res, RPC_TASK_NO_ROUND_ROBIN);
+ status = nfs4_call_sync_state(clnt, NFS_SERVER(dir), &msg, &args.seq_args,
+ &res.seq_res, 0);
dprintk("NFS reply secinfo: %d\n", status);

put_cred(cred);
@@ -9357,8 +9372,8 @@ _nfs41_proc_secinfo_no_name(struct nfs_server *server, struct nfs_fh *fhandle,
}

dprintk("--> %s\n", __func__);
- status = nfs4_call_sync(clnt, server, &msg, &args.seq_args,
- &res.seq_res, RPC_TASK_NO_ROUND_ROBIN);
+ status = nfs4_call_sync_state(clnt, server, &msg, &args.seq_args,
+ &res.seq_res, 0);
dprintk("<-- %s status=%d\n", __func__, status);

put_cred(cred);
@@ -9497,7 +9512,7 @@ static int _nfs41_test_stateid(struct nfs_server *server,
dprintk("NFS call test_stateid %p\n", stateid);
nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 1);
status = nfs4_call_sync_sequence(rpc_client, server, &msg,
- &args.seq_args, &res.seq_res);
+ &args.seq_args, &res.seq_res, 0);
if (status != NFS_OK) {
dprintk("NFS reply test_stateid: failed, %d\n", status);
return status;
--
2.14.0.rc0.dirty

Attachments:

signature.asc (847.00 B)