2022-02-28 20:28:48

by Olga Kornievskaia

[permalink] [raw]
Subject: managing trunking

Hello folks,

I would like to ask for what would be an acceptable solution to deal
with created trunking connections when there is a change in trunking
membership, specifically when a connection (ie., its endpoint) no
longer is part of the same group. An inaction on the client's part
would lead to unusable client.

Would a proposal to destroy trunking connections in as part of
DESTROY_SESSION be acceptable? The logic behind this solution is that
trunking membership was established as a part of a session, each
connection was tested to belong to trunkable server and added to that
session. Once the session is destroy and new created there is no
guarantee that the connections are to the same server that the new
session is created for. Trunking membership can be re-established at a
later time. I have some code that implements this solution but still
needs some testing.

Alternatively, if we keep connections past DESTROY_SESSION, then we
need a way to test that the same connections belong to the new session
that has been created, meaning that a probe for each connection on
create_session to see if they still belong to the same server as the
new session is created. Is that preferred over simply destroying
connections? I'm going to work on implementing this too and posting as
an alternative.

It has been expressed several times that the ultimate goal is to do
transport management in userspace so does it mean the solution to this
is also in userspace? Should there be upcalls to user land on
DESTROY_SESSION and CREATE_SESSION to destroy/create trunking
respectively but triggered via user land. But in this approach, while
this happens at user land speed, will we be allowing the client to
get into a state where it's unusable (because its connections are
talking to servers that don't belong to the same trunking group)? Or
to prevent this, will we be allowing the userland to pause activities
in the kernel until the transports are squared away? I just don't see
how out-sourcing trunking membership changes to the user land is
better than handling it in the kernel when no operations can proceed
until trunking membership is corrected.

Any feedback on the approaches or its alternatives would be much appreciated.

Thank you for the feedback.

Thank you.


2022-02-28 20:41:47

by Trond Myklebust

[permalink] [raw]
Subject: Re: managing trunking

On Mon, 2022-02-28 at 13:58 -0500, Olga Kornievskaia wrote:
> Hello folks,
>
> I would like to ask for what would be an acceptable solution to deal
> with created trunking connections when there is a change in trunking
> membership, specifically when a connection (ie., its endpoint) no
> longer is part of the same group. An inaction on the client's part
> would lead to unusable client.
>
> Would a proposal to destroy trunking connections in as part of
> DESTROY_SESSION be acceptable? The logic behind this solution is that
> trunking membership was established as a part of a session, each
> connection was tested to belong to trunkable server and added to that
> session. Once the session is destroy and new created there is no
> guarantee that the connections are to the same server that the new
> session is created for. Trunking membership can be re-established at
> a
> later time. I have some code that implements this solution but still
> needs some testing.
>
> Alternatively, if we keep connections past DESTROY_SESSION, then we
> need a way to test that the same connections belong to the new
> session
> that has been created, meaning that a probe for each connection on
> create_session to see if they still belong to the same server as the
> new session is created. Is that preferred over simply destroying
> connections? I'm going to work on implementing this too and posting
> as
> an alternative.
>
> It has been expressed several times that the ultimate goal is to do
> transport management in userspace so does it mean the solution to
> this
> is also in userspace? Should there be upcalls to user land on
> DESTROY_SESSION and CREATE_SESSION to destroy/create trunking
> respectively but triggered via user land. But in this approach, while
> this happens at user land speed,  will we be allowing the client to
> get into a state where it's unusable (because its connections are
> talking to servers that don't belong to the same trunking group)? Or
> to prevent this, will we be allowing the userland to pause activities
> in the kernel until the transports are squared away? I just don't see
> how out-sourcing trunking membership changes to the user land is
> better than handling it in the kernel when no operations can proceed
> until trunking membership is corrected.
>
> Any feedback on the approaches or its alternatives would be much
> appreciated.
>
> Thank you for the feedback.
>
> Thank you.

Right now, we only call DESTROY_SESSION on the final unmount of the
volumes on a given server, just before calling DESTROY_CLIENTID to
destroy the lease. So the point is really moot within the current
framework.

Is it therefore your intention to change when we call DESTROY_SESSION?

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2022-02-28 20:42:15

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: managing trunking

On Mon, Feb 28, 2022 at 3:03 PM Trond Myklebust <[email protected]> wrote:
>
> On Mon, 2022-02-28 at 13:58 -0500, Olga Kornievskaia wrote:
> > Hello folks,
> >
> > I would like to ask for what would be an acceptable solution to deal
> > with created trunking connections when there is a change in trunking
> > membership, specifically when a connection (ie., its endpoint) no
> > longer is part of the same group. An inaction on the client's part
> > would lead to unusable client.
> >
> > Would a proposal to destroy trunking connections in as part of
> > DESTROY_SESSION be acceptable? The logic behind this solution is that
> > trunking membership was established as a part of a session, each
> > connection was tested to belong to trunkable server and added to that
> > session. Once the session is destroy and new created there is no
> > guarantee that the connections are to the same server that the new
> > session is created for. Trunking membership can be re-established at
> > a
> > later time. I have some code that implements this solution but still
> > needs some testing.
> >
> > Alternatively, if we keep connections past DESTROY_SESSION, then we
> > need a way to test that the same connections belong to the new
> > session
> > that has been created, meaning that a probe for each connection on
> > create_session to see if they still belong to the same server as the
> > new session is created. Is that preferred over simply destroying
> > connections? I'm going to work on implementing this too and posting
> > as
> > an alternative.
> >
> > It has been expressed several times that the ultimate goal is to do
> > transport management in userspace so does it mean the solution to
> > this
> > is also in userspace? Should there be upcalls to user land on
> > DESTROY_SESSION and CREATE_SESSION to destroy/create trunking
> > respectively but triggered via user land. But in this approach, while
> > this happens at user land speed, will we be allowing the client to
> > get into a state where it's unusable (because its connections are
> > talking to servers that don't belong to the same trunking group)? Or
> > to prevent this, will we be allowing the userland to pause activities
> > in the kernel until the transports are squared away? I just don't see
> > how out-sourcing trunking membership changes to the user land is
> > better than handling it in the kernel when no operations can proceed
> > until trunking membership is corrected.
> >
> > Any feedback on the approaches or its alternatives would be much
> > appreciated.
> >
> > Thank you for the feedback.
> >
> > Thank you.
>
> Right now, we only call DESTROY_SESSION on the final unmount of the
> volumes on a given server, just before calling DESTROY_CLIENTID to
> destroy the lease. So the point is really moot within the current
> framework.
>
> Is it therefore your intention to change when we call DESTROY_SESSION?

No I do not propose to change the when we call destroy_session.
Destroy_session is triggered when the client receives a BAD_SESSION
error.

I'm not a fan of this approach. I think the 2nd approach that tests
the connections is a better one.


>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]
>
>