2022-04-19 01:08:06

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH RFC 08/15] SUNRPC: Add RPC_TASK_CORK flag

Introduce a mechanism to cause xprt_transmit() to break out of its
sending loop at a specific rpc_rqst, rather than draining the whole
transmit queue.

This enables the client to send just an RPC TLS probe and then wait
for the response before proceeding with the rest of the queue.

Signed-off-by: Chuck Lever <[email protected]>
---
include/linux/sunrpc/sched.h | 2 ++
include/trace/events/sunrpc.h | 1 +
net/sunrpc/xprt.c | 2 ++
3 files changed, 5 insertions(+)

diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h
index 599133fb3c63..f8c09638fa69 100644
--- a/include/linux/sunrpc/sched.h
+++ b/include/linux/sunrpc/sched.h
@@ -125,6 +125,7 @@ struct rpc_task_setup {
#define RPC_TASK_TLSCRED 0x00000008 /* Use AUTH_TLS credential */
#define RPC_TASK_NULLCREDS 0x00000010 /* Use AUTH_NULL credential */
#define RPC_CALL_MAJORSEEN 0x00000020 /* major timeout seen */
+#define RPC_TASK_CORK 0x00000040 /* cork the xmit queue */
#define RPC_TASK_DYNAMIC 0x00000080 /* task was kmalloc'ed */
#define RPC_TASK_NO_ROUND_ROBIN 0x00000100 /* send requests on "main" xprt */
#define RPC_TASK_SOFT 0x00000200 /* Use soft timeouts */
@@ -137,6 +138,7 @@ struct rpc_task_setup {

#define RPC_IS_ASYNC(t) ((t)->tk_flags & RPC_TASK_ASYNC)
#define RPC_IS_SWAPPER(t) ((t)->tk_flags & RPC_TASK_SWAPPER)
+#define RPC_IS_CORK(t) ((t)->tk_flags & RPC_TASK_CORK)
#define RPC_IS_SOFT(t) ((t)->tk_flags & (RPC_TASK_SOFT|RPC_TASK_TIMEOUT))
#define RPC_IS_SOFTCONN(t) ((t)->tk_flags & RPC_TASK_SOFTCONN)
#define RPC_WAS_SENT(t) ((t)->tk_flags & RPC_TASK_SENT)
diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index 811187c47ebb..e8d6adff1a50 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -312,6 +312,7 @@ TRACE_EVENT(rpc_request,
{ RPC_TASK_TLSCRED, "TLSCRED" }, \
{ RPC_TASK_NULLCREDS, "NULLCREDS" }, \
{ RPC_CALL_MAJORSEEN, "MAJORSEEN" }, \
+ { RPC_TASK_CORK, "CORK" }, \
{ RPC_TASK_DYNAMIC, "DYNAMIC" }, \
{ RPC_TASK_NO_ROUND_ROBIN, "NO_ROUND_ROBIN" }, \
{ RPC_TASK_SOFT, "SOFT" }, \
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 86d62cffba0d..4b303b945b51 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1622,6 +1622,8 @@ xprt_transmit(struct rpc_task *task)
if (xprt_request_data_received(task) &&
!test_bit(RPC_TASK_NEED_XMIT, &task->tk_runstate))
break;
+ if (RPC_IS_CORK(task))
+ break;
cond_resched_lock(&xprt->queue_lock);
}
spin_unlock(&xprt->queue_lock);



2022-04-19 10:08:25

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH RFC 08/15] SUNRPC: Add RPC_TASK_CORK flag

On Mon, 2022-04-18 at 12:52 -0400, Chuck Lever wrote:
> Introduce a mechanism to cause xprt_transmit() to break out of its
> sending loop at a specific rpc_rqst, rather than draining the whole
> transmit queue.
>
> This enables the client to send just an RPC TLS probe and then wait
> for the response before proceeding with the rest of the queue.
>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
>  include/linux/sunrpc/sched.h  |    2 ++
>  include/trace/events/sunrpc.h |    1 +
>  net/sunrpc/xprt.c             |    2 ++
>  3 files changed, 5 insertions(+)
>
> diff --git a/include/linux/sunrpc/sched.h
> b/include/linux/sunrpc/sched.h
> index 599133fb3c63..f8c09638fa69 100644
> --- a/include/linux/sunrpc/sched.h
> +++ b/include/linux/sunrpc/sched.h
> @@ -125,6 +125,7 @@ struct rpc_task_setup {
>  #define RPC_TASK_TLSCRED               0x00000008      /* Use
> AUTH_TLS credential */
>  #define RPC_TASK_NULLCREDS             0x00000010      /* Use
> AUTH_NULL credential */
>  #define RPC_CALL_MAJORSEEN             0x00000020      /* major
> timeout seen */
> +#define RPC_TASK_CORK                  0x00000040      /* cork the
> xmit queue */
>  #define RPC_TASK_DYNAMIC               0x00000080      /* task was
> kmalloc'ed */
>  #define        RPC_TASK_NO_ROUND_ROBIN         0x00000100      /*
> send requests on "main" xprt */
>  #define RPC_TASK_SOFT                  0x00000200      /* Use soft
> timeouts */
> @@ -137,6 +138,7 @@ struct rpc_task_setup {
>  
>  #define RPC_IS_ASYNC(t)                ((t)->tk_flags &
> RPC_TASK_ASYNC)
>  #define RPC_IS_SWAPPER(t)      ((t)->tk_flags & RPC_TASK_SWAPPER)
> +#define RPC_IS_CORK(t)         ((t)->tk_flags & RPC_TASK_CORK)
>  #define RPC_IS_SOFT(t)         ((t)->tk_flags &
> (RPC_TASK_SOFT|RPC_TASK_TIMEOUT))
>  #define RPC_IS_SOFTCONN(t)     ((t)->tk_flags & RPC_TASK_SOFTCONN)
>  #define RPC_WAS_SENT(t)                ((t)->tk_flags &
> RPC_TASK_SENT)
> diff --git a/include/trace/events/sunrpc.h
> b/include/trace/events/sunrpc.h
> index 811187c47ebb..e8d6adff1a50 100644
> --- a/include/trace/events/sunrpc.h
> +++ b/include/trace/events/sunrpc.h
> @@ -312,6 +312,7 @@ TRACE_EVENT(rpc_request,
>                 { RPC_TASK_TLSCRED, "TLSCRED"
> },                        \
>                 { RPC_TASK_NULLCREDS, "NULLCREDS"
> },                    \
>                 { RPC_CALL_MAJORSEEN, "MAJORSEEN"
> },                    \
> +               { RPC_TASK_CORK, "CORK"
> },                              \
>                 { RPC_TASK_DYNAMIC, "DYNAMIC"
> },                        \
>                 { RPC_TASK_NO_ROUND_ROBIN, "NO_ROUND_ROBIN"
> },          \
>                 { RPC_TASK_SOFT, "SOFT"
> },                              \
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 86d62cffba0d..4b303b945b51 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1622,6 +1622,8 @@ xprt_transmit(struct rpc_task *task)
>                 if (xprt_request_data_received(task) &&
>                     !test_bit(RPC_TASK_NEED_XMIT, &task-
> >tk_runstate))
>                         break;
> +               if (RPC_IS_CORK(task))
> +                       break;
>                 cond_resched_lock(&xprt->queue_lock);
>         }
>         spin_unlock(&xprt->queue_lock);
>
>

This is entirely the wrong place for this kind of control mechanism.

TLS vs not-TLS needs to be decided up front when we initialise the
transport (i.e. at mount time or whenever the pNFS channels are set
up). Otherwise, we're vulnerable to downgrade attacks.

Once we've decided that TLS is the right thing to do, then we shouldn't
declare to the RPC layer that the TLS-enabled transport is connected
until the underlying transport connection is established, and the TLS
handshake is done.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2022-04-20 08:42:40

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH RFC 08/15] SUNRPC: Add RPC_TASK_CORK flag



> On Apr 18, 2022, at 10:57 PM, Trond Myklebust <[email protected]> wrote:
>
> On Mon, 2022-04-18 at 12:52 -0400, Chuck Lever wrote:
>> Introduce a mechanism to cause xprt_transmit() to break out of its
>> sending loop at a specific rpc_rqst, rather than draining the whole
>> transmit queue.
>>
>> This enables the client to send just an RPC TLS probe and then wait
>> for the response before proceeding with the rest of the queue.
>>
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>> include/linux/sunrpc/sched.h | 2 ++
>> include/trace/events/sunrpc.h | 1 +
>> net/sunrpc/xprt.c | 2 ++
>> 3 files changed, 5 insertions(+)
>>
>> diff --git a/include/linux/sunrpc/sched.h
>> b/include/linux/sunrpc/sched.h
>> index 599133fb3c63..f8c09638fa69 100644
>> --- a/include/linux/sunrpc/sched.h
>> +++ b/include/linux/sunrpc/sched.h
>> @@ -125,6 +125,7 @@ struct rpc_task_setup {
>> #define RPC_TASK_TLSCRED 0x00000008 /* Use
>> AUTH_TLS credential */
>> #define RPC_TASK_NULLCREDS 0x00000010 /* Use
>> AUTH_NULL credential */
>> #define RPC_CALL_MAJORSEEN 0x00000020 /* major
>> timeout seen */
>> +#define RPC_TASK_CORK 0x00000040 /* cork the
>> xmit queue */
>> #define RPC_TASK_DYNAMIC 0x00000080 /* task was
>> kmalloc'ed */
>> #define RPC_TASK_NO_ROUND_ROBIN 0x00000100 /*
>> send requests on "main" xprt */
>> #define RPC_TASK_SOFT 0x00000200 /* Use soft
>> timeouts */
>> @@ -137,6 +138,7 @@ struct rpc_task_setup {
>>
>> #define RPC_IS_ASYNC(t) ((t)->tk_flags &
>> RPC_TASK_ASYNC)
>> #define RPC_IS_SWAPPER(t) ((t)->tk_flags & RPC_TASK_SWAPPER)
>> +#define RPC_IS_CORK(t) ((t)->tk_flags & RPC_TASK_CORK)
>> #define RPC_IS_SOFT(t) ((t)->tk_flags &
>> (RPC_TASK_SOFT|RPC_TASK_TIMEOUT))
>> #define RPC_IS_SOFTCONN(t) ((t)->tk_flags & RPC_TASK_SOFTCONN)
>> #define RPC_WAS_SENT(t) ((t)->tk_flags &
>> RPC_TASK_SENT)
>> diff --git a/include/trace/events/sunrpc.h
>> b/include/trace/events/sunrpc.h
>> index 811187c47ebb..e8d6adff1a50 100644
>> --- a/include/trace/events/sunrpc.h
>> +++ b/include/trace/events/sunrpc.h
>> @@ -312,6 +312,7 @@ TRACE_EVENT(rpc_request,
>> { RPC_TASK_TLSCRED, "TLSCRED"
>> }, \
>> { RPC_TASK_NULLCREDS, "NULLCREDS"
>> }, \
>> { RPC_CALL_MAJORSEEN, "MAJORSEEN"
>> }, \
>> + { RPC_TASK_CORK, "CORK"
>> }, \
>> { RPC_TASK_DYNAMIC, "DYNAMIC"
>> }, \
>> { RPC_TASK_NO_ROUND_ROBIN, "NO_ROUND_ROBIN"
>> }, \
>> { RPC_TASK_SOFT, "SOFT"
>> }, \
>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>> index 86d62cffba0d..4b303b945b51 100644
>> --- a/net/sunrpc/xprt.c
>> +++ b/net/sunrpc/xprt.c
>> @@ -1622,6 +1622,8 @@ xprt_transmit(struct rpc_task *task)
>> if (xprt_request_data_received(task) &&
>> !test_bit(RPC_TASK_NEED_XMIT, &task-
>>> tk_runstate))
>> break;
>> + if (RPC_IS_CORK(task))
>> + break;
>> cond_resched_lock(&xprt->queue_lock);
>> }
>> spin_unlock(&xprt->queue_lock);
>>
>>
>
> This is entirely the wrong place for this kind of control mechanism.

I'm not sure I entirely understand your concern, so bear with
me while I try to clarify.


> TLS vs not-TLS needs to be decided up front when we initialise the
> transport (i.e. at mount time or whenever the pNFS channels are set
> up). Otherwise, we're vulnerable to downgrade attacks.

Downgrade attacks are prevented by using "xprtsec=tls" because
in that case, transport creation fails if either the AUTH_TLS
fails or the handshake fails.

The TCP connection has to be established first, though. Then the
client can send the RPC_AUTH_TLS probe, which is the same as the
NULL ping that it already sends. That mechanism is independent
of the lower layer transport (TCP in this case).

Therefore, RPC traffic must be stoppered while the client:

1. waits for the AUTH_TLS probe's reply, and

2. waits for the handshake to complete

Because an RPC message is involved in this interaction, I didn't
see a way to implement it completely within xprtsock's TCP
connection logic. IMO, driving the handshake has to be done by
the generic RPC client.

So, do you mean that I need to replace RPC_TASK_CORK with a
special return code from xs_tcp_send_request() ?


> Once we've decided that TLS is the right thing to do, then we shouldn't
> declare to the RPC layer that the TLS-enabled transport is connected
> until the underlying transport connection is established, and the TLS
> handshake is done.

That logic is handled in patch 10/15.

Reconnecting and re-establishing a TLS session is handled in
patches 11/15 and 12/15. Again, if the transport's policy setting
is "must use TLS" then the client ensures that a TLS session is in
use before allowing more RPC traffic on the new connection.


--
Chuck Lever



2022-04-20 15:02:01

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH RFC 08/15] SUNRPC: Add RPC_TASK_CORK flag


> On Apr 19, 2022, at 6:09 PM, Trond Myklebust <[email protected]> wrote:
>
> On Tue, 2022-04-19 at 19:40 +0000, Chuck Lever III wrote:
>>
>>
>>> On Apr 19, 2022, at 3:04 PM, Trond Myklebust
>>> <[email protected]> wrote:
>>>
>>> On Tue, 2022-04-19 at 18:16 +0000, Chuck Lever III wrote:
>>>>
>>>>
>>>>> On Apr 18, 2022, at 10:57 PM, Trond Myklebust
>>>>> <[email protected]> wrote:
>>>>>
>>>>>> On Mon, 2022-04-18 at 12:52 -0400, Chuck Lever wrote:
>>>>>>> Introduce a mechanism to cause xprt_transmit() to break out
>>>>>>> of
>>>>>>> its
>>>>>>> sending loop at a specific rpc_rqst, rather than draining the
>>>>>>> whole
>>>>>>> transmit queue.
>>>>>>>
>>>>>>> This enables the client to send just an RPC TLS probe and
>>>>>>> then
>>>>>>> wait
>>>>>>> for the response before proceeding with the rest of the
>>>>>>> queue.
>>>>>>>
>>>>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>>>>> ---
>>>>>>> include/linux/sunrpc/sched.h | 2 ++
>>>>>>> include/trace/events/sunrpc.h | 1 +
>>>>>>> net/sunrpc/xprt.c | 2 ++
>>>>>>> 3 files changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/include/linux/sunrpc/sched.h
>>>>>>> b/include/linux/sunrpc/sched.h
>>>>>>> index 599133fb3c63..f8c09638fa69 100644
>>>>>>> --- a/include/linux/sunrpc/sched.h
>>>>>>> +++ b/include/linux/sunrpc/sched.h
>>>>>>> @@ -125,6 +125,7 @@ struct rpc_task_setup {
>>>>>>> #define RPC_TASK_TLSCRED 0x00000008 /*
>>>>>>> Use
>>>>>>> AUTH_TLS credential */
>>>>>>> #define RPC_TASK_NULLCREDS 0x00000010 /*
>>>>>>> Use
>>>>>>> AUTH_NULL credential */
>>>>>>> #define RPC_CALL_MAJORSEEN 0x00000020 /*
>>>>>>> major
>>>>>>> timeout seen */
>>>>>>> +#define RPC_TASK_CORK 0x00000040 /*
>>>>>>> cork
>>>>>>> the
>>>>>>> xmit queue */
>>>>>>> #define RPC_TASK_DYNAMIC 0x00000080 /*
>>>>>>> task
>>>>>>> was
>>>>>>> kmalloc'ed */
>>>>>>> #define RPC_TASK_NO_ROUND_ROBIN
>>>>>>> 0x00000100
>>>>>>> /*
>>>>>>> send requests on "main" xprt */
>>>>>>> #define RPC_TASK_SOFT 0x00000200 /*
>>>>>>> Use
>>>>>>> soft
>>>>>>> timeouts */
>>>>>>> @@ -137,6 +138,7 @@ struct rpc_task_setup {
>>>>>>>
>>>>>>> #define RPC_IS_ASYNC(t) ((t)->tk_flags &
>>>>>>> RPC_TASK_ASYNC)
>>>>>>> #define RPC_IS_SWAPPER(t) ((t)->tk_flags &
>>>>>>> RPC_TASK_SWAPPER)
>>>>>>> +#define RPC_IS_CORK(t) ((t)->tk_flags &
>>>>>>> RPC_TASK_CORK)
>>>>>>> #define RPC_IS_SOFT(t) ((t)->tk_flags &
>>>>>>> (RPC_TASK_SOFT|RPC_TASK_TIMEOUT))
>>>>>>> #define RPC_IS_SOFTCONN(t) ((t)->tk_flags &
>>>>>>> RPC_TASK_SOFTCONN)
>>>>>>> #define RPC_WAS_SENT(t) ((t)->tk_flags &
>>>>>>> RPC_TASK_SENT)
>>>>>>> diff --git a/include/trace/events/sunrpc.h
>>>>>>> b/include/trace/events/sunrpc.h
>>>>>>> index 811187c47ebb..e8d6adff1a50 100644
>>>>>>> --- a/include/trace/events/sunrpc.h
>>>>>>> +++ b/include/trace/events/sunrpc.h
>>>>>>> @@ -312,6 +312,7 @@ TRACE_EVENT(rpc_request,
>>>>>>> { RPC_TASK_TLSCRED, "TLSCRED"
>>>>>>> }, \
>>>>>>> { RPC_TASK_NULLCREDS, "NULLCREDS"
>>>>>>> }, \
>>>>>>> { RPC_CALL_MAJORSEEN, "MAJORSEEN"
>>>>>>> }, \
>>>>>>> + { RPC_TASK_CORK, "CORK"
>>>>>>> }, \
>>>>>>> { RPC_TASK_DYNAMIC, "DYNAMIC"
>>>>>>> }, \
>>>>>>> { RPC_TASK_NO_ROUND_ROBIN, "NO_ROUND_ROBIN"
>>>>>>> }, \
>>>>>>> { RPC_TASK_SOFT, "SOFT"
>>>>>>> }, \
>>>>>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>>>>>> index 86d62cffba0d..4b303b945b51 100644
>>>>>>> --- a/net/sunrpc/xprt.c
>>>>>>> +++ b/net/sunrpc/xprt.c
>>>>>>> @@ -1622,6 +1622,8 @@ xprt_transmit(struct rpc_task *task)
>>>>>>> if (xprt_request_data_received(task) &&
>>>>>>> !test_bit(RPC_TASK_NEED_XMIT, &task-
>>>>>>>> tk_runstate))
>>>>>>> break;
>>>>>>> + if (RPC_IS_CORK(task))
>>>>>>> + break;
>>>>>>> cond_resched_lock(&xprt->queue_lock);
>>>>>>> }
>>>>>>> spin_unlock(&xprt->queue_lock);
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> This is entirely the wrong place for this kind of control
>>>>>> mechanism.
>>>>>
>>>>> I'm not sure I entirely understand your concern, so bear with
>>>>> me while I try to clarify.
>>>>>
>>>>>
>>>>>> TLS vs not-TLS needs to be decided up front when we initialise
>>>>>> the
>>>>>> transport (i.e. at mount time or whenever the pNFS channels are
>>>>>> set
>>>>>> up). Otherwise, we're vulnerable to downgrade attacks.
>>>>>
>>>>> Downgrade attacks are prevented by using "xprtsec=tls" because
>>>>> in that case, transport creation fails if either the AUTH_TLS
>>>>> fails or the handshake fails.
>>>>>
>>>>> The TCP connection has to be established first, though. Then the
>>>>> client can send the RPC_AUTH_TLS probe, which is the same as the
>>>>> NULL ping that it already sends. That mechanism is independent
>>>>> of the lower layer transport (TCP in this case).
>>>>>
>>>>> Therefore, RPC traffic must be stoppered while the client:
>>>>>
>>>>> 1. waits for the AUTH_TLS probe's reply, and
>>>>>
>>>>> 2. waits for the handshake to complete
>>>>>
>>>>> Because an RPC message is involved in this interaction, I didn't
>>>>> see a way to implement it completely within xprtsock's TCP
>>>>> connection logic. IMO, driving the handshake has to be done by
>>>>> the generic RPC client.
>>>>>
>>>>> So, do you mean that I need to replace RPC_TASK_CORK with a
>>>>> special return code from xs_tcp_send_request() ?
>>>
>>>
>>> I mean the right mechanism for controlling whether or not the
>>> transport
>>> is ready to serve RPC requests is through the XPRT_CONNECTED flag.
>>> All
>>> the existing generic RPC error handling, congestion handling, etc
>>> depends on that flag being set correctly.
>>>
>>> Until the TLS socket has completed its handshake protocol and is
>>> ready
>>> to transmit data, it should not be declared connected. The
>>> distinction
>>> between the two states 'TCP is unconnected' and 'TLS handshake is
>>> incomplete' is a socket/transport setup detail as far as the RPC
>>> xprt
>>> layer is concerned: just another set of intermediate states between
>>> SYN_SENT and ESTABLISHED.
>>
>> First, TLS is technically an upper layer protocol. It's not
>> part of the transport protocol. This is exactly how it's
>> implemented in the Linux kernel. And, TLS works on transports
>> other than TCP, so that makes it a reasonable candidate for
>> treatment in the generic client rather than in a particular
>> transport mechanism.
>
> Sorry, but no! As far as the RPC layer is concerned, there is no
> difference between a TLS socket and a TCP socket. The xprt layer should
> not have to know or care about the existence of TLS other that as a
> transport option to be configured at connection time.
>
>>
>> Second, the "intermediate states" would be /outside/ of SYN_SENT
>> and ESTABLISHED. A TCP transport has to be in the ESTABLISHED
>> state (ie, the transport's connection handshake has to be
>> complete) before any TLS traffic can go over it.
>>
>
> My point is we don't give a damn about the intermediate states in the
> RPC layer.
>
>> Most importantly, the client has to send an RPC message first
>> before it can start a TLS handshake. The RPC-with-TLS protocol
>> specification requires that the handshake be preceded with the
>> NULL AUTH_TLS request, which is an RPC. Otherwise, there's no
>> way for the server end to know when to expect a handshake.
>>
>
> Sure, but those are 2 non-overlapping states. The socket is first in a
> state where it needs to do a NULL ping using regular RPC/TCP. Then it
> needs to do the TLS handshake. Then it transitions into the state where
> it can act like any other transport.

Understood: this architecture more-or-less mimics what the RPC client would see for a transport like QUIC where connection establishment and the security handshake are integrated and handled concurrently.

The reality is that the RPC client’s transport layer will have to deal with these steps separately for TLS-on-UDP and TLS-on-TCP, but hiding the details under the transport switch is fine with me as long as there is a way to send the AUTH_TLS probe before the transport is marked “connected” (see below).


>> In today's RPC client, the underlying connection has to be in
>> the XPRT_CONNECTED state before the RPC client can exchange any
>> RPC transaction, including AUTH_TLS NULL.
>>
>> To make it work the way you've suggested, we would have to build
>> a mechanism that could send the AUTH_TLS NULL and receive and
>> parse its reply /before/ the client has put the transport into
>> the XPRT_CONNECTED state, and that NULL request would have to
>> be driven from inside the transport instance (not via the FSM
>> where all other RPC traffic originates).
>>
>> Do you have any suggestions about how to make this last point
>> less painful?
>
> This isn't too different from what we already do with the rpcbind call
> for performing port discovery. The only difference is that the NULL
> ping needs to happen on the same transport as the one being constructed
> and that it needs to happen after the TCP connection is complete.
>
> So I'd suggest that TLS/TCP needs to be a different xprt_class than the
> base TCP, then doing the whole "do-NULL-ping-and-TLS-handshake" in the
> connect() callback for that new class.
>
> The connect() callback can set up a private rpc client and do the NULL
> call asynchronously just like we do in rpcb_getport_async().

“Set up a separate rpc_client to handle the AUTH_TLS probe” was the piece I was missing. The rest of this is pretty much what the RPC changes in this patch series already do, but organized a little differently (for example, they use a “done” callback as you describe below, so we’re already about halfway there).

The idea is to stopper the stream of RPC messages by leaving the xprt marked “not connected” instead of by adding and setting an RPC_TASK_CORK flag. The FSM will no longer be involved at all in dealing with TLS.


> When the
> RPC call completes, we steal the resulting socket from that private rpc
> client and kick off the TLS handshake on it. All that can be done in
> the rpc_call_done callback (i.e. the equivalent of rpcb_getport_done).
>
> Once the TLS handshake is done, you can set the XPRT_CONNECTED state
> and call xprt_wake_pending_tasks().

2022-04-21 07:41:27

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH RFC 08/15] SUNRPC: Add RPC_TASK_CORK flag

On Tue, 2022-04-19 at 19:40 +0000, Chuck Lever III wrote:
>
>
> > On Apr 19, 2022, at 3:04 PM, Trond Myklebust
> > <[email protected]> wrote:
> >
> > On Tue, 2022-04-19 at 18:16 +0000, Chuck Lever III wrote:
> > >
> > >
> > > > On Apr 18, 2022, at 10:57 PM, Trond Myklebust
> > > > <[email protected]> wrote:
> > > >
> > > > On Mon, 2022-04-18 at 12:52 -0400, Chuck Lever wrote:
> > > > > Introduce a mechanism to cause xprt_transmit() to break out
> > > > > of
> > > > > its
> > > > > sending loop at a specific rpc_rqst, rather than draining the
> > > > > whole
> > > > > transmit queue.
> > > > >
> > > > > This enables the client to send just an RPC TLS probe and
> > > > > then
> > > > > wait
> > > > > for the response before proceeding with the rest of the
> > > > > queue.
> > > > >
> > > > > Signed-off-by: Chuck Lever <[email protected]>
> > > > > ---
> > > > >  include/linux/sunrpc/sched.h  |    2 ++
> > > > >  include/trace/events/sunrpc.h |    1 +
> > > > >  net/sunrpc/xprt.c             |    2 ++
> > > > >  3 files changed, 5 insertions(+)
> > > > >
> > > > > diff --git a/include/linux/sunrpc/sched.h
> > > > > b/include/linux/sunrpc/sched.h
> > > > > index 599133fb3c63..f8c09638fa69 100644
> > > > > --- a/include/linux/sunrpc/sched.h
> > > > > +++ b/include/linux/sunrpc/sched.h
> > > > > @@ -125,6 +125,7 @@ struct rpc_task_setup {
> > > > >  #define RPC_TASK_TLSCRED               0x00000008      /*
> > > > > Use
> > > > > AUTH_TLS credential */
> > > > >  #define RPC_TASK_NULLCREDS             0x00000010      /*
> > > > > Use
> > > > > AUTH_NULL credential */
> > > > >  #define RPC_CALL_MAJORSEEN             0x00000020      /*
> > > > > major
> > > > > timeout seen */
> > > > > +#define RPC_TASK_CORK                  0x00000040      /*
> > > > > cork
> > > > > the
> > > > > xmit queue */
> > > > >  #define RPC_TASK_DYNAMIC               0x00000080      /*
> > > > > task
> > > > > was
> > > > > kmalloc'ed */
> > > > >  #define        RPC_TASK_NO_ROUND_ROBIN        
> > > > > 0x00000100    
> > > > > /*
> > > > > send requests on "main" xprt */
> > > > >  #define RPC_TASK_SOFT                  0x00000200      /*
> > > > > Use
> > > > > soft
> > > > > timeouts */
> > > > > @@ -137,6 +138,7 @@ struct rpc_task_setup {
> > > > >  
> > > > >  #define RPC_IS_ASYNC(t)                ((t)->tk_flags &
> > > > > RPC_TASK_ASYNC)
> > > > >  #define RPC_IS_SWAPPER(t)      ((t)->tk_flags &
> > > > > RPC_TASK_SWAPPER)
> > > > > +#define RPC_IS_CORK(t)         ((t)->tk_flags &
> > > > > RPC_TASK_CORK)
> > > > >  #define RPC_IS_SOFT(t)         ((t)->tk_flags &
> > > > > (RPC_TASK_SOFT|RPC_TASK_TIMEOUT))
> > > > >  #define RPC_IS_SOFTCONN(t)     ((t)->tk_flags &
> > > > > RPC_TASK_SOFTCONN)
> > > > >  #define RPC_WAS_SENT(t)                ((t)->tk_flags &
> > > > > RPC_TASK_SENT)
> > > > > diff --git a/include/trace/events/sunrpc.h
> > > > > b/include/trace/events/sunrpc.h
> > > > > index 811187c47ebb..e8d6adff1a50 100644
> > > > > --- a/include/trace/events/sunrpc.h
> > > > > +++ b/include/trace/events/sunrpc.h
> > > > > @@ -312,6 +312,7 @@ TRACE_EVENT(rpc_request,
> > > > >                 { RPC_TASK_TLSCRED, "TLSCRED"
> > > > > },                        \
> > > > >                 { RPC_TASK_NULLCREDS, "NULLCREDS"
> > > > > },                    \
> > > > >                 { RPC_CALL_MAJORSEEN, "MAJORSEEN"
> > > > > },                    \
> > > > > +               { RPC_TASK_CORK, "CORK"
> > > > > },                              \
> > > > >                 { RPC_TASK_DYNAMIC, "DYNAMIC"
> > > > > },                        \
> > > > >                 { RPC_TASK_NO_ROUND_ROBIN, "NO_ROUND_ROBIN"
> > > > > },          \
> > > > >                 { RPC_TASK_SOFT, "SOFT"
> > > > > },                              \
> > > > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > > > > index 86d62cffba0d..4b303b945b51 100644
> > > > > --- a/net/sunrpc/xprt.c
> > > > > +++ b/net/sunrpc/xprt.c
> > > > > @@ -1622,6 +1622,8 @@ xprt_transmit(struct rpc_task *task)
> > > > >                 if (xprt_request_data_received(task) &&
> > > > >                     !test_bit(RPC_TASK_NEED_XMIT, &task-
> > > > > > tk_runstate))
> > > > >                         break;
> > > > > +               if (RPC_IS_CORK(task))
> > > > > +                       break;
> > > > >                 cond_resched_lock(&xprt->queue_lock);
> > > > >         }
> > > > >         spin_unlock(&xprt->queue_lock);
> > > > >
> > > > >
> > > >
> > > > This is entirely the wrong place for this kind of control
> > > > mechanism.
> > >
> > > I'm not sure I entirely understand your concern, so bear with
> > > me while I try to clarify.
> > >
> > >
> > > > TLS vs not-TLS needs to be decided up front when we initialise
> > > > the
> > > > transport (i.e. at mount time or whenever the pNFS channels are
> > > > set
> > > > up). Otherwise, we're vulnerable to downgrade attacks.
> > >
> > > Downgrade attacks are prevented by using "xprtsec=tls" because
> > > in that case, transport creation fails if either the AUTH_TLS
> > > fails or the handshake fails.
> > >
> > > The TCP connection has to be established first, though. Then the
> > > client can send the RPC_AUTH_TLS probe, which is the same as the
> > > NULL ping that it already sends. That mechanism is independent
> > > of the lower layer transport (TCP in this case).
> > >
> > > Therefore, RPC traffic must be stoppered while the client:
> > >
> > > 1. waits for the AUTH_TLS probe's reply, and
> > >
> > > 2. waits for the handshake to complete
> > >
> > > Because an RPC message is involved in this interaction, I didn't
> > > see a way to implement it completely within xprtsock's TCP
> > > connection logic. IMO, driving the handshake has to be done by
> > > the generic RPC client.
> > >
> > > So, do you mean that I need to replace RPC_TASK_CORK with a
> > > special return code from xs_tcp_send_request() ?
> >
> >
> > I mean the right mechanism for controlling whether or not the
> > transport
> > is ready to serve RPC requests is through the XPRT_CONNECTED flag.
> > All
> > the existing generic RPC error handling, congestion handling, etc
> > depends on that flag being set correctly.
> >
> > Until the TLS socket has completed its handshake protocol and is
> > ready
> > to transmit data, it should not be declared connected. The
> > distinction
> > between the two states 'TCP is unconnected' and 'TLS handshake is
> > incomplete' is a socket/transport setup detail as far as the RPC
> > xprt
> > layer is concerned: just another set of intermediate states between
> > SYN_SENT and ESTABLISHED.
>
> First, TLS is technically an upper layer protocol. It's not
> part of the transport protocol. This is exactly how it's
> implemented in the Linux kernel. And, TLS works on transports
> other than TCP, so that makes it a reasonable candidate for
> treatment in the generic client rather than in a particular
> transport mechanism.

Sorry, but no! As far as the RPC layer is concerned, there is no
difference between a TLS socket and a TCP socket. The xprt layer should
not have to know or care about the existence of TLS other that as a
transport option to be configured at connection time.

>
> Second, the "intermediate states" would be /outside/ of SYN_SENT
> and ESTABLISHED. A TCP transport has to be in the ESTABLISHED
> state (ie, the transport's connection handshake has to be
> complete) before any TLS traffic can go over it.
>

My point is we don't give a damn about the intermediate states in the
RPC layer.

> Most importantly, the client has to send an RPC message first
> before it can start a TLS handshake. The RPC-with-TLS protocol
> specification requires that the handshake be preceded with the
> NULL AUTH_TLS request, which is an RPC. Otherwise, there's no
> way for the server end to know when to expect a handshake.
>

Sure, but those are 2 non-overlapping states. The socket is first in a
state where it needs to do a NULL ping using regular RPC/TCP. Then it
needs to do the TLS handshake. Then it transitions into the state where
it can act like any other transport.

> In today's RPC client, the underlying connection has to be in
> the XPRT_CONNECTED state before the RPC client can exchange any
> RPC transaction, including AUTH_TLS NULL.
>
> To make it work the way you've suggested, we would have to build
> a mechanism that could send the AUTH_TLS NULL and receive and
> parse its reply /before/ the client has put the transport into
> the XPRT_CONNECTED state, and that NULL request would have to
> be driven from inside the transport instance (not via the FSM
> where all other RPC traffic originates).
>
> Do you have any suggestions about how to make this last point
> less painful?

This isn't too different from what we already do with the rpcbind call
for performing port discovery. The only difference is that the NULL
ping needs to happen on the same transport as the one being constructed
and that it needs to happen after the TCP connection is complete.

So I'd suggest that TLS/TCP needs to be a different xprt_class than the
base TCP, then doing the whole "do-NULL-ping-and-TLS-handshake" in the
connect() callback for that new class.

The connect() callback can set up a private rpc client and do the NULL
call asynchronously just like we do in rpcb_getport_async(). When the
RPC call completes, we steal the resulting socket from that private rpc
client and kick off the TLS handshake on it. All that can be done in
the rpc_call_done callback (i.e. the equivalent of rpcb_getport_done).

Once the TLS handshake is done, you can set the XPRT_CONNECTED state
and call xprt_wake_pending_tasks().

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2022-04-21 21:18:17

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH RFC 08/15] SUNRPC: Add RPC_TASK_CORK flag

On Tue, 2022-04-19 at 18:16 +0000, Chuck Lever III wrote:
>
>
> > On Apr 18, 2022, at 10:57 PM, Trond Myklebust
> > <[email protected]> wrote:
> >
> > On Mon, 2022-04-18 at 12:52 -0400, Chuck Lever wrote:
> > > Introduce a mechanism to cause xprt_transmit() to break out of
> > > its
> > > sending loop at a specific rpc_rqst, rather than draining the
> > > whole
> > > transmit queue.
> > >
> > > This enables the client to send just an RPC TLS probe and then
> > > wait
> > > for the response before proceeding with the rest of the queue.
> > >
> > > Signed-off-by: Chuck Lever <[email protected]>
> > > ---
> > >  include/linux/sunrpc/sched.h  |    2 ++
> > >  include/trace/events/sunrpc.h |    1 +
> > >  net/sunrpc/xprt.c             |    2 ++
> > >  3 files changed, 5 insertions(+)
> > >
> > > diff --git a/include/linux/sunrpc/sched.h
> > > b/include/linux/sunrpc/sched.h
> > > index 599133fb3c63..f8c09638fa69 100644
> > > --- a/include/linux/sunrpc/sched.h
> > > +++ b/include/linux/sunrpc/sched.h
> > > @@ -125,6 +125,7 @@ struct rpc_task_setup {
> > >  #define RPC_TASK_TLSCRED               0x00000008      /* Use
> > > AUTH_TLS credential */
> > >  #define RPC_TASK_NULLCREDS             0x00000010      /* Use
> > > AUTH_NULL credential */
> > >  #define RPC_CALL_MAJORSEEN             0x00000020      /* major
> > > timeout seen */
> > > +#define RPC_TASK_CORK                  0x00000040      /* cork
> > > the
> > > xmit queue */
> > >  #define RPC_TASK_DYNAMIC               0x00000080      /* task
> > > was
> > > kmalloc'ed */
> > >  #define        RPC_TASK_NO_ROUND_ROBIN         0x00000100     
> > > /*
> > > send requests on "main" xprt */
> > >  #define RPC_TASK_SOFT                  0x00000200      /* Use
> > > soft
> > > timeouts */
> > > @@ -137,6 +138,7 @@ struct rpc_task_setup {
> > >  
> > >  #define RPC_IS_ASYNC(t)                ((t)->tk_flags &
> > > RPC_TASK_ASYNC)
> > >  #define RPC_IS_SWAPPER(t)      ((t)->tk_flags &
> > > RPC_TASK_SWAPPER)
> > > +#define RPC_IS_CORK(t)         ((t)->tk_flags & RPC_TASK_CORK)
> > >  #define RPC_IS_SOFT(t)         ((t)->tk_flags &
> > > (RPC_TASK_SOFT|RPC_TASK_TIMEOUT))
> > >  #define RPC_IS_SOFTCONN(t)     ((t)->tk_flags &
> > > RPC_TASK_SOFTCONN)
> > >  #define RPC_WAS_SENT(t)                ((t)->tk_flags &
> > > RPC_TASK_SENT)
> > > diff --git a/include/trace/events/sunrpc.h
> > > b/include/trace/events/sunrpc.h
> > > index 811187c47ebb..e8d6adff1a50 100644
> > > --- a/include/trace/events/sunrpc.h
> > > +++ b/include/trace/events/sunrpc.h
> > > @@ -312,6 +312,7 @@ TRACE_EVENT(rpc_request,
> > >                 { RPC_TASK_TLSCRED, "TLSCRED"
> > > },                        \
> > >                 { RPC_TASK_NULLCREDS, "NULLCREDS"
> > > },                    \
> > >                 { RPC_CALL_MAJORSEEN, "MAJORSEEN"
> > > },                    \
> > > +               { RPC_TASK_CORK, "CORK"
> > > },                              \
> > >                 { RPC_TASK_DYNAMIC, "DYNAMIC"
> > > },                        \
> > >                 { RPC_TASK_NO_ROUND_ROBIN, "NO_ROUND_ROBIN"
> > > },          \
> > >                 { RPC_TASK_SOFT, "SOFT"
> > > },                              \
> > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > > index 86d62cffba0d..4b303b945b51 100644
> > > --- a/net/sunrpc/xprt.c
> > > +++ b/net/sunrpc/xprt.c
> > > @@ -1622,6 +1622,8 @@ xprt_transmit(struct rpc_task *task)
> > >                 if (xprt_request_data_received(task) &&
> > >                     !test_bit(RPC_TASK_NEED_XMIT, &task-
> > > > tk_runstate))
> > >                         break;
> > > +               if (RPC_IS_CORK(task))
> > > +                       break;
> > >                 cond_resched_lock(&xprt->queue_lock);
> > >         }
> > >         spin_unlock(&xprt->queue_lock);
> > >
> > >
> >
> > This is entirely the wrong place for this kind of control
> > mechanism.
>
> I'm not sure I entirely understand your concern, so bear with
> me while I try to clarify.
>
>
> > TLS vs not-TLS needs to be decided up front when we initialise the
> > transport (i.e. at mount time or whenever the pNFS channels are set
> > up). Otherwise, we're vulnerable to downgrade attacks.
>
> Downgrade attacks are prevented by using "xprtsec=tls" because
> in that case, transport creation fails if either the AUTH_TLS
> fails or the handshake fails.
>
> The TCP connection has to be established first, though. Then the
> client can send the RPC_AUTH_TLS probe, which is the same as the
> NULL ping that it already sends. That mechanism is independent
> of the lower layer transport (TCP in this case).
>
> Therefore, RPC traffic must be stoppered while the client:
>
> 1. waits for the AUTH_TLS probe's reply, and
>
> 2. waits for the handshake to complete
>
> Because an RPC message is involved in this interaction, I didn't
> see a way to implement it completely within xprtsock's TCP
> connection logic. IMO, driving the handshake has to be done by
> the generic RPC client.
>
> So, do you mean that I need to replace RPC_TASK_CORK with a
> special return code from xs_tcp_send_request() ?


I mean the right mechanism for controlling whether or not the transport
is ready to serve RPC requests is through the XPRT_CONNECTED flag. All
the existing generic RPC error handling, congestion handling, etc
depends on that flag being set correctly.

Until the TLS socket has completed its handshake protocol and is ready
to transmit data, it should not be declared connected. The distinction
between the two states 'TCP is unconnected' and 'TLS handshake is
incomplete' is a socket/transport setup detail as far as the RPC xprt
layer is concerned: just another set of intermediate states between
SYN_SENT and ESTABLISHED.

> > Once we've decided that TLS is the right thing to do, then we
> > shouldn't
> > declare to the RPC layer that the TLS-enabled transport is
> > connected
> > until the underlying transport connection is established, and the
> > TLS
> > handshake is done.
>
> That logic is handled in patch 10/15.
>
> Reconnecting and re-establishing a TLS session is handled in
> patches 11/15 and 12/15. Again, if the transport's policy setting
> is "must use TLS" then the client ensures that a TLS session is in
> use before allowing more RPC traffic on the new connection.
>
>
> --
> Chuck Lever
>
>
>

--
Trond Myklebust
CTO, Hammerspace Inc
4984 El Camino Real, Suite 208
Los Altos, CA 94022

http://www.hammer.space

2022-04-22 18:01:38

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH RFC 08/15] SUNRPC: Add RPC_TASK_CORK flag



> On Apr 19, 2022, at 3:04 PM, Trond Myklebust <[email protected]> wrote:
>
> On Tue, 2022-04-19 at 18:16 +0000, Chuck Lever III wrote:
>>
>>
>>> On Apr 18, 2022, at 10:57 PM, Trond Myklebust
>>> <[email protected]> wrote:
>>>
>>> On Mon, 2022-04-18 at 12:52 -0400, Chuck Lever wrote:
>>>> Introduce a mechanism to cause xprt_transmit() to break out of
>>>> its
>>>> sending loop at a specific rpc_rqst, rather than draining the
>>>> whole
>>>> transmit queue.
>>>>
>>>> This enables the client to send just an RPC TLS probe and then
>>>> wait
>>>> for the response before proceeding with the rest of the queue.
>>>>
>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>> ---
>>>> include/linux/sunrpc/sched.h | 2 ++
>>>> include/trace/events/sunrpc.h | 1 +
>>>> net/sunrpc/xprt.c | 2 ++
>>>> 3 files changed, 5 insertions(+)
>>>>
>>>> diff --git a/include/linux/sunrpc/sched.h
>>>> b/include/linux/sunrpc/sched.h
>>>> index 599133fb3c63..f8c09638fa69 100644
>>>> --- a/include/linux/sunrpc/sched.h
>>>> +++ b/include/linux/sunrpc/sched.h
>>>> @@ -125,6 +125,7 @@ struct rpc_task_setup {
>>>> #define RPC_TASK_TLSCRED 0x00000008 /* Use
>>>> AUTH_TLS credential */
>>>> #define RPC_TASK_NULLCREDS 0x00000010 /* Use
>>>> AUTH_NULL credential */
>>>> #define RPC_CALL_MAJORSEEN 0x00000020 /* major
>>>> timeout seen */
>>>> +#define RPC_TASK_CORK 0x00000040 /* cork
>>>> the
>>>> xmit queue */
>>>> #define RPC_TASK_DYNAMIC 0x00000080 /* task
>>>> was
>>>> kmalloc'ed */
>>>> #define RPC_TASK_NO_ROUND_ROBIN 0x00000100
>>>> /*
>>>> send requests on "main" xprt */
>>>> #define RPC_TASK_SOFT 0x00000200 /* Use
>>>> soft
>>>> timeouts */
>>>> @@ -137,6 +138,7 @@ struct rpc_task_setup {
>>>>
>>>> #define RPC_IS_ASYNC(t) ((t)->tk_flags &
>>>> RPC_TASK_ASYNC)
>>>> #define RPC_IS_SWAPPER(t) ((t)->tk_flags &
>>>> RPC_TASK_SWAPPER)
>>>> +#define RPC_IS_CORK(t) ((t)->tk_flags & RPC_TASK_CORK)
>>>> #define RPC_IS_SOFT(t) ((t)->tk_flags &
>>>> (RPC_TASK_SOFT|RPC_TASK_TIMEOUT))
>>>> #define RPC_IS_SOFTCONN(t) ((t)->tk_flags &
>>>> RPC_TASK_SOFTCONN)
>>>> #define RPC_WAS_SENT(t) ((t)->tk_flags &
>>>> RPC_TASK_SENT)
>>>> diff --git a/include/trace/events/sunrpc.h
>>>> b/include/trace/events/sunrpc.h
>>>> index 811187c47ebb..e8d6adff1a50 100644
>>>> --- a/include/trace/events/sunrpc.h
>>>> +++ b/include/trace/events/sunrpc.h
>>>> @@ -312,6 +312,7 @@ TRACE_EVENT(rpc_request,
>>>> { RPC_TASK_TLSCRED, "TLSCRED"
>>>> }, \
>>>> { RPC_TASK_NULLCREDS, "NULLCREDS"
>>>> }, \
>>>> { RPC_CALL_MAJORSEEN, "MAJORSEEN"
>>>> }, \
>>>> + { RPC_TASK_CORK, "CORK"
>>>> }, \
>>>> { RPC_TASK_DYNAMIC, "DYNAMIC"
>>>> }, \
>>>> { RPC_TASK_NO_ROUND_ROBIN, "NO_ROUND_ROBIN"
>>>> }, \
>>>> { RPC_TASK_SOFT, "SOFT"
>>>> }, \
>>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>>> index 86d62cffba0d..4b303b945b51 100644
>>>> --- a/net/sunrpc/xprt.c
>>>> +++ b/net/sunrpc/xprt.c
>>>> @@ -1622,6 +1622,8 @@ xprt_transmit(struct rpc_task *task)
>>>> if (xprt_request_data_received(task) &&
>>>> !test_bit(RPC_TASK_NEED_XMIT, &task-
>>>>> tk_runstate))
>>>> break;
>>>> + if (RPC_IS_CORK(task))
>>>> + break;
>>>> cond_resched_lock(&xprt->queue_lock);
>>>> }
>>>> spin_unlock(&xprt->queue_lock);
>>>>
>>>>
>>>
>>> This is entirely the wrong place for this kind of control
>>> mechanism.
>>
>> I'm not sure I entirely understand your concern, so bear with
>> me while I try to clarify.
>>
>>
>>> TLS vs not-TLS needs to be decided up front when we initialise the
>>> transport (i.e. at mount time or whenever the pNFS channels are set
>>> up). Otherwise, we're vulnerable to downgrade attacks.
>>
>> Downgrade attacks are prevented by using "xprtsec=tls" because
>> in that case, transport creation fails if either the AUTH_TLS
>> fails or the handshake fails.
>>
>> The TCP connection has to be established first, though. Then the
>> client can send the RPC_AUTH_TLS probe, which is the same as the
>> NULL ping that it already sends. That mechanism is independent
>> of the lower layer transport (TCP in this case).
>>
>> Therefore, RPC traffic must be stoppered while the client:
>>
>> 1. waits for the AUTH_TLS probe's reply, and
>>
>> 2. waits for the handshake to complete
>>
>> Because an RPC message is involved in this interaction, I didn't
>> see a way to implement it completely within xprtsock's TCP
>> connection logic. IMO, driving the handshake has to be done by
>> the generic RPC client.
>>
>> So, do you mean that I need to replace RPC_TASK_CORK with a
>> special return code from xs_tcp_send_request() ?
>
>
> I mean the right mechanism for controlling whether or not the transport
> is ready to serve RPC requests is through the XPRT_CONNECTED flag. All
> the existing generic RPC error handling, congestion handling, etc
> depends on that flag being set correctly.
>
> Until the TLS socket has completed its handshake protocol and is ready
> to transmit data, it should not be declared connected. The distinction
> between the two states 'TCP is unconnected' and 'TLS handshake is
> incomplete' is a socket/transport setup detail as far as the RPC xprt
> layer is concerned: just another set of intermediate states between
> SYN_SENT and ESTABLISHED.

First, TLS is technically an upper layer protocol. It's not
part of the transport protocol. This is exactly how it's
implemented in the Linux kernel. And, TLS works on transports
other than TCP, so that makes it a reasonable candidate for
treatment in the generic client rather than in a particular
transport mechanism.

Second, the "intermediate states" would be /outside/ of SYN_SENT
and ESTABLISHED. A TCP transport has to be in the ESTABLISHED
state (ie, the transport's connection handshake has to be
complete) before any TLS traffic can go over it.

Most importantly, the client has to send an RPC message first
before it can start a TLS handshake. The RPC-with-TLS protocol
specification requires that the handshake be preceded with the
NULL AUTH_TLS request, which is an RPC. Otherwise, there's no
way for the server end to know when to expect a handshake.

In today's RPC client, the underlying connection has to be in
the XPRT_CONNECTED state before the RPC client can exchange any
RPC transaction, including AUTH_TLS NULL.

To make it work the way you've suggested, we would have to build
a mechanism that could send the AUTH_TLS NULL and receive and
parse its reply /before/ the client has put the transport into
the XPRT_CONNECTED state, and that NULL request would have to
be driven from inside the transport instance (not via the FSM
where all other RPC traffic originates).

Do you have any suggestions about how to make this last point
less painful?


--
Chuck Lever