2024-04-20 10:48:20

by Lex Siegel

[permalink] [raw]
Subject: [PATCH] xprtsock: Fix a loop in xs_tcp_setup_socket()

When using a bpf on kernel_connect(), the call can return -EPERM.
This causes xs_tcp_setup_socket() to loop forever, filling up the
syslog and causing the kernel to freeze up.

Signed-off-by: Lex Siegel <[email protected]>
---
net/sunrpc/xprtsock.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index bb9b747d58a1..47b254806a08 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2446,6 +2446,8 @@ static void xs_tcp_setup_socket(struct work_struct *work)
/* Happens, for instance, if the user specified a link
* local IPv6 address without a scope-id.
*/
+ case -EPERM:
+ /* Happens, for instance, if a bpf is preventing the connect */
case -ECONNREFUSED:
case -ECONNRESET:
case -ENETDOWN:
--
2.39.3



2024-04-20 11:06:21

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] xprtsock: Fix a loop in xs_tcp_setup_socket()

On Sat, 2024-04-20 at 19:48 +0900, Lex Siegel wrote:
> When using a bpf on kernel_connect(), the call can return -EPERM.
> This causes xs_tcp_setup_socket() to loop forever, filling up the
> syslog and causing the kernel to freeze up.
>
> Signed-off-by: Lex Siegel <[email protected]>
> ---
> net/sunrpc/xprtsock.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index bb9b747d58a1..47b254806a08 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2446,6 +2446,8 @@ static void xs_tcp_setup_socket(struct work_struct *work)
> /* Happens, for instance, if the user specified a link
> * local IPv6 address without a scope-id.
> */
> + case -EPERM:
> + /* Happens, for instance, if a bpf is preventing the connect */
> case -ECONNREFUSED:
> case -ECONNRESET:
> case -ENETDOWN:

Yeah, I think we'd consider -EPERM a permanent error.

Reviewed-by: Jeff Layton <[email protected]>

2024-04-21 22:35:09

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] xprtsock: Fix a loop in xs_tcp_setup_socket()

On Sat, 2024-04-20 at 19:48 +0900, Lex Siegel wrote:
> [You don't often get email from [email protected]. Learn why this is
> important at https://aka.ms/LearnAboutSenderIdentification ]
>
> When using a bpf on kernel_connect(), the call can return -EPERM.
> This causes xs_tcp_setup_socket() to loop forever, filling up the
> syslog and causing the kernel to freeze up.
>
> Signed-off-by: Lex Siegel <[email protected]>
> ---
>  net/sunrpc/xprtsock.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index bb9b747d58a1..47b254806a08 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2446,6 +2446,8 @@ static void xs_tcp_setup_socket(struct
> work_struct *work)
>                 /* Happens, for instance, if the user specified a
> link
>                  * local IPv6 address without a scope-id.
>                  */
> +       case -EPERM:
> +               /* Happens, for instance, if a bpf is preventing the
> connect */
>         case -ECONNREFUSED:
>         case -ECONNRESET:
>         case -ENETDOWN:
> --
> 2.39.3
>

This looks as if it will have some rather subtle interactions if the
RPC_TASK_SOFTCONN flag is set. Have you tested this kind of scenario?

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2024-04-21 23:23:04

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] xprtsock: Fix a loop in xs_tcp_setup_socket()

On Sat, 20 Apr 2024, Lex Siegel wrote:
> When using a bpf on kernel_connect(), the call can return -EPERM.
> This causes xs_tcp_setup_socket() to loop forever, filling up the
> syslog and causing the kernel to freeze up.
>
> Signed-off-by: Lex Siegel <[email protected]>
> ---
> net/sunrpc/xprtsock.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index bb9b747d58a1..47b254806a08 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2446,6 +2446,8 @@ static void xs_tcp_setup_socket(struct work_struct *work)
> /* Happens, for instance, if the user specified a link
> * local IPv6 address without a scope-id.
> */
> + case -EPERM:
> + /* Happens, for instance, if a bpf is preventing the connect */

This will propagate -EPERM up into other layers which might not be ready
to handle it.
It might be safer to map EPERM to an error we would be more likely to
expect from the network system - such as ECONNREFUSED or ENETDOWN.

Better still would be for kernel_connect() to return a more normal error
code - not EPERM. If that cannot be achieved, then I think it would be
best for the sunrpc code to map EPERM to something else at the place
where kernel_connect() is called - catch it early.

NeilBrown


> case -ECONNREFUSED:
> case -ECONNRESET:
> case -ENETDOWN:
> --
> 2.39.3
>
>


2024-04-22 03:45:10

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] xprtsock: Fix a loop in xs_tcp_setup_socket()

On Mon, 22 Apr 2024, Lex Siegel wrote:
> > Better still would be for kernel_connect() to return a more normal error
> > code - not EPERM. If that cannot be achieved, then I think it would be
> > best for the sunrpc code to map EPERM to something else at the place
> > where kernel_connect() is called - catch it early.
>
> The question is whether a permission error, EPERM, should cause a retry or
> return. Currently xs_tcp_setup_socket() is retrying. For the retry to clear,
> the connect call will have to not return a permission error to halt the retry
> attempts.
>
> This is a default behavior because EPERM is not an explicit case of the switch
> statement. Because bpf appropriately uses EPERM to show that the kernel_connect
> was not permitted, it highlights the return handling for this case is missing.
> It is unlikely that retry was ever the intended result.
>
> Upstream, the bpf that caused this is at:
> https://github.com/cilium/cilium/blob/v1.15/bpf/bpf_sock.c#L336
>
> This cilium bpf code has two return statuses, EPERM and ENXIO, that fall
> through to the default case of retrying. Here, cilium expects both of these
> statuses to indicate the connect failed. A retry is not the intended result.
>
> Handling this case without a retry aligns this code with the udp behavior. This
> precedence for passing EPERM back up the stack was set in 3dedbb5ca10ef.
>
> I will amend my patch to include an explicit case for ENXIO as well, as this is
> also in cilium's bpf and will cause the same bug to occur.
>

I think it should be up to cilium to report an errno that the kernel
understands, not up to the kernel to understand whatever errno cilium
chooses to return.

I don't think EPERM or ENXIO are appropriate errors for network
problems.
EHOSTUNREACH or ECONNREFUSED would make much more sense.

NeilBrown


>
> On Mon, Apr 22, 2024 at 8:22 AM NeilBrown <[email protected]> wrote:
> >
> > On Sat, 20 Apr 2024, Lex Siegel wrote:
> > > When using a bpf on kernel_connect(), the call can return -EPERM.
> > > This causes xs_tcp_setup_socket() to loop forever, filling up the
> > > syslog and causing the kernel to freeze up.
> > >
> > > Signed-off-by: Lex Siegel <[email protected]>
> > > ---
> > > net/sunrpc/xprtsock.c | 2 ++
> > > 1 file changed, 2 insertions(+)
> > >
> > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > > index bb9b747d58a1..47b254806a08 100644
> > > --- a/net/sunrpc/xprtsock.c
> > > +++ b/net/sunrpc/xprtsock.c
> > > @@ -2446,6 +2446,8 @@ static void xs_tcp_setup_socket(struct work_struct *work)
> > > /* Happens, for instance, if the user specified a link
> > > * local IPv6 address without a scope-id.
> > > */
> > > + case -EPERM:
> > > + /* Happens, for instance, if a bpf is preventing the connect */
> >
> > This will propagate -EPERM up into other layers which might not be ready
> > to handle it.
> > It might be safer to map EPERM to an error we would be more likely to
> > expect from the network system - such as ECONNREFUSED or ENETDOWN.
> >
> > Better still would be for kernel_connect() to return a more normal error
> > code - not EPERM. If that cannot be achieved, then I think it would be
> > best for the sunrpc code to map EPERM to something else at the place
> > where kernel_connect() is called - catch it early.
> >
> > NeilBrown
> >
> >
> > > case -ECONNREFUSED:
> > > case -ECONNRESET:
> > > case -ENETDOWN:
> > > --
> > > 2.39.3
> > >
> > >
> >
>