2023-07-20 19:27:30

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH net-next] mptcp: fix rcv buffer auto-tuning

On Thu, Jul 20, 2023 at 8:48 PM Matthieu Baerts
<[email protected]> wrote:
>
> From: Paolo Abeni <[email protected]>
>
> The MPTCP code uses the assumption that the tcp_win_from_space() helper
> does not use any TCP-specific field, and thus works correctly operating
> on an MPTCP socket.
>
> The commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
> broke such assumption, and as a consequence most MPTCP connections stall
> on zero-window event due to auto-tuning changing the rcv buffer size
> quite randomly.
>
> Address the issue syncing again the MPTCP auto-tuning code with the TCP
> one. To achieve that, factor out the windows size logic in socket
> independent helpers, and reuse them in mptcp_rcv_space_adjust(). The
> MPTCP level scaling_ratio is selected as the minimum one from the all
> the subflows, as a worst-case estimate.
>
> Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
> Signed-off-by: Paolo Abeni <[email protected]>
> Co-developed-by: Matthieu Baerts <[email protected]>
> Signed-off-by: Matthieu Baerts <[email protected]>
> ---
> include/net/tcp.h | 20 +++++++++++++++-----
> net/mptcp/protocol.c | 15 +++++++--------
> net/mptcp/protocol.h | 8 +++++++-
> net/mptcp/subflow.c | 2 +-
> 4 files changed, 30 insertions(+), 15 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index c5fb90079920..794642fbd724 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -1430,22 +1430,32 @@ void tcp_select_initial_window(const struct sock *sk, int __space,
> __u32 *window_clamp, int wscale_ok,
> __u8 *rcv_wscale, __u32 init_rcv_wnd);
>
> -static inline int tcp_win_from_space(const struct sock *sk, int space)
> +static inline int __tcp_win_from_space(u8 scaling_ratio, int space)
> {
> - s64 scaled_space = (s64)space * tcp_sk(sk)->scaling_ratio;
> + s64 scaled_space = (s64)space * scaling_ratio;
>
> return scaled_space >> TCP_RMEM_TO_WIN_SCALE;
> }
>
> -/* inverse of tcp_win_from_space() */
> -static inline int tcp_space_from_win(const struct sock *sk, int win)
> +static inline int tcp_win_from_space(const struct sock *sk, int space)

Maybe in a follow up patch we could change the prototype of this helper
to avoid future mis use :)

static inline int tcp_win_from_space(const struct tcp_sock *tp, int space)
{
}

Reviewed-by: Eric Dumazet <[email protected]>


> +{
> + return __tcp_win_from_space(tcp_sk(sk)->scaling_ratio, space);
> +}
> +
> +/* inverse of __tcp_win_from_space() */
> +static inline int __tcp_space_from_win(u8 scaling_ratio, int win)
> {
> u64 val = (u64)win << TCP_RMEM_TO_WIN_SCALE;
>
> - do_div(val, tcp_sk(sk)->scaling_ratio);
> + do_div(val, scaling_ratio);
> return val;
> }
>
> +static inline int tcp_space_from_win(const struct sock *sk, int win

Same remark here.

Thanks for the fix !


2023-07-20 19:29:33

by Soheil Hassas Yeganeh

[permalink] [raw]
Subject: Re: [PATCH net-next] mptcp: fix rcv buffer auto-tuning

On Thu, Jul 20, 2023 at 3:07 PM Eric Dumazet <[email protected]> wrote:
>
> On Thu, Jul 20, 2023 at 8:48 PM Matthieu Baerts
> <[email protected]> wrote:
> >
> > From: Paolo Abeni <[email protected]>
> >
> > The MPTCP code uses the assumption that the tcp_win_from_space() helper
> > does not use any TCP-specific field, and thus works correctly operating
> > on an MPTCP socket.
> >
> > The commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
> > broke such assumption, and as a consequence most MPTCP connections stall
> > on zero-window event due to auto-tuning changing the rcv buffer size
> > quite randomly.
> >
> > Address the issue syncing again the MPTCP auto-tuning code with the TCP
> > one. To achieve that, factor out the windows size logic in socket
> > independent helpers, and reuse them in mptcp_rcv_space_adjust(). The
> > MPTCP level scaling_ratio is selected as the minimum one from the all
> > the subflows, as a worst-case estimate.
> >
> > Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
> > Signed-off-by: Paolo Abeni <[email protected]>
> > Co-developed-by: Matthieu Baerts <[email protected]>
> > Signed-off-by: Matthieu Baerts <[email protected]>
> > ---
> > include/net/tcp.h | 20 +++++++++++++++-----
> > net/mptcp/protocol.c | 15 +++++++--------
> > net/mptcp/protocol.h | 8 +++++++-
> > net/mptcp/subflow.c | 2 +-
> > 4 files changed, 30 insertions(+), 15 deletions(-)
> >
> > diff --git a/include/net/tcp.h b/include/net/tcp.h
> > index c5fb90079920..794642fbd724 100644
> > --- a/include/net/tcp.h
> > +++ b/include/net/tcp.h
> > @@ -1430,22 +1430,32 @@ void tcp_select_initial_window(const struct sock *sk, int __space,
> > __u32 *window_clamp, int wscale_ok,
> > __u8 *rcv_wscale, __u32 init_rcv_wnd);
> >
> > -static inline int tcp_win_from_space(const struct sock *sk, int space)
> > +static inline int __tcp_win_from_space(u8 scaling_ratio, int space)
> > {
> > - s64 scaled_space = (s64)space * tcp_sk(sk)->scaling_ratio;
> > + s64 scaled_space = (s64)space * scaling_ratio;
> >
> > return scaled_space >> TCP_RMEM_TO_WIN_SCALE;
> > }
> >
> > -/* inverse of tcp_win_from_space() */
> > -static inline int tcp_space_from_win(const struct sock *sk, int win)
> > +static inline int tcp_win_from_space(const struct sock *sk, int space)
>
> Maybe in a follow up patch we could change the prototype of this helper
> to avoid future mis use :)
>
> static inline int tcp_win_from_space(const struct tcp_sock *tp, int space)
> {
> }
>
> Reviewed-by: Eric Dumazet <[email protected]>

Acked-by: Soheil Hassas Yeganeh <[email protected]>

Thank you for the fix!

> > +{
> > + return __tcp_win_from_space(tcp_sk(sk)->scaling_ratio, space);
> > +}
> > +
> > +/* inverse of __tcp_win_from_space() */
> > +static inline int __tcp_space_from_win(u8 scaling_ratio, int win)
> > {
> > u64 val = (u64)win << TCP_RMEM_TO_WIN_SCALE;
> >
> > - do_div(val, tcp_sk(sk)->scaling_ratio);
> > + do_div(val, scaling_ratio);
> > return val;
> > }
> >
> > +static inline int tcp_space_from_win(const struct sock *sk, int win
>
> Same remark here.
>
> Thanks for the fix !