Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp2344551pxu; Sun, 13 Dec 2020 23:06:40 -0800 (PST) X-Google-Smtp-Source: ABdhPJwhp/w6cjeEQjxQiglTvUqYv4Wud5UIm6p1ah89EbvIiNkOjrcpzk2MqHEpunNT60nMLbEl X-Received: by 2002:a17:906:fa8b:: with SMTP id lt11mr21456985ejb.94.1607929600262; Sun, 13 Dec 2020 23:06:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607929600; cv=none; d=google.com; s=arc-20160816; b=F+tMMEAmrC0OJybDnvb1LuLEA3DbKjUMi75Z9qaab8wFz1ufHYJSSb1YkRZVIsCEen tipK19qxelbnQN1kGFwoxls1oVQ5cPReKB/3qQcS+0DL+BJQSAZ5kRfnOFnNBapjr9gz EZW6A2g7uvXPbGfffyqJIwttDB56e23IlyiJX2GlFrGh6wr1XLUfHwK+KwPKIHxQkj0P gFM4WHdPpy5WMQb55GBF5JX/CZFS23AbIYJ+2a2fiPzFFLHC87qQ3ru9fRKfqrLWBJmB e62E3mvsfUy4TaIG7jbeyXZVMyQ4ko7fTrSLXx37IAXVImyGEtUvnHaNzTV+bebBu60t 85gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=qGFtyYlVmI62XiTpOKeYXRAQe6g42lFPimyQ1u1WOMA=; b=1ChWbQC1hd23DUNrZSXJGgulkfwX1DTSf+zQyVSbuGZz2KCPiP6yNee04BH/7Es8xS eqSuktCf4xFw/32TZOoc0PsJ2Xu4fpf5gI91A5FeI7QuksQaJf0woyslVbcF/GoS6G6+ qF6jnKTsBUBsTwB2Sb6MY9fyOkPKs/6LS5aMvG6SbjemqT+7IdeMVTLeGAZRe7AtbXdX J+H6yQldhXhmPLfvpIU0KvodXwFzDohnlD9SrRPGN/x/toZtelXHn0yTAHNbbslyJ1wX 4mdn+6eQv0yCJatdqre+KqzwQZRDbnTpJEsGo37YjnkQeemMITI10LhOkAB2huI2doF8 2EoA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=iNMUmWuB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c23si9516051edv.521.2020.12.13.23.06.18; Sun, 13 Dec 2020 23:06:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=iNMUmWuB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2407722AbgLLSfd (ORCPT + 99 others); Sat, 12 Dec 2020 13:35:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2407704AbgLLSfd (ORCPT ); Sat, 12 Dec 2020 13:35:33 -0500 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 37F0EC0613CF for ; Sat, 12 Dec 2020 10:34:53 -0800 (PST) Received: by mail-wr1-x441.google.com with SMTP id q18so4815170wrn.1 for ; Sat, 12 Dec 2020 10:34:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=qGFtyYlVmI62XiTpOKeYXRAQe6g42lFPimyQ1u1WOMA=; b=iNMUmWuBoFJGDpNE9UAlebZ02dvNng1fcXaL1Wvi+k0jhUhxO2dc7cgGDXmNTFqck3 cd8Owu1ezY7Zd7N1+UENX0nTMV+lcojO5X/CNWNkNLKsuEn260UJeH4vjxSC9gaglAKr s9Zo86YeoaSWsbCb7TUadOz4hDDIDfzf7I/+zYXR0bZEMIhS2PP0clOKY8ixQ4WgixRB CQBeB+Sb2MeLcjDEqrYRRj4HOWvXHaCwuNH2MLpSpkazBYuJMKs9fltgSuVYSWFOqWId 5UV3hUowVAm6z9ZEj2oDoM2EL7tcKmJkVmxQyTC8dIUEDzMxqhfPejh6RvbG1PV/+067 G/FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=qGFtyYlVmI62XiTpOKeYXRAQe6g42lFPimyQ1u1WOMA=; b=YtYXLS964ylSyl+GY+A3K4v+885i49b5PBzDR+cNvsXCw/NepUFOmsph3M2Gt41Twd I7ryllSdMlrplNX8PbbXo6SSsznh7utVal4jmVS3EH+/dI9NW85yHeuYY43BnPIOolGp ozut/C6K5JRtnrsDCdfOB2L5U1gGxqP1cqCSGYdGpJO/YmNqrm4jgZ/adbYiOgweODrI Om4Mj84Le6u3DqTYr8d68jVkHv1tTHQW8GMFJeWW/seb6kUoje8A1zMiCgS6ue9zyMTt J6Fqim8uafH7XFKoEF51brU9CDhw8emOF0tPprgh/OiIRDa1RM71no2YKcurAh4/iyx3 v6Ww== X-Gm-Message-State: AOAM5336zlZ0Y7CgXTKV4I7j+DIhvmZGDhwB9Ggaa7tiGE6vzpwI5hVv hsMuj6A940txrmaANO5MuqftnTIITRKPVDTMEIbqrg== X-Received: by 2002:adf:dc8b:: with SMTP id r11mr20694608wrj.131.1607798091560; Sat, 12 Dec 2020 10:34:51 -0800 (PST) MIME-Version: 1.0 References: <160773649920.2387.14668844101686155199.stgit@localhost.localdomain> In-Reply-To: <160773649920.2387.14668844101686155199.stgit@localhost.localdomain> From: Yuchung Cheng Date: Sat, 12 Dec 2020 10:34:14 -0800 Message-ID: Subject: Re: [net-next PATCH] tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit To: Alexander Duyck Cc: Eric Dumazet , David Miller , Jakub Kicinski , Alexey Kuznetsov , Hideaki YOSHIFUJI , netdev , LKML , Martin Lau , Kernel Team Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 11, 2020 at 5:28 PM Alexander Duyck wrote: > > From: Alexander Duyck > > There are cases where a fastopen SYN may trigger either a ICMP_TOOBIG > message in the case of IPv6 or a fragmentation request in the case of > IPv4. This results in the socket stalling for a second or more as it does > not respond to the message by retransmitting the SYN frame. > > Normally a SYN frame should not be able to trigger a ICMP_TOOBIG or > ICMP_FRAG_NEEDED however in the case of fastopen we can have a frame that > makes use of the entire MSS. In the case of fastopen it does, and an > additional complication is that the retransmit queue doesn't contain the > original frames. As a result when tcp_simple_retransmit is called and > walks the list of frames in the queue it may not mark the frames as lost > because both the SYN and the data packet each individually are smaller than > the MSS size after the adjustment. This results in the socket being stalled > until the retransmit timer kicks in and forces the SYN frame out again > without the data attached. > > In order to resolve this we can generate our best estimate for the original > packet size by detecting the fastopen SYN frame and then adding the > overhead for MAX_TCP_OPTION_SPACE and verifying if the SYN w/ data would > have exceeded the MSS. If so we can mark the frame as lost and retransmit > it. > > Signed-off-by: Alexander Duyck > --- > net/ipv4/tcp_input.c | 30 +++++++++++++++++++++++++++--- > 1 file changed, 27 insertions(+), 3 deletions(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 9e8a6c1aa019..79375b58de84 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -2686,11 +2686,35 @@ static void tcp_mtup_probe_success(struct sock *sk) > void tcp_simple_retransmit(struct sock *sk) > { > const struct inet_connection_sock *icsk = inet_csk(sk); > + struct sk_buff *skb = tcp_rtx_queue_head(sk); > struct tcp_sock *tp = tcp_sk(sk); > - struct sk_buff *skb; > - unsigned int mss = tcp_current_mss(sk); > + unsigned int mss; > + > + /* A fastopen SYN request is stored as two separate packets within > + * the retransmit queue, this is done by tcp_send_syn_data(). > + * As a result simply checking the MSS of the frames in the queue > + * will not work for the SYN packet. So instead we must make a best > + * effort attempt by validating the data frame with the mss size > + * that would be computed now by tcp_send_syn_data and comparing > + * that against the data frame that would have been included with > + * the SYN. > + */ > + if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN && tp->syn_data) { > + struct sk_buff *syn_data = skb_rb_next(skb); > + > + mss = tcp_mtu_to_mss(sk, icsk->icsk_pmtu_cookie) + > + tp->tcp_header_len - sizeof(struct tcphdr) - > + MAX_TCP_OPTION_SPACE; nice comment! The original syn_data mss needs to be inferred which is a hassle to get right. my sense is path-mtu issue is enough to warrant they are lost. I suggest simply mark syn & its data lost if tcp_simple_retransmit is called during TFO handshake, i.e. diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 62f7aabc7920..7f0c4f2947eb 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2864,7 +2864,8 @@ void tcp_simple_retransmit(struct sock *sk) unsigned int mss = tcp_current_mss(sk); skb_rbtree_walk(skb, &sk->tcp_rtx_queue) { - if (tcp_skb_seglen(skb) > mss) + if (tcp_skb_seglen(skb) > mss || + (tp->syn_data && sk->sk_state == TCP_SYN_SENT)) tcp_mark_skb_lost(sk, skb); } We have a TFO packetdrill test that verifies my suggested fix should trigger an immediate retransmit vs 1s wait. > > - skb_rbtree_walk(skb, &sk->tcp_rtx_queue) { > + if (syn_data && syn_data->len > mss) > + tcp_mark_skb_lost(sk, skb); > + > + skb = syn_data; > + } else { > + mss = tcp_current_mss(sk); > + } > + > + skb_rbtree_walk_from(skb) { > if (tcp_skb_seglen(skb) > mss) > tcp_mark_skb_lost(sk, skb); > } > >