Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp1637245pxu; Sat, 12 Dec 2020 21:30:21 -0800 (PST) X-Google-Smtp-Source: ABdhPJx8fIi7wlVgRVSUC3jRDEGUfXeiFs4ZbpxRiwk7LNyxEa4nk/Ja2CznCIOwAPfK5IOETOLz X-Received: by 2002:a17:906:f0d0:: with SMTP id dk16mr17512877ejb.144.1607837421483; Sat, 12 Dec 2020 21:30:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607837421; cv=none; d=google.com; s=arc-20160816; b=h0KLzfTV+/qv6D+PxCsO8DF7nz6VIDhEi9+NVhasqe9StSkDQwjSMImC68Tv9rO1X/ tQrBS/9G+PQnaWvoVvoEP0HwGTMucmfm5cFuXgfhxPqBWEeRdb3G3pAqcFz9GOMT9JyS sfODOvGzVJiPFx4LDRPMxjFvIXGYXere/txk80uGQowQPzoOMrCwur0MeLcX5bx/uu3d zMgwuHK00Hh07w+Ze89tsyIF0CT9US8n8GIFJ5mf/lQOexoDregF9r4D5cbIxnO8/ydD vTRN+KMlrSSmfa2L7rjbLaSk1EPRjqtQksINsdhq81EZpvAy2AAdes9BkYBRktzKyUlM 6KlQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=kHtjDDH7GvTHo4bSQkbyfDs6WmHuTRZuUmLBj4I09mM=; b=s85U1I24xB0k0S7kxULP/qc/aNROEKFWCLaZVF0gCcTcuLQel2U+xUCxpVesKR3e1M J4/MB/sqtzEtZOaJJlBsyQoHbm7V+9aqTbbd5Fe9HpC7bXU9+FslH3PwJNy4Fc5hM1np IHi5tZcpcCtVpOD5F/7Kf6gH0WSM3dJgwlheY90dhdKWgvRO7jOsPkTzZrooKEaihbB6 JmBChHQ9/TnCNGBHw7ZnZ23j0Jzt+5ZRTiDzfFlZT7QdNFrAyJ982ptxIVBOYbN23INH Y+RNbIl26GhZ1S3KJUADKaMA/kRAd0IM9NBTsWgCMqwHgxdiBgQXiPwrsbX9A1KBNK6p o3FQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ZOYpIIJh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b8si7775623edx.538.2020.12.12.21.29.58; Sat, 12 Dec 2020 21:30:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ZOYpIIJh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2407054AbgLKWyr (ORCPT + 99 others); Fri, 11 Dec 2020 17:54:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2394011AbgLKWyk (ORCPT ); Fri, 11 Dec 2020 17:54:40 -0500 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4655CC0613CF for ; Fri, 11 Dec 2020 14:54:00 -0800 (PST) Received: by mail-wm1-x344.google.com with SMTP id g185so10012206wmf.3 for ; Fri, 11 Dec 2020 14:54:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=kHtjDDH7GvTHo4bSQkbyfDs6WmHuTRZuUmLBj4I09mM=; b=ZOYpIIJhdkjIiQrCDklbAJoCGhqr2KCzTSQCU7kEVzQ2UEt0EEFBiz6tWBA4FGQQxV zjLKwy1NJpfNzHOQ9525bmhx6NUFRRLkgJBxTvN9eOVo2OJ/5a7mvmpstAVjteuGu2TT MKbbWwhIWHog9cQIiXzuei0HHyPDdARN994uPx0h4nQKujNwwlqM28TD7I4wbqC+pQPe J3YFcJimPtUDOzfug20srnJ0J7LWGyqxGtOv9VUFOwQ5bs04TalKW4xynLxZjwuC2MiK 1L2HFvrimunl74Eo77FPdWNYpZo8phU+MlexnX4QMczOi0Gaym3eS5gnfvHCEo4BTJWt C5yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kHtjDDH7GvTHo4bSQkbyfDs6WmHuTRZuUmLBj4I09mM=; b=AHUI/pPUkSicXfJr0qKZUqq6CdsyyF5+UwNj+XzJe9nSf1skA0PTNbO4MXmqydZye0 8pUreI3Sxsil6rjP1iieAv66g2ZED60Kvm+Imeh/dOE0sbOMZxZp2J++M4Otz4VzUwht XEYIftWRNWis55xQYJs094+NAYkO/DtLDhzKlGjRv7A/yXMO0vLrfeXURcQ6v9xDM60o Bj2VmwRmKszH0oKN02XGn1GaQip/2Fw4knkeJ+bOlKwgqoaNucReFnXU35QjqHidsOFI shin3D8OJXgJApQRDLF6tzaEFthwTlBcUgFN7TX7Ak7JKZBMzCPo9n/TnX0MR/PYPmYc WlgQ== X-Gm-Message-State: AOAM531OnQId9LyuTD4ugBa0jjoSlZt9I/GSAFmih0TFmlwixbqoq0Wx l87/xj7j8xk/0zaJHm30feADXq/5SXJVr8gt4oKOmQ== X-Received: by 2002:a1c:40c:: with SMTP id 12mr15508379wme.40.1607727238734; Fri, 11 Dec 2020 14:53:58 -0800 (PST) MIME-Version: 1.0 References: <160765171921.6905.7897898635812579754.stgit@localhost.localdomain> In-Reply-To: From: Yuchung Cheng Date: Fri, 11 Dec 2020 14:53:21 -0800 Message-ID: Subject: Re: [net PATCH] tcp: Mark fastopen SYN packet as lost when receiving ICMP_TOOBIG/ICMP_FRAG_NEEDED To: Alexander Duyck Cc: Eric Dumazet , David Miller , Jakub Kicinski , Alexey Kuznetsov , Hideaki YOSHIFUJI , netdev , LKML , Martin KaFai Lau , kernel-team Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 11, 2020 at 1:51 PM Alexander Duyck wrote: > > On Fri, Dec 11, 2020 at 11:18 AM Eric Dumazet wrote: > > > > On Fri, Dec 11, 2020 at 6:15 PM Alexander Duyck > > wrote: > > > > > > On Fri, Dec 11, 2020 at 8:22 AM Eric Dumazet wrote: > > > > > > > > On Fri, Dec 11, 2020 at 5:03 PM Alexander Duyck > > > > wrote: > > > > > > > > > That's fine. I can target this for net-next. I had just selected net > > > > > since I had considered it a fix, but I suppose it could be considered > > > > > a behavioral change. > > > > > > > > We are very late in the 5.10 cycle, and we never handled ICMP in this > > > > state, so net-next is definitely better. > > > > > > > > Note that RFC 7413 states in 4.1.3 : > > > > > > > > The client MUST cache cookies from servers for later Fast Open > > > > connections. For a multihomed client, the cookies are dependent on > > > > the client and server IP addresses. Hence, the client should cache > > > > at most one (most recently received) cookie per client and server IP > > > > address pair. > > > > > > > > When caching cookies, we recommend that the client also cache the > > > > Maximum Segment Size (MSS) advertised by the server. The client can > > > > cache the MSS advertised by the server in order to determine the > > > > maximum amount of data that the client can fit in the SYN packet in > > > > subsequent TFO connections. Caching the server MSS is useful > > > > because, with Fast Open, a client sends data in the SYN packet before > > > > the server announces its MSS in the SYN-ACK packet. If the client > > > > sends more data in the SYN packet than the server will accept, this > > > > will likely require the client to retransmit some or all of the data. > > > > Hence, caching the server MSS can enhance performance. > > > > > > > > Without a cached server MSS, the amount of data in the SYN packet is > > > > limited to the default MSS of 536 bytes for IPv4 [RFC1122] and 1220 > > > > bytes for IPv6 [RFC2460]. Even if the client complies with this > > > > limit when sending the SYN, it is known that an IPv4 receiver > > > > advertising an MSS less than 536 bytes can receive a segment larger > > > > than it is expecting. > > > > > > > > If the cached MSS is larger than the typical size (1460 bytes for > > > > IPv4 or 1440 bytes for IPv6), then the excess data in the SYN packet > > > > may cause problems that offset the performance benefit of Fast Open. > > > > For example, the unusually large SYN may trigger IP fragmentation and > > > > may confuse firewalls or middleboxes, causing SYN retransmission and > > > > other side effects. Therefore, the client MAY limit the cached MSS > > > > to 1460 bytes for IPv4 or 1440 for IPv6. > > > > > > > > > > > > Relying on ICMP is fragile, since they can be filtered in some way. > > > > > > In this case I am not relying on the ICMP, but thought that since I > > > have it I should make use of it. WIthout the ICMP we would still just > > > be waiting on the retransmit timer. > > > > > > The problem case has a v6-in-v6 tunnel between the client and the > > > endpoint so both ends assume an MTU 1500 and advertise a 1440 MSS > > > which works fine until they actually go to send a large packet between > > > the two. At that point the tunnel is triggering an ICMP_TOOBIG and the > > > endpoint is stalling since the MSS is dropped to 1400, but the SYN and > > > data payload were already smaller than that so no retransmits are > > > being triggered. This results in TFO being 1s slower than non-TFO > > > because of the failure to trigger the retransmit for the frame that > > > violated the PMTU. The patch is meant to get the two back into > > > comparable times. > > > > Okay... Have you studied why tcp_v4_mtu_reduced() (and IPv6 equivalent) > > code does not yet handle the retransmit in TCP_SYN_SENT state ? > > The problem lies in tcp_simple_retransmit(). Specifically the loop at > the start of the function goes to check the retransmit queue to see if > there are any packets larger than MSS and finds none since we don't > place the SYN w/ data in there and instead have a separate SYN and > data packet. > > I'm debating if I should take an alternative approach and modify the > loop at the start of tcp_simple_transmit to add a check for a SYN > packet, tp->syn_data being set, and then comparing the next frame > length + MAX_TCP_HEADER_OPTIONS versus mss. Thanks for bringing up this tricky issue. The root cause seems to be the special arrangement of storing SYN-data as one-(pure)-SYN and one non-SYN data segment. Given tcp_simple_transmit probably is not called frequently, your alternative approach sounds more appealing to me. Replacing that strange syn|data arrangement for TFO has been on my wish list for a long time... Ideally it's better to just store the SYN+data and just carve out the SYN for retransmit.