Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp3364066pxb; Tue, 20 Apr 2021 06:48:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw+AXi6RfNymA/xoQYyyIcF6m7KJGjVzwF5/7sdLsrSgcirR6gwt2iWWW2mI1GZmGS5IXO3 X-Received: by 2002:a17:906:3a45:: with SMTP id a5mr12913146ejf.288.1618926505114; Tue, 20 Apr 2021 06:48:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618926505; cv=none; d=google.com; s=arc-20160816; b=lAkUi9OA4coybp1adR3JgcpXa44HD4nWHcAz5AQ9HvE8DHV3Lrc9CfovVgVcpdBoMO 2WWaWaR5TLFA4q27V2ez0qI/OiHqvoELImmlOvJ05+tNH7VUeNdpsq1Vx1X1XV4gvnGO f4DJJKBPoNBWWykj7ClPYS8Y+QKe98MhQvnTak54X2dsCmisvv6LqM7akG6rkP96J1JW TNmLmUXBAqoVwwP63Xuiwwd2CPk2OkZFsCEnyYJizI96u+TBGluYAmj8KIDrPb/RNYcG tObzJLirzNIydU53+PFedugQwr/PWQPiW4dKiJReskaj6kQK7K2x7TI+i8aMXUMJJbYv sKJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=npQOAk4f1Iv7DwBnRcXaic7lK/TQsKHITZ6kiHMeceE=; b=W7OArAP2X1qtkHeKAbCK3m2/joHkACoQcbMM7yiUESl4qmRk45ZnqG40vfAOeXA3HW U1tRiqgR8mm40wfzLCB2w+mZtP1N/0yFUtlZ1d8gx+cPUQMfSOXic8nEKdw+/b5H/EU5 pPfTerwDfC+Nk9sI9Nnj7CdFrMCezDzXCWWcf2rqfRqYTjumqDIslfmKZnccWr3+Y3KM D63aN2lRw8wctlQA2VmVdbrrQOrpjZx1as7yUCOGyor3ErKxZekR0nLXS+EQl82O3X3S dv8zrsem2NdyV8eWnuGcyY1Zm5bQcOYKTZ0UN3jWbBkn08XzouoVYfBwhKCA1IVqfwhJ OQIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HEneWeYm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u18si12206453edq.402.2021.04.20.06.48.02; Tue, 20 Apr 2021 06:48:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HEneWeYm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232462AbhDTNqG (ORCPT + 99 others); Tue, 20 Apr 2021 09:46:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232449AbhDTNqB (ORCPT ); Tue, 20 Apr 2021 09:46:01 -0400 Received: from mail-yb1-xb31.google.com (mail-yb1-xb31.google.com [IPv6:2607:f8b0:4864:20::b31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A96A0C06138A for ; Tue, 20 Apr 2021 06:45:29 -0700 (PDT) Received: by mail-yb1-xb31.google.com with SMTP id n12so43004600ybf.8 for ; Tue, 20 Apr 2021 06:45:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=npQOAk4f1Iv7DwBnRcXaic7lK/TQsKHITZ6kiHMeceE=; b=HEneWeYmGUlCkBIPOZ3QG56s33aFaziBUk3egJBDs4ZyRwYJUgOAQJnM7OLuWiOLfJ Yfu3zPbVV+S2ZZDnwjK+8ap8uy/fWjP0xX63TIHjF12UzKFA5FsOSlWn8zqzj4IYZubP +6ff47X14pCrfJ0ka1buFiqUxbGDNJGaSC2up8qjT7cIcz/izyD4LLoVuyjxPvKH06nY K7/nzFU+H13iRopQMP2v3l4FEe5Wji3GSY2QxolsR5x4OpWxi3rYXEtKxCtUlPQJojvv 5YjFvuIUAST9mYOVcCKLPXIfev73xtkIuVJGC9A2nfIIoRMZwvw8Q06nFe74JOHdSzGm WWZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=npQOAk4f1Iv7DwBnRcXaic7lK/TQsKHITZ6kiHMeceE=; b=RwnLfjUNS0v/UgtgVrDndwtL/a3XRMg6EljgP8uGqWxl8PBLST21cMEvS/A8AJIDBF BAAmuSKu/if0uZkaywiwOXbtqWcuFlPH29G02sCqVDyi5YdrKIVnR65m0KpNzheQo/PO OLqh+8M5oubX/2gS8ltZBbV02zoouGYOmzK4wfB+GYxabVkGMZvjBzGTBsTcYt7OplYW sOSW2d90IRTrxocxZlbeAzrI3EO4dxxPktxy56wofHg39A1hf1wV47aYMi0Yz84l18nG RiY+BuS5m6X2MQW7NHACGiVWWdFuR2jr2+qqsm1183VtnraZYLAR/6VL32ptvomByIJR FF0A== X-Gm-Message-State: AOAM532UrFBRMVpInH2N5zW4Z4IuchPbPE9Vnclw2tCqCpvrGGhGnHlK Ipy7ITh0kB/OvT5R7MqOhvkSIKOBrZLxhqQRQ35FWg== X-Received: by 2002:a25:4244:: with SMTP id p65mr24674712yba.452.1618926328570; Tue, 20 Apr 2021 06:45:28 -0700 (PDT) MIME-Version: 1.0 References: <20210420120043.48151-1-zhaoya.gaius@bytedance.com> In-Reply-To: <20210420120043.48151-1-zhaoya.gaius@bytedance.com> From: Eric Dumazet Date: Tue, 20 Apr 2021 15:45:17 +0200 Message-ID: Subject: Re: [PATCH] tcp: prevent loss of bytes when using syncookie To: =?UTF-8?B?6LW15Lqa?= Cc: David Miller , Hideaki YOSHIFUJI , netdev , LKML Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 20, 2021 at 2:00 PM zhaoya wrote: > > When using syncookie, since $MSSID is spliced into cookie and > the legal index of msstab is 0,1,2,3, this gives client 3 bytes > of freedom, resulting in at most 3 bytes of silent loss. > > C ------------seq=12345-------------> S > C <------seq=cookie/ack=12346-------- S S generated the cookie > [RFC4987 Appendix A] > C ---seq=123456/ack=cookie+1-->X S The first byte was loss. > C -----seq=123457/ack=cookie+1------> S The second byte was received and > cookie-check was still okay and > handshake was finished. > C <--------seq=.../ack=12348--------- S acknowledge the second byte. > > Here is a POC: > > $ cat poc.c > #include > #include > #include > #include > #include > #include > #include > #include > #include > > void *serverfunc(void *arg) > { > int sd = -1; > int csd = -1; > struct sockaddr_in servaddr, cliaddr; > int len = sizeof(cliaddr); > > sd = socket(AF_INET, SOCK_STREAM, 0); > servaddr.sin_family = AF_INET; > servaddr.sin_addr.s_addr = htonl(INADDR_ANY); > servaddr.sin_port = htons(1234); > bind(sd, (struct sockaddr *)&servaddr, sizeof(servaddr)); > listen(sd, 1); > > while (1) { > char buf[2]; > int ret; > csd = accept(sd, (struct sockaddr *)&cliaddr, &len); > memset(buf, 0, 2); > ret = recv(csd, buf, 1, 0); > // but unexpected char is 'b' > if (ret && strncmp(buf, "a", 1)) { > printf("unexpected:%s\n", buf); > close(csd); > exit(0); > } > close(csd); > } > } > > void *connectfunc(void *arg) > { > struct sockaddr_in addr; > int sd; > int i; > > for (i = 0; i < 500; i++) { > sd = socket(AF_INET, SOCK_STREAM, 0); > addr.sin_family = AF_INET; > addr.sin_addr.s_addr = inet_addr("127.0.0.1"); > addr.sin_port = htons(1234); > > connect(sd, (struct sockaddr *)&addr, sizeof(addr)); > > send(sd, "a", 1, 0); // expected char is 'a' > send(sd, "b", 1, 0); > close(sd); > } > return NULL; > } > > int main(int argc, char *argv[]) > { > int i; > pthread_t id; > > pthread_create(&id, NULL, serverfunc, NULL); > sleep(1); > for (i = 0; i < 500; i++) { > pthread_create(&id, NULL, connectfunc, NULL); > } > sleep(5); > } > > $ sudo gcc poc.c -lpthread > $ sudo sysctl -w net.ipv4.tcp_syncookies=1 > $ sudo ./a.out # please try as many times. > POC is distracting, you could instead give a link to https://kognitio.com/blog/syn-cookies-ate-my-dog-breaking-tcp-on-linux/ where all the details can be found. Also it gives credits to past work. If you feel a program needs to be written and recorded for posterity, add a selftest. (in tools/testing/selftests/net ) > This problem can be fixed by having $SSEQ itself participate in the > hash operation. > > The feature is protected by a sysctl so that admins can choose > the preferred behavior. > > Signed-off-by: zhaoya > --- > Documentation/networking/ip-sysctl.rst | 9 ++++++++ > include/linux/netfilter_ipv6.h | 24 +++++++++++++------- > include/net/netns/ipv4.h | 1 + > include/net/tcp.h | 20 ++++++++++------- > net/core/filter.c | 6 +++-- > net/ipv4/syncookies.c | 31 ++++++++++++++++---------- > net/ipv4/sysctl_net_ipv4.c | 7 ++++++ > net/ipv4/tcp_ipv4.c | 4 +++- > net/ipv6/syncookies.c | 29 +++++++++++++++--------- > net/ipv6/tcp_ipv6.c | 3 ++- > net/netfilter/nf_synproxy_core.c | 10 +++++---- > 11 files changed, 97 insertions(+), 47 deletions(-) > > diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst > index c7952ac5bd2f..a430e8736c2b 100644 > --- a/Documentation/networking/ip-sysctl.rst > +++ b/Documentation/networking/ip-sysctl.rst > @@ -732,6 +732,15 @@ tcp_syncookies - INTEGER > network connections you can set this knob to 2 to enable > unconditionally generation of syncookies. > > +tcp_syncookies_enorder - INTEGER enorder ? What does it mean ? > + Only valid when the kernel was compiled with CONFIG_SYN_COOKIES > + Prevent the loss of at most 3 bytes of which sent by client when > + syncookies as generated to complete TCP-Handshake. > + Default: 0 > + > + Note that if it was enabled, any out-of-order bytes sent by client > + to complete TCP-Handshake could get the session killed. Technically speaking, the client does not send out-of-order packets. The issue here is that loss of packets sent after 3WHS can lead to RST and session being killed. > + > tcp_fastopen - INTEGER > Enable TCP Fast Open (RFC7413) to send and accept data in the opening > SYN packet. > diff --git a/include/linux/netfilter_ipv6.h b/include/linux/netfilter_ipv6.h > index 48314ade1506..97b67672dee7 100644 > --- a/include/linux/netfilter_ipv6.h > +++ b/include/linux/netfilter_ipv6.h > @@ -49,9 +49,11 @@ struct nf_ipv6_ops { > int (*route)(struct net *net, struct dst_entry **dst, struct flowi *fl, > bool strict); > u32 (*cookie_init_sequence)(const struct ipv6hdr *iph, > - const struct tcphdr *th, u16 *mssp); > + const struct tcphdr *th, u16 *mssp, > + int enorder); > Patch is too big, and passing either a ' struct net' or an ' int enorder' is not pretty/consistent. Please send a patch series : 1) Add ' struct net' pointers to all helpers which will need it later. This is a pure preparation work, easy to review. 2) Patch itself, adding the sysctl. I would prefer you add a helper like this and use it every where you need to use the sysctl so that it is factorized static inline u32 tcp_cond_seq(const struct net *net, u32 seq) { return net->ipv4.sysctl_tcp_syncookie_no_eat ? seq : 0; } And use this helper in functions right before using (or not) the seq in hash functions/ (Do not pass around the result of the helper) Thanks.