Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1095538rwd; Thu, 18 May 2023 07:57:06 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7mgc5fSJCqSj3t7iAOi84z22eqJiAXHU0cRYPuACz82GJgvD6D9qbWamia29SDx5M+FsMq X-Received: by 2002:a17:903:41cf:b0:1a9:498a:1da2 with SMTP id u15-20020a17090341cf00b001a9498a1da2mr3148834ple.56.1684421826584; Thu, 18 May 2023 07:57:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684421826; cv=none; d=google.com; s=arc-20160816; b=BSkMybqlYGz+pdQB9JJMzJUbtEq4gdwzw1GnWOrT9YZUzJLtGsrPkp2l3SchrBL2xw 2rRaJvCx8R43csVc/xuMNeez3rQq0pCIa2OTyu3S5e4qG5mFVYMcOlYivTwmLJQvHW+b BiF/MjdKdchf0XscSxNffowguhI1ABytsQYAKA3A8sZ3YdevPr9FKy3opUNCEwklfwxF DLDTA6oEKZZ54NJzOlRPY750XL+DZpdndUN9FSfprJP+LqxqnFgZBDCa34fCfQiSzXJ+ hlQqHvPA40oZsJEUjNgjQ2tZvVh+YzArtRN/dTby/7gD5nSK9lYfVh2O15YaEybwevPR 0oJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=VTZb9xwEPvqrA/j50VjQvMgiTttO1+oVgLrqOulaWSQ=; b=xRPolxKBFEN32ZQbgKi9EpcdCRqiVgtgCbvvYB1ZAu+6TdvoBGgHK5pGAoY/4+E2QV I9w517jsGY8odlOXht7Bb9Cm0Z1aVgisLIu/PQSTIMrPeghS764kIgbvteh1JJzXvAOw wMJZAaYheaeJuIVOL81s1Dvseo9/RLrLlakqMNHW8phIUIo5t5dn1/oNv21GnBM5jDOa mVveTfafKAT+K0xA+Ckzr4XrkKuK/lCpskIa7f074JdOwm0xhOlm5FeuG1admr9VlOJ7 pUIYVvNpaNHwuDonfBdQ8CXD+qKoKYPLEMZ6i73kzGM6X9VI1JuCz/0CMgViv2FrbsFE I3BA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=Cie3DBeR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d13-20020a170902654d00b001ae59169f18si1393020pln.414.2023.05.18.07.56.53; Thu, 18 May 2023 07:57:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=Cie3DBeR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231703AbjEROZg (ORCPT + 99 others); Thu, 18 May 2023 10:25:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52038 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230376AbjEROZc (ORCPT ); Thu, 18 May 2023 10:25:32 -0400 Received: from mail-yw1-x1143.google.com (mail-yw1-x1143.google.com [IPv6:2607:f8b0:4864:20::1143]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB77BE69; Thu, 18 May 2023 07:25:30 -0700 (PDT) Received: by mail-yw1-x1143.google.com with SMTP id 00721157ae682-5619032c026so25850897b3.1; Thu, 18 May 2023 07:25:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684419930; x=1687011930; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=VTZb9xwEPvqrA/j50VjQvMgiTttO1+oVgLrqOulaWSQ=; b=Cie3DBeRtOyBQRKJgcR50NNynnzE/TEAWOfqObFUR8A2nC4Zw5QkCfWZG69biLFCrY ivdVpj0aOtb0aqPuxEufhKQOOaZ1sjMMXIiRTB1fpZYC0iH+RTBzRgs/Xoh/Ke8MhTyo Wv74c3ed9XnC9vHtxlywqfbIJVqTe4kIMNaB2WpOoZTSKg3pBorrnUGYRxCZ996iTgI4 1krAISt0z4/vdsQENQIgJm/JuDp29UDs5rzALcm3fgoOVbU5iTqNhRx3TWfmXi7no2zR K2lBJ3S6xdED996DnHzOJ93yRxT0Wt0HiTfzcoRQd9OkuG6sVGlgJNeta7l4UWPebd8O pLNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684419930; x=1687011930; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VTZb9xwEPvqrA/j50VjQvMgiTttO1+oVgLrqOulaWSQ=; b=Dir7bHwepibQxG5kqf6VkS0PGXuH4CuOBjzYNIuVVSmZakJcPv3PF5sXeMDMdlLohk ZMtqOavKSFSBCBQSrVay3ItIKFhZgqvNwV9rReEqgIVqHAVhWR7OBZ6EX33l0fkYTxNL 2AeG4E2aKIHGUZ9Vj3G2zauH7zVVbuV4ub51cl2gbG4y84DooHRRO/19LkNI6UBwnkmQ iwfLm7r9QwfNUuEOYK1TI+o5urncniUyhAFBuGCvFMs6yOUwSr2WIP4svPid0vKaUVgn LzpouVUHsM7IX/cXL09dL3Nkz2SmfpoLxpLG9C3UBQ2TbfeYOCu29lbzFZLRIQ3vlMA2 HZLA== X-Gm-Message-State: AC+VfDyvb0FoUsQidOsVFg1FzQW7NATDFKOul1ytXkKiCeVpozJmwvss Z7MU7aXUy0TB/TWoNp8E0ZEii1+yWamq3O0JExcLGWHkQiZglJQl X-Received: by 2002:a81:4854:0:b0:561:a41d:aabb with SMTP id v81-20020a814854000000b00561a41daabbmr1430263ywa.16.1684419929926; Thu, 18 May 2023 07:25:29 -0700 (PDT) MIME-Version: 1.0 References: <20230517124201.441634-1-imagedong@tencent.com> <20230517124201.441634-3-imagedong@tencent.com> In-Reply-To: From: Menglong Dong Date: Thu, 18 May 2023 22:25:18 +0800 Message-ID: Subject: Re: [PATCH net-next 2/3] net: tcp: send zero-window when no memory To: Eric Dumazet Cc: kuba@kernel.org, davem@davemloft.net, pabeni@redhat.com, dsahern@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Menglong Dong Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 17, 2023 at 10:45=E2=80=AFPM Eric Dumazet = wrote: > > On Wed, May 17, 2023 at 2:42=E2=80=AFPM wrote: > > > > From: Menglong Dong > > > > For now, skb will be dropped when no memory, which makes client keep > > retrans util timeout and it's not friendly to the users. > > Yes, networking needs memory. Trying to deny it is recipe for OOM. > > > > > Therefore, now we force to receive one packet on current socket when > > the protocol memory is out of the limitation. Then, this socket will > > stay in 'no mem' status, util protocol memory is available. > > > > I think you missed one old patch. > > commit ba3bb0e76ccd464bb66665a1941fabe55dadb3ba tcp: fix > SO_RCVLOWAT possible hangs under high mem pressure > > > > > When a socket is in 'no mem' status, it's receive window will become > > 0, which means window shrink happens. And the sender need to handle > > such window shrink properly, which is done in the next commit. > > > > Signed-off-by: Menglong Dong > > --- > > include/net/sock.h | 1 + > > net/ipv4/tcp_input.c | 12 ++++++++++++ > > net/ipv4/tcp_output.c | 7 +++++++ > > 3 files changed, 20 insertions(+) > > > > diff --git a/include/net/sock.h b/include/net/sock.h > > index 5edf0038867c..90db8a1d7f31 100644 > > --- a/include/net/sock.h > > +++ b/include/net/sock.h > > @@ -957,6 +957,7 @@ enum sock_flags { > > SOCK_XDP, /* XDP is attached */ > > SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */ > > SOCK_RCVMARK, /* Receive SO_MARK ancillary data with packet */ > > + SOCK_NO_MEM, /* protocol memory limitation happened */ > > }; > > > > #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIM= ESTAMPING_RX_SOFTWARE)) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > index a057330d6f59..56e395cb4554 100644 > > --- a/net/ipv4/tcp_input.c > > +++ b/net/ipv4/tcp_input.c > > @@ -5047,10 +5047,22 @@ static void tcp_data_queue(struct sock *sk, str= uct sk_buff *skb) > > if (skb_queue_len(&sk->sk_receive_queue) =3D=3D 0) > > sk_forced_mem_schedule(sk, skb->truesize); > > I think you missed this part : We accept at least one packet, > regardless of memory pressure, > if the queue is empty. > > So your changelog is misleading. > > > else if (tcp_try_rmem_schedule(sk, skb, skb->truesize))= { > > + if (sysctl_tcp_wnd_shrink) > > We no longer add global sysctls for TCP. All new sysctls must per net-ns. > > > + goto do_wnd_shrink; > > + > > reason =3D SKB_DROP_REASON_PROTO_MEM; > > NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDR= OP); > > sk->sk_data_ready(sk); > > goto drop; > > +do_wnd_shrink: > > + if (sock_flag(sk, SOCK_NO_MEM)) { > > + NET_INC_STATS(sock_net(sk), > > + LINUX_MIB_TCPRCVQDROP); > > + sk->sk_data_ready(sk); > > + goto out_of_window; > > + } > > + sk_forced_mem_schedule(sk, skb->truesize); > > So now we would accept two packets per TCP socket, and yet EPOLLIN > will not be sent in time ? > > packets can consume about 45*4K each, I do not think it is wise to > double receive queue sizes. > > What you want instead is simply to send EPOLLIN sooner (when the first > packet is queued instead when the second packet is dropped) > by changing sk_forced_mem_schedule() a bit. > > This might matter for applications using SO_RCVLOWAT, but not for > other applications. To be more clear, what I talk about here is not to send EPOLLIN sooner, but try to make the TCP connection, which has a "hang" receiver and in TCP protocol memory pressure, entry 0-probe state. And this commit is the first step: make the receiver shrink the window by sending a zero-window ack. Thanks! Menglong Dong