Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp4971754imm; Tue, 9 Oct 2018 07:54:24 -0700 (PDT) X-Google-Smtp-Source: ACcGV611V8gDITF3tE55m5IJ/lx5KVvR8LDltNusI0F+4j1fy2gCmEgBahiKQ61LOxWnq91FEDDJ X-Received: by 2002:a62:7f8c:: with SMTP id a134-v6mr30361616pfd.257.1539096864813; Tue, 09 Oct 2018 07:54:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539096864; cv=none; d=google.com; s=arc-20160816; b=HhmqVTUCqF/bVWnqzF33PKzDkDkn2keEJhaqSR5hVD3GXGH8oxXqmWlP5/jBcue2E9 bJij6wApOHTnq6Kmz1cBiFnYfZxjGklJr04bENwyZDTWMct68VhJc/yvR1eSLCYyxpTK 9xnG86VU93DwwLTNTH/jrGlvYxvky9SYkDmytiauxcZs9Hn++Ei53n9ZcvWu+FPzhjD/ m0P3K5y6t/FaFEzUwQBoMdDL8tr1fIhb/hdrhs2FIUHd2aL9oetU8EPlOcgxOUcQwljQ y9lMT0D2p3DStVvm6TxX2OJk+HkLEZ80w0TUq3Mz+YZBFHbVHPKK3lNTrxnpCE9PdZj7 PjqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=J7BVTazitQ38nthpwd2UFjerB9MlaBokQUm37HxYkKo=; b=Cq4AepeuBNXx3n141xRVaI4lErmyaKJ6zZNBh5O0i9tMbqJSaHsAqvRy81VCrJsfBM 9uop2AYWhYd7cDJhOWPDKcZB2PfjGSOlxeajB/J9Qpps6Ai/fFv1DwaAHx8RZ0B4r/dK CSjUiPxSlQvH7XglSFTllXJjC4C3bG1uqk0XATZS8r+6UwI2XjxxQ/a4/6isgIjMz4Vf 4h1Sn5EWleQ/YqvAH+AGYd6F6Ve15gpbk2GUt/9TiHXUA6BrKwb3TiCSNUVRzGLRGIka Lle5eyIPu3gUJbeR/nQINWITikRQchAOCF0Nw+/tnP3+kud3DQpE1AD6Xj3skv2OzM1U P8yA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=VsIbBitb; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y27-v6si21170777pgc.197.2018.10.09.07.54.10; Tue, 09 Oct 2018 07:54:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=VsIbBitb; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726721AbeJIWKb (ORCPT + 99 others); Tue, 9 Oct 2018 18:10:31 -0400 Received: from mail-it1-f194.google.com ([209.85.166.194]:52646 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726393AbeJIWKa (ORCPT ); Tue, 9 Oct 2018 18:10:30 -0400 Received: by mail-it1-f194.google.com with SMTP id 134-v6so3034172itz.2; Tue, 09 Oct 2018 07:53:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=J7BVTazitQ38nthpwd2UFjerB9MlaBokQUm37HxYkKo=; b=VsIbBitbG3phadKclNU7lDq8GfZEufWPfloQj5KJcPIc/xEA3xuMOu7wk8I/OCmkcw /fnnSc+mH1/wK41EPAEbidR7wyc6DdrHmIUYNQ5Kj5fw/IviJBNkbn6ZsCh8TMw38cxL Wn5ri/hAjHumDXJj/mJdXXxBJVoAX9UvbI/Yd0TikNAR6OXuBsLqZ3VJE6zOQZ9Y7Twu BAGmvSC1vbjsl5aP1xF7HFqbSVdYsBwegiHS90GSj5dS88YHeg5R4EEOG+WGkDN16cmI j7HStsCAdqwfhrdJ2mZtA6QtIaQhXLyN1JznIYQFnYs/z9R+5EHZ8cDF6qsiEPMMuL9h /28Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=J7BVTazitQ38nthpwd2UFjerB9MlaBokQUm37HxYkKo=; b=QtISFbofLAAVYUyfm2rAuPnFrMpq6EQK9lcaLo6axNhb7CRCVtD9TfgVo3bzGskft2 zBwoGwyz7xO7Yv4mFrkdfx1sGk7Lc+hma2Yf03xgo5nsvFpWUouTMQtImupFqsOYGfVF N5/aTSkTiaR1oAnWR/4H7JUjg8j+ooH/hwovCzYN1h8kHLh2VitWR/lG9rc8ZNujshM3 On9gThWRHAnUeeCjvfLe26ZUQAxRJzUVppdGQh5qEnDIs3qLPfmIIPoJYBR0LoO9TlU5 xphxWwCgRlHbPwtnxh2/D7NLouABLzFqWYuFFEmKBzgHWBoy2kDfQ8kYf4Jhjs4U+niO YAtg== X-Gm-Message-State: ABuFfogHQK6IbSloxJEgn3db2w6rSE6r96bBWs6XGdm75kHlCOF9mY95 HyYy254vaG4lWYyjtbU+aiKKtII5f7Y7FZPgrRs= X-Received: by 2002:a02:5a01:: with SMTP id v1-v6mr5106991jaa.11.1539096792459; Tue, 09 Oct 2018 07:53:12 -0700 (PDT) MIME-Version: 1.0 References: <1539086718-4119-1-git-send-email-laoar.shao@gmail.com> <1539086718-4119-2-git-send-email-laoar.shao@gmail.com> In-Reply-To: From: Yafang Shao Date: Tue, 9 Oct 2018 22:52:35 +0800 Message-ID: Subject: Re: [PATCH net-next] tcp: forbid direct reclaim if MSG_DONTWAIT is set in send path To: Eric Dumazet Cc: David Miller , netdev , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 9, 2018 at 10:12 PM Eric Dumazet wrote: > > On Tue, Oct 9, 2018 at 5:05 AM Yafang Shao wrote: > > > > By default, the sk->sk_allocation is GFP_KERNEL, that means if there's > > no enough memory it will do both direct reclaim and background reclaim. > > If the size of system memory is great, the direct reclaim may cause great > > latency spike. > > > > When we set MSG_DONTWAIT in send syscalls, we really don't want it to be > > blocked, so we'd better clear __GFP_DIRECT_RECLAIM when allocate skb in the > > send path. Then, it will return immediately if there's no enough memory to > > be allocated, and then the appliation has a chance to do some other stuffs > > instead of being blocked here. > > > > Signed-off-by: Yafang Shao > > --- > > net/ipv4/tcp.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > > index 43ef83b..fe4f5ce 100644 > > --- a/net/ipv4/tcp.c > > +++ b/net/ipv4/tcp.c > > @@ -1182,6 +1182,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) > > bool process_backlog = false; > > bool zc = false; > > long timeo; > > + gfp_t gfp; > > > > flags = msg->msg_flags; > > > > @@ -1255,6 +1256,9 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) > > /* Ok commence sending. */ > > copied = 0; > > > > + gfp = flags & MSG_DONTWAIT ? sk->sk_allocation & ~__GFP_DIRECT_RECLAIM : > > + sk->sk_allocation; > > + > > restart: > > mss_now = tcp_send_mss(sk, &size_goal, flags); > > > > @@ -1283,8 +1287,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) > > } > > first_skb = tcp_rtx_and_write_queues_empty(sk); > > linear = select_size(first_skb, zc); > > - skb = sk_stream_alloc_skb(sk, linear, sk->sk_allocation, > > - first_skb); > > + skb = sk_stream_alloc_skb(sk, linear, gfp, first_skb); > > if (!skb) > > goto wait_for_memory; > > > How have you tested this patch exactly ? > There was a network latency (hunreds msecs or even one sec ) recently on our production enviroment. And finally I diagnosed that this latency was caused by direct reclaim in tcp_sendmsg. That issue could be resovled by keeping a reserved memory. But I think deeply that why not forbid direct reclaim if we set MSG_DONWAIT. So I did this change and tested it. The application got a errno returned instead of being blocked in send path. That's why I sumbit this patch. > Most of TCP payloads are added in page fragments, and you have not > changed the page allocation fragments. > > Also, I do not see how an application will get future notifications > that it can retry the failed system call ? > How are you really going to deal with this in high performance applications ? > I think that immdiately return with errno is better than being blocked. Maybe this solution is not good enough. At least it could tell the application that something is wrong and it can't send now. > I would rather prefer a socket setsockopt() to eventually be able to > flip __GFP_DIRECT_RECLAIM in sk->sk_allocation, > to not add all these tests in fast path, but honestly I do not see how > applications can really make use of this. Maybe an event is needed to tell the application it can send now. I don't have better idea neither. Thanks Yafang