From: Eric Dumazet Subject: Re: [PATCH] net/sock: move memory_allocated over to percpu_counter variables Date: Fri, 7 Sep 2018 00:21:46 -0700 Message-ID: References: <20180906192034.8467-1-olof@lixom.net> <20180907033257.2nlgiqm2t4jiwhzc@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: Herbert Xu , David Miller , Neil Horman , Marcelo Ricardo Leitner , Vladislav Yasevich , Alexey Kuznetsov , Hideaki YOSHIFUJI , linux-crypto@vger.kernel.org, LKML , linux-sctp@vger.kernel.org, netdev , linux-decnet-user@lists.sourceforge.net, kernel-team , Yuchung Cheng , Neal Cardwell To: Olof Johansson Return-path: In-Reply-To: Sender: netdev-owner@vger.kernel.org List-Id: linux-crypto.vger.kernel.org On Fri, Sep 7, 2018 at 12:03 AM Eric Dumazet wrote: > Problem is : we have platforms with more than 100 cpus, and > sk_memory_allocated() cost will be too expensive, > especially if the host is under memory pressure, since all cpus will > touch their private counter. > > per cpu variables do not really scale, they were ok 10 years ago when > no more than 16 cpus were the norm. > > I would prefer change TCP to not aggressively call > __sk_mem_reduce_allocated() from tcp_write_timer() > > Ideally only tcp_retransmit_timer() should attempt to reduce forward > allocations, after recurring timeout. > > Note that after 20c64d5cd5a2bdcdc8982a06cb05e5e1bd851a3d ("net: avoid > sk_forward_alloc overflows") > we have better control over sockets having huge forward allocations. > > Something like : Or something less risky : diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 7fdf222a0bdfe9775970082f6b5dcdcc82b2ae1a..0aee80b6966cb2898e46350c761f9eb431ff1206 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -604,7 +604,8 @@ void tcp_write_timer_handler(struct sock *sk) } out: - sk_mem_reclaim(sk); + if (tcp_under_memory_pressure(sk)) + sk_mem_reclaim(sk); } static void tcp_write_timer(struct timer_list *t)