Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp849201imm; Sat, 8 Sep 2018 10:06:01 -0700 (PDT) X-Google-Smtp-Source: ANB0VdaAMZQL5lwDXfyG1uME/8hJrF6Utstf/LLIGXhSLoo2cJzF4D8efMUDuYvrtqi5bmjkhtZp X-Received: by 2002:a63:b19:: with SMTP id 25-v6mr14415834pgl.301.1536426361930; Sat, 08 Sep 2018 10:06:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536426361; cv=none; d=google.com; s=arc-20160816; b=d2JdKsNWkVvWuf1fKyaek2q75SYe5w2KKxcagV5YAKVuXVUQr6Lu7dMbQKmBXghfH5 +iWYssL8otAMkQVeC426PoOa1DGQ/SVvc6snjkYd3gVRkGNVRf8M6ehvehtnhj4yWwF+ 9+BMjOpXT7Z8U+d2tfwAwW23fQV3aCcRXQqFOV4MY6/zFX2egiGAck5mq2K1KCDZtxUN 7ho/pgd+AA5VqV6xj59AxawdXDbAnawC0HeMlD4YyqiDdBMrsCPGNw+1YVsTP2tTn0BR wfhtH0N5fvgCJmurzb6b/N53eNWSZSQsIkxAzonuhgaPvqEfAJE45HoQjtp8uqVR1SSA IttQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=gW4HoiBuk9T0sod+V1/9BFEH0nvLkGgnIrdbD5XAt+8=; b=kTz2w49MuS5m2Uv5emuIDdISjHwwzVWhCsD0tdvrt3B0aNrKPbNtKDsOj917nrs1ag l2q8ojBSSzC8gga2d9TvEuT67JRYegrcMct3AjwIh00x82tViOeBj22WaL/XJoKle/l1 XKuV8TIeU3Gw5lojtLT1TCdNiB2VXAx3wXCENE+MVzokeAQazNu5Kc/+9YGlb6wumJ40 oqM8rhfeDEUApXT3dIh7AI6jdY+peJPqFyGlJwSDsxlgPdwYBEjmZKzKeKXLOexq3Dbn Mfa9BgyW9Wd4PlryP1KA4qpkB6DNHVw2Q0qhMSMA1311IqWSD7D4/cuy3yJYTFfnyCYa Dbsg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@lixom-net.20150623.gappssmtp.com header.s=20150623 header.b=I1KoGShD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z20-v6si1313624pgl.594.2018.09.08.10.05.09; Sat, 08 Sep 2018 10:06:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@lixom-net.20150623.gappssmtp.com header.s=20150623 header.b=I1KoGShD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727685AbeIHVtL (ORCPT + 99 others); Sat, 8 Sep 2018 17:49:11 -0400 Received: from mail-lf1-f66.google.com ([209.85.167.66]:45010 "EHLO mail-lf1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726765AbeIHVtK (ORCPT ); Sat, 8 Sep 2018 17:49:10 -0400 Received: by mail-lf1-f66.google.com with SMTP id g6-v6so14341741lfb.11 for ; Sat, 08 Sep 2018 10:02:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lixom-net.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=gW4HoiBuk9T0sod+V1/9BFEH0nvLkGgnIrdbD5XAt+8=; b=I1KoGShD54PcuKR7YIBg7S3PIDi2uUHskVE0bk97i9BOjHpnUb8uioki5N+EvLSRtU abFVUSuwTGMnVDfBqYpQ4YhjcMJmll8WsytImzd+Vgxpr7footUDG6W0sjFGDiD8Uq29 +OWHyKvkGkrwkKeC8bdMu89wg+UTMkTxd9/f+WdhF4cGkftXp4yCugTOyoJeQcQVk0w5 uJ6nd9eWRNQKxEQrOtrZAMgicSG6N4LNCmyuWewEZWth6B51+anh4v6nMAw6QFCurHQc 2IOJlGKTRq0rZQyA2Pt9Ls0ObRwhUGjuH6rDCVN7YPEu20ywCZONQXsLKhuN/hdIMixC Q5Dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=gW4HoiBuk9T0sod+V1/9BFEH0nvLkGgnIrdbD5XAt+8=; b=BlWOO4X0shrF9lIswQ35BJ8A9mss9SOMvtqdwM1a6HCcLMBQqHdZVK+yL2zvD4FsWq IytSz/lf38OJnA9AqhU8q1LL+bRGXyihz880yr76oUmGZ5qCoLcuVlUifbUj6+2DhVfJ esNcnWCW/0ObDvuYJvHKRS6brzrTKPCS4t6gRGz9Z4mOBh50B42wUKfRIDLfQhBK59kr rKdhUyIN8uSXlJbx5/E3rets7mMdK3v+6az7+NjKtTCQLhm0Td1pLQIaq8xVfbArGbw/ 6wtJu7TXLYGXxVshS8GEiYLaB0jI7g26qptccD3yiW+9RWIksI5eNqzucNfbRMzhLtFY umFw== X-Gm-Message-State: APzg51D8JSc9jyARYBdDJg3db/JXUcBbN4P4+Js3ofHS7rGEKHDB/Dm3 Jg8D6uh4WaFMlPmB9efnZtyzU1zulqSUPsKeuD1e3Q== X-Received: by 2002:a19:6f0a:: with SMTP id k10-v6mr6000310lfc.143.1536426163863; Sat, 08 Sep 2018 10:02:43 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a19:6413:0:0:0:0:0 with HTTP; Sat, 8 Sep 2018 10:02:42 -0700 (PDT) X-Originating-IP: [2620:10d:c090:180::1:7b1c] In-Reply-To: References: <20180906192034.8467-1-olof@lixom.net> <20180907033257.2nlgiqm2t4jiwhzc@gondor.apana.org.au> From: Olof Johansson Date: Sat, 8 Sep 2018 10:02:42 -0700 Message-ID: Subject: Re: [PATCH] net/sock: move memory_allocated over to percpu_counter variables To: Eric Dumazet Cc: Herbert Xu , David Miller , Neil Horman , Marcelo Ricardo Leitner , Vladislav Yasevich , Alexey Kuznetsov , Hideaki YOSHIFUJI , linux-crypto@vger.kernel.org, LKML , linux-sctp@vger.kernel.org, netdev , linux-decnet-user@lists.sourceforge.net, kernel-team , Yuchung Cheng , Neal Cardwell Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Fri, Sep 7, 2018 at 12:21 AM, Eric Dumazet wrote: > On Fri, Sep 7, 2018 at 12:03 AM Eric Dumazet wrote: > >> Problem is : we have platforms with more than 100 cpus, and >> sk_memory_allocated() cost will be too expensive, >> especially if the host is under memory pressure, since all cpus will >> touch their private counter. >> >> per cpu variables do not really scale, they were ok 10 years ago when >> no more than 16 cpus were the norm. >> >> I would prefer change TCP to not aggressively call >> __sk_mem_reduce_allocated() from tcp_write_timer() >> >> Ideally only tcp_retransmit_timer() should attempt to reduce forward >> allocations, after recurring timeout. >> >> Note that after 20c64d5cd5a2bdcdc8982a06cb05e5e1bd851a3d ("net: avoid >> sk_forward_alloc overflows") >> we have better control over sockets having huge forward allocations. >> >> Something like : > > Or something less risky : I gave both of these patches a run, and neither do as well on the system that has slower atomics. :( The percpu version: 8.05% workload [kernel.vmlinux] [k] __do_softirq 7.04% swapper [kernel.vmlinux] [k] cpuidle_enter_state 5.54% workload [kernel.vmlinux] [k] _raw_spin_unlock_irqrestore 1.66% swapper [kernel.vmlinux] [k] __do_softirq 1.55% workload [kernel.vmlinux] [k] finish_task_switch 1.24% swapper [kernel.vmlinux] [k] finish_task_switch 1.07% workload [kernel.vmlinux] [k] net_rx_action The first patch from you still has significant amount of time spent in the atomics paths (non-inlined versions used): 7.87% workload [kernel.vmlinux] [k] __ll_sc_atomic64_sub 7.48% workload [kernel.vmlinux] [k] __do_softirq 5.05% workload [kernel.vmlinux] [k] _raw_spin_unlock_irqrestore 2.42% workload [kernel.vmlinux] [k] __ll_sc_atomic64_add_return 1.49% swapper [kernel.vmlinux] [k] cpuidle_enter_state 1.31% workload [kernel.vmlinux] [k] finish_task_switch 1.09% workload [kernel.vmlinux] [k] tcp_sendmsg_locked 1.08% workload [kernel.vmlinux] [k] __arch_copy_from_user 1.02% workload [kernel.vmlinux] [k] net_rx_action I think a lot of the overhead from percpu approach can be alleviated if we can use percpu_counter_read() instead of _sum() (i.e. no need to iterate through the local per-cpu recent delta). I don't know the TCP stack well enough to tell where it's OK to use a bit of slack in the numbers though -- by default count will at most be off by 32*online cpus. Might not be a significant number in reality. -Olof