Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp28358iog; Thu, 23 Jun 2022 21:25:21 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vL57XHqmZ5hqVdRZJ1CX5oCkt5HJRnxosvANJMlinWxByuMv4SE8XYaE78yLDdUFZDAOcs X-Received: by 2002:a17:903:1d2:b0:165:fd6:6ab6 with SMTP id e18-20020a17090301d200b001650fd66ab6mr42576153plh.41.1656044721383; Thu, 23 Jun 2022 21:25:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656044721; cv=none; d=google.com; s=arc-20160816; b=jv7Y22BYfiBM+9GYjqp7xaTevSbd+7V9ZK7HsLW4UCiHyBz4hrLOxju9QlgE4ESXOC YCL4hxCDMwOGs085GSXSlCfzxKrG2scdsopU6DVXGjZVhXm6YGtao+vm6votnWpX5sYK sqlbDCkezGICr3rLZOgkfo8wJZ/Olhg8qGMgSXVy+Sxxg+vcA+vfmEYNbCb7tETNiLzl IPgvqtTjAGRsoU0sNaWqsmJnP3FE/4ckIPMqypaNN+uoVE9hTNzm7N8UGPftMoLGcO5Y SZMZNSGPPxjiAgrPtSGb6SXYGtL3p9xxuwGkl4M3lKcQALtIerXmEbH8NkHCG83XDcoZ o4+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=LrpVbauQhlY6sbHoa1+haYyof5/5LMZu+rrUVujXJmY=; b=PVz5+rY78j9/OePNsrGGo2L2RmLh0vJNCHE/Tc7XqVZ3f2yrBsCfNE7UBYKpd2IACR X/628aMAECRVIlFiRyHQmQAuq1FggOwNJ3HhX4ikG63qjAgqoXZJUnU4dX7OGzZhRTWP cmb67xGtGZ45wONMPogUuoPtgCw/V5/g7QbjXKHxtTqXHDaO+xmRzAsuioIuO27UbBfp gPWgO5ndQCbonR4bPR3HEFBMfwGKDliQQhgSM15ESS3BYz/YrzOIjN+Go7vJ6W16wPG6 VCIeR9AU8BjmGmI9YMLbsrkPLm3B+Ar8Z4+V57LhspnXYhmVcQ+cRYJggbimCes+KEIc 9rpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=DVqUHWwj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a6-20020a170902900600b00161e5dcc680si1544624plp.358.2022.06.23.21.25.09; Thu, 23 Jun 2022 21:25:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=DVqUHWwj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229755AbiFXEW7 (ORCPT + 99 others); Fri, 24 Jun 2022 00:22:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229738AbiFXEW4 (ORCPT ); Fri, 24 Jun 2022 00:22:56 -0400 Received: from mail-yw1-x112a.google.com (mail-yw1-x112a.google.com [IPv6:2607:f8b0:4864:20::112a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB933DB9 for ; Thu, 23 Jun 2022 21:22:54 -0700 (PDT) Received: by mail-yw1-x112a.google.com with SMTP id 00721157ae682-317741c86fdso13636747b3.2 for ; Thu, 23 Jun 2022 21:22:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LrpVbauQhlY6sbHoa1+haYyof5/5LMZu+rrUVujXJmY=; b=DVqUHWwjqfSyUrKEEmRQ3laXS5aZCV+h8P/IkyQC8m5k2r+n3tUMsOoQKXSoh9WRlI m09tKWwK3GiflUWF0Dr20XIr6c7X7GVgN3HpaTX/FUe7X5ZdSfJ4ElUGmC7149/pIdQJ 9o0+iH8IHPVcRhmTCZQMzuuMazOytWvgsLgpPgHhvHqc3gB5h6ZXqeagU/N6bqv7xKHW U5kAm0QEVu6jjxLZp8nNu53bqGtb1ImVP9XTBXqxr0HqrLOQoUvfXuq/KF0SsTK5lMEB F8UaqHOrxK2yqpZS0IusbDmoUsv65akGOoFjouvrzFa14L3TLe/6MHJKyUHyZ6f5eNMe WH8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LrpVbauQhlY6sbHoa1+haYyof5/5LMZu+rrUVujXJmY=; b=nY4X5qL5T75EmD1DK53DqacBihhuFnv9oAx+BYu0Nx/GUwvs7LCma+MLbZei287Qmb AEUCrEvDbuQcdOkuM6p/+cE2srCFRzZL4XwM9mCSTQ6BHtBVBJVNmOtT7H+p0/DmRJtw quyLRlv6WbcbtNKkx2gYPp5fwQdkspHuSIA7mXIEkbcEoNsZHJpX6P+gqvFyhLEYbyeW E1JVr4ajB5Qp2i1RY/1QyYeeuSFkLR1BiExuXfRc/4i+c9uJ38auFMQ1VJCqW7w0DfGa Yn2v59SIsdTtrEhR+IqNgtD1MT9DbM9GyxvgPVwgrRh4MvSk66uou6LD/uO7VMBXTsh0 FocA== X-Gm-Message-State: AJIora/D9ADlPwT2kvokYtWE/N+XbqNsMKvoskyk1xV/pqcDUbBtJVv9 ARcz9ZR5UOaZZdALLu5sW0YIB+S5rZV1AHHX/1NZx2lgURs= X-Received: by 2002:a81:9b93:0:b0:317:8c9d:4c22 with SMTP id s141-20020a819b93000000b003178c9d4c22mr14652446ywg.278.1656044573688; Thu, 23 Jun 2022 21:22:53 -0700 (PDT) MIME-Version: 1.0 References: <20220619150456.GB34471@xsang-OptiPlex-9020> <20220622172857.37db0d29@kernel.org> <20220623185730.25b88096@kernel.org> In-Reply-To: From: Eric Dumazet Date: Fri, 24 Jun 2022 06:22:42 +0200 Message-ID: Subject: Re: [net] 4890b686f4: netperf.Throughput_Mbps -69.4% regression To: Jakub Kicinski Cc: Xin Long , Marcelo Ricardo Leitner , kernel test robot , Shakeel Butt , Soheil Hassas Yeganeh , LKML , Linux Memory Management List , network dev , linux-s390@vger.kernel.org, MPTCP Upstream , "linux-sctp @ vger . kernel . org" , lkp@lists.01.org, kbuild test robot , Huang Ying , "Tang, Feng" , zhengjun.xing@linux.intel.com, fengwei.yin@intel.com, Ying Xu Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 24, 2022 at 6:13 AM Eric Dumazet wrote: > > On Fri, Jun 24, 2022 at 3:57 AM Jakub Kicinski wrote: > > > > On Thu, 23 Jun 2022 18:50:07 -0400 Xin Long wrote: > > > From the perf data, we can see __sk_mem_reduce_allocated() is the one > > > using CPU the most more than before, and mem_cgroup APIs are also > > > called in this function. It means the mem cgroup must be enabled in > > > the test env, which may explain why I couldn't reproduce it. > > > > > > The Commit 4890b686f4 ("net: keep sk->sk_forward_alloc as small as > > > possible") uses sk_mem_reclaim(checking reclaimable >= PAGE_SIZE) to > > > reclaim the memory, which is *more frequent* to call > > > __sk_mem_reduce_allocated() than before (checking reclaimable >= > > > SK_RECLAIM_THRESHOLD). It might be cheap when > > > mem_cgroup_sockets_enabled is false, but I'm not sure if it's still > > > cheap when mem_cgroup_sockets_enabled is true. > > > > > > I think SCTP netperf could trigger this, as the CPU is the bottleneck > > > for SCTP netperf testing, which is more sensitive to the extra > > > function calls than TCP. > > > > > > Can we re-run this testing without mem cgroup enabled? > > > > FWIW I defer to Eric, thanks a lot for double checking the report > > and digging in! > > I did tests with TCP + memcg and noticed a very small additional cost > in memcg functions, > because of suboptimal layout: > > Extract of an internal Google bug, update from June 9th: > > -------------------------------- > I have noticed a minor false sharing to fetch (struct > mem_cgroup)->css.parent, at offset 0xc0, > because it shares the cache line containing struct mem_cgroup.memory, > at offset 0xd0 > > Ideally, memcg->socket_pressure and memcg->parent should sit in a read > mostly cache line. > ----------------------- > > But nothing that could explain a "-69.4% regression" I guess the test now hits memcg limits more often, forcing expensive reclaim, and the memcg limits need some adjustments. Overall, tests enabling memcg should probably need fine tuning, I will defer to Intel folks. > > memcg has a very similar strategy of per-cpu reserves, with > MEMCG_CHARGE_BATCH being 32 pages per cpu. > > It is not clear why SCTP with 10K writes would overflow this reserve constantly. > > Presumably memcg experts will have to rework structure alignments to > make sure they can cope better > with more charge/uncharge operations, because we are not going back to > gigantic per-socket reserves, > this simply does not scale.