Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp57806iog; Thu, 23 Jun 2022 22:21:38 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tYxNtA6cdlGBVKDPfQYFKHPEtJQOSujZcrwbl1fEtlqv6NAkY+0Jjc453d6X2MEctFK5+d X-Received: by 2002:a17:907:c1f:b0:711:d4c6:9161 with SMTP id ga31-20020a1709070c1f00b00711d4c69161mr6823805ejc.760.1656048097989; Thu, 23 Jun 2022 22:21:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656048097; cv=none; d=google.com; s=arc-20160816; b=g/k1S1sDV3S5zDtJH5lThXz+/LCKAOAKWSYP0sKPN2JRDJIyTOgioSnqBDe8V3Vivr VjBLtZ1pGTlDRcRrhUsRNC04nr/2zQkWosH3/B34h3PuJHeyG5pXP+4eEQjzNITTGUno HZruNIW/TVuXhmBrfLdIgW8Vg/6Swaiumd5P7bqeUaaUvBEETiGxBkK5peewoIMUlkDo IRrLc6kOHS4KCbN2oIyawWjr6eDx+SM2GYOjNufDvKamxba/+cMsDSjdEItqknrMQNhM dRa6QESs7ZoxkMB5EPZzpEUt6O3nPL0Egkp6nRgbfX53CGyGsEdTSIb29opHAQUo1bma dz9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=9za2e6t9oeEMf6Tcg1WpJcXkpsw9Di41DPfBvMm1yAw=; b=lH9EtCKUUxJKtk5uoJnJv6WnQS8/dkkUBooIzL7SfeAGXKyayQ9te/YYbz04Cf9qBV 06WKf1napMLkEVK2ZOvi+pILV3wmYkSxDl9bspima2E+rFaPsqbCz+U1PN62AegwSaWC xsC8Af1bf7Dibmi/O2DStYwLDKbu1Omy6dANt/iO6tgHVK+olShr25nvtDcHf9iB2/pY I4yP9FR3E+MVj1dEKRpxKQJfsHArzQe/SN9IGtCIm/OFqkwlC8mXnpWR0LR+NwilHRI/ 3jrYwwwFmN9i7WaXamEIglT560GwJf2IeZcaKkSv/G7oGHB88y4E0mG4/c9WLv4/wtUd 47bA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Mra4+y9s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g18-20020a1709065d1200b007262c09f357si1701078ejt.349.2022.06.23.22.21.12; Thu, 23 Jun 2022 22:21:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Mra4+y9s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231230AbiFXFOB (ORCPT + 99 others); Fri, 24 Jun 2022 01:14:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229734AbiFXFN7 (ORCPT ); Fri, 24 Jun 2022 01:13:59 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B8D65158C; Thu, 23 Jun 2022 22:13:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1656047639; x=1687583639; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=QYBf8oflbgBC/XBfAzL41uZYcigxxwAfRsDKymMRo5s=; b=Mra4+y9sjqYxc7GHN9b056p1jjqE33cnLMgvMSE4m6PrX8RFPVqO7D7/ rEUl7Q7NVSS2komKAVeP83YcRiN/oMwLU25hB7T1JURpkhgvjBE3UIWt/ oyHrp6ow6mea98On/TfEUYkHTtPk60aHkacBp/0+S7R2BkrVat8Jcb483 34XMfuiHk0Uv5ypK7Cxh+l4+yHVEsSgfS4Y099hZ3PFX3xlTHCJs+pnmq azL4BvjLcpMctpMLAQ5ELthnGm7xx/EbQaT3pkDrAneM0bduxkq0K3iGT vXbqyqZAhRuQ+vQz8lr49fG/2Fpdj2hNrol9KfGzzZ8ZKYkeZzoV6HQyA A==; X-IronPort-AV: E=McAfee;i="6400,9594,10387"; a="263962408" X-IronPort-AV: E=Sophos;i="5.92,218,1650956400"; d="scan'208";a="263962408" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Jun 2022 22:13:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,218,1650956400"; d="scan'208";a="678398213" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.146.138]) by FMSMGA003.fm.intel.com with ESMTP; 23 Jun 2022 22:13:52 -0700 Date: Fri, 24 Jun 2022 13:13:51 +0800 From: Feng Tang To: Eric Dumazet Cc: Jakub Kicinski , Xin Long , Marcelo Ricardo Leitner , kernel test robot , Shakeel Butt , Soheil Hassas Yeganeh , LKML , Linux Memory Management List , network dev , linux-s390@vger.kernel.org, MPTCP Upstream , "linux-sctp @ vger . kernel . org" , lkp@lists.01.org, kbuild test robot , Huang Ying , zhengjun.xing@linux.intel.com, fengwei.yin@intel.com, Ying Xu Subject: Re: [net] 4890b686f4: netperf.Throughput_Mbps -69.4% regression Message-ID: <20220624051351.GA72171@shbuild999.sh.intel.com> References: <20220619150456.GB34471@xsang-OptiPlex-9020> <20220622172857.37db0d29@kernel.org> <20220623185730.25b88096@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Eric, On Fri, Jun 24, 2022 at 06:13:51AM +0200, Eric Dumazet wrote: > On Fri, Jun 24, 2022 at 3:57 AM Jakub Kicinski wrote: > > > > On Thu, 23 Jun 2022 18:50:07 -0400 Xin Long wrote: > > > From the perf data, we can see __sk_mem_reduce_allocated() is the one > > > using CPU the most more than before, and mem_cgroup APIs are also > > > called in this function. It means the mem cgroup must be enabled in > > > the test env, which may explain why I couldn't reproduce it. > > > > > > The Commit 4890b686f4 ("net: keep sk->sk_forward_alloc as small as > > > possible") uses sk_mem_reclaim(checking reclaimable >= PAGE_SIZE) to > > > reclaim the memory, which is *more frequent* to call > > > __sk_mem_reduce_allocated() than before (checking reclaimable >= > > > SK_RECLAIM_THRESHOLD). It might be cheap when > > > mem_cgroup_sockets_enabled is false, but I'm not sure if it's still > > > cheap when mem_cgroup_sockets_enabled is true. > > > > > > I think SCTP netperf could trigger this, as the CPU is the bottleneck > > > for SCTP netperf testing, which is more sensitive to the extra > > > function calls than TCP. > > > > > > Can we re-run this testing without mem cgroup enabled? > > > > FWIW I defer to Eric, thanks a lot for double checking the report > > and digging in! > > I did tests with TCP + memcg and noticed a very small additional cost > in memcg functions, > because of suboptimal layout: > > Extract of an internal Google bug, update from June 9th: > > -------------------------------- > I have noticed a minor false sharing to fetch (struct > mem_cgroup)->css.parent, at offset 0xc0, > because it shares the cache line containing struct mem_cgroup.memory, > at offset 0xd0 > > Ideally, memcg->socket_pressure and memcg->parent should sit in a read > mostly cache line. > ----------------------- > > But nothing that could explain a "-69.4% regression" We can double check that. > memcg has a very similar strategy of per-cpu reserves, with > MEMCG_CHARGE_BATCH being 32 pages per cpu. We have proposed patch to increase the batch numer for stats update, which was not accepted as it hurts the accuracy and the data is used by many tools. > It is not clear why SCTP with 10K writes would overflow this reserve constantly. > > Presumably memcg experts will have to rework structure alignments to > make sure they can cope better > with more charge/uncharge operations, because we are not going back to > gigantic per-socket reserves, > this simply does not scale. Yes, the memcg statitics and charge/unchage update is very sensitive with the data alignemnt layout, and can easily trigger peformance changes, as we've seen quite some similar cases in the past several years. One pattern we've seen is, even if a memcg stats updating or charge function only takes about 2%~3% of the CPU cycles in perf-profile data, once it got affected, the peformance change could be amplified to up to 60% or more. Thanks, Feng