Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp3054617iog; Mon, 27 Jun 2022 08:13:39 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tUGVptSYp7zj27XkvXu8D+8q1Xyt8x57lnY4O40YzG9YcqYUMiLqpW+1xQ1wiFRBT57ia5 X-Received: by 2002:a05:6402:698:b0:435:75d9:f94c with SMTP id f24-20020a056402069800b0043575d9f94cmr17524795edy.330.1656342819172; Mon, 27 Jun 2022 08:13:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656342819; cv=none; d=google.com; s=arc-20160816; b=L8xWzwMLfCyxoTf2X08JDOQ9bRqcgV9oA2GnHweTEVqy4QYpBtSAq/gUlt/wLRZ6cT elMBzhV5yVUnsxnLrqzR8GJD0XQIeQ8zYfzIKPE1AENMWKhrIqVjlvc+k82PIjP137Th 4YnZJO9KxCS0oHrZ9EkPgw/drgQmIXlFFpgVPiGV+Wd6BUHtUMqj6rvGyTh4w0yGsQmb 8Fg1CPkjxJaHbuvI3o1nEmoqxLY+1Es/KdvhPMHPEiK9VESXjRzwov9nj5F11hs2rYhr y3IlsK6nc9kjHBOPsgHu3SGM7SbLYIewRnxurMZo9V1uvt1n3SyrXBPdH2Bpcj8Hx2x7 tzlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=w4nnCuaaqVKXnI7zfu4XEe/EidzYbOyoiaXDjkUN60I=; b=jkShlgHCjXUntQLObcIoZ2ulKIRv2iYfghJiksE8BZHjF1fajrtSIH5rhB7147t/JI LlZKDVMBccAUyO+Yb2vT872wMktDgKNVLtE2jwUQ8Rrkqcmk7d0Diqd+nPM1PwTreQ8t u4TA9jp6B4cPvmVKcA9nVFgMZhzdt+WlLKYEcp+RyCVxOVMXUE6wSq9lEgESLShjFTMB NkGKmY45IGjCREmGk/pcSfAsyWqB3PjsOQJGuEQ5QKObQwJZ8mKxgCB7Id8gVh0m5GTK axirPJIAg7gAj85xSdBph/sxFmAak1DTFKGzuZfF5p63A3Ku8InvvhBSWgefMHfse9Ka vaTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=gM2IEHiW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hz5-20020a1709072ce500b006fe9a0289d5si11572692ejc.885.2022.06.27.08.13.13; Mon, 27 Jun 2022 08:13:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=gM2IEHiW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234276AbiF0OyD (ORCPT + 99 others); Mon, 27 Jun 2022 10:54:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238003AbiF0OxI (ORCPT ); Mon, 27 Jun 2022 10:53:08 -0400 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D908BBC1D for ; Mon, 27 Jun 2022 07:53:06 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id v126so5172965pgv.11 for ; Mon, 27 Jun 2022 07:53:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=w4nnCuaaqVKXnI7zfu4XEe/EidzYbOyoiaXDjkUN60I=; b=gM2IEHiWITi7+4u4kxy5XXhdNF32YfAWT9wfPYpqKED+86BqP0njb8qUgEGsh9TRYl FRgvQrSgG3jwRpHz/uynTtwGUcE/VoJkOuXWiry7nkDqDl2zZnQnFn3X5plbLx3+56e4 hC0V9xPTZ2U0O8Y2L8ACjYHGOIzt7S5t8FpBOYyrGUe0ZInCXT+oy7CDYYs4Sf2DvCNY /KqagQAvzNqW1Eh3DsFCZ96+/e2yMMrqBvrtESk0Lw5VH/OFNxJ6iHniPwWMHoDSuUmt CAb+Lssqwd8d5t2N1o+ikpuuiFm7ysYla8g2bWw1dyswpjy7jf+F/+zei0aPAiIRJoaM cDSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=w4nnCuaaqVKXnI7zfu4XEe/EidzYbOyoiaXDjkUN60I=; b=WTPg1GohzQC1idCoW7Y4idAOA3lm3iq1QHEyBgFoC1ZnirMAwNaouQMn1LWn0vRP4I R6B/RApvQqbq11zWZ6CtY3U0a8C616SmO5e2eA3FncWZTNJQMAhSwioYKyHd9mRnRip3 beW+ZLNeNU3WsnIn8TDueG1izymEDg3N+CQEAw0aLabgQr0Jy7JZR8inYeFM7oaA2qeR S1jr2yb7ri+8IlKh7q86R2+Gj8fxNyOekMOPzU2bDtXU3p5e4c93MEt4FsLoROsx8hRZ vM5RV9JRXZXl7ptineeKTFDvN6tzvvZ49K8dP/QRYJgoGIfwFg0ULB3EF/INc+kZ8qQi t0Jw== X-Gm-Message-State: AJIora+xTlOWZWAzQxOnsCPAqPaayGymznbm+sVft/cAWlETmGy+Zfy4 IWjcthudRWcdA6hn8SWFiBXxPB5S0349i8Q5MP2kXQ== X-Received: by 2002:a63:6cc8:0:b0:40d:e553:f200 with SMTP id h191-20020a636cc8000000b0040de553f200mr6417592pgc.166.1656341586171; Mon, 27 Jun 2022 07:53:06 -0700 (PDT) MIME-Version: 1.0 References: <20220623185730.25b88096@kernel.org> <20220624070656.GE79500@shbuild999.sh.intel.com> <20220624144358.lqt2ffjdry6p5u4d@google.com> <20220625023642.GA40868@shbuild999.sh.intel.com> <20220627023812.GA29314@shbuild999.sh.intel.com> <20220627123415.GA32052@shbuild999.sh.intel.com> In-Reply-To: <20220627123415.GA32052@shbuild999.sh.intel.com> From: Shakeel Butt Date: Mon, 27 Jun 2022 07:52:55 -0700 Message-ID: Subject: Re: [net] 4890b686f4: netperf.Throughput_Mbps -69.4% regression To: Feng Tang Cc: Eric Dumazet , Linux MM , Andrew Morton , Roman Gushchin , Michal Hocko , Johannes Weiner , Muchun Song , Jakub Kicinski , Xin Long , Marcelo Ricardo Leitner , kernel test robot , Soheil Hassas Yeganeh , LKML , network dev , linux-s390@vger.kernel.org, MPTCP Upstream , "linux-sctp @ vger . kernel . org" , lkp@lists.01.org, kbuild test robot , Huang Ying , Xing Zhengjun , Yin Fengwei , Ying Xu Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 27, 2022 at 5:34 AM Feng Tang wrote: > > On Mon, Jun 27, 2022 at 10:46:21AM +0200, Eric Dumazet wrote: > > On Mon, Jun 27, 2022 at 4:38 AM Feng Tang wrote: > [snip] > > > > > > > > > > Thanks Feng. Can you check the value of memory.kmem.tcp.max_usage_in_bytes > > > > > in /sys/fs/cgroup/memory/system.slice/lkp-bootstrap.service after making > > > > > sure that the netperf test has already run? > > > > > > > > memory.kmem.tcp.max_usage_in_bytes:0 > > > > > > Sorry, I made a mistake that in the original report from Oliver, it > > > was 'cgroup v2' with a 'debian-11.1' rootfs. > > > > > > When you asked about cgroup info, I tried the job on another tbox, and > > > the original 'job.yaml' didn't work, so I kept the 'netperf' test > > > parameters and started a new job which somehow run with a 'debian-10.4' > > > rootfs and acutally run with cgroup v1. > > > > > > And as you mentioned cgroup version does make a big difference, that > > > with v1, the regression is reduced to 1% ~ 5% on different generations > > > of test platforms. Eric mentioned they also got regression report, > > > but much smaller one, maybe it's due to the cgroup version? > > > > This was using the current net-next tree. > > Used recipe was something like: > > > > Make sure cgroup2 is mounted or mount it by mount -t cgroup2 none $MOUNT_POINT. > > Enable memory controller by echo +memory > $MOUNT_POINT/cgroup.subtree_control. > > Create a cgroup by mkdir $MOUNT_POINT/job. > > Jump into that cgroup by echo $$ > $MOUNT_POINT/job/cgroup.procs. > > > > > > > > The regression was smaller than 1%, so considered noise compared to > > the benefits of the bug fix. > > Yes, 1% is just around noise level for a microbenchmark. > > I went check the original test data of Oliver's report, the tests was > run 6 rounds and the performance data is pretty stable (0Day's report > will show any std deviation bigger than 2%) > > The test platform is a 4 sockets 72C/144T machine, and I run the > same job (nr_tasks = 25% * nr_cpus) on one CascadeLake AP (4 nodes) > and one Icelake 2 sockets platform, and saw 75% and 53% regresson on > them. > > In the first email, there is a file named 'reproduce', it shows the > basic test process: > > " > use 'performane' cpufre governor for all CPUs > > netserver -4 -D > modprobe sctp > netperf -4 -H 127.0.0.1 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K & > netperf -4 -H 127.0.0.1 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K & > netperf -4 -H 127.0.0.1 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K & > (repeat 36 times in total) > ... > > " > > Which starts 36 (25% of nr_cpus) netperf clients. And the clients number > also matters, I tried to increase the client number from 36 to 72(50%), > and the regression is changed from 69.4% to 73.7% > Am I understanding correctly that this 69.4% (or 73.7%) regression is with cgroup v2? Eric did the experiments on v2 but on real hardware where the performance impact was negligible. BTW do you see similar regression for tcp as well or just sctp?