Received: by 2002:a05:6a10:87d6:0:0:0:0 with SMTP id g22csp32494pxr; Sun, 10 Apr 2022 06:07:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzn6/Rp+iAUJvbgrM9+3wbqDFCDSZ0OW+u5qveym3X0GUQsHh513wcYZUpgtvSI4Q57XI1z X-Received: by 2002:a17:907:9506:b0:6da:b4cd:515b with SMTP id ew6-20020a170907950600b006dab4cd515bmr26111672ejc.602.1649596066932; Sun, 10 Apr 2022 06:07:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649596066; cv=none; d=google.com; s=arc-20160816; b=AH8QboChkKlrcOGYsCTP7Q3rbWi7d5rJnNGHReeSAQX+cGCJB2KgN+nBXX3RCXZjpC bfMH2vqkm9mTlvSDL6ZlHLp2OJLa44VdT/Jfr04HFH8nfBDS9DsBV+IkP7dppzWSNz5d 5/GHEFvuuqdbsmIbTjdbW3k+388ceB7dUz8H/W6W1RspBys9Hlny/HhgLJPzVuyUrlhx u8U+399wREFP8LruP6tzU1OWSQepErnrU0usPBRnv88nIHAv4K4jDIcUbugE6TrVejyt ZANhEV1rFfYoWRwHncjYhGHcQxMUyNLJLwP7aTiwHPy5eAaHdMgGT4kNLj0rj7DZG0IL GUDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=XZyaNrKnWQ3OxNWIlk/HZpTSNSvp2mqmDY/7G8nvfkI=; b=DShJXASN8epjNrDELU1+vv8PxR+lVjSwFv9dvzyQy3VTZAHyj0eyNStdoQ3yevJ1i0 0Bu0Y9q40ZvGa3T68CZJ6cv2hbZHGqajwZK3DWDW0ysPPDxM3HqTC9PKfPU/0nodnPmc dAMgPWE9BO9Dd7AbkmhRyDyxMlhM2+6uv8ZmWiKp2xgheZo0f0B8lk7wDdrf1To4Uyy4 QriD8iUnOuPh0vyKyy+YTm56BNQ3Qg+EP3W1id7SzMyzwtEO+DjiJsDHMBipxUhcTFuw fxTe7lAfQ9ckM2XqCkpPXntzcOPswqEhYtimWlmka7wGMSUt42qXY1BNLhmBT1MxfchK kQEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=kSAu76Wp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x2-20020a170906b08200b006dfe7fc57efsi4991822ejy.475.2022.04.10.06.07.09; Sun, 10 Apr 2022 06:07:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=kSAu76Wp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242370AbiDIPLb (ORCPT + 99 others); Sat, 9 Apr 2022 11:11:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229437AbiDIPL1 (ORCPT ); Sat, 9 Apr 2022 11:11:27 -0400 Received: from mail-qv1-xf33.google.com (mail-qv1-xf33.google.com [IPv6:2607:f8b0:4864:20::f33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC22D75E76 for ; Sat, 9 Apr 2022 08:09:19 -0700 (PDT) Received: by mail-qv1-xf33.google.com with SMTP id p8so9754257qvv.5 for ; Sat, 09 Apr 2022 08:09:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=XZyaNrKnWQ3OxNWIlk/HZpTSNSvp2mqmDY/7G8nvfkI=; b=kSAu76WpzRQJo8Q4tSg403Y0QBymLeMmk35w2DVInEBBp940QNwLoyFWnAwgZzozAJ 9AykxP+0RV0jXt2//MTr1+DLzi8XHQS1EFW7stEwcNvKOrvNBBnYtbeJz3WmwpkgoP93 dlBe0bOwF5J+9fN5xRGHoEzEJIcpWCANuYgcIe+iYKfo8hS3uTv0HWr0ZMIqJPQAnvBN 3eMMgvDhiaFtWuP9QPdWm4iuhRATZvcVUjG/m2D83maQlcQn4MFvPiRdx26QOMfXKIjT l6gwapGSGmGd9PNm7DYyMKWu+o+iHMxI9BrdRV2kvU49acX+LMakUtDQBKzv1i7ltHUn ljLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=XZyaNrKnWQ3OxNWIlk/HZpTSNSvp2mqmDY/7G8nvfkI=; b=KbLuhJMIb/8S9fm3tpm1o1GU5aKQ81IRl2YRdl/kYb6qby5OQlyjTS4CuI22iKgXdZ KQCJ97XXBLKqyDYRsIGhoUESVXCcU6KoUPmISp85krPxhvJnjzXiWrSV3xYBBCYv8ILu xdSVniP6ACaRf0imS68gJ6VL6QOH2W3zwRfj/IyQHlCMNFdswj1JIQ1kNIZvviOVY/7g spOu1aDSqujah6wc/a26GeTK7ANOYCUrPfvaAzNBTntpRZbIrCv83bNhlFsCNcXZwlIg x+7G2WGYAuy09W11E2rHwGNhGDpeNsMRlUnMNCefMobWyzKtXieJdn9P5biizk53x9uZ A5DA== X-Gm-Message-State: AOAM53368LBV6/IJ8uJtkD57c+4LhPrOd+bgqdw0XruH2jxMwgpa01GV qBYs40TsV3eAnCO1qmu6WcnXI0GCLUPoCwP1hK8= X-Received: by 2002:ad4:5f4b:0:b0:441:4d40:f8d2 with SMTP id p11-20020ad45f4b000000b004414d40f8d2mr20303653qvg.33.1649516958997; Sat, 09 Apr 2022 08:09:18 -0700 (PDT) MIME-Version: 1.0 References: <20220310005228.11737-1-yu.c.chen@intel.com> <20220318034335.GA12577@chenyu5-mobl1> In-Reply-To: From: Chen Yu Date: Sat, 9 Apr 2022 23:09:07 +0800 Message-ID: Subject: Re: [PATCH v2][RFC] sched/fair: Change SIS_PROP to search idle CPU based on sum of util_avg To: Yicong Yang Cc: Chen Yu , Yicong Yang , yangyicong@hisilicon.com, Linux Kernel Mailing List , Tim Chen , Peter Zijlstra , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Mel Gorman , Viresh Kumar , Barry Song <21cnbao@gmail.com>, Barry Song , Srikar Dronamraju , Len Brown , Ben Segall , Daniel Bristot de Oliveira , Aubrey Li , K Prateek Nayak , "shenyang (M)" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 5, 2022 at 9:05 AM Yicong Yang wrote: > > FYI, shenyang has done some investigation on whether we can get an idle cpu if the nr is 4. > For netperf running on node 0-1 (32 cores on each node) with 32, 64, 128 threads, the success > rate of findindg an idle cpu is about 61.8%, 7.4%, <0.1%, the CPU utilization is 70.7%, 87.4% > and 99.9% respectively. > Thanks for this testing. So this indicates that nr = 4 would not improve the idle CPU search efficiency much when the load is extremely high. Stop searching entirely when it is nearly 100% may be more appropriate. > I have test this patch based on 5.17-rc7 on Kunpeng 920. The benchmarks are binding to node 0 > or node 0-1. The tbench result has some oscillation so I need to have a further check. > For netperf I see performance enhancement when the threads equals to the cpu number. > The benefit might come from returning previous CPU earlier when nr_threads equals to nr_cpu. And when the threads number exceeds that of CPU, it might have already returned previous CPU without this patch, so we didn't see much improvements(in Shenyang's test, the success rate is only 7.4% when threads number equals to CPU number) > For netperf: > TCP_RR 2 nodes > threads base patched pct > 16 50335.56667 49970.63333 -0.73% > 32 47281.53333 48191.93333 1.93% > 64 18907.7 34263.63333 81.22% > 128 14391.1 14480.8 0.62% > 256 6905.286667 6853.83 -0.75% > > TCP_RR 1 node > threads base patched pct > 16 50086.06667 49648.13333 -0.87% > 32 24983.3 39489.43333 58.06% > 64 18340.03333 18399.56667 0.32% > 128 7174.713333 7390.09 3.00% > 256 3433.696667 3404.956667 -0.84% > > UDP_RR 2 nodes > threads base patched pct > 16 81448.7 82659.43333 1.49% > 32 75351.13333 76812.36667 1.94% > 64 25539.46667 41835.96667 63.81% > 128 25081.56667 23595.56667 -5.92% > 256 11848.23333 11017.13333 -7.01% > > UDP_RR 1 node > threads base patched pct > 16 87288.96667 88719.83333 1.64% > 32 22891.73333 68854.33333 200.78% > 64 33853.4 35891.6 6.02% > 128 12108.4 11885.76667 -1.84% > 256 5620.403333 5531.006667 -1.59% > > mysql on node 0-1 > base patched pct > 16threads-TPS 7100.27 7224.31 1.75% > 16threads-QPS 142005.45 144486.19 1.75% > 16threads-avg lat 2.25 2.22 1.63% > 16threads-99th lat 2.46 2.43 1.08% > 24threads-TPS 10424.70 10312.20 -1.08% > 24threads-QPS 208493.86 206243.93 -1.08% > 24threads-avg lat 2.30 2.32 -0.87% > 24threads-99th lat 2.52 2.57 -1.85% > 32threads-TPS 12528.79 12228.88 -2.39% > 32threads-QPS 250575.92 244577.59 -2.39% > 32threads-avg lat 2.55 2.61 -2.35% > 32threads-99th lat 2.88 2.99 -3.82% > 64threads-TPS 21386.17 21789.99 1.89% > 64threads-QPS 427723.41 435799.85 1.89% > 64threads-avg lat 2.99 2.94 1.78% > 64threads-99th lat 5.00 4.69 6.33% > 128threads-TPS 20865.13 20781.24 -0.40% > 128threads-QPS 417302.73 415624.83 -0.40% > 128threads-avg lat 6.13 6.16 -0.38% > 128threads-99th lat 8.90 8.95 -0.60% > 256threads-TPS 19258.15 19295.11 0.19% > 256threads-QPS 385162.92 385902.27 0.19% > 256threads-avg lat 13.29 13.26 0.23% > 256threads-99th lat 20.12 20.12 0.00% > > I also had a look on a machine with 2-socket Xeon 6148 (80 threads in total) > For TCP_RR, the best enhancement also happens when the threads equals to > the cpu number. > May I know if the test is with turbo enabled or disabled? If the turbo is disabled, there might be some issues when calculating the util_avg. I had a workaround at https://lore.kernel.org/all/20220407234258.569681-1-yu.c.chen@intel.com/ And I'm working on the v3 patch which would include above workaround, will sent it out later. -- Thanks, Chenyu