Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp6174617rdb; Thu, 14 Dec 2023 10:09:03 -0800 (PST) X-Google-Smtp-Source: AGHT+IGLkmNaj9QaAWgN2DZiTKTwHUU9Kmtghj/2Ig9hs81m8BFCBtlQDPBq5eanHCKUH87c8I+t X-Received: by 2002:a05:6a00:2e0c:b0:6d0:8c48:fa0b with SMTP id fc12-20020a056a002e0c00b006d08c48fa0bmr4447462pfb.39.1702577343131; Thu, 14 Dec 2023 10:09:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702577343; cv=none; d=google.com; s=arc-20160816; b=gJ9fi664uD7xPo8M+kRkmNtXIXoRqXM9HeFNq/9WJc64n7h2CdB7MLb7kttXiaxHc1 SzXcVjWvmjWBz9lXCA/ucFwigqmBvk0e55gdjIw8jQ6a1epM+d/f1J5ejkKpthcKBf9P E98gwmruuQa7khjg3a4QHGrTvT4V/DZg6fJmKpgJzNeA0sm698O0nyUq7iZkWFmU0r4e /KfKIl/NIVK2qDckhSE9kKHj5hqXEe9/s8mLLSkkXz34eI1nkVqBBOEIAokzBDqe1Qbj B8uv/7fOIpZbkqQcqEXKzbn73I2WtkZnvd6UlSZpdSDfdO2mE16jV5FRih5KezwS2O85 P3vQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :message-id:date:subject:cc:to:from:dkim-signature; bh=EaNoWI8SI1SXaSWVAi6FVwRogSbOhH6qXVjk4Ig6Rd8=; fh=9L3nogdSG5wPJR64WBT/qux4Ac4rxPhflLbzDtMW4UE=; b=k8j/9GwxRQoFY2IJ/G/JVCksVJSBsLn5PsKJJDp4URJTSIbUT4jw2+glxsXPJFo15Z Sam+egiRNRdUIHCI7XSg0qBzsdYXgIroECzvfc8aPXLmLKBoPEbvmsWSVxMBpqMxQkgD TdbvR1EADhhwKr+QkdM4dL5Pyu9yxiY+MNAUXOXZuIG9ZMCsBrlGM8ax3PR1afyuscrA dT9MQM6WMv+LGkW7yIuE3g+nYBATQ2noa6kt/sbbfPRLcXXwkxizUehwYQSbhX8K8PFI ezdlmvkO2gn9qcpGW8mc/F/nISBLPqgmg8CUwNJbMI7+8cS3DVUCDjfu16cekf/hVfaA sBvg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=G2tJypmN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id bw40-20020a056a0204a800b005c672f5f9fdsi11665256pgb.710.2023.12.14.10.09.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 10:09:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=G2tJypmN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 57039826EC82; Thu, 14 Dec 2023 10:08:58 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1572936AbjLNSId (ORCPT + 99 others); Thu, 14 Dec 2023 13:08:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230332AbjLNSI3 (ORCPT ); Thu, 14 Dec 2023 13:08:29 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 37CF5CF for ; Thu, 14 Dec 2023 10:08:33 -0800 (PST) Received: from pps.filterd (m0353724.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3BEHHcYf007550; Thu, 14 Dec 2023 18:08:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : content-transfer-encoding : mime-version; s=pp1; bh=EaNoWI8SI1SXaSWVAi6FVwRogSbOhH6qXVjk4Ig6Rd8=; b=G2tJypmNnHsNPiKYFUyDxCYIStDbxveDFU9G6RJ4LzSWxPxWyHpxTAkzzW4U6MBnI/2F UN3Pt+0UBjsnAk8WwI0OKMHkze/aGLg118nsCkqU8IiTGQTHWaIMFgtw37Ia/x5vqdjT kD1XKg+BaYR8yTPckGR0QQQLMSckgXWMLSgwN0VrEyjKVGykWnlbfwX1Wo4SzUdcGtGJ dq3Ew785JH7OkP5WTf/nVeCaDpZgTZ17hyRVNdcTpmC8NTlz06Z9GgjDvqg+Bbfy7v93 /Q/+Gpymz//SloKdrxfSDwUH1KMrxp5xHJnZdawHVQE3oV4FjwRnsNKT6sAeWQOl5WHx pA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3v04rxbxvu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Dec 2023 18:08:00 +0000 Received: from m0353724.ppops.net (m0353724.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3BEHrMH3020720; Thu, 14 Dec 2023 18:08:00 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3v04rxbxvh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Dec 2023 18:08:00 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3BEH01XH008442; Thu, 14 Dec 2023 18:07:59 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3uw2jttr9e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Dec 2023 18:07:59 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3BEI7vnA4063766 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Dec 2023 18:07:57 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 541EF20049; Thu, 14 Dec 2023 18:07:57 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3DE4620040; Thu, 14 Dec 2023 18:07:54 +0000 (GMT) Received: from sapthagiri.in.ibm.com (unknown [9.43.25.55]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 Dec 2023 18:07:54 +0000 (GMT) From: Srikar Dronamraju To: Michael Ellerman Cc: linuxppc-dev , Srikar Dronamraju , Christophe Leroy , Josh Poimboeuf , linux-kernel@vger.kernel.org, Mark Rutland , Nicholas Piggin , "Paul E. McKenney" , Peter Zijlstra , Rohan McLure , Valentin Schneider , Vincent Guittot , Aneesh Subject: [PATCH v5 0/5] powerpc/smp: Topology and shared processor optimizations Date: Thu, 14 Dec 2023 23:37:10 +0530 Message-ID: <20231214180720.310852-1-srikar@linux.vnet.ibm.com> X-Mailer: git-send-email 2.43.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: j6c8xqUKI9MnKzeTDbHrBiVnoqIQtbpi X-Proofpoint-ORIG-GUID: XmxWE_-nvbY_MDVLUjW84i3xUEd3fLkk Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-14_12,2023-12-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 spamscore=0 clxscore=1011 impostorscore=0 mlxlogscore=999 mlxscore=0 malwarescore=0 adultscore=0 bulkscore=0 phishscore=0 lowpriorityscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2312140128 X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 14 Dec 2023 10:08:58 -0800 (PST) PowerVM systems configured in shared processors mode have some unique challenges. Some device-tree properties will be missing on a shared processor. Hence some sched domains may not make sense for shared processor systems. Most shared processor systems are over-provisioned. Underlying PowerVM Hypervisor would schedule at a Big Core granularity. The most recent power processors support two almost independent cores. In a lightly loaded condition, it helps the overall system performance if we pack to lesser number of Big Cores. Since each thread-group is independent, running threads on both the thread-groups of a SMT8 core, should have a minimal adverse impact in non over provisioned scenarios. These changes in this patchset will not affect in the over provisioned scenario. If there are more threads than SMT domains, then asym_packing will not kick-in System Configuration type=Shared mode=Uncapped smt=8 lcpu=96 mem=1066409344 kB cpus=96 ent=64.00 So *64 Entitled cores/ 96 Virtual processor* Scenario lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 768 On-line CPU(s) list: 0-767 Model name: POWER10 (architected), altivec supported Model: 2.0 (pvr 0080 0200) Thread(s) per core: 8 Core(s) per socket: 16 Socket(s): 6 Hypervisor vendor: pHyp Virtualization type: para L1d cache: 6 MiB (192 instances) L1i cache: 9 MiB (192 instances) NUMA node(s): 6 NUMA node0 CPU(s): 0-7,32-39,80-87,128-135,176-183,224-231,272-279,320-327,368-375,416-423,464-471,512-519,560-567,608-615,656-663,704-711,752-759 NUMA node1 CPU(s): 8-15,40-47,88-95,136-143,184-191,232-239,280-287,328-335,376-383,424-431,472-479,520-527,568-575,616-623,664-671,712-719,760-767 NUMA node4 CPU(s): 64-71,112-119,160-167,208-215,256-263,304-311,352-359,400-407,448-455,496-503,544-551,592-599,640-647,688-695,736-743 NUMA node5 CPU(s): 16-23,48-55,96-103,144-151,192-199,240-247,288-295,336-343,384-391,432-439,480-487,528-535,576-583,624-631,672-679,720-727 NUMA node6 CPU(s): 72-79,120-127,168-175,216-223,264-271,312-319,360-367,408-415,456-463,504-511,552-559,600-607,648-655,696-703,744-751 NUMA node7 CPU(s): 24-31,56-63,104-111,152-159,200-207,248-255,296-303,344-351,392-399,440-447,488-495,536-543,584-591,632-639,680-687,728-735 ebizzy -t 32 -S 200 (5 iterations) Records per second. (Higher is better) Kernel N Min Max Median Avg Stddev %Change 6.6.0-rc3 5 3840178 4059268 3978042 3973936.6 84264.456 +patch 5 3768393 3927901 3874994 3854046 71532.926 -3.01692 >From lparstat (when the workload stabilized) Kernel %user %sys %wait %idle physc %entc lbusy app vcsw phint 6.6.0-rc3 4.16 0.00 0.00 95.84 26.06 40.72 4.16 69.88 276906989 578 +patch 4.16 0.00 0.00 95.83 17.70 27.66 4.17 78.26 70436663 119 ebizzy -t 128 -S 200 (5 iterations) Records per second. (Higher is better) Kernel N Min Max Median Avg Stddev %Change 6.6.0-rc3 5 5520692 5981856 5717709 5727053.2 176093.2 +patch 5 5305888 6259610 5854590 5843311 375917.03 2.02998 >From lparstat (when the workload stabilized) Kernel %user %sys %wait %idle physc %entc lbusy app vcsw phint 6.6.0-rc3 16.66 0.00 0.00 83.33 45.49 71.08 16.67 50.50 288778533 581 +patch 16.65 0.00 0.00 83.35 30.15 47.11 16.65 65.76 85196150 133 ebizzy -t 512 -S 200 (5 iterations) Records per second. (Higher is better) Kernel N Min Max Median Avg Stddev %Change 6.6.0-rc3 5 19563921 20049955 19701510 19728733 198295.18 +patch 5 19455992 20176445 19718427 19832017 304094.05 0.523521 >From lparstat (when the workload stabilized) %Kernel user %sys %wait %idle physc %entc lbusy app vcsw phint 66.6.0-rc3 6.44 0.01 0.00 33.55 94.14 147.09 66.45 1.33 313345175 621 6+patch 6.44 0.01 0.00 33.55 94.15 147.11 66.45 1.33 109193889 309 System Configuration type=Shared mode=Uncapped smt=8 lcpu=40 mem=1067539392 kB cpus=96 ent=40.00 So *40 Entitled cores/ 40 Virtual processor* Scenario lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 320 On-line CPU(s) list: 0-319 Model name: POWER10 (architected), altivec supported Model: 2.0 (pvr 0080 0200) Thread(s) per core: 8 Core(s) per socket: 10 Socket(s): 4 Hypervisor vendor: pHyp Virtualization type: para L1d cache: 2.5 MiB (80 instances) L1i cache: 3.8 MiB (80 instances) NUMA node(s): 4 NUMA node0 CPU(s): 0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 NUMA node1 CPU(s): 8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 NUMA node4 CPU(s): 16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279,304-311 NUMA node5 CPU(s): 24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287,312-319 ebizzy -t 32 -S 200 (5 iterations) Records per second. (Higher is better) Kernel N Min Max Median Avg Stddev %Change 6.6.0-rc3 5 3535518 3864532 3745967 3704233.2 130216.76 +patch 5 3608385 3708026 3649379 3651596.6 37862.163 -1.42099 %Kernel user %sys %wait %idle physc %entc lbusy app vcsw phint 6.6.0-rc3 10.00 0.01 0.00 89.99 22.98 57.45 10.01 41.01 1135139 262 +patch 10.00 0.00 0.00 90.00 16.95 42.37 10.00 47.05 925561 19 ebizzy -t 64 -S 200 (5 iterations) Records per second. (Higher is better) Kernel N Min Max Median Avg Stddev %Change 6.6.0-rc3 5 4434984 4957281 4548786 4591298.2 211770.2 +patch 5 4461115 4835167 4544716 4607795.8 151474.85 0.359323 %Kernel user %sys %wait %idle physc %entc lbusy app vcsw phint 6.6.0-rc3 20.01 0.00 0.00 79.99 38.22 95.55 20.01 25.77 1287553 265 +patch 19.99 0.00 0.00 80.01 25.55 63.88 19.99 38.44 1077341 20 ebizzy -t 256 -S 200 (5 iterations) Records per second. (Higher is better) Kernel N Min Max Median Avg Stddev %Change 6.6.0-rc3 5 8850648 8982659 8951911 8936869.2 52278.031 +patch 5 8751038 9060510 8981409 8942268.4 117070.6 0.0604149 %Kernel user %sys %wait %idle physc %entc lbusy app vcsw phint 6.6.0-rc3 80.02 0.01 0.01 19.96 40.00 100.00 80.03 24.00 1597665 276 +patch 80.02 0.01 0.01 19.96 40.00 100.00 80.03 23.99 1383921 63 Observation: We are able to see Improvement in ebizzy throughput even with lesser core utilization (almost half the core utilization) in low utilization scenarios while still retaining throughput in mid and higher utilization scenarios. Note: The numbers are with Uncapped + no-noise case. In the Capped and/or noise case, due to contention on the Cores, the numbers are expected to further improve. Note: The numbers included (sched/fair: Enable group_asym_packing in find_idlest_group) https://lore.kernel.org/all/20231018155036.2314342-1-srikar@linux.vnet.ibm.com/ Changelog v4 1. Updated commit msg of patch 1 based on comments from Aneesh v3 (https://lore.kernel.org/all/20231026101843.56784-1-srikar@linux.vnet.ibm.com) ->v4: 1. SPLAR specific Asym packing only for MC and DIE domains. 2. Changes due to rebase (DIE became PKG) v2 (https://lore.kernel.org/all/20231018163751.2423181-1-srikar@linux.vnet.ibm.com) ->v3: 1. Handle comments from Peter Zijlstra / Michael Ellerman 2. Use __ro_after_init attribute instead of read_mostly 3. Use cpu_has_feature static_key instead of a new one. 4. Build topology dynamically patch added to this patchset. v1 (https://lore.kernel.org/all/20230830105244.62477-1-srikar@linux.vnet.ibm.com) -> v2: 1. Last two patches were added in this version 2. This version uses static keys Cc: Christophe Leroy Cc: Josh Poimboeuf Cc: linux-kernel@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Mark Rutland Cc: Michael Ellerman Cc: Nicholas Piggin Cc: "Paul E. McKenney" Cc: Peter Zijlstra (Intel) Cc: Rohan McLure Cc: Valentin Schneider Cc: Vincent Guittot CC: Aneesh Srikar Dronamraju (5): powerpc/smp: Enable Asym packing for cores on shared processor powerpc/smp: Disable MC domain for shared processor powerpc/smp: Add __ro_after_init attribute powerpc/smp: Avoid asym packing within thread_group of a core powerpc/smp: Dynamically build Powerpc topology arch/powerpc/kernel/smp.c | 124 +++++++++++++++++++++----------------- 1 file changed, 70 insertions(+), 54 deletions(-) base-commit: 3c0fd4382b584d4bdc9564526841df32e9b6d817 -- 2.35.3