Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp2026014pxa; Fri, 7 Aug 2020 00:47:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyre31SvQYq5w481rySx3cxIHxi/eh2GGqoxUYECOU7pOlW2pR8APcmR/UVD6aZi7pGX1OR X-Received: by 2002:a17:906:7752:: with SMTP id o18mr7890017ejn.150.1596786445693; Fri, 07 Aug 2020 00:47:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596786445; cv=none; d=google.com; s=arc-20160816; b=Kd3ldfLc0i6clS1yIrWlQgCCBuHRBMabz3MnUVU/NzjhdOAQ1qWsyQjQreg2+sKfPi FNC5rjpzAKRSIuAMYuqmOm/HgcTbBQfFG0Sh+6JibqueHi3DrBgx80xUyTM4f34S+u8H kKGUGRjr/G1OEIqcn5iy3lnldgHCvAayrhzmycc5hsAObHAZBU50EGILJHyNJ9kfhd1c +RuBjS2/Cs7chkueBgxwxjLPRJWp2FTO5faWYjNf0+yt5fOBfM4Vw/7ETOwjLkeSr4Ig n9gYk9+woKpzPBUcKWQ7NLM5BNkp0KWoWFBFvThYSlKeaJtH45QOVmcgEyTevLoq4P+0 /BMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=aUlNL7sr0ua7PsthmyzDvPCr3i3J9piy9BAkH+ZcaXs=; b=jmr7NG/EiCih5K8CoxfmYKLSZDHPNQ/Sn0/DMDsRLI0T5Ujh43PnkL3gkJ5+IB9/k9 5xLIkVZRv3ncs1r3n3ijXVu+wwGKEkOwICwOddgpLFvD6//FKT4mUjN1w6nqBSg2kJMu LSbFVKHEiJLyY0yqUfGfU80H9qWJliMIbuwV+N0xn/HfpCgegsM0dJ7xdMyuBDRHlayO 2NvgybZ/KH2slmwt/eDqpKjr+m/i+Jf2FQgC8AqreDErqacfzHpBQ4Ut8tRSJo0fjXW0 fQ8nBh86eimvA+WTNpq8bxkUl28nq/e9pDNW8xxD4p1tpaIcMB6psPAATgHCoXlSmxs7 UYiQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=omKXMc7x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o17si4847903ejg.156.2020.08.07.00.47.02; Fri, 07 Aug 2020 00:47:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=omKXMc7x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727120AbgHGHp4 (ORCPT + 99 others); Fri, 7 Aug 2020 03:45:56 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:17654 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727079AbgHGHpx (ORCPT ); Fri, 7 Aug 2020 03:45:53 -0400 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0777VYKW083756; Fri, 7 Aug 2020 03:45:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=aUlNL7sr0ua7PsthmyzDvPCr3i3J9piy9BAkH+ZcaXs=; b=omKXMc7xADJ6av7ZYWwzKaKFvp/9fMaNOOOm0V1BqOhEXydtUq81eaeamMm/iuEYUCpI 5MlI3N80cGxOdU8T2Jf5ooJX5wXLA47dpj+gEbrNrsPz9tNDaz6cyf8dLABmX1uv0Ldo 3Da2ef3wNMZdm5D1Rvf8M5khcEWokhisIPH8gN9FToCci23q6j1LryvMNFMrWW/r/haF CeXk8tsM75HglKMmTy3+r6NPaEzGIQ2YXM277LKAeI+nrUgZfERp0m+3bxeRLh1+ho6w AbDZBiT9636XmE035GiBKi9c+Of6/ywRJsH0/bpEOIdmNRSd4StXWHMk06eom82MUR/F SQ== Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0b-001b2d01.pphosted.com with ESMTP id 32rnu9meuj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 07 Aug 2020 03:45:34 -0400 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0777fTOP013194; Fri, 7 Aug 2020 07:45:32 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma06fra.de.ibm.com with ESMTP id 32mynhbwrv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 07 Aug 2020 07:45:32 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0777jUVe31195456 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 7 Aug 2020 07:45:30 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5993CAE053; Fri, 7 Aug 2020 07:45:30 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A6F79AE056; Fri, 7 Aug 2020 07:45:26 +0000 (GMT) Received: from srikart450.in.ibm.com (unknown [9.199.45.118]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 7 Aug 2020 07:45:26 +0000 (GMT) From: Srikar Dronamraju To: Michael Ellerman Cc: linuxppc-dev , Srikar Dronamraju , LKML , Michael Neuling , Gautham R Shenoy , Ingo Molnar , Peter Zijlstra , Valentin Schneider , Dietmar Eggemann , Mel Gorman , Vincent Guittot , Vaidyanathan Srinivasan Subject: [PATCH v2 2/2] powerpc/topology: Override cpu_smt_mask Date: Fri, 7 Aug 2020 13:15:17 +0530 Message-Id: <20200807074517.27957-2-srikar@linux.vnet.ibm.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200807074517.27957-1-srikar@linux.vnet.ibm.com> References: <20200807074517.27957-1-srikar@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235,18.0.687 definitions=2020-08-07_02:2020-08-06,2020-08-07 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 phishscore=0 lowpriorityscore=0 clxscore=1015 malwarescore=0 mlxlogscore=999 impostorscore=0 adultscore=0 spamscore=0 priorityscore=1501 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2008070051 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Power9, a pair of SMT4 cores can be presented by the firmware as a SMT8 core for backward compatibility reasons, with the fusion of two SMT4 cores. Powerpc allows LPARs to be live migrated from Power8 to Power9. Existing software developed/configured for Power8, expects to see a SMT8 core. In order to maintain userspace backward compatibility (with Power8 chips in case of Power9) in enterprise Linux systems, the topology_sibling_cpumask has to be set to SMT8 core. cpu_smt_mask() should generally point to the cpu mask of the SMT4 core. Hence override the default cpu_smt_mask() to be powerpc specific allowing for better scheduling behaviour on Power. schbench (latency measured in usecs, so lesser is better) Without patch With patch Latency percentiles (usec) Latency percentiles (usec) 50.0000th: 34 50.0000th: 38 75.0000th: 47 75.0000th: 52 90.0000th: 54 90.0000th: 60 95.0000th: 57 95.0000th: 64 *99.0000th: 62 *99.0000th: 72 99.5000th: 65 99.5000th: 75 99.9000th: 76 99.9000th: 3452 min=0, max=9205 min=0, max=9344 schbench (With Cede disabled) Without patch With patch Latency percentiles (usec) Latency percentiles (usec) 50.0000th: 20 50.0000th: 21 75.0000th: 28 75.0000th: 29 90.0000th: 33 90.0000th: 34 95.0000th: 35 95.0000th: 37 *99.0000th: 40 *99.0000th: 40 99.5000th: 48 99.5000th: 42 99.9000th: 94 99.9000th: 79 min=0, max=791 min=0, max=791 perf bench sched pipe usec/ops : lesser is better Without patch N Min Max Median Avg Stddev 101 5.095113 5.595269 5.204842 5.2298776 0.10762713 5.10 - 5.15 : ################################################## 23% (24) 5.15 - 5.20 : ############################################# 21% (22) 5.20 - 5.25 : ################################################## 23% (24) 5.25 - 5.30 : ######################### 11% (12) 5.30 - 5.35 : ########## 4% (5) 5.35 - 5.40 : ######## 3% (4) 5.40 - 5.45 : ######## 3% (4) 5.45 - 5.50 : #### 1% (2) 5.50 - 5.55 : ## 0% (1) 5.55 - 5.60 : #### 1% (2) With patch N Min Max Median Avg Stddev 101 5.134675 8.524719 5.207658 5.2780985 0.34911969 5.1 - 5.5 : ################################################## 94% (95) 5.5 - 5.8 : ## 3% (4) 5.8 - 6.2 : 0% (1) 6.2 - 6.5 : 6.5 - 6.8 : 6.8 - 7.2 : 7.2 - 7.5 : 7.5 - 7.8 : 7.8 - 8.2 : 8.2 - 8.5 : perf bench sched pipe (cede disabled) usec/ops : lesser is better Without patch N Min Max Median Avg Stddev 101 7.884227 12.576538 7.956474 8.0170722 0.46159054 7.9 - 8.4 : ################################################## 99% (100) 8.4 - 8.8 : 8.8 - 9.3 : 9.3 - 9.8 : 9.8 - 10.2 : 10.2 - 10.7 : 10.7 - 11.2 : 11.2 - 11.6 : 11.6 - 12.1 : 12.1 - 12.6 : With patch N Min Max Median Avg Stddev 101 7.956021 8.217284 8.015615 8.0283866 0.049844967 7.96 - 7.98 : ###################### 12% (13) 7.98 - 8.01 : ################################################## 28% (29) 8.01 - 8.03 : #################################### 20% (21) 8.03 - 8.06 : ######################### 14% (15) 8.06 - 8.09 : ###################### 12% (13) 8.09 - 8.11 : ###### 3% (4) 8.11 - 8.14 : ### 1% (2) 8.14 - 8.17 : ### 1% (2) 8.17 - 8.19 : 8.19 - 8.22 : # 0% (1) Observations: With the patch, the initial run/iteration takes a slight longer time. This can be attributed to the fact that now we pick a CPU from a idle core which could be sleep mode. Once we remove the cede, state the numbers improve in favour of the patch. ebizzy: transactions per second (higher is better) without patch N Min Max Median Avg Stddev 100 1018433 1304470 1193208 1182315.7 60018.733 1018433 - 1047037 : ###### 3% (3) 1047037 - 1075640 : ######## 4% (4) 1075640 - 1104244 : ######## 4% (4) 1104244 - 1132848 : ############### 7% (7) 1132848 - 1161452 : #################################### 17% (17) 1161452 - 1190055 : ########################## 12% (12) 1190055 - 1218659 : ############################################# 21% (21) 1218659 - 1247263 : ################################################## 23% (23) 1247263 - 1275866 : ######## 4% (4) 1275866 - 1304470 : ######## 4% (4) with patch N Min Max Median Avg Stddev 100 967014 1292938 1208819 1185281.8 69815.851 967014 - 999606 : ## 1% (1) 999606 - 1032199 : ## 1% (1) 1032199 - 1064791 : ############ 6% (6) 1064791 - 1097384 : ########## 5% (5) 1097384 - 1129976 : ################## 9% (9) 1129976 - 1162568 : #################### 10% (10) 1162568 - 1195161 : ########################## 13% (13) 1195161 - 1227753 : ############################################ 22% (22) 1227753 - 1260346 : ################################################## 25% (25) 1260346 - 1292938 : ############## 7% (7) Observations: Not much changes, ebizzy is not much impacted. Cc: linuxppc-dev Cc: LKML Cc: Michael Ellerman Cc: Michael Neuling Cc: Gautham R Shenoy Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Valentin Schneider Cc: Dietmar Eggemann Cc: Mel Gorman Cc: Vincent Guittot Cc: Vaidyanathan Srinivasan Signed-off-by: Srikar Dronamraju --- Changelog v1->v2: Modified commit msg as per mailing list discussion. Added performance numbers arch/powerpc/include/asm/cputhreads.h | 1 - arch/powerpc/include/asm/smp.h | 13 +++++++++++++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/cputhreads.h b/arch/powerpc/include/asm/cputhreads.h index deb99fd6e060..98c8bd155bf9 100644 --- a/arch/powerpc/include/asm/cputhreads.h +++ b/arch/powerpc/include/asm/cputhreads.h @@ -23,7 +23,6 @@ extern int threads_per_core; extern int threads_per_subcore; extern int threads_shift; -extern bool has_big_cores; extern cpumask_t threads_core_mask; #else #define threads_per_core 1 diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 9cd0765633c5..bb06aa875131 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -131,6 +131,19 @@ static inline struct cpumask *cpu_smallcore_mask(int cpu) extern int cpu_to_core_id(int cpu); +extern bool has_big_cores; + +#define cpu_smt_mask cpu_smt_mask +#ifdef CONFIG_SCHED_SMT +static inline const struct cpumask *cpu_smt_mask(int cpu) +{ + if (has_big_cores) + return per_cpu(cpu_smallcore_map, cpu); + + return per_cpu(cpu_sibling_map, cpu); +} +#endif /* CONFIG_SCHED_SMT */ + /* Since OpenPIC has only 4 IPIs, we use slightly different message numbers. * * Make sure this matches openpic_request_IPIs in open_pic.c, or what shows up -- 2.18.2