Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp93930pxu; Thu, 3 Dec 2020 20:53:25 -0800 (PST) X-Google-Smtp-Source: ABdhPJzcIC1Pi5CbGIuA39e8wGgGAv4jnJer/cLOkKTICMExy3zvRdxPcFcqA0Nf2sPblLVvSXLV X-Received: by 2002:a17:906:1405:: with SMTP id p5mr5347747ejc.282.1607057605308; Thu, 03 Dec 2020 20:53:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607057605; cv=none; d=google.com; s=arc-20160816; b=zIPfOYfNg3I9ePr0iTOBqPQimGBAPY8UfgoXHLzAPEkoQLPF5v1JYONK/Lna/4HYiC 9QBf5NjcbJmv25dGYUlC0Oib3z/k3Ei3X+m/tB1FoPythqexOUCnWcwoaz6ymi3SQjZi TwSx1glFrjZsmTlDqfaKVq30gsJ8jq6eT8i3tdoax0dO6oazzGHGnZedYfNkNjHxPkY3 geNybhwO0ngT59MDA3HTVbKAFAv2FQWH8W+3f0SCPVpKoN96VkJ0DgVvWLcLNdWzmSY2 mUMXSfVEUju2b67zhH9hbMlsBXIzDpRgI1wl8KarFbtftVrupjDPSkHtpoNLTc+ALFUk oVXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=U0r4XEZvrvVVL+gfq8PpFGyJnoKttXE+3/lNTXlim7s=; b=CwYJV7hQ+ZDfpfPF73frWyBcdLqmQtwBPEbIIfkjYY9WyOEuM4JxlDU9dUVTrU5xql fVQeseh9SmA+c4xq/hBCmqbGuwYldyo2Hv8VlhNWA0I86orvXZfCWyYEndVbgHlcX/k2 nN5Xv75u3cP7575g5/rJlPK8tREgid0xFF/6J1psL6fmVCUACkzM3djFj15enhJAnG/H MhAobJ9uc591WwHo8qM5Wpdi0nqQ+djgv/HRv4eg6t4AnmeSoSdIS388VdsE0ylkeM/F JVSB2Gz7R1bre7eQjnDohlrjKW51BJ/f6KnCojmPxiOjTFLGPuICn3quW7Gwc+fAD1dr duXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=aX8WBR45; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p20si708755ejx.41.2020.12.03.20.53.02; Thu, 03 Dec 2020 20:53:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=aX8WBR45; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727245AbgLDEuL (ORCPT + 99 others); Thu, 3 Dec 2020 23:50:11 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:54316 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725989AbgLDEuK (ORCPT ); Thu, 3 Dec 2020 23:50:10 -0500 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0B44bVIM145848; Thu, 3 Dec 2020 23:49:11 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=pp1; bh=U0r4XEZvrvVVL+gfq8PpFGyJnoKttXE+3/lNTXlim7s=; b=aX8WBR45Eg3Pa4hQfBmAUxvYpBmGn2+Ad5briKBfdImvyQo83QtkAtRBwOgVWGfPh0sL VvD8ANVcsloKponAJHbQg4QZ5PgMyLDA+AXWqH1AZJaXL2vTM2nSfMqbJJ2Xlrxj3XeV /p7n6bgAWqgWdxNTdGhwXDeIBrmVc2pE80v73igH7dIkgqI8Ski3BxoJEiSjdspWMcc+ 06etOUpQCkSrFa7Y4sgSkO0L5mrGsYtzbAJNUrTFgvhptPCMqt7Cn89L87c91+3tyvQZ 8rHBv15t1hoOq8W27ebNZIlRNJLN/I1/KOT2AKWHJn7wW7O9v1Wo8+VJOmvQQrbXcNa3 dw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 357d31stx5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Dec 2020 23:49:11 -0500 Received: from m0098420.ppops.net (m0098420.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0B44bZCL146096; Thu, 3 Dec 2020 23:49:10 -0500 Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0b-001b2d01.pphosted.com with ESMTP id 357d31stx1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Dec 2020 23:49:10 -0500 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0B44lXEN025539; Fri, 4 Dec 2020 04:49:10 GMT Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by ppma02dal.us.ibm.com with ESMTP id 3569xutbwe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 04 Dec 2020 04:49:10 +0000 Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0B44n6MB20250928 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 4 Dec 2020 04:49:06 GMT Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 427EA6A04F; Fri, 4 Dec 2020 04:49:06 +0000 (GMT) Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 79A2C6A051; Fri, 4 Dec 2020 04:49:05 +0000 (GMT) Received: from sofia.ibm.com (unknown [9.199.56.248]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP; Fri, 4 Dec 2020 04:49:05 +0000 (GMT) Received: by sofia.ibm.com (Postfix, from userid 1000) id AED9D2E3D00; Fri, 4 Dec 2020 10:19:01 +0530 (IST) From: "Gautham R. Shenoy" To: Srikar Dronamraju , Anton Blanchard , Vaidyanathan Srinivasan , Michael Ellerman , Michael Neuling , Nicholas Piggin , Nathan Lynch , Peter Zijlstra , Valentin Schneider Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, "Gautham R. Shenoy" Subject: [PATCH 2/3] powerpc/smp: Add support detecting thread-groups sharing L2 cache Date: Fri, 4 Dec 2020 10:18:46 +0530 Message-Id: <1607057327-29822-3-git-send-email-ego@linux.vnet.ibm.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1607057327-29822-1-git-send-email-ego@linux.vnet.ibm.com> References: <1607057327-29822-1-git-send-email-ego@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.312,18.0.737 definitions=2020-12-04_01:2020-12-03,2020-12-04 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 spamscore=0 mlxscore=0 malwarescore=0 clxscore=1015 lowpriorityscore=0 adultscore=0 impostorscore=0 phishscore=0 suspectscore=0 mlxlogscore=999 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012040021 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Gautham R. Shenoy" On POWER systems, groups of threads within a core sharing the L2-cache can be indicated by the "ibm,thread-groups" property array with the identifier "2". This patch adds support for detecting this, and when present, populate the populating the cpu_l2_cache_mask of every CPU to the core-siblings which share L2 with the CPU as specified in the by the "ibm,thread-groups" property array. On a platform with the following "ibm,thread-group" configuration 00000001 00000002 00000004 00000000 00000002 00000004 00000006 00000001 00000003 00000005 00000007 00000002 00000002 00000004 00000000 00000002 00000004 00000006 00000001 00000003 00000005 00000007 Without this patch, the sched-domain hierarchy for CPUs 0,1 would be CPU0 attaching sched-domain(s): domain-0: span=0,2,4,6 level=SMT domain-1: span=0-7 level=CACHE domain-2: span=0-15,24-39,48-55 level=MC domain-3: span=0-55 level=DIE CPU1 attaching sched-domain(s): domain-0: span=1,3,5,7 level=SMT domain-1: span=0-7 level=CACHE domain-2: span=0-15,24-39,48-55 level=MC domain-3: span=0-55 level=DIE The CACHE domain at 0-7 is incorrect since the ibm,thread-groups sub-array [00000002 00000002 00000004 00000000 00000002 00000004 00000006 00000001 00000003 00000005 00000007] indicates that L2 (Property "2") is shared only between the threads of a single group. There are "2" groups of threads where each group contains "4" threads each. The groups being {0,2,4,6} and {1,3,5,7}. With this patch, the sched-domain hierarchy for CPUs 0,1 would be CPU0 attaching sched-domain(s): domain-0: span=0,2,4,6 level=SMT domain-1: span=0-15,24-39,48-55 level=MC domain-2: span=0-55 level=DIE CPU1 attaching sched-domain(s): domain-0: span=1,3,5,7 level=SMT domain-1: span=0-15,24-39,48-55 level=MC domain-2: span=0-55 level=DIE The CACHE domain with span=0,2,4,6 for CPU 0 (span=1,3,5,7 for CPU 1 resp.) gets degenerated into the SMT domain. Furthermore, the last-level-cache domain gets correctly set to the SMT sched-domain. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/kernel/smp.c | 66 +++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 58 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 6a242a3..a116d2d 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -76,6 +76,7 @@ struct task_struct *secondary_current; bool has_big_cores; bool coregroup_enabled; +bool thread_group_shares_l2; DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map); DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map); @@ -99,6 +100,7 @@ enum { #define MAX_THREAD_LIST_SIZE 8 #define THREAD_GROUP_SHARE_L1 1 +#define THREAD_GROUP_SHARE_L2 2 struct thread_groups { unsigned int property; unsigned int nr_groups; @@ -107,7 +109,7 @@ struct thread_groups { }; /* Maximum number of properties that groups of threads within a core can share */ -#define MAX_THREAD_GROUP_PROPERTIES 1 +#define MAX_THREAD_GROUP_PROPERTIES 2 struct thread_groups_list { unsigned int nr_properties; @@ -121,6 +123,13 @@ struct thread_groups_list { */ DEFINE_PER_CPU(cpumask_var_t, cpu_l1_cache_map); +/* + * On some big-cores system, thread_group_l2_cache_map for each CPU + * corresponds to the set its siblings within the core that share the + * L2-cache. + */ +DEFINE_PER_CPU(cpumask_var_t, thread_group_l2_cache_map); + /* SMP operations for this machine */ struct smp_ops_t *smp_ops; @@ -718,7 +727,9 @@ static void or_cpumasks_related(int i, int j, struct cpumask *(*srcmask)(int), * * ibm,thread-groups[i + 0] tells us the property based on which the * threads are being grouped together. If this value is 1, it implies - * that the threads in the same group share L1, translation cache. + * that the threads in the same group share L1, translation cache. If + * the value is 2, it implies that the threads in the same group share + * the same L2 cache. * * ibm,thread-groups[i+1] tells us how many such thread groups exist for the * property ibm,thread-groups[i] @@ -745,10 +756,10 @@ static void or_cpumasks_related(int i, int j, struct cpumask *(*srcmask)(int), * 12}. * * b) there are 2 groups of 4 threads each, where each group of - * threads share some property indicated by the first value 2. The - * "ibm,ppc-interrupt-server#s" of the first group is {5,7,9,11} - * and the "ibm,ppc-interrupt-server#s" of the second group is - * {6,8,10,12} structure + * threads share some property indicated by the first value 2 (L2 + * cache). The "ibm,ppc-interrupt-server#s" of the first group is + * {5,7,9,11} and the "ibm,ppc-interrupt-server#s" of the second + * group is {6,8,10,12} structure * * Returns 0 on success, -EINVAL if the property does not exist, * -ENODATA if property does not have a value, and -EOVERFLOW if the @@ -840,7 +851,8 @@ static int init_cpu_cache_map(int cpu, unsigned int cache_property) if (!dn) return -ENODATA; - if (!(cache_property == THREAD_GROUP_SHARE_L1)) + if (!(cache_property == THREAD_GROUP_SHARE_L1 || + cache_property == THREAD_GROUP_SHARE_L2)) return -EINVAL; if (!cpu_tgl->nr_properties) { @@ -867,7 +879,10 @@ static int init_cpu_cache_map(int cpu, unsigned int cache_property) goto out; } - mask = &per_cpu(cpu_l1_cache_map, cpu); + if (cache_property == THREAD_GROUP_SHARE_L1) + mask = &per_cpu(cpu_l1_cache_map, cpu); + else if (cache_property == THREAD_GROUP_SHARE_L2) + mask = &per_cpu(thread_group_l2_cache_map, cpu); zalloc_cpumask_var_node(mask, GFP_KERNEL, cpu_to_node(cpu)); @@ -973,6 +988,16 @@ static int init_big_cores(void) } has_big_cores = true; + + for_each_possible_cpu(cpu) { + int err = init_cpu_cache_map(cpu, THREAD_GROUP_SHARE_L2); + + if (err) + return err; + } + + thread_group_shares_l2 = true; + pr_info("Thread-groups in a core share L2-cache\n"); return 0; } @@ -1287,6 +1312,31 @@ static bool update_mask_by_l2(int cpu, cpumask_var_t *mask) if (has_big_cores) submask_fn = cpu_smallcore_mask; + + /* + * If the threads in a thread-group share L2 cache, then then + * the L2-mask can be obtained from thread_group_l2_cache_map. + */ + if (thread_group_shares_l2) { + /* Siblings that share L1 is a subset of siblings that share L2.*/ + or_cpumasks_related(cpu, cpu, submask_fn, cpu_l2_cache_mask); + if (*mask) { + cpumask_andnot(*mask, + per_cpu(thread_group_l2_cache_map, cpu), + cpu_l2_cache_mask(cpu)); + } else { + mask = &per_cpu(thread_group_l2_cache_map, cpu); + } + + for_each_cpu(i, *mask) { + if (!cpu_online(i)) + continue; + set_cpus_related(i, cpu, cpu_l2_cache_mask); + } + + return true; + } + l2_cache = cpu_to_l2cache(cpu); if (!l2_cache || !*mask) { /* Assume only core siblings share cache with this CPU */ -- 1.9.4