Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp3790528imm; Sun, 13 May 2018 20:21:39 -0700 (PDT) X-Google-Smtp-Source: AB8JxZruA1Ds8IpB7ARiGWoJqa1k4WgRBcEQhgeTlTn8YwdOLwZ7Hqa9IpSEGXUJ61cjd1oowjzW X-Received: by 2002:a62:ccdc:: with SMTP id j89-v6mr8717719pfk.182.1526268099719; Sun, 13 May 2018 20:21:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526268099; cv=none; d=google.com; s=arc-20160816; b=NsR2cEJrAugfBkOxrIc+H8pCQiQMeP587AAiBykVjsv9mXdlG2UcRuJq2TA4bR8nRn X/r8XifcVQ3nI2fSwbrNiEnDDtWGjhPBtRvA0Zjk4eDxUS85ajlNJevOZgYZu/ASqN8a pS+NGX/lS/RCiStXlg/UK48lhBie4DzwqvKA+V6KaWVXG0qbEUa11Qt3OjAtgRUsECPK pzTlNNjQhAqLF9ZMv6iYc1P00GUSwk3da8CpbZoTC1MJ9UR8sQ/IEklRz5do/r3PqS3X rSf68a0PRJ/EzspUUTgBXc9ANWo59v5MCPX0XnCMAQakE1jWkU1cm+M1DfAVmDVe8PTk 9szw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :references:in-reply-to:date:cc:to:from:subject:message-id :arc-authentication-results; bh=jg46FVKhCC3zEkjAPl0wV+E5xjGQNVIB/TdyB+NiUPI=; b=OSxncGQ/iTSeSs7PrvXMVNMmQrCu15CkySZEKiFqo7eHINFpgbjWqop/6BW4r1Ky3Y YUF2eZUKB1p6wAcD1mwJYeIxi19WWNiHaTVAnjDZn4ZiE7TXkoPgmNRR3sGXSKppF9rY lc6b/4OoSFbvSxzodYa/56s/rAM0UU3jgFh2D2mukVb3WadQt73aaYZZqZkJ5eVmYMhS 3RGU+3YEoYiBtPutt+BIYON243nnhbcUhsLBJvG6tkXfrVvdTybK6Vu/4qcrEJ/7wgzG ocFYFIwOVRMX297Ei0MzIVfn4i7xoqD8mZbaWOavewgXPtJVWF/QyXZtGhtCNDsg2e04 dGxg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y1-v6si6871286pge.248.2018.05.13.20.21.24; Sun, 13 May 2018 20:21:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752079AbeENDVP convert rfc822-to-8bit (ORCPT + 99 others); Sun, 13 May 2018 23:21:15 -0400 Received: from ozlabs.org ([203.11.71.1]:56511 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751953AbeENDVO (ORCPT ); Sun, 13 May 2018 23:21:14 -0400 Received: from localhost.localdomain (localhost [127.0.0.1]) by ozlabs.org (Postfix) with ESMTP id 40kmF408ZLz9s0y; Mon, 14 May 2018 13:21:12 +1000 (AEST) Received: by localhost.localdomain (Postfix, from userid 1000) id E7A3CEE7911; Mon, 14 May 2018 13:21:11 +1000 (AEST) Message-ID: <1526268071.30369.20.camel@neuling.org> Subject: Re: [PATCH 1/2] powerpc: Detect the presence of big-core with interleaved threads From: Michael Neuling To: "Gautham R. Shenoy" , Michael Ellerman , Benjamin Herrenschmidt , Vaidyanathan Srinivasan , Akshay Adiga , Shilpasri G Bhat , Balbir Singh , Oliver O'Halloran , Nicholas Piggin Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Date: Mon, 14 May 2018 13:21:11 +1000 In-Reply-To: <1526037444-22876-2-git-send-email-ego@linux.vnet.ibm.com> References: <1526037444-22876-1-git-send-email-ego@linux.vnet.ibm.com> <1526037444-22876-2-git-send-email-ego@linux.vnet.ibm.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT X-Mailer: Evolution 3.26.1-1 Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Thanks for posting this... A couple of comments below. On Fri, 2018-05-11 at 16:47 +0530, Gautham R. Shenoy wrote: > From: "Gautham R. Shenoy" > > A pair of IBM POWER9 SMT4 cores can be fused together to form a > big-core with 8 SMT threads. This can be discovered via the > "ibm,thread-groups" CPU property in the device tree which will > indicate which group of threads that share the L1 cache, translation > cache and instruction data flow. If there are multiple such group of > threads, then the core is a big-core. The thread-ids of the threads of > the big-core can be obtained by interleaving the thread-ids of the > thread-groups (component small core). > > Eg: Threads in the pair of component SMT4 cores of an interleaved > big-core are numbered {0,2,4,6} and {1,3,5,7} respectively. > > This patch introduces a function to check if a given device tree node > corresponding to a CPU node represents an interleaved big-core. > > This function is invoked during the boot-up to detect the presence of > interleaved big-cores. The presence of such an interleaved big-core is > recorded in a global variable for later use. > > Signed-off-by: Gautham R. Shenoy > --- > arch/powerpc/include/asm/cputhreads.h | 8 +++-- > arch/powerpc/kernel/setup-common.c | 63 +++++++++++++++++++++++++++++++++- > - > 2 files changed, 66 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/include/asm/cputhreads.h > b/arch/powerpc/include/asm/cputhreads.h > index d71a909..b706f0a 100644 > --- a/arch/powerpc/include/asm/cputhreads.h > +++ b/arch/powerpc/include/asm/cputhreads.h > @@ -23,11 +23,13 @@ > extern int threads_per_core; > extern int threads_per_subcore; > extern int threads_shift; > +extern bool has_interleaved_big_core; > extern cpumask_t threads_core_mask; > #else > -#define threads_per_core 1 > -#define threads_per_subcore 1 > -#define threads_shift 0 > +#define threads_per_core 1 > +#define threads_per_subcore 1 > +#define threads_shift 0 > +#define has_interleaved_big_core 0 > #define threads_core_mask (*get_cpu_mask(0)) > #endif > > diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup- > common.c > index 0af5c11..884dff2 100644 > --- a/arch/powerpc/kernel/setup-common.c > +++ b/arch/powerpc/kernel/setup-common.c > @@ -408,10 +408,12 @@ void __init check_for_initrd(void) > #ifdef CONFIG_SMP > > int threads_per_core, threads_per_subcore, threads_shift; > +bool has_interleaved_big_core; > cpumask_t threads_core_mask; > EXPORT_SYMBOL_GPL(threads_per_core); > EXPORT_SYMBOL_GPL(threads_per_subcore); > EXPORT_SYMBOL_GPL(threads_shift); > +EXPORT_SYMBOL_GPL(has_interleaved_big_core); > EXPORT_SYMBOL_GPL(threads_core_mask); > > static void __init cpu_init_thread_core_maps(int tpc) > @@ -436,8 +438,56 @@ static void __init cpu_init_thread_core_maps(int tpc) > printk(KERN_DEBUG " (thread shift is %d)\n", threads_shift); > } > > - > u32 *cpu_to_phys_id = NULL; > +/* > + * check_for_interleaved_big_core - Checks if the core represented by > + * dn is a big-core whose threads are interleavings of the > + * threads of the component small cores. > + * > + * @dn: device node corresponding to the core. > + * > + * Returns true if the core is a interleaved big-core. > + * Returns false otherwise. > + */ > +static inline bool check_for_interleaved_big_core(struct device_node *dn) > +{ > + int len, nr_groups, threads_per_group; > + const __be32 *thread_groups; > + __be32 *thread_list, *first_cpu_idx; > + int cur_cpu, next_cpu, i, j; > + > + thread_groups = of_get_property(dn, "ibm,thread-groups", &len); > + if (!thread_groups) > + return false; Can you document what this property looks like? Seems to be nr_groups, threads_per_group, thread_list. Can you explain what each of these mean? If we get configured with an SMT2 big-core (ie. two interleaved SMT1 normal cores), will this code also work there? > + > + nr_groups = be32_to_cpu(*(thread_groups + 1)); > + if (nr_groups <= 1) > + return false; > + > + threads_per_group = be32_to_cpu(*(thread_groups + 2)); > + thread_list = (__be32 *)thread_groups + 3; > + > + /* > + * In case of an interleaved big-core, the thread-ids of the > + * big-core can be obtained by interleaving the the thread-ids > + * of the component small > + * > + * Eg: On a 8-thread big-core with two SMT4 small cores, the > + * threads of the two component small cores will be > + * {0, 2, 4, 6} and {1, 3, 5, 7}. > + */ > + for (i = 0; i < nr_groups; i++) { > + first_cpu_idx = thread_list + i * threads_per_group; > + > + for (j = 0; j < threads_per_group - 1; j++) { > + cur_cpu = be32_to_cpu(*(first_cpu_idx + j)); > + next_cpu = be32_to_cpu(*(first_cpu_idx + j + 1)); > + if (next_cpu != cur_cpu + nr_groups) > + return false; > + } > + } > + return true; > +} > > /** > * setup_cpu_maps - initialize the following cpu maps: > @@ -565,7 +615,16 @@ void __init smp_setup_cpu_maps(void) > vdso_data->processorCount = num_present_cpus(); > #endif /* CONFIG_PPC64 */ > > - /* Initialize CPU <=> thread mapping/ > + dn = of_find_node_by_type(NULL, "cpu"); > + if (dn) { > + if (check_for_interleaved_big_core(dn)) { > + has_interleaved_big_core = true; > + pr_info("Detected interleaved big-cores\n"); Is there a runtime way to check this also? If the dmesg buffer overflows, we lose this. Mikey > + } > + of_node_put(dn); > + } > + > + /* Initialize CPU <=> thread mapping/ > * > * WARNING: We assume that the number of threads is the same for > * every CPU in the system. If that is not the case, then some code