Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp1699538pxb; Thu, 16 Sep 2021 13:19:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyDFcHauOsbJIs5FD+ol9VwrAVDXmZeev8WSUbFJgJlpjUa88qEocf8AZ/MpnsxCSkS+51C X-Received: by 2002:a50:ff0b:: with SMTP id a11mr8278271edu.373.1631823556744; Thu, 16 Sep 2021 13:19:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631823556; cv=none; d=google.com; s=arc-20160816; b=YJQMFJT9z7/U4JAQKzlsNRmrl8mPt5AV0nFbkd2DRhTzhdUYYwnqS0UPRyaALSQtgw sDNzWyEF4u4fVxC+mlPsAi9Tx7vwFgKw/mFvJ7uG69ez+LA3lwCC108xuyoMEcTYCaMa kshrFR0ZxMBFw3mo7My5q8Nm7MxDyKHVoPGDgq9by2cwEoFYVQ2qfYWUmTCz/DpfltZ7 jxb1vN1W4ztMosd/XOGXn+4mMvksfeyXgBEckkgiuShuyNuE14BaqiS9Cj/FnlXpbT8c TD/gY/ebFzPbMFI+EOcRq/j412Gd69FRvn+9wHkSbEePLLsNhFfmI1XcCimP8I5hVC0z UgjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=5G+ykgK6J/yyKXYNk1lt2Jofsgmvur29YChDsfP2axw=; b=QlE4QZr9DmjOkDcj2b/oRe/oYiwH8+7U61THUOTWWSlUdf48meNe9Bh/Tj7oL+cZho DH8SpNaOpeHalk2IQ4XzFjB/g26IOYYuLWbDK23tXdaqWOWpvX1A8oNq1PVQh5vxmdzW TE5sjlGaw2QhfkSPw1/meWu8OpQ8uYTh5h/XuLFput3vSPqh/d+g7sRFQTckmszobMEZ R/QWto1c3ZAMdM3BfPcvripIBcwUp3VynPLq9Mev+8zJ/b8KiBvVkqMRPVPAwezN/9qn Ydp4vUUj/OJ5Vfb0yVaq0Df1NSbW0ZEPL4erzDrmKyFvkWrM7XRNlFfzAXAdyMLdlJdX brYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=vLGbh+n4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z14si4677947edm.574.2021.09.16.13.18.52; Thu, 16 Sep 2021 13:19:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=vLGbh+n4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241130AbhIPQPO (ORCPT + 99 others); Thu, 16 Sep 2021 12:15:14 -0400 Received: from mail.kernel.org ([198.145.29.99]:48012 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235476AbhIPQJH (ORCPT ); Thu, 16 Sep 2021 12:09:07 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id D6A956135E; Thu, 16 Sep 2021 16:07:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1631808453; bh=kQ5PrhTdPYpk3ywUsdGkW4dHUxC9DRcGu0JK8ZVm+R4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vLGbh+n4Ex2ZOCtVPGH1GUJf+b01iI0KUDqz23d5jETK7l1rEUAzMH/Tf2CDT0TME +D6/eL/Ii5tV8PU1bQ0fEffo50DV/pYAuQPI+dcdUgyz8FtsMiZOyv2jxYP7SXo+57 0kT4pPNFtyUzke5swYqF2dIHwThIwDfPG0wt3tSQ= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Srikar Dronamraju , Michael Ellerman , Sasha Levin Subject: [PATCH 5.10 096/306] powerpc/smp: Update cpu_core_map on all PowerPc systems Date: Thu, 16 Sep 2021 17:57:21 +0200 Message-Id: <20210916155757.332816390@linuxfoundation.org> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210916155753.903069397@linuxfoundation.org> References: <20210916155753.903069397@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Srikar Dronamraju [ Upstream commit b8b928030332a0ca16d42433eb2c3085600d8704 ] lscpu() uses core_siblings to list the number of sockets in the system. core_siblings is set using topology_core_cpumask. While optimizing the powerpc bootup path, Commit 4ca234a9cbd7 ("powerpc/smp: Stop updating cpu_core_mask"). it was found that updating cpu_core_mask() ended up taking a lot of time. It was thought that on Powerpc, cpu_core_mask() would always be same as cpu_cpu_mask() i.e number of sockets will always be equal to number of nodes. As an optimization, cpu_core_mask() was made a snapshot of cpu_cpu_mask(). However that was found to be false with PowerPc KVM guests, where each node could have more than one socket. So with Commit c47f892d7aa6 ("powerpc/smp: Reintroduce cpu_core_mask"), cpu_core_mask was updated based on chip_id but in an optimized way using some mask manipulations and chip_id caching. However on non-PowerNV and non-pseries KVM guests (i.e not implementing cpu_to_chip_id(), continued to use a copy of cpu_cpu_mask(). There are two issues that were noticed on such systems 1. lscpu would report one extra socket. On a IBM,9009-42A (aka zz system) which has only 2 chips/ sockets/ nodes, lscpu would report Architecture: ppc64le Byte Order: Little Endian CPU(s): 160 On-line CPU(s) list: 0-159 Thread(s) per core: 8 Core(s) per socket: 6 Socket(s): 3 <-------------- NUMA node(s): 2 Model: 2.2 (pvr 004e 0202) Model name: POWER9 (architected), altivec supported Hypervisor vendor: pHyp Virtualization type: para L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 10240K NUMA node0 CPU(s): 0-79 NUMA node1 CPU(s): 80-159 2. Currently cpu_cpu_mask is updated when a core is added/removed. However its not updated when smt mode switching or on CPUs are explicitly offlined. However all other percpu masks are updated to ensure only active/online CPUs are in the masks. This results in build_sched_domain traces since there will be CPUs in cpu_cpu_mask() but those CPUs are not present in SMT / CACHE / MC / NUMA domains. A loop of threads running smt mode switching and core add/remove will soon show this trace. Hence cpu_cpu_mask has to be update at smt mode switch. This will have impact on cpu_core_mask(). cpu_core_mask() is a snapshot of cpu_cpu_mask. Different CPUs within the same socket will end up having different cpu_core_masks since they are snapshots at different points of time. This means when lscpu will start reporting many more sockets than the actual number of sockets/ nodes / chips. Different ways to handle this problem: A. Update the snapshot aka cpu_core_mask for all CPUs whenever cpu_cpu_mask is updated. This would a non-optimal solution. B. Instead of a cpumask_var_t, make cpu_core_map a cpumask pointer pointing to cpu_cpu_mask. However percpu cpumask pointer is frowned upon and we need a clean way to handle PowerPc KVM guest which is not a snapshot. C. Update cpu_core_masks all PowerPc systems like in PowerPc KVM guests using mask manipulations. This approach is relatively simple and unifies with the existing code. D. On top of 3, we could also resurrect get_physical_package_id which could return a nid for the said CPU. However this is not needed at this time. Option C is the preferred approach for now. While this is somewhat a revert of Commit 4ca234a9cbd7 ("powerpc/smp: Stop updating cpu_core_mask"). 1. Plain revert has some conflicts 2. For chip_id == -1, the cpu_core_mask is made identical to cpu_cpu_mask, unlike previously where cpu_core_mask was set to a core if chip_id doesn't exist. This goes by the principle that if chip_id is not exposed, then sockets / chip / node share the same set of CPUs. With the fix, lscpu o/p would be Architecture: ppc64le Byte Order: Little Endian CPU(s): 160 On-line CPU(s) list: 0-159 Thread(s) per core: 8 Core(s) per socket: 6 Socket(s): 2 <-------------- NUMA node(s): 2 Model: 2.2 (pvr 004e 0202) Model name: POWER9 (architected), altivec supported Hypervisor vendor: pHyp Virtualization type: para L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 10240K NUMA node0 CPU(s): 0-79 NUMA node1 CPU(s): 80-159 Fixes: 4ca234a9cbd7 ("powerpc/smp: Stop updating cpu_core_mask") Signed-off-by: Srikar Dronamraju Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210826100401.412519-3-srikar@linux.vnet.ibm.com Signed-off-by: Sasha Levin --- arch/powerpc/kernel/smp.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 26a028a9233a..91f274134884 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -1385,6 +1385,7 @@ static void add_cpu_to_masks(int cpu) * add it to it's own thread sibling mask. */ cpumask_set_cpu(cpu, cpu_sibling_mask(cpu)); + cpumask_set_cpu(cpu, cpu_core_mask(cpu)); for (i = first_thread; i < first_thread + threads_per_core; i++) if (cpu_online(i)) @@ -1399,11 +1400,6 @@ static void add_cpu_to_masks(int cpu) if (has_coregroup_support()) update_coregroup_mask(cpu, &mask); - if (chip_id == -1 || !ret) { - cpumask_copy(per_cpu(cpu_core_map, cpu), cpu_cpu_mask(cpu)); - goto out; - } - if (shared_caches) submask_fn = cpu_l2_cache_mask; @@ -1413,6 +1409,10 @@ static void add_cpu_to_masks(int cpu) /* Skip all CPUs already part of current CPU core mask */ cpumask_andnot(mask, cpu_online_mask, cpu_core_mask(cpu)); + /* If chip_id is -1; limit the cpu_core_mask to within DIE*/ + if (chip_id == -1) + cpumask_and(mask, mask, cpu_cpu_mask(cpu)); + for_each_cpu(i, mask) { if (chip_id == cpu_to_chip_id(i)) { or_cpumasks_related(cpu, i, submask_fn, cpu_core_mask); @@ -1422,7 +1422,6 @@ static void add_cpu_to_masks(int cpu) } } -out: free_cpumask_var(mask); } -- 2.30.2