Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1079010pxb; Thu, 21 Oct 2021 15:25:33 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz8mplDZhVsxXrsVjeUDH4yotzl2/I1v3XMKYeGMslv3BSdzWgiWKzwE1rcMfvJ7fipiA6b X-Received: by 2002:a05:6402:c01:: with SMTP id co1mr11250858edb.314.1634855133508; Thu, 21 Oct 2021 15:25:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634855133; cv=none; d=google.com; s=arc-20160816; b=wvuatvfKubMB3hB9vUTuArAGEcvj1G05ilQ5QEzA0iqbfdam29mY1wAP9mdChyV+gR 1lO1jWJ/AShrrfiCH2q+VV4t7P2aaEZgdmG+21IqstDKVZhKzluJXuU8cLLj1HMcgLeG s/23wlwLtxPb1zDJ4m7SHKx9pqj8NBmthHDtgQPVA8J7RK10jCKCM/9RBNabjkoQh41h xXQd3ckafy0nkg4PP/gDFZN1ZEgLOlf+HhvbaGAgp1tCkmEeoJdLXOr7bAKOyJSGhGPY nPM/V70x9SI+cPb+7BNCqyi8UqlG5LNqqCaPDjQX2BFZUOxkMABGwB5Z49xE5dFMor6+ JttA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from; bh=9/J8ADLpYzr3tKWewU5xOScBIftN4+yWt5ryKDakoEc=; b=RqhyCmZC3wLrgPrUoU3kmHIosB3FNmF3cF11aIDA1sbZY2K4hNSj2ZS4igoogHycUL oyg9wQJKUFH8MW0Y5tPZfMBdr0grgwVuQRVq6oGOAZ6FKaZIyinonh20dKS/dvMonioz 4QBZ5/DWx3vjMZ1f/LIUcNCebqDTLQOH3bWHvtbZmhPLHX7jj3de4XNJzCda+Kp8aVCM xC9VvdiWiqbejJiuw0urHB8x+Znd9MxBAMF/Msh3N+dY3QAw9qxZlBf9c07UOHqGqR3B 5kcHUA5NQb9qBxeq4OB1jhZHszg+Nm44RZ4HSYrC5k9Son5/4wgPHnJ+TJC4pYHeJruw zaCQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=hisilicon.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b91si8489611edf.112.2021.10.21.15.25.09; Thu, 21 Oct 2021 15:25:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=hisilicon.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232173AbhJUWZ1 convert rfc822-to-8bit (ORCPT + 99 others); Thu, 21 Oct 2021 18:25:27 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:13968 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230500AbhJUWZ0 (ORCPT ); Thu, 21 Oct 2021 18:25:26 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.54]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Hb24v2pgTzZcJc; Fri, 22 Oct 2021 06:21:19 +0800 (CST) Received: from kwepemm000014.china.huawei.com (7.193.23.6) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.15; Fri, 22 Oct 2021 06:23:07 +0800 Received: from kwepemm600014.china.huawei.com (7.193.23.54) by kwepemm000014.china.huawei.com (7.193.23.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.15; Fri, 22 Oct 2021 06:23:07 +0800 Received: from kwepemm600014.china.huawei.com ([7.193.23.54]) by kwepemm600014.china.huawei.com ([7.193.23.54]) with mapi id 15.01.2308.015; Fri, 22 Oct 2021 06:23:07 +0800 From: "Song Bao Hua (Barry Song)" To: Peter Zijlstra , Barry Song <21cnbao@gmail.com> CC: Tom Lendacky , LKML , "linux-tip-commits@vger.kernel.org" , Tim Chen , x86 Subject: RE: [tip: sched/core] sched: Add cluster scheduler level for x86 Thread-Topic: [tip: sched/core] sched: Add cluster scheduler level for x86 Thread-Index: AQHXwalo/RpU/FhCiUKur45aRWVHFKvbXu+AgABvY4CAAATMgIAABMEAgAAC94CAAAFLAIAA6F0AgAAvmgCAARqoMA== Date: Thu, 21 Oct 2021 22:23:07 +0000 Message-ID: <73dec318e90d45ae93f2931f0e25171b@hisilicon.com> References: <20210924085104.44806-4-21cnbao@gmail.com> <163429109791.25758.3107620034958821511.tip-bot2@tip-bot2> <9e7b0c92-5a3b-8099-8c69-83a9d62aced4@amd.com> <20211020195131.GT174703@worktop.programming.kicks-ass.net> <20211020202542.GU174703@worktop.programming.kicks-ass.net> <20211020203619.GC174730@worktop.programming.kicks-ass.net> <20211020204056.GD174730@worktop.programming.kicks-ass.net> In-Reply-To: Accept-Language: en-GB, zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.126.203.9] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > -----Original Message----- > From: Peter Zijlstra [mailto:peterz@infradead.org] > Sent: Friday, October 22, 2021 2:23 AM > To: Barry Song <21cnbao@gmail.com> > Cc: Tom Lendacky ; LKML > ; linux-tip-commits@vger.kernel.org; Tim Chen > ; Song Bao Hua (Barry Song) > ; x86 > Subject: Re: [tip: sched/core] sched: Add cluster scheduler level for x86 > > On Thu, Oct 21, 2021 at 11:32:36PM +1300, Barry Song wrote: > > On Thu, Oct 21, 2021 at 9:43 AM Peter Zijlstra wrote: > > > > > > On Wed, Oct 20, 2021 at 10:36:19PM +0200, Peter Zijlstra wrote: > > > > > > > OK, I think I see what's happening. > > > > > > > > AFAICT cacheinfo.c does *NOT* set l2c_id on AMD/Hygon hardware, this > > > > means it's set to BAD_APICID. > > > > > > > > This then results in match_l2c() to never match. And as a direct > > > > consequence set_cpu_sibling_map() will generate cpu_l2c_shared_mask with > > > > just the one CPU set. > > > > > > > > And we have the above result and things come unstuck if we assume: > > > > SMT <= L2 <= LLC > > > > > > > > Now, the big question, how to fix this... Does AMD have means of > > > > actually setting l2c_id or should we fall back to using match_smt() for > > > > l2c_id == BAD_APICID ? > > > > > > The latter looks something like the below and ought to make EPYC at > > > least function as it did before. > > > > > > > > > --- > > > arch/x86/kernel/smpboot.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > > > index 849159797101..c2671b2333d1 100644 > > > --- a/arch/x86/kernel/smpboot.c > > > +++ b/arch/x86/kernel/smpboot.c > > > @@ -472,7 +472,7 @@ static bool match_l2c(struct cpuinfo_x86 *c, struct > cpuinfo_x86 *o) > > > > > > /* Do not match if we do not have a valid APICID for cpu: */ > > > if (per_cpu(cpu_l2c_id, cpu1) == BAD_APICID) > > > - return false; > > > + return match_smt(c, o); /* assume at least SMT shares L2 */ > > > > Rather than making a fake cluster_cpus and cluster_cpus_list which > > will expose to userspace > > through /sys/devices/cpus/cpux/topology, could we just fix the > > sched_domain mask by the > > below? > > I don't think it's fake; SMT fundamentally has to share all cache > levels. And having the sched domains differ in setup from the reported > (nonsensical) topology also isn't appealing. Fair enough. I was actually inspired by cpu_coregroup_mask() which is a combination of a couple of cpumask set: drivers/base/arch_topology.c const struct cpumask *cpu_coregroup_mask(int cpu) { const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu)); /* Find the smaller of NUMA, core or LLC siblings */ if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) { /* not numa in package, lets use the package siblings */ core_mask = &cpu_topology[cpu].core_sibling; } if (cpu_topology[cpu].llc_id != -1) { if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask)) core_mask = &cpu_topology[cpu].llc_sibling; } return core_mask; } Thanks Barry