Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp4035857ybl; Mon, 13 Jan 2020 06:50:20 -0800 (PST) X-Google-Smtp-Source: APXvYqzw7K35uOw4xWZ/QMpG2pIkrq4OBwlAdB8mNEAoKJZubzqB0x6W1YYmp8R2FI8PsqNxAJ2H X-Received: by 2002:aca:eb83:: with SMTP id j125mr12237183oih.153.1578927020680; Mon, 13 Jan 2020 06:50:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578927020; cv=none; d=google.com; s=arc-20160816; b=k4B9q0I6JWOQ+3avqKZ0J8LxJNgNF9gtTdYE/a8FembrH3By2wOxf+3t/m+Wf5Uzl/ lJL1xUxW1H0oT6O2VCxuj0YkyMUfe3Qqew2QVzRVMmAZcUhbyKS4AhgYfvIs8XZLO9E0 Ec0GOiogmP+utotBBGR+xCiFF5bPVBK5cetEuKbRvWlDrMUbw1MQkjMdij5rzQgIE6v7 NUJT3J80lIWbS537hwNukTPXOw6UEv2wkQMG0NM/xnoqCI7zddw311NtBmNHWnAGTBW+ XIeuknPZJK516VmZgJH4oUF2bWsb1uiHna8HOJuzF1svOmdRrRqxUhOClB5Iljp1kLAs PxkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=OZqJze8c3eGzLrjacShPLiT/C7p8y72V+mZRAGxRZWM=; b=mZSZNC08aFW5+767T2UV1HOvbF7qusGbcJr5z/RrIJv5t+AxBdj+k/ekvBdiqSCUVj 8BW+UvdPadtsE49aiygCxdoHhowQ2oICy7q5oXwI22x0S9AuyQfbOoZwaPqJaB1HjF3Y lvE+ECIvC1VWUnovbpqLjX9U53VUQw59ukvR5l9zajnng5exXsf+E+bLtBFzCF1evk3d ZKDPFN6OLIUkYELepTxDncP8JiuxYVfiSt98mBBr+W/wTyNazCwLja+7s4ZDYRcRA54t woLrfGo5A9bY0AeSRGxTHmf+rCHoTpBguDzTIIlrT2PZSsalXszx5Auljdp9M8PreQfd xewQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l21si5687238oic.126.2020.01.13.06.50.08; Mon, 13 Jan 2020 06:50:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728819AbgAMOtM (ORCPT + 99 others); Mon, 13 Jan 2020 09:49:12 -0500 Received: from foss.arm.com ([217.140.110.172]:40386 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726505AbgAMOtM (ORCPT ); Mon, 13 Jan 2020 09:49:12 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C342611B3; Mon, 13 Jan 2020 06:49:11 -0800 (PST) Received: from [192.168.0.103] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 854213F68E; Mon, 13 Jan 2020 06:49:10 -0800 (PST) Subject: Re: [PATCH] cpu-topology: warn if NUMA configurations conflicts with lower layer To: Valentin Schneider , "Zengtao (B)" , Morten Rasmussen Cc: Sudeep Holla , Linuxarm , Greg Kroah-Hartman , "Rafael J. Wysocki" , "linux-kernel@vger.kernel.org" References: <1577088979-8545-1-git-send-email-prime.zeng@hisilicon.com> <20191231164051.GA4864@bogus> <678F3D1BB717D949B966B68EAEB446ED340AE1D3@dggemm526-mbx.china.huawei.com> <20200102112955.GC4864@bogus> <678F3D1BB717D949B966B68EAEB446ED340AEB67@dggemm526-mbx.china.huawei.com> <678F3D1BB717D949B966B68EAEB446ED340AFCA0@dggemm526-mbx.china.huawei.com> <20200103114011.GB19390@bogus> <678F3D1BB717D949B966B68EAEB446ED340B31E9@dggemm526-mbx.china.huawei.com> <20200109104306.GA10914@e105550-lin.cambridge.arm.com> <678F3D1BB717D949B966B68EAEB446ED340BEDD6@dggemm526-mbx.china.huawei.com> <1a8f7963-97e9-62cc-12d2-39f816dfaf67@arm.com> From: Dietmar Eggemann Message-ID: <1fbe4475-363d-e800-8295-a1591d5e52d9@arm.com> Date: Mon, 13 Jan 2020 15:49:09 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <1a8f7963-97e9-62cc-12d2-39f816dfaf67@arm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11.01.20 21:56, Valentin Schneider wrote: > On 09/01/2020 12:58, Zengtao (B) wrote: >>> IIUC, the problem is that virt can set up a broken topology in some >>> cases where MPIDR doesn't line up correctly with the defined NUMA >>> nodes. >>> >>> We could argue that it is a qemu/virt problem, but it would be nice if >>> we could at least detect it. The proposed patch isn't really the right >>> solution as it warns on some valid topologies as Sudeep already pointed >>> out. >>> >>> It sounds more like we need a mask subset check in the sched_domain >>> building code, if there isn't already one? >> >> Currently no, it's a bit complex to do the check in the sched_domain building code, >> I need to take a think of that. >> Suggestion welcomed. >> > > Doing a search on the sched_domain spans themselves should look something like > the completely untested: [...] LGTM. This code detects the issue in cpu_coregroup_mask(), which is the the cpumask function of the sched domain MC level struct sched_domain_topology_level of ARM64's (and other archs) default_topology[]. I wonder how x86 copes with such a config error? Maybe they do it inside their cpu_coregroup_mask()? We could move validate_topology_spans() into the existing for_each_cpu(i, cpu_map) for_each_sd_topology(tl) loop in build_sched_domains() saving some code? ---8<--- diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index e6ff114e53f2..5f2764433a3d 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1880,37 +1880,34 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve } /* Ensure topology masks are sane; non-NUMA spans shouldn't overlap */ -static int validate_topology_spans(const struct cpumask *cpu_map) +static int validate_topology_spans(struct sched_domain_topology_level *tl, + const struct cpumask *cpu_map, int cpu) { - struct sched_domain_topology_level *tl; - int i, j; + const struct cpumask* mask = tl->mask(cpu); + int i; - for_each_sd_topology(tl) { - /* NUMA levels are allowed to overlap */ - if (tl->flags & SDTL_OVERLAP) - break; + /* NUMA levels are allowed to overlap */ + if (tl->flags & SDTL_OVERLAP) + return 0; + /* + * Non-NUMA levels cannot partially overlap - they must be + * either equal or wholly disjoint. Otherwise we can end up + * breaking the sched_group lists - i.e. a later get_group() + * pass breaks the linking done for an earlier span. + */ + for_each_cpu(i, cpu_map) { + if (i == cpu) + continue; /* - * Non-NUMA levels cannot partially overlap - they must be - * either equal or wholly disjoint. Otherwise we can end up - * breaking the sched_group lists - i.e. a later get_group() - * pass breaks the linking done for an earlier span. + * We should 'and' all those masks with 'cpu_map' + * to exactly match the topology we're about to + * build, but that can only remove CPUs, which + * only lessens our ability to detect overlaps */ - for_each_cpu(i, cpu_map) { - for_each_cpu(j, cpu_map) { - if (i == j) - continue; - /* - * We should 'and' all those masks with 'cpu_map' - * to exactly match the topology we're about to - * build, but that can only remove CPUs, which - * only lessens our ability to detect overlaps - */ - if (!cpumask_equal(tl->mask(i), tl->mask(j)) && - cpumask_intersects(tl->mask(i), tl->mask(j))) - return -1; - } - } + if (!cpumask_equal(mask, tl->mask(i)) && + cpumask_intersects(mask, tl->mask(i))) + return -1; } return 0; @@ -1990,8 +1987,7 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att struct sched_domain_topology_level *tl_asym; bool has_asym = false; - if (WARN_ON(cpumask_empty(cpu_map)) || - WARN_ON(validate_topology_spans(cpu_map))) + if (WARN_ON(cpumask_empty(cpu_map))) goto error; alloc_state = __visit_domain_allocation_hell(&d, cpu_map); @@ -2013,6 +2009,9 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att has_asym = true; } + if (WARN_ON(validate_topology_spans(tl, cpu_map, i))) + goto error; + sd = build_sched_domain(tl, cpu_map, attr, sd, dflags, i); if (tl == sched_domain_topology)