Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp2265440ybl; Sat, 11 Jan 2020 13:00:12 -0800 (PST) X-Google-Smtp-Source: APXvYqyiQewClREWtb1NwO8nzvUyi7hyEJNvOHiFM26leIoAW171ejjFv8wsd7WrAx001XtYZGXX X-Received: by 2002:a9d:4e97:: with SMTP id v23mr7340568otk.201.1578776412369; Sat, 11 Jan 2020 13:00:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578776412; cv=none; d=google.com; s=arc-20160816; b=StT0NDsaSamgeoqpikqy2V661GSj1w43MH896co4uOAx9tlYaOiCzikYU6GpByWx34 w+Hs7MeBtKn8ykqR9QX7poaoAW5YdL5qV3uRLSg/p+jlLo6+TkdK4AQQobCuEQFQRDy0 LfQVjSaQg+HVuBv6niIsU9QNkMjf5YLBEPo1z9B1FQ0lh1w7qXAbZasLrNvSs4FV0ocO tri+5oUPxK18tivEeDdbmni+5YycSCVvHFybhVeB4VQj3HB+IG43abULA8xEBrS5M0Ce zZCrmWq65UUWxwvLf3/747vEXUtRlYtVIW/VTa5zoXwJGNdYvh9J0PgN/BfOf2ZHF1zC TRSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=6qgdRubWl35WBUMHHJ+SHIzeSrNYRN4jzo7W5MxmuGM=; b=ehs1Nt3sNOKg6Q9gZDsBlDA9zxOnvvBITml6LdgcEnlnPGI8bBjhM7ixOouICrlZ1Q jd3Yu42lHH0ccD6srSe3AQ8LsRUpedxSFvtgQFpP9tBjghYz7bbNZiXwujpz4jNDrO+Z /bFjFp9XCBhZDYqINS+pxrECQE/aX/eKkG0W5OlNM9+nLJD3uq1I7rslrUYOjNvS2y7O fl0kxFsQZB3BUpu32/glCwB+VxAjkSAD0ZycfnkWcu3kDLTet4iv1B0gHJkBcawpOcPw 3xj0tH4xORr0NRk+8iiAZ70kn/pYdEibYkVFUbPqBjv9b1k6RYbqqbS66OckwR7YEeTj Hf8A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i17si4136883ota.82.2020.01.11.12.59.48; Sat, 11 Jan 2020 13:00:12 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731335AbgAKU46 (ORCPT + 99 others); Sat, 11 Jan 2020 15:56:58 -0500 Received: from foss.arm.com ([217.140.110.172]:57004 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731097AbgAKU46 (ORCPT ); Sat, 11 Jan 2020 15:56:58 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0968830E; Sat, 11 Jan 2020 12:56:57 -0800 (PST) Received: from [10.0.2.15] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 014473F534; Sat, 11 Jan 2020 12:56:55 -0800 (PST) Subject: Re: [PATCH] cpu-topology: warn if NUMA configurations conflicts with lower layer To: "Zengtao (B)" , Morten Rasmussen Cc: Sudeep Holla , Linuxarm , Greg Kroah-Hartman , "Rafael J. Wysocki" , "linux-kernel@vger.kernel.org" References: <1577088979-8545-1-git-send-email-prime.zeng@hisilicon.com> <20191231164051.GA4864@bogus> <678F3D1BB717D949B966B68EAEB446ED340AE1D3@dggemm526-mbx.china.huawei.com> <20200102112955.GC4864@bogus> <678F3D1BB717D949B966B68EAEB446ED340AEB67@dggemm526-mbx.china.huawei.com> <678F3D1BB717D949B966B68EAEB446ED340AFCA0@dggemm526-mbx.china.huawei.com> <20200103114011.GB19390@bogus> <678F3D1BB717D949B966B68EAEB446ED340B31E9@dggemm526-mbx.china.huawei.com> <20200109104306.GA10914@e105550-lin.cambridge.arm.com> <678F3D1BB717D949B966B68EAEB446ED340BEDD6@dggemm526-mbx.china.huawei.com> From: Valentin Schneider Message-ID: <1a8f7963-97e9-62cc-12d2-39f816dfaf67@arm.com> Date: Sat, 11 Jan 2020 20:56:28 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <678F3D1BB717D949B966B68EAEB446ED340BEDD6@dggemm526-mbx.china.huawei.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/01/2020 12:58, Zengtao (B) wrote: >> IIUC, the problem is that virt can set up a broken topology in some >> cases where MPIDR doesn't line up correctly with the defined NUMA >> nodes. >> >> We could argue that it is a qemu/virt problem, but it would be nice if >> we could at least detect it. The proposed patch isn't really the right >> solution as it warns on some valid topologies as Sudeep already pointed >> out. >> >> It sounds more like we need a mask subset check in the sched_domain >> building code, if there isn't already one? > > Currently no, it's a bit complex to do the check in the sched_domain building code, > I need to take a think of that. > Suggestion welcomed. > Doing a search on the sched_domain spans themselves should look something like the completely untested: ---8<--- diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 6ec1e595b1d4..96128d12ec23 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1879,6 +1879,43 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve return sd; } +/* Ensure topology masks are sane; non-NUMA spans shouldn't overlap */ +static int validate_topology_spans(const struct cpumask *cpu_map) +{ + struct sched_domain_topology_level *tl; + int i, j; + + for_each_sd_topology(tl) { + /* NUMA levels are allowed to overlap */ + if (tl->flags & SDTL_OVERLAP) + break; + + /* + * Non-NUMA levels cannot partially overlap - they must be + * either equal or wholly disjoint. Otherwise we can end up + * breaking the sched_group lists - i.e. a later get_group() + * pass breaks the linking done for an earlier span. + */ + for_each_cpu(i, cpu_map) { + for_each_cpu(j, cpu_map) { + if (i == j) + continue; + /* + * We should 'and' all those masks with 'cpu_map' + * to exactly match the topology we're about to + * build, but that can only remove CPUs, which + * only lessens our ability to detect overlaps + */ + if (!cpumask_equal(tl->mask(i), tl->mask(j)) && + cpumask_intersects(tl->mask(i), tl->mask(j))) + return -1; + } + } + } + + return 0; +} + /* * Find the sched_domain_topology_level where all CPU capacities are visible * for all CPUs. @@ -1953,7 +1990,8 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att struct sched_domain_topology_level *tl_asym; bool has_asym = false; - if (WARN_ON(cpumask_empty(cpu_map))) + if (WARN_ON(cpumask_empty(cpu_map)) || + WARN_ON(validate_topology_spans(cpu_map))) goto error; alloc_state = __visit_domain_allocation_hell(&d, cpu_map); --->8--- Alternatively the assertion on the sched_group linking I suggested earlier in the thread should suffice, since this should trigger whenever we have overlapping non-NUMA sched domains. Since you have a setup where you can reproduce the issue, could please give either (ideally both!) a try? Thanks. > Thanks > Zengtao > >> >> Morten