Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp18807145ybl; Fri, 3 Jan 2020 09:22:55 -0800 (PST) X-Google-Smtp-Source: APXvYqwqQkpiCco3vaTyVTDUdHSmt6SeXf5YZbP7AUEtM3DXn2D9PfLLdPD4V6uThMOHy04pqWnv X-Received: by 2002:a05:6808:a8a:: with SMTP id q10mr4146792oij.66.1578072175186; Fri, 03 Jan 2020 09:22:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578072175; cv=none; d=google.com; s=arc-20160816; b=GY445kNwygLPM2L6B/JjNRkjdz5zazb3GRdH3tOc9Tfox2HqKxjwZSq0bqhhNdkCNP jFc1BlSq+d1hHfP2wdebnRI5yz6A1fZyF57Nvp2Rz2lPavpbifBGNL80ACtNTfghzRDG cu+vqcBGxewzG9KGavbid0AsFCximNrDYVlgo4H/VuEbRdnkoj70wMRCQnMmDLJGzGFh z6GSaWprnYZrJNd24GmEb+74djGz3GteRMca+XR3DQ52689uyfey1a6sQUTAUwUdoUnV B7Z0JMC8tJajGTK2wyPMb6KYslI0TCsXT06YSugtMQNHCA2enDDdtZuOJu9oZfjMmFVQ ogVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=snWdH7C5vrgsXUlMiPY3+GmJmJKSyrQ+DAlyu4dR63w=; b=S0qYAFqzB3KxdKMeT7c43/1xtBeJh48fAyKmvrPiiXXTBMMnQIXtP576+oa+9dJmRy SCrVtcgSJpBF+X1FLj+XmhbhAmFB4UES4M6whaZa4MYxs7MOhoT7HP6Kp8/waELuESd3 HLSd+kQSCYOoZgViA/2K2l3tkhymow8UbW/zJSM6mZWRhGglpoQd5mWZjgUd3+4GmTfp 55lHbCtjAlnqkOnsL/CcVI7RVveJ2q33BkRmZBJPfmJzK7RpYMQLcMRabQg6igf0FWbd euI9Dvj55DnxLsVuJ/eGeKfjTnpXvhJdq3+tzw1I9Hiq6G18JOPb0T8/xHY9ETo21gBX BMIw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a9si21324979oid.132.2020.01.03.09.22.42; Fri, 03 Jan 2020 09:22:55 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728171AbgACRUi (ORCPT + 99 others); Fri, 3 Jan 2020 12:20:38 -0500 Received: from foss.arm.com ([217.140.110.172]:57204 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728107AbgACRUi (ORCPT ); Fri, 3 Jan 2020 12:20:38 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 864A7328; Fri, 3 Jan 2020 09:20:37 -0800 (PST) Received: from [192.168.0.7] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 457A33F703; Fri, 3 Jan 2020 09:20:36 -0800 (PST) Subject: Re: [PATCH] cpu-topology: warn if NUMA configurations conflicts with lower layer To: Valentin Schneider , "Zengtao (B)" , Sudeep Holla Cc: Linuxarm , Greg Kroah-Hartman , "Rafael J. Wysocki" , "linux-kernel@vger.kernel.org" , Morten Rasmussen References: <1577088979-8545-1-git-send-email-prime.zeng@hisilicon.com> <20191231164051.GA4864@bogus> <678F3D1BB717D949B966B68EAEB446ED340AE1D3@dggemm526-mbx.china.huawei.com> <20200102112955.GC4864@bogus> <678F3D1BB717D949B966B68EAEB446ED340AEB67@dggemm526-mbx.china.huawei.com> <678F3D1BB717D949B966B68EAEB446ED340AFCA0@dggemm526-mbx.china.huawei.com> <7b375d79-2d3c-422b-27a6-68972fbcbeaf@arm.com> <66943c82-2cfd-351b-7f36-5aefdb196a03@arm.com> From: Dietmar Eggemann Message-ID: Date: Fri, 3 Jan 2020 18:20:35 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <66943c82-2cfd-351b-7f36-5aefdb196a03@arm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/01/2020 13:14, Valentin Schneider wrote: > On 03/01/2020 10:57, Valentin Schneider wrote: >> I'm juggling with other things atm, but let me have a think and see if we >> couldn't detect that in the scheduler itself. If this is a common problem, we should detect it in the scheduler rather than in the arch code. > Something like this ought to catch your case; might need to compare group > spans rather than pure group pointers. > > --- > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c > index 6ec1e595b1d4..c4151e11afcd 100644 > --- a/kernel/sched/topology.c > +++ b/kernel/sched/topology.c > @@ -1120,6 +1120,13 @@ build_sched_groups(struct sched_domain *sd, int cpu) > > sg = get_group(i, sdd); > > + /* sg's are inited as self-looping. If 'last' is not self > + * looping, we set it in a previous visit. No further visit > + * should change the link order, if we do then the topology > + * description is terminally broken. > + */ > + BUG_ON(last && last->next != last && last->next != sg); > + > cpumask_or(covered, covered, sched_group_span(sg)); > > if (!first) > Still don't see the actual problem case. The closest I came is: qemu-system-aarch64 -kernel ... -append ' ... loglevel=8 sched_debug' -smp cores=4,sockets=2 ... -numa node,cpus=0-2,nodeid=0 -numa node,cpus=3-7,nodeid=1 but this behaves sane. Since DIE and NUMA have the same span, the former degenerates. [ 0.654451] CPU0 attaching sched-domain(s): [ 0.654483] domain-0: span=0-2 level=MC [ 0.654635] groups: 0:{ span=0 cap=1008 }, 1:{ span=1 cap=1015 }, 2:{ span=2 cap=1014 } [ 0.654787] domain-1: span=0-7 level=NUMA [ 0.654805] groups: 0:{ span=0-2 cap=3037 }, 3:{ span=3-7 cap=5048 } [ 0.655326] CPU1 attaching sched-domain(s): [ 0.655339] domain-0: span=0-2 level=MC [ 0.655356] groups: 1:{ span=1 cap=1015 }, 2:{ span=2 cap=1014 }, 0:{ span=0 cap=1008 } [ 0.655391] domain-1: span=0-7 level=NUMA [ 0.655407] groups: 0:{ span=0-2 cap=3037 }, 3:{ span=3-7 cap=5048 } [ 0.655480] CPU2 attaching sched-domain(s): [ 0.655492] domain-0: span=0-2 level=MC [ 0.655507] groups: 2:{ span=2 cap=1014 }, 0:{ span=0 cap=1008 }, 1:{ span=1 cap=1015 } [ 0.655541] domain-1: span=0-7 level=NUMA [ 0.655556] groups: 0:{ span=0-2 cap=3037 }, 3:{ span=3-7 cap=5048 } [ 0.655603] CPU3 attaching sched-domain(s): [ 0.655614] domain-0: span=3-7 level=MC [ 0.655628] groups: 3:{ span=3 cap=984 }, 4:{ span=4 cap=1015 }, 5:{ span=5 cap=1016 }, 6:{ span=6 cap=1016 }, 7:{ span=7 cap=1017 } [ 0.655693] domain-1: span=0-7 level=NUMA [ 0.655721] groups: 3:{ span=3-7 cap=5048 }, 0:{ span=0-2 cap=3037 } [ 0.655769] CPU4 attaching sched-domain(s): [ 0.655780] domain-0: span=3-7 level=MC [ 0.655795] groups: 4:{ span=4 cap=1015 }, 5:{ span=5 cap=1016 }, 6:{ span=6 cap=1016 }, 7:{ span=7 cap=1017 }, 3:{ span=3 cap=984 } [ 0.655841] domain-1: span=0-7 level=NUMA [ 0.655855] groups: 3:{ span=3-7 cap=5048 }, 0:{ span=0-2 cap=3037 } [ 0.655902] CPU5 attaching sched-domain(s): [ 0.655916] domain-0: span=3-7 level=MC [ 0.655930] groups: 5:{ span=5 cap=1016 }, 6:{ span=6 cap=1016 }, 7:{ span=7 cap=1017 }, 3:{ span=3 cap=984 }, 4:{ span=4 cap=1015 } [ 0.656545] domain-1: span=0-7 level=NUMA [ 0.656562] groups: 3:{ span=3-7 cap=5048 }, 0:{ span=0-2 cap=3037 } [ 0.656775] CPU6 attaching sched-domain(s): [ 0.656796] domain-0: span=3-7 level=MC [ 0.656835] groups: 6:{ span=6 cap=1016 }, 7:{ span=7 cap=1017 }, 3:{ span=3 cap=984 }, 4:{ span=4 cap=1015 }, 5:{ span=5 cap=1016 } [ 0.656881] domain-1: span=0-7 level=NUMA [ 0.656911] groups: 3:{ span=3-7 cap=5048 }, 0:{ span=0-2 cap=3037 } [ 0.657102] CPU7 attaching sched-domain(s): [ 0.657113] domain-0: span=3-7 level=MC [ 0.657128] groups: 7:{ span=7 cap=1017 }, 3:{ span=3 cap=984 }, 4:{ span=4 cap=1015 }, 5:{ span=5 cap=1016 }, 6:{ span=6 cap=1016 } [ 0.657172] domain-1: span=0-7 level=NUMA [ 0.657186] groups: 3:{ span=3-7 cap=5048 }, 0:{ span=0-2 cap=3037 } [ 0.657241] root domain span: 0-7 (max cpu_capacity = 1024)