Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp411518pxu; Tue, 1 Dec 2020 14:37:22 -0800 (PST) X-Google-Smtp-Source: ABdhPJyNe9632mDyezyStt/ijbJ58QeI9T6SQ8HNF92cDyqwzokV7p5CppW6RtmSA3cEBYyzn8z0 X-Received: by 2002:a17:907:210b:: with SMTP id qn11mr5308144ejb.41.1606862242045; Tue, 01 Dec 2020 14:37:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606862242; cv=none; d=google.com; s=arc-20160816; b=FjF0f+2N2LgezowXO58+A64m4FXUQAhnxkXgAWJXE/+PCKw1X0N7ikCzCNhurDduwL SBZZ69//vRKicM5au4ZKnfceMd+iUAgK3ZEJlOkJR2pdajb+YSlxa78OIUFsmHQeIH0S n0fpCKvoYr17+Z80AtvVPA2tUg2bda8n8qj1VlcWG+BIDiVPPQuAGKbu42GXha23cM05 RjQICPljeUIqlyD3aIJb8a/ibm150PrwycPkc5ky4XaKsD+lPLMqfvp32/lhL5XSRO83 PwXJy7+OwclwzV6Z+M8mVCXjrUMDitK8EwKbi7X/9rglIQIJJ9yJm7hlKPSJrRSmdWGq xg5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:in-reply-to:subject :cc:to:from:user-agent:references; bh=P9Jxh4RwbiUMSOGZPA2ww070UN9aiIVKJaaMoszXckU=; b=lDhGFIFjRfXPYS1/GAN2twL4ytnJJkkAeas5ZJCKQU6vHkDkIEGNkSJlJifLFE/MJq kkFeG3IYjg9J3uP3WeGQCogxKXtxJ+vih28my7N17UehMlF1eIcTVzPglK4F8nFLFWd2 2Q3ffldhRnxKJch6vmeyaQI1c0ip1Utkai6Ptiy2a/WiXgRMuYX1godYAtd0LGm9A5aa vJ4xtK8YWauWCV3x5L4PSorzPX8B2qXv77DbJ3+zq4M8EipEyVG89vDcpC/IkAFBrYY+ pOKFlFkNsU6BIP6MKxXcciDoiuB/NiZ5oVqPKoanovs5cljfby87BhJnLTyO+Qkq1dBK 0thA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o6si1000248edi.566.2020.12.01.14.36.59; Tue, 01 Dec 2020 14:37:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403918AbgLAQFD (ORCPT + 99 others); Tue, 1 Dec 2020 11:05:03 -0500 Received: from foss.arm.com ([217.140.110.172]:45436 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388182AbgLAQFD (ORCPT ); Tue, 1 Dec 2020 11:05:03 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1C0311042; Tue, 1 Dec 2020 08:04:09 -0800 (PST) Received: from e113632-lin (e113632-lin.cambridge.arm.com [10.1.194.46]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 713343F575; Tue, 1 Dec 2020 08:04:06 -0800 (PST) References: <20201201025944.18260-1-song.bao.hua@hisilicon.com> <20201201025944.18260-3-song.bao.hua@hisilicon.com> User-agent: mu4e 0.9.17; emacs 26.3 From: Valentin Schneider To: Barry Song Cc: catalin.marinas@arm.com, will@kernel.org, rjw@rjwysocki.net, lenb@kernel.org, gregkh@linuxfoundation.org, Jonathan.Cameron@huawei.com, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linuxarm@huawei.com, xuwei5@huawei.com, prime.zeng@hisilicon.com Subject: Re: [RFC PATCH v2 2/2] scheduler: add scheduler level for clusters In-reply-to: <20201201025944.18260-3-song.bao.hua@hisilicon.com> Date: Tue, 01 Dec 2020 16:04:04 +0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/12/20 02:59, Barry Song wrote: > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 1a68a05..ae8ec910 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6106,6 +6106,37 @@ static inline int select_idle_smt(struct task_struct *p, int target) > > #endif /* CONFIG_SCHED_SMT */ > > +#ifdef CONFIG_SCHED_CLUSTER > +/* > + * Scan the local CLUSTER mask for idle CPUs. > + */ > +static int select_idle_cluster(struct task_struct *p, int target) > +{ > + int cpu; > + > + /* right now, no hardware with both cluster and smt to run */ > + if (sched_smt_active()) > + return -1; > + > + for_each_cpu_wrap(cpu, cpu_cluster_mask(target), target) { Gating this behind this new config only leveraged by arm64 doesn't make it very generic. Note that powerpc also has this newish "CACHE" level which seems to overlap in function with your "CLUSTER" one (both are arch specific, though). I think what you are after here is an SD_SHARE_PKG_RESOURCES domain walk, i.e. scan CPUs by increasing cache "distance". We already have it in some form, as we scan SMT & LLC domains; AFAICT LLC always maps to MC, except for said powerpc's CACHE thingie. *If* we are to generally support more levels with SD_SHARE_PKG_RESOURCES, we could say frob something into select_idle_cpu(). I'm thinking of something like the incomplete, untested below: --- diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index ae7ceba8fd4f..70692888db00 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6120,7 +6120,7 @@ static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target) { struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); - struct sched_domain *this_sd; + struct sched_domain *this_sd, *child = NULL; u64 avg_cost, avg_idle; u64 time; int this = smp_processor_id(); @@ -6150,14 +6150,22 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t time = cpu_clock(this); - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + do { + /* XXX: sd should start as SMT's parent */ + cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + if (child) + cpumask_andnot(cpus, cpus, sched_domain_span(child)); + + for_each_cpu_wrap(cpu, cpus, target) { + if (!--nr) + return -1; + if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) + break; + } - for_each_cpu_wrap(cpu, cpus, target) { - if (!--nr) - return -1; - if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) - break; - } + child = sd; + sd = sd->parent; + } while (sd && sd->flags & SD_SHARE_PKG_RESOURCES); time = cpu_clock(this) - time; update_avg(&this_sd->avg_scan_cost, time);