Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp53667pxk; Tue, 15 Sep 2020 17:34:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyq/HmZVRNPOGzpJePO8n/TgHbzSH+N9A7sEBhbyDXj1ErgWvLmTpOtkQMU3xClopu9uNGd X-Received: by 2002:a17:906:6409:: with SMTP id d9mr22485408ejm.344.1600216497864; Tue, 15 Sep 2020 17:34:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600216497; cv=none; d=google.com; s=arc-20160816; b=MLkxUli2WLgNryDazKQpU14LbTteQ769A5SOSNTJFjOrFb91TwewblbUHla4uP0Aoq CiCFa68/PLPwKT0XmY9zE1hd43HY6RksBwj1O3cUhVfZ9xSWdAkKyuYRwdRHhZiVgXsU aNItZ2q5+KStjlN4cgt992GkUaA8A+zJaiyC/Pstmb+X8c/YQVwUTaFeow42z+vvE7Hr l/7MXE6KH548fdwLjyK1C3B18K4TwWQeXBa4nw1xrDsbx9cJuP4YbdJX0BWkb+Tscena 0ORSZJVhBomOkQb3zHilKZFs0r/4QFmqwL/zUvwguWayoTgC089EHTajwiXM4SAFc6uo rlmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:ironport-sdr:ironport-sdr; bh=ngCPumzcH44939dZFZpuJP7PGXRO5R8NatH297fgeow=; b=DeJUv2iD9NXvHzYgcZjvGAo11uUwVeXo71jdYfUVqxfDfkoHV+zZS2FZ0olDoOhNJ3 eaXOfODlvKIgKEWnjCiDHzQy/fI9km3hbXRmGpF2/rzyX/blp8+gKxTn/gItSXCAyqrO yJt4sz1Q7wK5IL20x3QBq4d3WDS21u9b/QvUC9Bbfb3/0hI1W3o2+iv2NgRjY4mR9i4e BD+MI6fyT0ZVjd9Dz87odkTkk4tRU7H5UjrjLi11ZpOSr7kG03UrnxdZ6s7XZvWZ/uTP biQgoEJCHiO8vB02rp/ygcrPgxcSiJbRqKkCG6UvzwJ3uXLJDEIn3CnLk0J7UvXGNrFS 7qqg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x26si10449044edi.79.2020.09.15.17.34.35; Tue, 15 Sep 2020 17:34:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726667AbgIPAeA (ORCPT + 99 others); Tue, 15 Sep 2020 20:34:00 -0400 Received: from mga09.intel.com ([134.134.136.24]:44345 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726455AbgIONko (ORCPT ); Tue, 15 Sep 2020 09:40:44 -0400 IronPort-SDR: joN5RT4p5Ggxm695IooCsjmHQnZlTWMlHWUSrCE1PzZRw7ZE5KbxVc82ZXy4Iz1h4rP7+q2Q6Z MU5TUuY2L7yg== X-IronPort-AV: E=McAfee;i="6000,8403,9744"; a="160187970" X-IronPort-AV: E=Sophos;i="5.76,430,1592895600"; d="scan'208";a="160187970" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2020 06:38:07 -0700 IronPort-SDR: 1LW4adCSbUnt4s0K6C7IcQp+jeqoshblniqRt8gVygmb4Yxp5orBIRgsmgid4xuysjdCOzoikM WFBbjpXq5TBA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,430,1592895600"; d="scan'208";a="331192786" Received: from cli6-desk1.ccr.corp.intel.com (HELO [10.239.161.135]) ([10.239.161.135]) by fmsmga004.fm.intel.com with ESMTP; 15 Sep 2020 06:38:04 -0700 Subject: Re: [RFC PATCH v1 1/1] sched/fair: select idle cpu from idle cpumask in sched domain To: Vincent Guittot , Jiang Biao Cc: Aubrey Li , Ingo Molnar , Peter Zijlstra , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tim Chen , linux-kernel References: <20200910054203.525420-1-aubrey.li@intel.com> <20200910054203.525420-2-aubrey.li@intel.com> From: "Li, Aubrey" Message-ID: <3c9c8db6-86c0-3f62-4a8e-a5df4cb03715@linux.intel.com> Date: Tue, 15 Sep 2020 21:38:04 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/9/15 17:23, Vincent Guittot wrote: > On Tue, 15 Sep 2020 at 10:47, Jiang Biao wrote: >> >> Hi, Vincent >> >> On Mon, 14 Sep 2020 at 20:26, Vincent Guittot >> wrote: >>> >>> On Sun, 13 Sep 2020 at 05:59, Jiang Biao wrote: >>>> >>>> Hi, Aubrey >>>> >>>> On Fri, 11 Sep 2020 at 23:48, Aubrey Li wrote: >>>>> >>>>> Added idle cpumask to track idle cpus in sched domain. When a CPU >>>>> enters idle, its corresponding bit in the idle cpumask will be set, >>>>> and when the CPU exits idle, its bit will be cleared. >>>>> >>>>> When a task wakes up to select an idle cpu, scanning idle cpumask >>>>> has low cost than scanning all the cpus in last level cache domain, >>>>> especially when the system is heavily loaded. >>>>> >>>>> Signed-off-by: Aubrey Li >>>>> --- >>>>> include/linux/sched/topology.h | 13 +++++++++++++ >>>>> kernel/sched/fair.c | 4 +++- >>>>> kernel/sched/topology.c | 2 +- >>>>> 3 files changed, 17 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h >>>>> index fb11091129b3..43a641d26154 100644 >>>>> --- a/include/linux/sched/topology.h >>>>> +++ b/include/linux/sched/topology.h >>>>> @@ -65,8 +65,21 @@ struct sched_domain_shared { >>>>> atomic_t ref; >>>>> atomic_t nr_busy_cpus; >>>>> int has_idle_cores; >>>>> + /* >>>>> + * Span of all idle CPUs in this domain. >>>>> + * >>>>> + * NOTE: this field is variable length. (Allocated dynamically >>>>> + * by attaching extra space to the end of the structure, >>>>> + * depending on how many CPUs the kernel has booted up with) >>>>> + */ >>>>> + unsigned long idle_cpus_span[]; >>>>> }; >>>>> >>>>> +static inline struct cpumask *sds_idle_cpus(struct sched_domain_shared *sds) >>>>> +{ >>>>> + return to_cpumask(sds->idle_cpus_span); >>>>> +} >>>>> + >>>>> struct sched_domain { >>>>> /* These fields must be setup */ >>>>> struct sched_domain __rcu *parent; /* top domain must be null terminated */ >>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>>>> index 6b3b59cc51d6..3b6f8a3589be 100644 >>>>> --- a/kernel/sched/fair.c >>>>> +++ b/kernel/sched/fair.c >>>>> @@ -6136,7 +6136,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t >>>>> >>>>> time = cpu_clock(this); >>>>> >>>>> - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); >>>>> + cpumask_and(cpus, sds_idle_cpus(sd->shared), p->cpus_ptr); >>>> Is the sds_idle_cpus() always empty if nohz=off? >>> >>> Good point >>> >>>> Do we need to initialize the idle_cpus_span with sched_domain_span(sd)? >>>> >>>>> >>>>> for_each_cpu_wrap(cpu, cpus, target) { >>>>> if (!--nr) >>>>> @@ -10182,6 +10182,7 @@ static void set_cpu_sd_state_busy(int cpu) >>>>> sd->nohz_idle = 0; >>>>> >>>>> atomic_inc(&sd->shared->nr_busy_cpus); >>>>> + cpumask_clear_cpu(cpu, sds_idle_cpus(sd->shared)); >>>>> unlock: >>>>> rcu_read_unlock(); >>>>> } >>>>> @@ -10212,6 +10213,7 @@ static void set_cpu_sd_state_idle(int cpu) >>>>> sd->nohz_idle = 1; >>>>> >>>>> atomic_dec(&sd->shared->nr_busy_cpus); >>>>> + cpumask_set_cpu(cpu, sds_idle_cpus(sd->shared)); >>>> This only works when entering/exiting tickless mode? :) >>>> Why not update idle_cpus_span during tick_nohz_idle_enter()/exit()? >>> >>> set_cpu_sd_state_busy is only called during a tick in order to limit >>> the rate of the update to once per tick per cpu at most and prevents >>> any kind of storm of update if short running tasks wake/sleep all the >>> time. We don't want to update a cpumask at each and every enter/leave >>> idle. >>> >> Agree. But set_cpu_sd_state_busy seems not being reached when >> nohz=off, which means it will not work for that case? :) > > Yes set_cpu_sd_state_idle/busy are nohz function Thanks Biao to point this out. If the shared idle cpumask is initialized with sched_domain_span(sd), then nohz=off case will remain the previous behavior. Thanks, -Aubrey