Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp1684361ybz; Thu, 16 Apr 2020 13:38:44 -0700 (PDT) X-Google-Smtp-Source: APiQypKYeb8Q75ucwEej4mLCYGRtvDEQg1id88/FDzreklP7/2oKeX99/YSrCrkxXOaNvpXr/e/q X-Received: by 2002:aa7:d788:: with SMTP id s8mr14140edq.311.1587069524495; Thu, 16 Apr 2020 13:38:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587069524; cv=none; d=google.com; s=arc-20160816; b=skXQQ6J2nzBdZ0uNVVqc9njh+hgbDDzenWkb5LnXytPSy6IAbxsjYeb11lbCrEXZiH gXYBYLiIdiCVggsxLi7jXrjczef58UU5I36bjF2yP03qpZHTy13gfvBl6W2zcZeSgyTJ r6eoyIis7xHJE85PEshEqZdwIbvWynB1MD8DpLd3izxzs2tihEx8mN/dTFRALB9JwwPN XfayAzuxMTP6cVBFwsqDrcCeRFeyt8QODhhnoUUWzm2P/AGrwzLSZ5khsUSjuQ6nn7Gw kCJd6l2xfZqIeXq0B7HhrNYnvT75rdxBZzVgMn9iiFiZtpICPJRYaOUfym7Fwcwdl3M5 DuhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=RKkzesvk5cYfKkW+cuVfyJJ4B3TfqgCX4qySMMGKGlg=; b=RqQGwwIQ92J07vsszxo4iJcWZ1QUS/ZLMDGnYJ2t2+uY822In9646tQQ7W0MX6ltB5 qmLgP0l9f+R8qMqbgfOjx4fX2Gm3AOWs/LGXVJM9rDNSdRNod/mektTIbPlft3soVpuF 96LzyWJysUfPsnS2n/3qvAcjx9lvLFUawjJgShwWwb5KcV9TzhWTFyLhQsYHGyIbSlZz 8KVW/jyCNo1jbMadnXOugC5FNQ5/fcMhdakNOJFB5TtXvznUk3LWteoqG3AX0Dhsy9Uc qHUcywi0A9rX8VloD1bAYnnlgMaVVr43dOLMvYmk+wSiKMS5BdqBKdDMox+9e1Cpsumv clSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=MGeTnGvI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b1si13596080edh.318.2020.04.16.13.38.21; Thu, 16 Apr 2020 13:38:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=MGeTnGvI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2895967AbgDPP7R (ORCPT + 99 others); Thu, 16 Apr 2020 11:59:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S2896089AbgDPP7A (ORCPT ); Thu, 16 Apr 2020 11:59:00 -0400 Received: from mail-oi1-x241.google.com (mail-oi1-x241.google.com [IPv6:2607:f8b0:4864:20::241]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23B5FC061A0C for ; Thu, 16 Apr 2020 08:59:00 -0700 (PDT) Received: by mail-oi1-x241.google.com with SMTP id t199so13578171oif.7 for ; Thu, 16 Apr 2020 08:59:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=RKkzesvk5cYfKkW+cuVfyJJ4B3TfqgCX4qySMMGKGlg=; b=MGeTnGvI/2k6bxRBviIyd1LSkMZNOawQBC63cTOw3eNT0E2cqiGkqP5WM1LyY9cMCd O8yJOXkdTY9vlwgcQZEdCBZkjsiMs3JQEKv/w07/loPznlTIv0qQfB+83EaslJL1jHkk QH0H/Oka1VIrsQPXERMd/sv/kAn1gX1q4CJAHf8eRMVOIBks6PtE+Pks7rIMGXpx+kdr gm+GETzmhGS5ichB041ZwaGZFAxNB4gCONZf2UN1wWrfDm+u6Jo6rrQ3HgGJaQNta4yz Oyg+mE8xJ43OI3nc5YKILs3NLr2uH1ld2+XpCXKn0sHlE3V5pdNa7vaex7HxtTiNbdU5 YI8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=RKkzesvk5cYfKkW+cuVfyJJ4B3TfqgCX4qySMMGKGlg=; b=KjcAGPdSFXh9/fKvaNf/GSsgsVjpp9tOTEJDXTRhFn6R1WcEYEqBW+ulTDbVAYbTve +9AYP1o4/MChVukN36+iNErV6EcTC/Uc71r3S9qvVgj3oVgAzaeu+9lOvfbk3bLlUzD5 6wGT8TmBEZTNVM+YTLvIB7qC615fHh7VqoKIV7Ar700VC6JyUtrdvMj6Ytgh1IxTAbaG 52Lq/fdMxwcoe2wkuGaTpwTb63OzbViwAMaVUg7a3FpzMzXhV/pGzO3H0vv6rHOslwhA LjUW8NGB11Z6Fe4nS5DbPjSXKtaf5UKRx/IFqh+YIP9kZJY7XLBWukZ60QitCIggz1M8 KTug== X-Gm-Message-State: AGi0PuaV1QQ8u/s11wneXWJLHhhjCSJ7qFjWu2t1Ud7jAU6DDRPBgyQS 7Mv7HpahKh1xKLIBSIUWEAhn3YD2AXUI+6mGaNmaLQ== X-Received: by 2002:aca:56c2:: with SMTP id k185mr3169314oib.141.1587052739351; Thu, 16 Apr 2020 08:58:59 -0700 (PDT) MIME-Version: 1.0 References: <20200415210512.805-1-valentin.schneider@arm.com> <20200415210512.805-10-valentin.schneider@arm.com> In-Reply-To: From: Vincent Guittot Date: Thu, 16 Apr 2020 17:58:47 +0200 Message-ID: Subject: Re: [PATCH v3 9/9] sched/topology: Define and use shortcut pointers for wakeup sd_flag scan To: Valentin Schneider Cc: Dietmar Eggemann , linux-kernel , Ingo Molnar , Peter Zijlstra Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 16 Apr 2020 at 17:27, Valentin Schneider wrote: > > > On 16/04/20 14:36, Vincent Guittot wrote: > >> Coming back to the v2 discussion on this patch > >> > >> https://lore.kernel.org/r/20200311181601.18314-10-valentin.schneider@arm.com > >> > >> SD_BALANCE_WAKE is not used in mainline anymore, so wakeups are always > >> fast today. > >> > >> I.e. you wouldn't need a per_cpu(sd_balance_wake, cpu) since it's always > >> NULL. > >> > >> I.e. want_affine logic and the 'for_each_domain(cpu, tmp)' isn't needed > >> anymore. > >> > >> This will dramatically simplify the code in select_task_rq_fair(). > >> > >> But I guess Vincent wants to keep the functionality so we're able to > >> enable SD_BALANCE_WAKE on certain sd's? > > > > I looked too quickly what was done by this patch. I thought that it > > was adding a per_cpu pointer for all cases including the fast path > > with wake affine but it only skips the for_each_domain loop for the > > slow paths which don't need it because they are already slow. > > > > It would be better to keep the for_each_domain loop for slow paths and > > to use a per_cpu pointer for fast_path/wake affine. Regarding the > > wake_affine path, we don't really care about looping all domains and > > we could directly use the highest domain because wake_affine() that is > > used in the loop, only uses the imbalance_pct of the sched domain for > > wake_affine_weight() and it should not harm to use only the highest > > domain and then select_idle_sibling doesn't use it but the llc or > > asym_capacity pointer instead. > > So Dietmar's pointing out that sd will always be NULL for want_affine, > because want_affine can only be true at wakeups and we don't have any > topologies with SD_BALANCE_WAKE anymore. We would still want to call > wake_affine() though, because that can change new_cpu. > > What you are adding on top is that the only sd field used in wake_affine() > is the imbalance_pct, so we could take a shortcut and just go for the > highest domain with SD_WAKE_AFFINE; i.e. something like this: > > --- > if (want_affine) { > // We can cache that at topology buildup > sd = highest_flag_domain(cpu, SD_WAKE_AFFINE); Yes and this one should be cached at topology buildup > > if (cpumask_test_cpu(prev_cpu, sched_domain_span(sd) && > cpu != prev_cpu) > new_cpu = wake_affine(); > > // Directly go to select_idle_sibling() > goto sis; > } > > // !want_affine logic here > --- > > As for the !want_affine part, we could either keep the current domain walk > (IIUC this is what you are suggesting) or go for the extra cached pointers > I'm introducing. > > Now if we are a bit bolder than that, because there are no more > (mainline) topologies w/ SD_BALANCE_WAKE, we could even turn the above > into: > > --- > if (wake_flags & WF_TTWU) { > if (want_affine) { > // We can cache that at topology buildup > sd = highest_flag_domain(cpu, SD_WAKE_AFFINE); > > if (cpumask_test_cpu(prev_cpu, sched_domain_span(sd) && > cpu != prev_cpu) > new_cpu = wake_affine(); > > } > // Directly go to select_idle_sibling() > goto sis; > } > > // !want_affine logic here > --- > > This in turns mean we could get rid of SD_BALANCE_WAKE entirely... I'm a > bit more reluctant to that only because the last SD_BALANCE_WAKE setter was For now, we should probably skip the additional test above: "if (wake_flags & WF_TTWU) {" and keep SD_BALANCE_WAKE so we will continue to loop in case of !want_affine. We can imagine that we might want at the end to be a bit more smart for SD_BALANCE_WAKE and the slow path... like with the latency nice proposal and latency-nice=19 as a example > removed fairly recently, see > a526d466798d ("sched/topology: Remove SD_BALANCE_WAKE on asymmetric capacity systems")