Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp351046pxf; Wed, 10 Mar 2021 07:39:03 -0800 (PST) X-Google-Smtp-Source: ABdhPJwUDyWbzoUipAKMxmvw5zWLBqJMNfrTLRZ02XLoOLp/sh3DWLUhVOqVM6g0hHu3Ez+lWccS X-Received: by 2002:a50:ed83:: with SMTP id h3mr4155356edr.140.1615390743289; Wed, 10 Mar 2021 07:39:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1615390743; cv=none; d=google.com; s=arc-20160816; b=kpaKp7j/FqKJExPqXj+F9SzkXdf/8DYTrrXGP5P48EzNMCwXNV32HCNrLlhXcsSviR CAIF9/yYUPv3gHyfNElPPCx9wVp28V+oqi8uRd72Grb+qlJggCp/xKIvSKw6QvPmyw1d vdAxjlUv+1T5FxFekZ6F3ufRZc5tLm3DUnWPjvV//FXMnQOFn7UFtqWTs1IkOmmYUXEK Ix6CQZMlQZIeQkRebAqLjMIbUOMV5x+gj32p6t/eYaadR+9QzKkBo0UFasOP9Dl05yog 0LyDVBSzOBsiQPbrdsKYGpyVnRBe5q/3tNN0lyRX4lVgEBdSnFn1TKNV1HPqUm2C1anT Tnbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=dl5TWeyuZTMt1cgpit698eWdy4xdbmLgJGIjyn/Jnmk=; b=ch8KSBM+7BRYjV+hfj6isJW80RmkomsL6cP3j8cpgBuW0s0mZ9k/wwInAjpy3AXUP4 7SprQS9O6yymBzJdgPiZzJyJaASLg9RTmIvIASjhpUBLnbPl480a87FdDvVD2oasLV+D w0PSvtWiVig5fJbAxJqIh7Kh0HvUIRQninMfgK9uAdEUgehw/8UaSxwYIrPv6iZ7uRrv S7xjPWZHfXwmjt7xLPTsWcW0r48ZVq+Xp8AhDznzHxolDXRqPnk3MfncOtYlTs8MUITP Elu//FvJMqDSvahlAFCK/nKlu0O5yfEmYZFsYcD9ZnPNUqViFu5N1/LfalMwoXij6TzK WBIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="WCUmx3W/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r18si13815054edi.229.2021.03.10.07.38.41; Wed, 10 Mar 2021 07:39:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="WCUmx3W/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233150AbhCJPhj (ORCPT + 99 others); Wed, 10 Mar 2021 10:37:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232504AbhCJPh3 (ORCPT ); Wed, 10 Mar 2021 10:37:29 -0500 Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 546FBC061760 for ; Wed, 10 Mar 2021 07:37:29 -0800 (PST) Received: by mail-lf1-x134.google.com with SMTP id d3so34135939lfg.10 for ; Wed, 10 Mar 2021 07:37:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dl5TWeyuZTMt1cgpit698eWdy4xdbmLgJGIjyn/Jnmk=; b=WCUmx3W/XDcG/YCS15uRNC7ugzCGiDrP8UQX8eDE3UafW5htyWiLOxMZj5JTB2fgnP +OYrwhDcdSXrpfPdUyKBhvw01CB/t1/0GUWgvB3i/xhXkGIBG5Mhoh4iaM2hVD0FhTMn Tuxmdu7npXf5V9Uo6GBmYk7GFrk29VqTZohXWEa8rGxCCHogMVVFK88+1FB7IjjgOypb 9Pk7qyhAsNJBLxmYKpHbHAt/Bme0sdv3H8vbHzWM/pFLxnbt88Kr/ro/U/CpxqDE5J2i ahQCPHiIgQwSSYticWF4CDEpfTeAJCA83vnptT6gU1OHqZK6wnK3rOwdRWXBfdalLkNX pFmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dl5TWeyuZTMt1cgpit698eWdy4xdbmLgJGIjyn/Jnmk=; b=HdAEp+GNMBtEjVeh+p0J86kt8bhGw5uWVQlzIpWWMS61zjuhImN7SX6HeB+Zoxwl6u GaBVfbAMVEjn4jc5mKXO2Pf3CH6uv2LzrBdILheROuu0c0TQa/KAKbskYQeA/S+A4DGY ykiBRDxtGUO6DUO/g4S+FNbdIJn3kxG2KMoBa4S5SMcF1ewIqnFz9ZDXpYu+Q2x2aN/i Zrz6UaGeFS8IAgoHUsBPZr7/wJ6d1NWuzQFmfhY4Bv72VW+zT7+XRIGXwVSR2zQB1lma igZDOiyMqByjUFzFAYfv0teEXHznFxV2h8Cnwa106HbkRurK+hd8LR5edK8pqTlrxUDh sxpw== X-Gm-Message-State: AOAM531p5Q3ZjFHCNHjFIgo2HQa36t7TrZISQ+zhYs8FJRVYO/yCCFfC uVzFQLmB3NSLonwQ05wYGrz1AOBc0A8KstFaAhEoOw== X-Received: by 2002:a19:f812:: with SMTP id a18mr2430527lff.254.1615390647271; Wed, 10 Mar 2021 07:37:27 -0800 (PST) MIME-Version: 1.0 References: <20210226164029.122432-1-srikar@linux.vnet.ibm.com> <20210310055241.GO2028034@linux.vnet.ibm.com> In-Reply-To: <20210310055241.GO2028034@linux.vnet.ibm.com> From: Vincent Guittot Date: Wed, 10 Mar 2021 16:37:15 +0100 Message-ID: Subject: Re: [PATCH] sched/fair: Prefer idle CPU to cache affinity To: Srikar Dronamraju Cc: Ingo Molnar , Peter Zijlstra , LKML , Mel Gorman , Rik van Riel , Thomas Gleixner , Valentin Schneider , Dietmar Eggemann , Michael Ellerman , Michael Neuling , Gautham R Shenoy , Parth Shah Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 10 Mar 2021 at 06:53, Srikar Dronamraju wrote: > > * Vincent Guittot [2021-03-08 14:52:39]: > > > On Fri, 26 Feb 2021 at 17:41, Srikar Dronamraju > > wrote: > > > > > Thanks Vincent for your review comments. > > > > +static int prefer_idler_llc(int this_cpu, int prev_cpu, int sync) > > > +{ > > > + struct sched_domain_shared *tsds, *psds; > > > + int pnr_busy, pllc_size, tnr_busy, tllc_size, diff; > > > + > > > + tsds = rcu_dereference(per_cpu(sd_llc_shared, this_cpu)); > > > + tnr_busy = atomic_read(&tsds->nr_busy_cpus); > > > + tllc_size = per_cpu(sd_llc_size, this_cpu); > > > + > > > + psds = rcu_dereference(per_cpu(sd_llc_shared, prev_cpu)); > > > + pnr_busy = atomic_read(&psds->nr_busy_cpus); > > > + pllc_size = per_cpu(sd_llc_size, prev_cpu); > > > + > > > + /* No need to compare, if both LLCs are fully loaded */ > > > + if (pnr_busy == pllc_size && tnr_busy == pllc_size) > > > + return nr_cpumask_bits; > > > + > > > + if (sched_feat(WA_WAKER) && tnr_busy < tllc_size) > > > + return this_cpu; > > > > Why have you chosen to favor this_cpu instead of prev_cpu unlike for wake_idle ? > > At this point, we know the waker running on this_cpu and wakee which was > running on prev_cpu are affine to each other and this_cpu and prev_cpu dont > share cache. I chose to move them close to each other to benefit from the > cache sharing. Based on feedback from Peter and Rik, I made the check more > conservative i.e tnr_busy <= tllc_size/smt_weight (where smt_weight is the > cpumask weight of smt domain for this_cpu) i.e if we have a free core in yeah make sense > this llc domain, chose this_cpu. select_idle_sibling() should pick an idle > cpu/core/smt within the llc domain for this_cpu. > > Do you feel, this may not be the correct option? I was worried that we end up pulling tasks in same llc but the condition above and wake_wide should prevent such behavior > > We are also experimenting with another option, were we call prefer_idler_cpu > after wa_weight. I.e > 1. if wake_affine_weight choses this_cpu but llc in prev_cpu has an idle > smt/CPU but there are no idle smt/CPU in this_cpu, then chose idle smt/CPU > in prev_cpu > 2. if wake_affine_weight choses nr_cpumask(aka prev_cpu) but llc in this_cpu > has an idle smt/CPU but there are no idle smt/CPU in prev_cpu, then chose > idle smt/CPU in this_cpu > > > > > + > > > + /* For better wakeup latency, prefer idler LLC to cache affinity */ > > > + diff = tnr_busy * pllc_size - sync - pnr_busy * tllc_size; > > > + if (!diff) > > > + return nr_cpumask_bits; > > > + if (diff < 0) > > > + return this_cpu; > > > + > > > + return prev_cpu; > > > +} > > > + > > > static int wake_affine(struct sched_domain *sd, struct task_struct *p, > > > int this_cpu, int prev_cpu, int sync) > > > { > > > @@ -5877,6 +5907,10 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p, > > > if (sched_feat(WA_IDLE)) > > > target = wake_affine_idle(this_cpu, prev_cpu, sync); > > > > > > + if (sched_feat(WA_IDLER_LLC) && target == nr_cpumask_bits && > > > + !cpus_share_cache(this_cpu, prev_cpu)) > > > + target = prefer_idler_llc(this_cpu, prev_cpu, sync); > > > > could you use the same naming convention as others function ? > > wake_affine_llc as an example > > I guess you meant s/prefer_idler_llc/wake_affine_llc/ yes > Sure. I can modify. > > > > > > + > > > if (sched_feat(WA_WEIGHT) && target == nr_cpumask_bits) > > > target = wake_affine_weight(sd, p, this_cpu, prev_cpu, sync); > > > > > > @@ -5884,8 +5918,11 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p, > > > if (target == nr_cpumask_bits) > > > return prev_cpu; > > > > > > - schedstat_inc(sd->ttwu_move_affine); > > > - schedstat_inc(p->se.statistics.nr_wakeups_affine); > > > + if (target == this_cpu) { > > > > How is this condition related to $subject ? > > Before this change, wake_affine_weight and wake_affine_idle would either > return this_cpu or nr_cpumask_bits. Just before this check, we check if > target is nr_cpumask_bits and return prev_cpu. So the stats were only > incremented when target was this_cpu. > > However with prefer_idler_llc, we may return this_cpu, prev_cpu or > nr_cpumask_bits. Now we only to update stats when we have chosen to migrate > the task to this_cpu. Hence I had this check. ok got it. May be return earlier in this case like for if (target == nr_cpumask_bits) above > > If we use the slightly lazier approach which is check for wa_weight first > before wa_idler_llc, then we may not need this change at all. > > -- > Thanks and Regards > Srikar Dronamraju