Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751896AbdFZTe4 (ORCPT ); Mon, 26 Jun 2017 15:34:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48987 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751419AbdFZTev (ORCPT ); Mon, 26 Jun 2017 15:34:51 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com A59D1C04B310 Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=riel@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com A59D1C04B310 Message-ID: <1498505689.13083.49.camel@redhat.com> Subject: Re: [PATCH 4/4] sched,fair: remove effective_load From: Rik van Riel To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, jhladky@redhat.com, mingo@kernel.org, mgorman@suse.de Date: Mon, 26 Jun 2017 15:34:49 -0400 In-Reply-To: <20170626161250.GD4941@worktop> References: <20170623165530.22514-1-riel@redhat.com> <20170623165530.22514-5-riel@redhat.com> <20170626144437.GB4941@worktop> <20170626144611.GA5775@worktop> <1498488941.13083.43.camel@redhat.com> <20170626150401.GC4941@worktop> <1498490454.13083.45.camel@redhat.com> <20170626161250.GD4941@worktop> Organization: Red Hat, Inc Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Mon, 26 Jun 2017 19:34:50 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1479 Lines: 39 On Mon, 2017-06-26 at 18:12 +0200, Peter Zijlstra wrote: > On Mon, Jun 26, 2017 at 11:20:54AM -0400, Rik van Riel wrote: > > > Oh, indeed.  I guess in wake_affine() we should test > > whether the CPUs are in the same NUMA node, rather than > > doing cpus_share_cache() ? > > Well, since select_idle_sibling() is on LLC; the early test on > cpus_share_cache(prev,this) seems to actually make sense. > > But then cutting out all the other bits seems wrong. Not in the least > because !NUMA_BALACING should also still keep working. Even when !NUMA_BALANCING, I suspect it makes little sense to compare the loads just one the cores in question, since select_idle_sibling() will likely move the task somewhere else. I suspect we want to compare the load on the whole LLC for that reason, even with NUMA_BALANCING disabled. > > Or, alternatively, have an update_numa_stats() variant > > for numa_wake_affine() that works on the LLC level? > > I think we want to retain the existing behaviour for everything > larger than LLC, and when NUMA_BALANCING, smaller than NUMA. What do you mean by this, exactly? How does the "existing behaviour" of only looking at the load on two cores make sense when doing LLC-level task placement? > Also note that your use of task_h_load() in the new numa thing > suffers > from exactly the problem effective_load() is trying to solve. Are you saying task_h_load is wrong in task_numa_compare() too, then? Should both use effective_load()?