Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751858AbdGGP0m (ORCPT ); Fri, 7 Jul 2017 11:26:42 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:40574 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750938AbdGGP0k (ORCPT ); Fri, 7 Jul 2017 11:26:40 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org EB7E76081B Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=jhugo@codeaurora.org Subject: Re: [PATCH V5 2/2] sched/fair: Remove group imbalance from calculate_imbalance() To: Peter Zijlstra Cc: Ingo Molnar , linux-kernel@vger.kernel.org, Dietmar Eggemann , Austin Christ , Tyler Baicar , Timur Tabi , Morten Rasmussen References: <1496863138-11322-1-git-send-email-jhugo@codeaurora.org> <1496863138-11322-3-git-send-email-jhugo@codeaurora.org> <20170705112205.tyeprtki4vel5kpx@hirez.programming.kicks-ass.net> From: Jeffrey Hugo Message-ID: <77212993-a875-f26d-cc54-2607acc881e7@codeaurora.org> Date: Fri, 7 Jul 2017 09:26:37 -0600 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <20170705112205.tyeprtki4vel5kpx@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2807 Lines: 56 On 7/5/2017 5:22 AM, Peter Zijlstra wrote: > On Wed, Jun 07, 2017 at 01:18:58PM -0600, Jeffrey Hugo wrote: >> The group_imbalance path in calculate_imbalance() made sense when it was >> added back in 2007 with commit 908a7c1b9b80 ("sched: fix improper load >> balance across sched domain") because busiest->load_per_task factored into >> the amount of imbalance that was calculated. That is not the case today. > > It would be nice to have some more information on which patch(es) > changed that. From the history it looks like commit dd5feea14a7d ("sched: Fix SCHED_MC regression caused by change in sched cpu_power") removed load_per_task from imbalance calculations (was factored into max_pull). After this change, the group imbalance modifications to this variable appear to no longer work as originally intended. > >> The group_imbalance path can only affect the outcome of >> calculate_imbalance() when the average load of the domain is less than the >> original busiest->load_per_task. In this case, busiest->load_per_task is >> overwritten with the scheduling domain load average. Thus >> busiest->load_per_task no longer represents actual load that can be moved. >> >> At the final comparison between env->imbalance and busiest->load_per_task, >> imbalance may be larger than the new busiest->load_per_task causing the >> check to fail under the assumption that there is a task that could be >> migrated to satisfy the imbalance. However env->imbalance may still be >> smaller than the original busiest->load_per_task, thus it is unlikely that >> there is a task that can be migrated to satisfy the imbalance. >> Calculate_imbalance() would not choose to run fix_small_imbalance() when we >> expect it should. In the worst case, this can result in idle cpus. >> >> Since the group imbalance path in calculate_imbalance() is at best a NOP >> but otherwise harmful, remove it. > > load_per_task is horrible and should die. Ever since we did cgroup > support the number is complete crap, but even before that the concept > was dubious. > > Most of the logic that uses the number stems from the pre-smp-nice era. > > This also of course means that fix_small_imbalance() is probably a load > of crap. Digging through all that has been on the todo list for a long > while but somehow not something I've ever gotten to :/ Based on the state of the code today, our change fixes a current issue that we are encoutering. If we add the above history to the commit text, is our change sufficent as a short term solution between now and whenever the load_per_task path is rearchitected? -- Jeffrey Hugo Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.