Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp435780ybt; Wed, 8 Jul 2020 03:37:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzJj8iX1Vwwp35DVWqwVpowqHIWwBTRUrlJKS+QN8E4StNUwTOUNJkO7ViA4cxQxP55wt4Q X-Received: by 2002:a17:906:5008:: with SMTP id s8mr40020951ejj.147.1594204655813; Wed, 08 Jul 2020 03:37:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594204655; cv=none; d=google.com; s=arc-20160816; b=JO9q3zDZBeE7G4i9VUVaTcZuKrDaljv0/q6cOj+L+NcyPO7AYpBM4AKDFr0MM9de9Q O/dys7ko+sg7fd9UygIFIHS4LDPllj781mL2ECh0tDvkBn56vtny5MCSwevYC7v82g3u HU2h0X0XJElKtVwWPAByTp32OplvUpkQDPu6num21h+OZy8oQofHgYX/gBezk2VGtht2 Tt5w4O5uTl4ek2o+41tgO1q/c+W2hpvtQYEMniBu+5+a+5LQIDA3166xcffONOlUpLI6 q1BhQU4Zt4LAupsZTSadJV8W4RiWEmsxMDDn1ZzXrXQcxvsdwwTuy/hvR57UIe/DZ2Us /OTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:in-reply-to :subject:cc:to:from:user-agent:references; bh=5J969hAj+4sqqlbfNmBbPP5IhKZCk8m/zsGx3lkju28=; b=H+lpmtkqyJha/2svflkWcH7Iswn2qeO6KR0ucXhP3gpT0YJuw2DMgAlGY0b+15rcDV RhBzffXvNgmkZeQXAsS+UexGjXJgl3MM9cGhJLgciNsy4ceQDWFXzqHJQYVSDr4KH1eZ cf68xZMPhZZABRGDj54xjta6rVF6G+DjkCtppybMAjF4jCbDbv3/s4oM0BtdCQNKGFAf Cz2xMjO5nsUzJONwevGAK2Vrz6//765lxuayOskVRFb1O5Nxd41Q2MO5WMYiWH6mpyrv iizqdDuZjSghwEESzXGvedi9D2AhynsfWZQCIFpnRsUh9lNxld/35XQ7raidMmx/dWNv 0gxw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k21si12018091edx.64.2020.07.08.03.37.13; Wed, 08 Jul 2020 03:37:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728385AbgGHKeS (ORCPT + 99 others); Wed, 8 Jul 2020 06:34:18 -0400 Received: from foss.arm.com ([217.140.110.172]:59170 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725972AbgGHKeR (ORCPT ); Wed, 8 Jul 2020 06:34:17 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 11F6731B; Wed, 8 Jul 2020 03:34:17 -0700 (PDT) Received: from e113632-lin (e113632-lin.cambridge.arm.com [10.1.194.46]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CFB313F71E; Wed, 8 Jul 2020 03:34:15 -0700 (PDT) References: <20200702144258.19326-1-vincent.guittot@linaro.org> User-agent: mu4e 0.9.17; emacs 26.3 From: Valentin Schneider To: Vincent Guittot Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , linux-kernel Subject: Re: [PATCH] sched/fair: handle case of task_h_load() returning 0 In-reply-to: Date: Wed, 08 Jul 2020 11:34:10 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/07/20 14:30, Vincent Guittot wrote: > On Thu, 2 Jul 2020 at 18:28, Vincent Guittot wrote: >> >> On Thu, 2 Jul 2020 at 18:11, Valentin Schneider >> wrote: >> > >> > >> > On 02/07/20 15:42, Vincent Guittot wrote: >> > > task_h_load() can return 0 in some situations like running stress-ng >> > > mmapfork, which forks thousands of threads, in a sched group on a 224 cores >> > > system. The load balance doesn't handle this correctly because >> > > env->imbalance never decreases and it will stop pulling tasks only after >> > > reaching loop_max, which can be equal to the number of running tasks of >> > > the cfs. Make sure that imbalance will be decreased by at least 1. >> > > >> > > misfit task is the other feature that doesn't handle correctly such >> > > situation although it's probably more difficult to face the problem >> > > because of the smaller number of CPUs and running tasks on heterogenous >> > > system. >> > > >> > > We can't simply ensure that task_h_load() returns at least one because it >> > > would imply to handle underrun in other places. >> > >> > Nasty one, that... >> > >> > Random thought: isn't that the kind of thing we have scale_load() and >> > scale_load_down() for? There's more uses of task_h_load() than I would like >> > for this, but if we upscale its output (or introduce an upscaled variant), >> > we could do something like: >> > >> > --- >> > detach_tasks() >> > { >> > long imbalance = env->imbalance; >> > >> > if (env->migration_type == migrate_load) >> > imbalance = scale_load(imbalance); >> > >> > while (!list_empty(tasks)) { >> > /* ... */ >> > switch (env->migration_type) { >> > case migrate_load: >> > load = task_h_load_upscaled(p); >> > /* ... usual bits here ...*/ >> > lsub_positive(&env->imbalance, load); >> > break; >> > /* ... */ >> > } >> > >> > if (!scale_load_down(env->imbalance)) >> > break; >> > } >> > } >> > --- >> > >> > It's not perfect, and there's still the misfit situation to sort out - >> > still, do you think this is something we could go towards? >> >> This will not work for 32bits system. >> >> For 64bits, I have to think a bit more if the upscale would fix all >> cases and support propagation across a hierarchy. And in this case we >> could also consider to make scale_load/scale_load_down a nop all the >> time > > In addition that problem remains on 32bits, the problem can still > happen after extending the scale so this current patch still makes > sense. > Right, I think we'd want to have that at the very least for 32bit anyway. I haven't done the math, but doesn't it require an obscene amount of tasks for that to still happen on 64bit with the increased resolution? > Then if we want to reduce the cases where task_h_load returns 0, we > should better make scale_load_down a nop otherwise we will have to > maintain 2 values h_load and scale_h_load across the hierarchy > I don't fully grasp yet how much surgery that would require, but it does sound like something we've been meaning to do, see e.g. se_weight: * XXX we want to get rid of these helpers and use the full load resolution.