Received: by 2002:a05:6a10:a852:0:0:0:0 with SMTP id d18csp307548pxy; Wed, 5 May 2021 02:45:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxYJCwAKfxl8BhjHCXNylUT+E3TWkgj05Dj+X1l7zWxAzPke0JO+ks0jwo8sg91CltTZeoy X-Received: by 2002:a62:1ec2:0:b029:275:9866:be33 with SMTP id e185-20020a621ec20000b02902759866be33mr27966175pfe.15.1620207913397; Wed, 05 May 2021 02:45:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620207913; cv=none; d=google.com; s=arc-20160816; b=U5MPbbFeHR1AzcleO16JBdgIP9wzfK6HIS1sQ8C07/kpJgP2YZIPWU2pO9+Mp3wL95 1sVQjnBmt+WYNbU9QxQuaCEdkH4P83nipe4D1zbNzSk90BDmlvXX1l7Xgdhh1/YWxlTk Y9mRO0KGboWapJH3rDhhgp/+BMaTOyxmWEwGgAeIx9/Wrz89cHtpOBeDMzCRl6oO3hL/ BFz4qzWhRAYxKvRweNAf5pBBCWlBnub+f4+PufMdwYNr3wmbPYOXENnMgPdUpVaikEzo aMa2NP/Gzsi0mx/CSWM1zamRn7EpGekWgGUqMNgbpsNrgwyi4mPfWRsBdCA+QhNUA5pO BmJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=Kd4ex3fhfQ5t0zLYYlT24ZKmfY6ZJanOSwKwmP+GJH4=; b=J/HEyYBwCVYRBwlAn9+/G7iZVbV30MnFNKYVUlPpw6sRIPQ8uQ23KQOcdkLh4RUpXA nJbyDYb2vLTBTa8Ty3imq3RMRVIM9uDwdf5f8yMzz0KAEGO/gJq8tz9eZE0uJJCR9/CJ 7MoTvdfId5581ixJ53PoI6KrJi7DmQMDFZOPkjgAg6RbUjUPPWxLvfw5Gm3OelOH0f2j lmntA1+/KeEeu5gn7vvuEefaQmlWYFN2b+YqdjDv8PIXoY4ZCUHiyfJ5EfIm64uu3/p4 HRDJS0YhU7iUxfLLq6VUlctZ2zhdCMle9oS+OCLtmPc5+ZcdIm7rxDZammfTRe9feo3w hlpA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s32si6596947pgk.590.2021.05.05.02.44.49; Wed, 05 May 2021 02:45:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232558AbhEEJof (ORCPT + 99 others); Wed, 5 May 2021 05:44:35 -0400 Received: from foss.arm.com ([217.140.110.172]:41392 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233139AbhEEJoB (ORCPT ); Wed, 5 May 2021 05:44:01 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0DB2ED6E; Wed, 5 May 2021 02:43:05 -0700 (PDT) Received: from [192.168.178.6] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 468C13F70D; Wed, 5 May 2021 02:43:02 -0700 (PDT) Subject: Re: [PATCH 1/1] sched/fair: Fix unfairness caused by missing load decay To: Odin Ugedal Cc: Vincent Guittot , Odin Ugedal , Ingo Molnar , Peter Zijlstra , Juri Lelli , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , "open list:CONTROL GROUP (CGROUP)" , open list References: <20210425080902.11854-1-odin@uged.al> <20210425080902.11854-2-odin@uged.al> <20210427142611.GA22056@vingu-book> <4ba77def-c7e9-326e-7b5c-cd491b063888@arm.com> From: Dietmar Eggemann Message-ID: <4b0d6562-db41-b4fc-ae51-694946c9255d@arm.com> Date: Wed, 5 May 2021 11:43:00 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/05/2021 16:41, Odin Ugedal wrote: > Hi, > >> I think what I see on my Juno running the unfairness_missing_load_decay.sh script is >> in sync which what you discussed here: > > Thanks for taking a look! > >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 794c2cb945f8..7214e6e89820 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -10854,6 +10854,8 @@ static void propagate_entity_cfs_rq(struct sched_entity *se) >> break; >> >> update_load_avg(cfs_rq, se, UPDATE_TG); >> + if (!cfs_rq_is_decayed(cfs_rq)) >> + list_add_leaf_cfs_rq(cfs_rq); >> } >> } > > This might however lead to "loss" at /slice/cg-2/sub and > slice/cg-1/sub in this particular case tho, since > propagate_entity_cfs_rq skips one cfs_rq > by taking the parent of the provided se. The loss in that case would > however not be equally big, but will still often contribute to some > unfairness. Yeah, that's true. By moving stopped `stress` tasks into /sys/fs/cgroup/cpu/slice/cg-{1,2}/sub and then into /sys/fs/cgroup/cpuset/A which has a cpuset.cpus {0-1,4-5} not containing the cpus the `stress` tasks attached {2,3} to and then restart the `stress` tasks again I get: cfs_rq[1]:/slice/cg-1/sub .load_avg : 1024 .removed.load_avg : 0 .tg_load_avg_contrib : 1024 <--- .tg_load_avg : 2047 <--- .se->avg.load_avg : 511 cfs_rq[1]:/slice/cg-1 .load_avg : 512 .removed.load_avg : 0 .tg_load_avg_contrib : 512 <--- .tg_load_avg : 1022 <--- .se->avg.load_avg : 512 cfs_rq[1]:/slice .load_avg : 513 .removed.load_avg : 0 .tg_load_avg_contrib : 513 .tg_load_avg : 1024 .se->avg.load_avg : 512 cfs_rq[5]:/slice/cg-1/sub .load_avg : 1024 .removed.load_avg : 0 .tg_load_avg_contrib : 1023 <--- .tg_load_avg : 2047 <--- .se->avg.load_avg : 511 cfs_rq[5]:/slice/cg-1 .load_avg : 512 .removed.load_avg : 0 .tg_load_avg_contrib : 510 <--- .tg_load_avg : 1022 <--- .se->avg.load_avg : 511 cfs_rq[5]:/slice .load_avg : 512 .removed.load_avg : 0 .tg_load_avg_contrib : 511 .tg_load_avg : 1024 .se->avg.load_avg : 510 I saw that your v2 patch takes care of that.