Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp4306285pxy; Tue, 27 Apr 2021 01:38:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxx4pLmYE5Dk6diaxvpwmH6N9ym8OnG3VBtWFgSRvqmMA4NSdC1cZS4Mx57+sQLRmKrgSfy X-Received: by 2002:a50:ec97:: with SMTP id e23mr2976680edr.98.1619512723138; Tue, 27 Apr 2021 01:38:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619512723; cv=none; d=google.com; s=arc-20160816; b=uUOBStLvoooMEoBRdKVm/QRlentAdvJUQuuHuAxJQcSaOTqXLzBCoIfDfq4PX+0Zx+ Jr8lL0gMT5MXxzKDrMLhknH0/+bYSQlRm2xYAVCFMn1Gt7+llZgf1EtL2cQj1AxUgkIL Tdsb1jZ/DKAL08uDZL602ELGwM2zlqUsfmQccS5Uj8RbImgtSeRb2bXA482XY7jawHLQ tFmyJXjeZcjhhpO+J35f8AE7LxFrPEnNfwvu0iufXXAeXNeosfIc0DxLdZ6Kn+m1jKDC GzJ7mtTzLpYqPb0zZMfw01WdQQqZkIUmR55zPOqWYd6jF3aW2Vb3Gj3HwN/VgGXgpflB sUsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=zh33Yh7lW0dlBQiRgpyt2psV9Y8igA+5UjDIZx32BfY=; b=oO1Rzwrgj3vSPzwxKnaedt4OqXYTtLLxGOK6dkGp4y3uEq8JjkRE2ooDwTPx/IdEzd 7RtUg6J9HzHocNa2PaiJ8T55NPpfzz/XoXLEgP9/KI1fZ92FuMkxJ3Pc0Vak3JP6bpht qt9JnK7QRUVkhjzLlDZvMWMXPMzgh8PGFP71z6rSZgABgythSN1J3LMGNIMX0byUbCqt wK1pqdzv6fqVJJPa3IuwIwrP+Ms4lpsAcvVQZst/i3CvC9WffVHdBmEoqNwbg8nst+U1 oO8cWe4amNTjTR2L48+ZBPC2+ri2OZKrik9SZQtXlKmkjj9fIe6miuTOgtA4V5pgdwfA Or0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ugedal.com header.s=google header.b=R80hr1wa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id nc12si1034382ejc.8.2021.04.27.01.38.18; Tue, 27 Apr 2021 01:38:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ugedal.com header.s=google header.b=R80hr1wa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235106AbhD0Ihm (ORCPT + 99 others); Tue, 27 Apr 2021 04:37:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234775AbhD0Ihl (ORCPT ); Tue, 27 Apr 2021 04:37:41 -0400 Received: from mail-qt1-x82a.google.com (mail-qt1-x82a.google.com [IPv6:2607:f8b0:4864:20::82a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CC67C061574 for ; Tue, 27 Apr 2021 01:36:57 -0700 (PDT) Received: by mail-qt1-x82a.google.com with SMTP id o1so1985582qta.1 for ; Tue, 27 Apr 2021 01:36:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ugedal.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=zh33Yh7lW0dlBQiRgpyt2psV9Y8igA+5UjDIZx32BfY=; b=R80hr1wa0Mn4k9VnGUbjc2R4cixMJ5ZgNIsARabk5tPMu6Xepea/w5ho7Xc+BrYwxe RzVMetIIjm8bZYHV94xbetqPI3rgYcmAT7Nrf8Dq39rGvxBcqXoeXnQ7GesBx3HvsUES y8tiElerXVsLUim5FOz5JOxbTJoBMrtc63Y/8T+OOTjI7PYvJ0/xb4rO/cOynbxkZZUf EyqUNqvy6ym8veTJ9KzmuW0L/PJlrPuQ9zJAejiUYWsuDsIUEffKgPUIlE7cYyTAucyZ MW76yf/Qz4jtb3+hk6kV9THp4URfVjGSiTN+UZmk55DKeUifX6PsQ7ZTvUmp9qKKJ+Gl J5yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=zh33Yh7lW0dlBQiRgpyt2psV9Y8igA+5UjDIZx32BfY=; b=CNWFA7X8GUb4nuMLlDhQrmBK4DsOCdCFclYECVNN2rqO/mYRIiRJV6gWxZzLIB+T2C t/fmV9lkuh76wGhLIt3goeXl8tho+xD6SNlTjhdR1Vgx7zQazWL8AMk8XF0t7Vn53uJ4 HxsYFFhXSf0z2thPUTEOp280q9H9TjVkt2yPMvzBOZZoFmFcDRtboScB+J/rexw+HTIh /X9DQIsOz6QA3S7do3w6/5y+QfZfaqS70qGVRKjGDufiJF32hFczX2nQAImWWgGZE/hq CWs3KN+2TaiLqIuAMcO89im+7zG53tIFUJBji6DTNJL3TTS01BabJ39/LiW5mgLUBboZ GHAA== X-Gm-Message-State: AOAM531edW6S/xOkwD4EwpVzBkuwBGlZlNIUNmaJhx3p5sn9Fr41hFZz XaZHFmZCL4nKAo7TUHJ29c5CaZFAsvbBm+zp0M0HFg== X-Received: by 2002:a05:622a:14c9:: with SMTP id u9mr20628968qtx.313.1619512616470; Tue, 27 Apr 2021 01:36:56 -0700 (PDT) MIME-Version: 1.0 References: <20210425080902.11854-1-odin@uged.al> In-Reply-To: From: Odin Ugedal Date: Tue, 27 Apr 2021 10:36:23 +0200 Message-ID: Subject: Re: [PATCH 0/1] sched/fair: Fix unfairness caused by missing load decay To: Vincent Guittot Cc: Odin Ugedal , Ingo Molnar , Peter Zijlstra , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , "open list:CONTROL GROUP (CGROUP)" , linux-kernel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Also, instead of bpftrace, one can look at the /proc/sched_debug file, and infer from there. Something like: $ cat /proc/sched_debug | grep ":/slice" -A 28 | egrep "(:/slice)|load_avg" gives me the output (when one stress proc gets 99%, and the other one 1%): cfs_rq[0]:/slice/cg-2/sub .load_avg : 1023 .removed.load_avg : 0 .tg_load_avg_contrib : 1035 .tg_load_avg : 1870 .se->avg.load_avg : 56391 cfs_rq[0]:/slice/cg-1/sub .load_avg : 1023 .removed.load_avg : 0 .tg_load_avg_contrib : 1024 .tg_load_avg : 1847 .se->avg.load_avg : 4 cfs_rq[0]:/slice/cg-1 .load_avg : 4 .removed.load_avg : 0 .tg_load_avg_contrib : 4 .tg_load_avg : 794 .se->avg.load_avg : 5 cfs_rq[0]:/slice/cg-2 .load_avg : 56401 .removed.load_avg : 0 .tg_load_avg_contrib : 56700 .tg_load_avg : 57496 .se->avg.load_avg : 1008 cfs_rq[0]:/slice .load_avg : 1015 .removed.load_avg : 0 .tg_load_avg_contrib : 1009 .tg_load_avg : 2314 .se->avg.load_avg : 447 As can be seen here, no other cfs_rq for the relevant cgroups are "active" and listed, but they still contribute to eg. the "tg_load_avg". In this example, "cfs_rq[0]:/slice/cg-1" has a load_avg of 4, and contributes with 4 to tg_load_avg. However, the total total tg_load_avg is 794. That means that the other 790 have to come from somewhere, and in this example, they come from the cfs_rq on another cpu. Hopefully that clarified a bit. For reference, here is the output when the issue is not occuring: cfs_rq[1]:/slice/cg-2/sub .load_avg : 1024 .removed.load_avg : 0 .tg_load_avg_contrib : 1039 .tg_load_avg : 1039 .se->avg.load_avg : 1 cfs_rq[1]:/slice/cg-1/sub .load_avg : 1023 .removed.load_avg : 0 .tg_load_avg_contrib : 1034 .tg_load_avg : 1034 .se->avg.load_avg : 49994 cfs_rq[1]:/slice/cg-1 .load_avg : 49998 .removed.load_avg : 0 .tg_load_avg_contrib : 49534 .tg_load_avg : 49534 .se->avg.load_avg : 1023 cfs_rq[1]:/slice/cg-2 .load_avg : 1 .removed.load_avg : 0 .tg_load_avg_contrib : 1 .tg_load_avg : 1 .se->avg.load_avg : 1023 cfs_rq[1]:/slice .load_avg : 2048 .removed.load_avg : 0 .tg_load_avg_contrib : 2021 .tg_load_avg : 2021 .se->avg.load_avg : 1023 Odin