Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp250561pxj; Thu, 3 Jun 2021 05:55:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxp8s26LQQB66ITFgdjoUlzacfoil594TyPnIbr92EhQtzM3BCt9dO1Z6VQ8g9Svhnkg11W X-Received: by 2002:aa7:d8d9:: with SMTP id k25mr29938225eds.377.1622724953152; Thu, 03 Jun 2021 05:55:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1622724953; cv=none; d=google.com; s=arc-20160816; b=Y5GX7biuOQZn2swLubi0ww8XekwbKBFxK7ZMRxU53yulHC3awjRds1BK/amiG1K4QT alVND3aITyzD4YOLtS3Sg48wuCmZvhAIHmmpGn1HzhI5qLa/wHOH8Il8qrlNdovlXldz ib96WtxCcs0u/o6MXUygZzATq52/uoLDSAycyuihHJ0P6KwlDlo9vDbc786sqgR1cfRi jnYydWy2uOOXvGqJWp3aVq5GjNZm6TUtYZKImFUMT+ZKDuX83YBI5T8DC9fM5q3CIvPP aWQIr7yeVEwsfhxKEsYS6UrXRC094Pnf9PpR0sQ70Htv+9MPWs6ErDr7QWSisvGmSg+1 LBNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=KXSPFYUv+frq5B31P2+9676MnBddW3+j4BZ+XqvE/IU=; b=vu5kfVlErhcth0WY6MBZx2dSMVf16oJ7JxKSfk926JE8GaKyBNx54eRPR/g5QYuYzV GsUUKmj49amcclAB6DtKcXmWkMbHOZA1GpWE8MTtVTQpwXX9v7aUBKBy4Lepwhm3vSrn gZn6QPEX8YlYes920jnf41hDgnq2hNqDq8qt32uwsHXw8zg1RYcSr8Z2Z5cxvHp5ojJW bQsOAiPIp9WgMdLF0+wg/IiwvgzinTh2Zg/oXgBt9mH3H+hdsKAURbdr2OUYHuxm/tcu Z4Iz2IxXyVP5aVT0SGcwvkH2vL1VHEI1DsdMyjzpEwwAnVnasd1D3bPPAaaS5XOw9+Xa uvgQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=mkYMalVH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 24si2172238ejx.746.2021.06.03.05.55.30; Thu, 03 Jun 2021 05:55:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=mkYMalVH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230525AbhFCMxv (ORCPT + 99 others); Thu, 3 Jun 2021 08:53:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230479AbhFCMxs (ORCPT ); Thu, 3 Jun 2021 08:53:48 -0400 Received: from mail-lf1-x136.google.com (mail-lf1-x136.google.com [IPv6:2a00:1450:4864:20::136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62012C06174A for ; Thu, 3 Jun 2021 05:52:03 -0700 (PDT) Received: by mail-lf1-x136.google.com with SMTP id w33so8640746lfu.7 for ; Thu, 03 Jun 2021 05:52:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=KXSPFYUv+frq5B31P2+9676MnBddW3+j4BZ+XqvE/IU=; b=mkYMalVHjpfomaWeFWDWX9glGkJHhsdYkZIp8eFDb0kxyUonE980Dq+4l3iVMucOKv /EIKyQCKHFCEZXAQDj6hIR8Ot8arkMF1kblq7fAHcIzO+4g4wLM3PDocDRqhRHl1i4o4 Laj2syzgyAjKpRqxjAPRCDZ8NpA6G/G8D5H8ie7sgttBLI1fdVjZuX87mGAX+BQXBavf lBCM2WVLrwLAC+pRRcQuAZ/tS0nAbzaUyaL7fKysEiKgSpXmeNuRqZniBgHzeaG7nSE1 tkyZocErWj3cgE18KuwIy6uNfE6LUmri061DG2AeTM1pS6luFQPo6Fwb84sGOLr6ctSW nlUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=KXSPFYUv+frq5B31P2+9676MnBddW3+j4BZ+XqvE/IU=; b=saXffkTtjFcTNfR6tnA97RBcF/Ln60KnR5OZDwAUSLMm6Y4Lhixqdk1L2RY/4HhcXy dAcpY3a37s6uGQO3rn7acfDkwxbfX5LR8NNl+X5dpg1nFMLeGJFg0ub9YkdmEhxODcmz Qr3uAHgvhWUlDlHSgXzVWZAs4TSyAzs8XwCavAA9k9z4h18hhEYy83ZqkilAgHWDdV2J EQ996FdmvadtNRBZNMIFBal4am4nnNcIiQEc0rDGPXZXbbbY2fUhxLaBH0Twe435/962 57LdouW1rh0kV4dA8xpSEs9fi3bmGdNTgsLNx7+UfG57SZ15THyfqU1NDiYbuZXMnXEP UpmQ== X-Gm-Message-State: AOAM532Jn/8EXB1oXajyz2KvQ7mPkEYkpmxAY7AShEBME24mAgNdgBBJ /unE7kZE7A2NHCnclcNzubGXeIcf2ZzR6r1rp64y0w== X-Received: by 2002:a19:4810:: with SMTP id v16mr9307733lfa.254.1622724721651; Thu, 03 Jun 2021 05:52:01 -0700 (PDT) MIME-Version: 1.0 References: <20210603113847.163512-1-odin@uged.al> In-Reply-To: <20210603113847.163512-1-odin@uged.al> From: Vincent Guittot Date: Thu, 3 Jun 2021 14:51:50 +0200 Message-ID: Subject: Re: [PATCH v2] sched/fair: Correctly insert cfs_rq's to list on unthrottle To: Odin Ugedal Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , "open list:CONTROL GROUP (CGROUP)" , linux-kernel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 3 Jun 2021 at 13:41, Odin Ugedal wrote: > > This fixes an issue where fairness is decreased since cfs_rq's can > end up not being decayed properly. For two sibling control groups with > the same priority, this can often lead to a load ratio of 99/1 (!!). > > This happen because when a cfs_rq is throttled, all the descendant cfs_rq's > will be removed from the leaf list. When they initial cfs_rq is > unthrottled, it will currently only re add descendant cfs_rq's if they > have one or more entities enqueued. This is not a perfect heuristic. > > Insted, we insert all cfs_rq's that contain one or more enqueued > entities, or contributes to the load of the task group. > > Can often lead to sutiations like this for equally weighted control > groups: > > $ ps u -C stress > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > root 10009 88.8 0.0 3676 100 pts/1 R+ 11:04 0:13 stress --cpu 1 > root 10023 3.0 0.0 3676 104 pts/1 R+ 11:04 0:00 stress --cpu 1 > > Fixes: 31bc6aeaab1d ("sched/fair: Optimize update_blocked_averages()") > Signed-off-by: Odin Ugedal > --- > > Original thread: https://lore.kernel.org/lkml/20210518125202.78658-3-odin@uged.al/ > Changes since v1: > - Replaced cfs_rq field with using tg_load_avg_contrib > - Went from 3 to 1 pathces; one is merged and one is replaced > by a new patchset. > > kernel/sched/fair.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 794c2cb945f8..0f1b39ca5ca8 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4719,8 +4719,11 @@ static int tg_unthrottle_up(struct task_group *tg, void *data) > cfs_rq->throttled_clock_task_time += rq_clock_task(rq) - > cfs_rq->throttled_clock_task; > > - /* Add cfs_rq with already running entity in the list */ > - if (cfs_rq->nr_running >= 1) > + /* > + * Add cfs_rq with tg load avg contribution or one or more > + * already running entities to the list > + */ > + if (cfs_rq->tg_load_avg_contrib || cfs_rq->nr_running) Out of curiosity, why did you decide to use cfs_rq->tg_load_avg_contrib instead of !cfs_rq_is_decayed(cfs_rq) which is used to delete the cfs_rq from the list when updating blocked load ? > list_add_leaf_cfs_rq(cfs_rq); > } > > -- > 2.31.1 >