Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp1262912ybf; Thu, 27 Feb 2020 07:43:32 -0800 (PST) X-Google-Smtp-Source: APXvYqzPPUJgq3V9vF2uRYuEtkcXxPLyyZj6wBgj2neTxLdaANbcI2rELBMsrXnkohPRtav10vq6 X-Received: by 2002:a9d:811:: with SMTP id 17mr189343oty.369.1582818212291; Thu, 27 Feb 2020 07:43:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582818212; cv=none; d=google.com; s=arc-20160816; b=n/TQFnkJ9zJqcgCQp2HjmL5e3rz8qI7j6/5s6VD0P65/GSWHO+Wxkfe5MrRlvmyjCm 8hZjm2u7KRQYw3omzLEVdMezeiLsc8bPug9fs/NzkwXBDHgYktcrNoBKoiVXJ4cGMZ/x 4DHTPR7yCQzaYZu6ELyX5JOb6l40XZqQCbSndfvlXl/uYi5lThDxeZ1eVro8RCq7Ep5K 7GbjayHCb9IxKtvs6phcb0xkTTvSXRT+Gi4OuwyiDIV1ZaBofA+d5ADY+xMolxGvuohR R3BJF420q46gDr1rviDitIFZo8pHnqFEDLvcqICFTcKulQcsPs0W6ahOgGA/H5VExzxs tgyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=Ph6zNvRlMjBgEifT68quDtBVSaXm9rcB1KivMTIcqD0=; b=J2hjhnUAoNMl0EYzlxfG8ffHNTIn3R0cX34yud8WQjcIBiyh28isWTStpk3DS6HcmE Pd8sL32wO26smM3e6z8fX+qu1bHpvIQovzXwvD3Gjiej+WvVwyabkH1GT4EkRakGZct4 12jSAHRbgb7gW+J6en533NMsR0xQeLw7pWlcXv6Tgax9BCw2QFKHPA2t6ulr+br7RmKC O3pxqARazO3+hzWAA5HMaAY5i6cd5sPVhG0KkHv3zWIcJ7gw9W6xgNcIC0NTjCR9rxVx bsdlHyun9maLTWMgiMCQTh7ITaDgIsMqp4esoWHGeWHyQdbfSxwKxB+04FEr1aujgNUL Wi7g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=LOGSoHuD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q66si24168oig.65.2020.02.27.07.43.07; Thu, 27 Feb 2020 07:43:32 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=LOGSoHuD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729274AbgB0Pe5 (ORCPT + 99 others); Thu, 27 Feb 2020 10:34:57 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:58665 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727592AbgB0Pe5 (ORCPT ); Thu, 27 Feb 2020 10:34:57 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582817695; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Ph6zNvRlMjBgEifT68quDtBVSaXm9rcB1KivMTIcqD0=; b=LOGSoHuDUjG9nlx8wQgtGFh9ZhZE/sWaGYeYT8IktyIS1oLoq/FtiNS/4s1p/x0Jxn/PgA +9fe2ujQ2Cyp3tL4a+Gd8mm5HnrA/QeIyJszobOXYslugWkNpI99x5q7X7jg+Ubxf8Inhs GZbXKgN2JCd+wgRrb8+OiMQZIQGdPg0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-288-gczZdh52PeySscz9vobOOw-1; Thu, 27 Feb 2020 10:34:52 -0500 X-MC-Unique: gczZdh52PeySscz9vobOOw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2171E1005514; Thu, 27 Feb 2020 15:34:51 +0000 (UTC) Received: from pauld.bos.csb (dhcp-17-51.bos.redhat.com [10.18.17.51]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 622C787B08; Thu, 27 Feb 2020 15:34:46 +0000 (UTC) Date: Thu, 27 Feb 2020 10:34:44 -0500 From: Phil Auld To: Vincent Guittot Cc: Dietmar Eggemann , Ben Segall , Ingo Molnar , Peter Zijlstra , Juri Lelli , Steven Rostedt , Mel Gorman , linux-kernel , Parth Shah , Valentin Schneider , Hillf Danton , zhout@vivaldi.net Subject: Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs Message-ID: <20200227153444.GB30178@pauld.bos.csb> References: <20200226181640.21664-1-vincent.guittot@linaro.org> <8f72ea72-f36d-2611-e026-62ddff5c3422@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 27, 2020 at 03:58:02PM +0100 Vincent Guittot wrote: > On Thu, 27 Feb 2020 at 14:10, Vincent Guittot > wrote: > > > > On Thu, 27 Feb 2020 at 12:20, Dietmar Eggemann wrote: > > > > > > On 26.02.20 21:01, Vincent Guittot wrote: > > > > On Wed, 26 Feb 2020 at 20:04, wrote: > > > >> > > > >> Vincent Guittot writes: > > > >> > > > >>> When a cfs_rq is throttled, its group entity is dequeued and its running > > > >>> tasks are removed. We must update runnable_avg with current h_nr_running > > > >>> and update group_se->runnable_weight with new h_nr_running at each level > > > > > > ^^^ > > > > > > Shouldn't this be 'current' rather 'new' h_nr_running for > > > group_se->runnable_weight? IMHO, you want to cache the current value > > > before you add/subtract task_delta. > > > > hmm... it can't be current in both places. In my explanation, > > "current" means the current situation when we started to throttle cfs > > and "new" means the new situation after we finished to throttle the > > cfs. I should probably use old and new to prevent any > > misunderstanding. > > I'm about to send a new version to fix some minor changes: The if > statement should have some { } as there are some on the else part > > Would it be better for you if i use old and new instead of current and > new in the commit message ? > Seems better to me. You could also consider "the old" and "the new". Cheers, Phil > > > > That being said, we need to update runnable_avg with the old > > h_nr_running: the one before we started to throttle the cfs which is > > the value saved in group_se->runnable_weight. Once we have updated > > runnable_avg, we save the new h_nr_running in > > group_se->runnable_weight that will be used for next updates. > > > > > > > > >>> of the hierarchy. > > > >> > > > >> You'll also need to do this for task enqueue/dequeue inside of a > > > >> throttled hierarchy, I'm pretty sure. > > > > > > > > AFAICT, this is already done with patch "sched/pelt: Add a new > > > > runnable average signal" when task is enqueued/dequeued inside a > > > > throttled hierarchy > > > > > > > >> > > > >>> > > > >>> Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal") > > > >>> Signed-off-by: Vincent Guittot > > > >>> --- > > > >>> This patch applies on top of tip/sched/core > > > >>> > > > >>> kernel/sched/fair.c | 10 ++++++++++ > > > >>> 1 file changed, 10 insertions(+) > > > >>> > > > >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > >>> index fcc968669aea..6d46974e9be7 100644 > > > >>> --- a/kernel/sched/fair.c > > > >>> +++ b/kernel/sched/fair.c > > > >>> @@ -4703,6 +4703,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq) > > > >>> > > > >>> if (dequeue) > > > >>> dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP); > > > >>> + else { > > > >>> + update_load_avg(qcfs_rq, se, 0); > > > >>> + se_update_runnable(se); > > > >>> + } > > > >>> + > > > >>> qcfs_rq->h_nr_running -= task_delta; > > > >>> qcfs_rq->idle_h_nr_running -= idle_task_delta; > > > >>> > > > >>> @@ -4772,6 +4777,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) > > > >>> cfs_rq = cfs_rq_of(se); > > > >>> if (enqueue) > > > >>> enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP); > > > >>> + else { > > > >>> + update_load_avg(cfs_rq, se, 0); > > > >> > > > >> > > > >>> + se_update_runnable(se); > > > >>> + } > > > >>> + > > > >>> cfs_rq->h_nr_running += task_delta; > > > >>> cfs_rq->idle_h_nr_running += idle_task_delta; > --