Received: by 2002:ab2:7a55:0:b0:1f4:4a7d:290d with SMTP id u21csp616532lqp; Fri, 5 Apr 2024 04:05:12 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWD6R0P6IsHiZhxYirJVbbC4Ydp275Qrba18m0xmE5+7xUh2oL+FoR7/7Q7rZjxdo/MO9QRaQBW59YfnXr8suHJSgJMGT/yU3LrWnaCEg== X-Google-Smtp-Source: AGHT+IGs6e2d2bxiDYDrBdV1rbqPqx/uscpxNcrACD1n82sHL3tR22NthiPVxU7qb7PoX5QARxo4 X-Received: by 2002:a05:622a:1a1c:b0:434:3f4c:6f50 with SMTP id f28-20020a05622a1a1c00b004343f4c6f50mr1104033qtb.49.1712315111677; Fri, 05 Apr 2024 04:05:11 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712315111; cv=pass; d=google.com; s=arc-20160816; b=KfWYtp/yyNkUzjetsDpd4rl8WRTz42kOWqJ910D7ab8D4LHsm4AD9xmGV0UJ4kQ6nW zdU4nn0/In+jd1Z1i1YeRHDci5sDhlIXAJjcPu3IZOAXHNVW7yHC5x752PQicxlDJBuM TChGUfngy19+JMRBbXqkW3PsM3et5nRkFi7joB13+v1kG4Rjeo263RER/fodqukoMpdD OlBQWiKEQMqX+7JNm2JA35dSp5HexVt54ZSmoTVAuhLoMeaq0g+ORYhHQ5C8R17bmZCo sQxjQGGsfF8GAyHiFEnF8O5qsjU2oV3aFSv1U9Oo8n8A1zX9WpyT3GsNpt6Q5+iUebN/ 8nxg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:subject:cc:to:from:date:user-agent:message-id :dkim-signature; bh=Awpxthql4oMHs/pvkyB9gf8DdhE6IGA6MumMVMqXLls=; fh=3eU22lworbfn3SW0+3NlbIyHftl34KjRA7Qmky/3j80=; b=EO7eTL5vLuJ1LYOmPNm8J/tpZQjmeKS0zfLYJUfH/UTIi9oP/oH5K2BOej7RD97EzQ dnlZ2+4OCz3jpgUUK+kwyrhMgrKnB72YMRKhZZ9ilCChhlvADzEingKLFgIOmxud8uFX kJmwrAchsJTNm1dmEwMYed7Ilgg5IRyhjihwRiAZR0weDRgvfVyow1C0NJQpdd6NOkiu ouHJ6fk5gUAvbXrbeas63iZqF02XAwCihAgNv+UPblMu7mlZ7dNrnUvN0TVrzljc7Icu 8o6pCGrKvUYdmSN1wSSexKxoGADNjOxSMpX0qc20SFKqZ/IjuifREhgcJItp1K7PN9MQ pSuw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=LfMc9Xvy; arc=pass (i=1 dkim=pass dkdomain=infradead.org); spf=pass (google.com: domain of linux-kernel+bounces-132881-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-132881-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id o11-20020a05622a044b00b00432c7b7e8c2si1459084qtx.679.2024.04.05.04.05.11 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Apr 2024 04:05:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-132881-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=LfMc9Xvy; arc=pass (i=1 dkim=pass dkdomain=infradead.org); spf=pass (google.com: domain of linux-kernel+bounces-132881-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-132881-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 7686F1C22D61 for ; Fri, 5 Apr 2024 11:05:02 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3763C16C857; Fri, 5 Apr 2024 11:04:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="LfMc9Xvy" Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D2EE416C441 for ; Fri, 5 Apr 2024 11:04:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712315047; cv=none; b=gb3XrG7sui3E1ncRRNguZJQt0crA0cVcV7Lxw16wU1BLf4zswS8zNy9cKttTXmCBge3o7Pcw4HN4Q31Trogdv0yaT7ytGB6GeFr5WkPczlewqMXWf4bMfqRZfmU2CwQS7UnaYzXTfnBUofCgvLQtgTk3IqjtGQrc8oug8UiT6n4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712315047; c=relaxed/simple; bh=PkhWAQpajBwui8Ycrw5lTswtSwkgjOF5Q/Nz56jrQEU=; h=Message-Id:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=VxqIhDdrbvZILunPpusyJSZrA17hflidb8WiyqJ0jV71WWy4F1jhvGRGsJb+vAByEUorpzqmHNWPxxv3CeGWaJ8dmwMqr2pXYCzEKBF/7pvG0B8nNfN+2ueBOQnaMn2Bo0k2MWhi3kVsfp9uvwVxIktyJu9B+ACsbAtiWjaGmaw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=LfMc9Xvy; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=Awpxthql4oMHs/pvkyB9gf8DdhE6IGA6MumMVMqXLls=; b=LfMc9XvyFhlBdrxp6oc+dSBHK/ erUKxJv7afnUFZX+QuQyTDhj8KMIJe3Rrb/TE5eil7N5A3hGaep8T++Bui4nbidNiQwN2MSFGnVwZ lvMmRXXz9z4CSRRfbpMhED2krmQ/J3eJ2p46jagoRsWq+5Vu8J6cQb1L3H6CHfKpbdlNVzVo9E/ZJ dzhBLfLQYvLQdM5krFdm3g3UNV0IKXRqhiQ54Zf1cNRAtq3yZxdIF274O7OVO+DUKioofra+A3lEn ZAYavvvaY5pnQibyVTPqNlzxGw6J3slJhQD1EdngrZzbL7sRoHkxuzhoONVfja2b0pTbSkzvOk4P6 +GmWwR0w==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1rshMH-0000000AKMQ-24Ds; Fri, 05 Apr 2024 11:03:45 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 9721C302DAB; Fri, 5 Apr 2024 13:03:44 +0200 (CEST) Message-Id: <20240405110010.522077707@infradead.org> User-Agent: quilt/0.65 Date: Fri, 05 Apr 2024 12:28:01 +0200 From: Peter Zijlstra To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, linux-kernel@vger.kernel.org Cc: kprateek.nayak@amd.com, wuyun.abel@bytedance.com, tglx@linutronix.de, efault@gmx.de Subject: [RFC][PATCH 07/10] sched/fair: Re-organize dequeue_task_fair() References: <20240405102754.435410987@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Working towards delaying dequeue, notably also inside the hierachy, rework dequeue_task_fair() such that it can 'resume' an interrupted hierarchy walk. Signed-off-by: Peter Zijlstra (Intel) --- kernel/sched/fair.c | 82 ++++++++++++++++++++++++++++++++++------------------ 1 file changed, 55 insertions(+), 27 deletions(-) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6824,33 +6824,45 @@ enqueue_task_fair(struct rq *rq, struct static void set_next_buddy(struct sched_entity *se); /* - * The dequeue_task method is called before nr_running is - * decreased. We remove the task from the rbtree and - * update the fair scheduling stats: + * Basically dequeue_task_fair(), except it can deal with dequeue_entity() + * failing half-way through and resume the dequeue later. + * + * Returns: + * -1 - dequeue delayed + * 0 - dequeue throttled + * 1 - dequeue complete */ -static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags) +static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags) { - struct cfs_rq *cfs_rq; - struct sched_entity *se = &p->se; - int task_sleep = flags & DEQUEUE_SLEEP; - int idle_h_nr_running = task_has_idle_policy(p); bool was_sched_idle = sched_idle_rq(rq); + bool task_sleep = flags & DEQUEUE_SLEEP; + struct task_struct *p = NULL; + struct cfs_rq *cfs_rq; + int idle_h_nr_running; - util_est_dequeue(&rq->cfs, p); + if (entity_is_task(se)) { + p = task_of(se); + idle_h_nr_running = task_has_idle_policy(p); + } else { + idle_h_nr_running = cfs_rq_is_idle(group_cfs_rq(se)); + } for_each_sched_entity(se) { cfs_rq = cfs_rq_of(se); dequeue_entity(cfs_rq, se, flags); - cfs_rq->h_nr_running--; - cfs_rq->idle_h_nr_running -= idle_h_nr_running; + /* h_nr_running is the hierachical count of tasks */ + if (p) { + cfs_rq->h_nr_running--; + cfs_rq->idle_h_nr_running -= idle_h_nr_running; - if (cfs_rq_is_idle(cfs_rq)) - idle_h_nr_running = 1; + if (cfs_rq_is_idle(cfs_rq)) + idle_h_nr_running = 1; + } /* end evaluation on encountering a throttled cfs_rq */ if (cfs_rq_throttled(cfs_rq)) - goto dequeue_throttle; + return 0; /* Don't dequeue parent if it has other entities besides us */ if (cfs_rq->load.weight) { @@ -6870,33 +6882,49 @@ static bool dequeue_task_fair(struct rq for_each_sched_entity(se) { cfs_rq = cfs_rq_of(se); + // XXX avoid these load updates for delayed dequeues ? update_load_avg(cfs_rq, se, UPDATE_TG); se_update_runnable(se); update_cfs_group(se); - cfs_rq->h_nr_running--; - cfs_rq->idle_h_nr_running -= idle_h_nr_running; + if (p) { + cfs_rq->h_nr_running--; + cfs_rq->idle_h_nr_running -= idle_h_nr_running; - if (cfs_rq_is_idle(cfs_rq)) - idle_h_nr_running = 1; + if (cfs_rq_is_idle(cfs_rq)) + idle_h_nr_running = 1; + } /* end evaluation on encountering a throttled cfs_rq */ if (cfs_rq_throttled(cfs_rq)) - goto dequeue_throttle; + return 0; + } + + if (p) { + sub_nr_running(rq, 1); + /* balance early to pull high priority tasks */ + if (unlikely(!was_sched_idle && sched_idle_rq(rq))) + rq->next_balance = jiffies; } - /* At this point se is NULL and we are at root level*/ - sub_nr_running(rq, 1); + return 1; +} - /* balance early to pull high priority tasks */ - if (unlikely(!was_sched_idle && sched_idle_rq(rq))) - rq->next_balance = jiffies; +/* + * The dequeue_task method is called before nr_running is + * decreased. We remove the task from the rbtree and + * update the fair scheduling stats: + */ +static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags) +{ + util_est_dequeue(&rq->cfs, p); -dequeue_throttle: - util_est_update(&rq->cfs, p, task_sleep); - hrtick_update(rq); + if (dequeue_entities(rq, &p->se, flags) < 0) + return false; + util_est_update(&rq->cfs, p, flags & DEQUEUE_SLEEP); + hrtick_update(rq); return true; }