Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp3477415pxb; Tue, 20 Apr 2021 09:08:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwxBaIe5A7t9bK+ivuxAEBLKiGiqcsUkzpDs+zZn9j1esUDWu1bueU+G3DWw68UJ4gzMtrS X-Received: by 2002:a17:902:d909:b029:ec:927c:3316 with SMTP id c9-20020a170902d909b02900ec927c3316mr16646751plz.21.1618934903916; Tue, 20 Apr 2021 09:08:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618934903; cv=none; d=google.com; s=arc-20160816; b=lBkdoPa/k6zdtQSti1eDIKRHXyuQvQubuUkQ18ecq1QXpfbTB8iFw5T6BVIugFSnL2 IelPptUWb5wyPDUw8XQTY/M2RUX8vokchuaSgdAWmfcDo9sbkNjI5TKc5q3S9RBImoua yfhvcE6C7WpkwkQKofJ43PPojXaF2qSvnyex3Yp5n2dtAbqpzBrGDOqEmSP397kAPRbG UM5aL3AOl2YDNIEdgQlcsz8ndddrQqNBvZTYBuwvjqHqI5PZCybYg6AKy3WO1rQzqELy HLd8Lm6B2+jUu3VeH74uUvW5BCiobGk3tSnFnoB97qFNFHavVg/3NjUkFlGelf4domA8 o6jQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:subject:cc:to:from:date; bh=2DjHr8C5YSaYmISZGHIj+I5gr1B7CZl6CIeDddQ7MRY=; b=hMN3dLi71jlKc0bdMWz14XbafNaWLjfAWrs6qsEeWIqtFeuUcgPRVPS9JZTL62fzRE Kguu8oKtKovlnO6ksrxIxt5g2OKqZdlVUz/Xup6dBg16jIrGqfxd84L7dVA/JnLesNaj 0bRr4K4mt96/h8oQSKT9sv2xGZca4mCeoVqqET65txzGpatWHSHD/e6FifsXax3tIomi nBe/Uf7FGk8MeabhP1Vw7lETSqzluHc3Ri6iPbCuTghs+8lJS8/KJnS2czNC+qz8l6u9 pgrRZlJrAPsQa2/cIZYpw2yPcUSlKDFqxMEHuf39RniivBtDuI6QUKt8mYtUrk9VfNxs eCCg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o24si12300946pgb.465.2021.04.20.09.08.09; Tue, 20 Apr 2021 09:08:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232504AbhDTQHj (ORCPT + 99 others); Tue, 20 Apr 2021 12:07:39 -0400 Received: from shelob.surriel.com ([96.67.55.147]:55018 "EHLO shelob.surriel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232174AbhDTQHj (ORCPT ); Tue, 20 Apr 2021 12:07:39 -0400 Received: from [2603:3005:d05:2b00:6e0b:84ff:fee2:98bb] (helo=imladris.surriel.com) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1lYsu1-00059N-Q3; Tue, 20 Apr 2021 12:07:05 -0400 Date: Tue, 20 Apr 2021 12:07:05 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: Kernel Team , Peter Zijlstra , Ingo Molnar , Dietmar Eggemann , Vincent Guittot , Mel Gorman , Valentin Schneider Subject: [PATCH v3] sched,fair: skip newidle_balance if a wakeup is pending Message-ID: <20210420120705.5c705d4b@imladris.surriel.com> X-Mailer: Claws Mail 3.17.6 (GTK+ 2.24.32; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: riel@shelob.surriel.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The try_to_wake_up function has an optimization where it can queue a task for wakeup on its previous CPU, if the task is still in the middle of going to sleep inside schedule(). Once schedule() re-enables IRQs, the task will be woken up with an IPI, and placed back on the runqueue. If we have such a wakeup pending, there is no need to search other CPUs for runnable tasks. Just skip (or bail out early from) newidle balancing, and run the just woken up task. For a memcache like workload test, this reduces total CPU use by about 2%, proportionally split between user and system time, and p99 and p95 application response time by 10% on average. The schedstats run_delay number shows a similar improvement. Signed-off-by: Rik van Riel --- kernel/sched/fair.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 69680158963f..fd80175c3b3e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10594,6 +10594,14 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf) u64 curr_cost = 0; update_misfit_status(NULL, this_rq); + + /* + * There is a task waiting to run. No need to search for one. + * Return 0; the task will be enqueued when switching to idle. + */ + if (this_rq->ttwu_pending) + return 0; + /* * We must set idle_stamp _before_ calling idle_balance(), such that we * measure the duration of idle_balance() as idle time. @@ -10661,7 +10669,8 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf) * Stop searching for tasks to pull if there are * now runnable tasks on this rq. */ - if (pulled_task || this_rq->nr_running > 0) + if (pulled_task || this_rq->nr_running > 0 || + this_rq->ttwu_pending) break; } rcu_read_unlock(); @@ -10688,7 +10697,12 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf) if (this_rq->nr_running != this_rq->cfs.h_nr_running) pulled_task = -1; - if (pulled_task) + /* + * If we are no longer idle, do not let the time spent here pull + * down this_rq->avg_idle. That could lead to newidle_balance not + * doing enough work, and the CPU actually going idle. + */ + if (pulled_task || this_rq->ttwu_pending) this_rq->idle_stamp = 0; rq_repin_lock(this_rq, rf); -- 2.25.4