Received: by 2002:a05:7412:d008:b0:f9:6acb:47ec with SMTP id bd8csp261137rdb; Tue, 19 Dec 2023 16:22:25 -0800 (PST) X-Google-Smtp-Source: AGHT+IFeVfdZ3wx0s00a0iNXTHZDpIxDz3PBXM8hOA8TMSRPNLFB5UibIoAvWdf1fbu5kPM6CzHZ X-Received: by 2002:a17:906:a011:b0:a23:9a1c:b1c8 with SMTP id p17-20020a170906a01100b00a239a1cb1c8mr1883765ejy.36.1703031745403; Tue, 19 Dec 2023 16:22:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703031745; cv=none; d=google.com; s=arc-20160816; b=d+4GZ07a5utiHWOBxZeOHs1XSxkStKUDWSYVCfFdmUlqY7Nqckb1ICYqFV9mL/UzPJ OOdFFev6EVJwVYIG65mVHNVf+t0YCqqwdokn7kX65pTTHfUam7rC4hI5owai5GDX674d o2suhRr4ZnWAoMWv5H+3pPnTV9C6qPqIMv6Wb2LWbvjlcqpHcW+1Rab3IP/jEAydC9V5 K9SMJ3l9jfsK81YXL5K8Qveo8R0b7eT8W1nKRyBADBphbG2OLjxJ17feOo4ruLgiS/W8 rA7ZpXpBCbXCP0bkf86dy02PuhNkG1Vc0i3bXKCtB7gCx1ZCgmQZUJDUtRejOZ1enKkX muEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=WJZ0zBDHStU6K3+8MFmMAIsVcwSpztLmjKA4OaYGSSc=; fh=D8Xl8uhzaYQmGsKoDgW5lEeEjxxrU2lJl72u5ukygg0=; b=YEeQE0zXCiD/46P3dh+9/LbHjGZL07QI8x9y1b6Gs3tJVZFGeJp6pmYuqFCe2FRSN/ v2XqkTUs0TSZ1hELSyyppRhBawbfELn//LlH/7S1hLyHsPG2mSSYrgk+n3wPwx9xLmiD nY7nQa4bIGoGGDI9gxLCJpC09W2BiGe94C+R/nxZDCMaAzScAX0nbL6Tc80v7FUvEYAQ TPy50hibLHQElLIH1vm5jlNBDMYBdcpP6TSfHVfpvZINbbutTF6fwk9imozPkIgNhojy /MGKz1qYm9WJg7xasiK+vpuoiMakiBn83vKIVhW3CkFAVEBhG3oWtf9TRQ6WsL8PUpGT 9N5Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=CN3eC1b8; spf=pass (google.com: domain of linux-kernel+bounces-6147-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-6147-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id ca20-20020a170906a3d400b00a233bf38df1si2935557ejb.420.2023.12.19.16.22.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Dec 2023 16:22:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-6147-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=CN3eC1b8; spf=pass (google.com: domain of linux-kernel+bounces-6147-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-6147-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id F174D1F25F5B for ; Wed, 20 Dec 2023 00:22:24 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D9FCBFBF9; Wed, 20 Dec 2023 00:19:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="CN3eC1b8" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3D4E1168C4 for ; Wed, 20 Dec 2023 00:19:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-28b6a78a164so3269242a91.0 for ; Tue, 19 Dec 2023 16:19:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1703031563; x=1703636363; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WJZ0zBDHStU6K3+8MFmMAIsVcwSpztLmjKA4OaYGSSc=; b=CN3eC1b8cDs50wBw0TKGRLWUljQS7QAHezN8cauNxoTPio+leTvL9ng2q9JkHFUzyc qU5+8QZiu6FnH94cr6H/iWm4HS7Vtr7bse5vFSlYXvtoPRsDmjTxPjxEMHUYWTWmm1YE 4AxW7Ao8DLaGPwlHL/dnUW/YiFtfSyHssUQbwfVuaaXEJSRT2Kcs5Mde/n7oAgzQkzQJ O9XU46GssCRtleqt/+pxewzRZn6t7985edCRsBc7n3AG1DR8W1I9QkvsaYFiz5NutTmr AJ1RVwxLg4zLR5qnTB2yFR5zjCecmJrQ59nQ5wD9PogSFgcCpIwWRHcAs+x1hamBCc0a mE6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703031563; x=1703636363; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WJZ0zBDHStU6K3+8MFmMAIsVcwSpztLmjKA4OaYGSSc=; b=t2v+jI2+WVBFBnLqsC9ckpKLwc70HxHj3b/TS0jwpXonkDsE5yCDWgJfVaXz2PJK9W 6SbKvON4T2CcJPW6qHcgJBwgfzoL0IUKJboHadQIU8ZDVW/LXh2fU2jIFJ/I0h+urlox K6jqGR8nOTjU0S76wD9KUSyf0sWkv20p1LLFnrbyZTxuIy8AHZhBszRjdZbwcKOoL1fq pmSuIFulypB/9VsTacntmUspWy21ZoKLaT4cBJ2ouWfNuUSpLDe+3ngEwtI4Ip67pr2R Sc2iCchzBd5454EC3quKL46SdbVPUD7msgkFmD6Ibg4eADgwvigRlTbssh3QT3DRIRIE u1Fg== X-Gm-Message-State: AOJu0YykkLqbhPhY6kQR21RVprBrZU2BUBORyrSn6mlZhn6iYbp1THfD X5SFtOKC+jR+bMjewdPZVI7ATJFoE4CR76Pw5PE47OIjw+xXtfZ7gBIv4ojtyfptJVfqTYNJBHy 9SnaoX7Q1WyDOXKJ6EdB9Ey0WPLoWemPhIRNiHHrlLjWS+dGzCgJYIsNWjClpFvARNn184oo= X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a17:90b:1b09:b0:28b:4d36:522 with SMTP id nu9-20020a17090b1b0900b0028b4d360522mr884613pjb.8.1703031563577; Tue, 19 Dec 2023 16:19:23 -0800 (PST) Date: Tue, 19 Dec 2023 16:18:23 -0800 In-Reply-To: <20231220001856.3710363-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231220001856.3710363-1-jstultz@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231220001856.3710363-13-jstultz@google.com> Subject: [PATCH v7 12/23] sched: Fix proxy/current (push,pull)ability From: John Stultz To: LKML Cc: Valentin Schneider , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com, "Connor O'Brien" , John Stultz Content-Type: text/plain; charset="UTF-8" From: Valentin Schneider Proxy execution forms atomic pairs of tasks: The selected task (scheduling context) and a proxy (execution context). The selected task, along with the rest of the blocked chain, follows the proxy wrt CPU placement. They can be the same task, in which case push/pull doesn't need any modification. When they are different, however, FIFO1 & FIFO42: ,-> RT42 | | blocked-on | v blocked_donor | mutex | | owner | v `-- RT1 RT1 RT42 CPU0 CPU1 ^ ^ | | overloaded !overloaded rq prio = 42 rq prio = 0 RT1 is eligible to be pushed to CPU1, but should that happen it will "carry" RT42 along. Clearly here neither RT1 nor RT42 must be seen as push/pullable. Unfortunately, only the selected task is usually dequeued from the rq, and the proxy'ed execution context (rq->curr) remains on the rq. This can cause RT1 to be selected for migration from logic like the rt pushable_list. This patch adds a dequeue/enqueue cycle on the proxy task before __schedule returns, which allows the sched class logic to avoid adding the now current task to the pushable_list. Furthermore, tasks becoming blocked on a mutex don't need an explicit dequeue/enqueue cycle to be made (push/pull)able: they have to be running to block on a mutex, thus they will eventually hit put_prev_task(). XXX: pinned tasks becoming unblocked should be removed from the push/pull lists, but those don't get to see __schedule() straight away. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Signed-off-by: Valentin Schneider Signed-off-by: Connor O'Brien Signed-off-by: John Stultz --- v3: * Tweaked comments & commit message v5: * Minor simplifications to utilize the fix earlier in the patch series. * Rework the wording of the commit message to match selected/ proxy terminology and expand a bit to make it more clear how it works. v6: * Droped now-unused proxied value, to be re-added later in the series when it is used, as caught by Dietmar v7: * Unused function argument fixup * Commit message nit pointed out by Metin Kaya * Droped unproven unlikely() and use sched_proxy_exec() in proxy_tag_curr, suggested by Metin Kaya --- kernel/sched/core.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 12f5a0618328..f6bf3b62194c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6674,6 +6674,23 @@ find_proxy_task(struct rq *rq, struct task_struct *next, struct rq_flags *rf) } #endif /* SCHED_PROXY_EXEC */ +static inline void proxy_tag_curr(struct rq *rq, struct task_struct *next) +{ + if (sched_proxy_exec()) { + /* + * pick_next_task() calls set_next_task() on the selected task + * at some point, which ensures it is not push/pullable. + * However, the selected task *and* the ,mutex owner form an + * atomic pair wrt push/pull. + * + * Make sure owner is not pushable. Unfortunately we can only + * deal with that by means of a dequeue/enqueue cycle. :-/ + */ + dequeue_task(rq, next, DEQUEUE_NOCLOCK | DEQUEUE_SAVE); + enqueue_task(rq, next, ENQUEUE_NOCLOCK | ENQUEUE_RESTORE); + } +} + /* * __schedule() is the main scheduler function. * @@ -6796,6 +6813,10 @@ static void __sched notrace __schedule(unsigned int sched_mode) * changes to task_struct made by pick_next_task(). */ RCU_INIT_POINTER(rq->curr, next); + + if (!task_current_selected(rq, next)) + proxy_tag_curr(rq, next); + /* * The membarrier system call requires each architecture * to have a full memory barrier after updating @@ -6820,6 +6841,10 @@ static void __sched notrace __schedule(unsigned int sched_mode) /* Also unlocks the rq: */ rq = context_switch(rq, prev, next, &rf); } else { + /* In case next was already curr but just got blocked_donor*/ + if (!task_current_selected(rq, next)) + proxy_tag_curr(rq, next); + rq_unpin_lock(rq, &rf); __balance_callbacks(rq); raw_spin_rq_unlock_irq(rq); -- 2.43.0.472.g3155946c3a-goog