Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp4237358pxb; Tue, 17 Nov 2020 15:25:29 -0800 (PST) X-Google-Smtp-Source: ABdhPJwML0vo3ds0kEr3EU0lnwaTPVnGkeB+9NChdqB9+KrRt3GegcytINowobKwdU/kpj7jdGuW X-Received: by 2002:a05:6402:1cb2:: with SMTP id cz18mr22353448edb.388.1605655528919; Tue, 17 Nov 2020 15:25:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605655528; cv=none; d=google.com; s=arc-20160816; b=psabBzZBJtlE1PeQFR4xSy9ZBndQIl3/YTX0gRYEy6XDmhYVmhDkFabgo5nWQv2AHX HkCyWB75tySqr0JP/IWraQ0F5GnenkXrC3705ZbXVwTwe+eWGkcN0lD7/kvJoeO9qJQ2 39i0MjT9ANxeDzR9WuZ2bJyw+DKlSTgITiq4qrjsquSJ1p1HGgZoIHay8Xd677+m9DzN mj5YcKo08mreY1ZWQePD3EeWlKnRqrup2TTQPNqtduHevpgqDOKPlsU4ZL8NcmN0CSo/ otLlSw3a8J0ZVKd2AblI3q2tHucicQsdeGPUeG+irl4v4W+5dsqRyQTeJpFP9tYlB4LU fHfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=2B+7I+GkvlqccJMxOn9O78ObTrnDCHzQYPn7mfVNg6M=; b=AUQ6d/CTJy7EHBNRRYxIQLBsIpPvUdAt4nQyUACo5lxIwuNl25tSSEjTxr7IqE7tVi aY9cLzzlFtlM++2Lq634yiSZLD9CqPFoCw3MUrdokj3frqi+ji9qmQRhim+geVl57l2k 0BM0Upm5tWNPKQ6GZJ0dcJR11oqS6vEJWP/FAT5t4HJUBSdnvAyFJZKNWDB8M73ua0y1 zrAioEa1nj7CJ5DDR2MEdwy6AXRSlVURXZdMJ0k+cgXdTu0NBc1QhfxchwBFJTnCPA8g iMW7iLx2XmgxdJee7V6f9kwOvLnUOUrFUDl09+gTtwcxmeMaAaHQl3UgfiGMdZAMpxkq Ektg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=hTtHrI8f; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u19si14522774ejc.310.2020.11.17.15.25.05; Tue, 17 Nov 2020 15:25:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=hTtHrI8f; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728805AbgKQXUg (ORCPT + 99 others); Tue, 17 Nov 2020 18:20:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728757AbgKQXUe (ORCPT ); Tue, 17 Nov 2020 18:20:34 -0500 Received: from mail-qk1-x734.google.com (mail-qk1-x734.google.com [IPv6:2607:f8b0:4864:20::734]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B7A0C0617A6 for ; Tue, 17 Nov 2020 15:20:34 -0800 (PST) Received: by mail-qk1-x734.google.com with SMTP id n132so50728qke.1 for ; Tue, 17 Nov 2020 15:20:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2B+7I+GkvlqccJMxOn9O78ObTrnDCHzQYPn7mfVNg6M=; b=hTtHrI8fmDAY3YGXUTbuUG0Vda4EDFKODdogbm8FXj1vHBu6rRcvIP7a4UnqFQYfnJ RValtr5HrQxx4rAOM5quz+MMufdVeJgUWyhnkwCCMMNUyIA9MuoRKaHVho6gqrXO3voY IYlILCzyEqHxS3UPPFn4+TeswRgSERbSUvyXE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2B+7I+GkvlqccJMxOn9O78ObTrnDCHzQYPn7mfVNg6M=; b=a4UcaPv9UR+3bjcWim6AZMu9Jv7X3jdsLfRZSHqbjZZe5H0mjPZQYUgSaW7OoDHpkI mNPGgEkOVq8bUNnYuy/5/VgyoqxDzQ03VE0Hw9zIl6rv+i9qj4KJQRv838zPdaRdyMiU PhUF43nRBmd7wMN+7+IZc7UnCGRa/pNit5FghnzO9EOINqivcICTy2gAk1SC00iL8O09 0yO4asQUHOtfGCD/f+xToALh+Z3SjBJNiNolTliiMM2UTgQIAV6vaPKGeYEz01O2oTUK pvSIOl5IuYMwT+12BWIJ4yx0JU3MXSUXUlwlr2NrXqbMyDC49TMeL1Zzz9BnOu4qYsQU Rxtg== X-Gm-Message-State: AOAM531TMY9lcC2KapS/L6qlIgCfGEqvlC3/IaP+/KBoQ2jlBSaJJmqr k9brbyr00G+wV+9aFHJc/fPsHC1GZ8iyLA== X-Received: by 2002:ae9:dc45:: with SMTP id q66mr2008858qkf.407.1605655233560; Tue, 17 Nov 2020 15:20:33 -0800 (PST) Received: from joelaf.cam.corp.google.com ([2620:15c:6:411:cad3:ffff:feb3:bd59]) by smtp.gmail.com with ESMTPSA id d12sm14555544qtp.77.2020.11.17.15.20.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Nov 2020 15:20:33 -0800 (PST) From: "Joel Fernandes (Google)" To: Nishanth Aravamudan , Julien Desfossez , Peter Zijlstra , Tim Chen , Vineeth Pillai , Aaron Lu , Aubrey Li , tglx@linutronix.de, linux-kernel@vger.kernel.org Cc: mingo@kernel.org, torvalds@linux-foundation.org, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Phil Auld , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , joel@joelfernandes.org, vineeth@bitbyteword.org, Chen Yu , Christian Brauner , Agata Gruza , Antonio Gomez Iglesias , graf@amazon.com, konrad.wilk@oracle.com, dfaggioli@suse.com, pjt@google.com, rostedt@goodmis.org, derkling@google.com, benbjiang@tencent.com, Alexandre Chartre , James.Bottomley@hansenpartnership.com, OWeisse@umich.edu, Dhaval Giani , Junaid Shahid , jsbarnes@google.com, chris.hyser@oracle.com, Ben Segall , Josh Don , Hao Luo , Tom Lendacky , Aubrey Li , "Paul E. McKenney" , Tim Chen Subject: [PATCH -tip 10/32] sched: Fix priority inversion of cookied task with sibling Date: Tue, 17 Nov 2020 18:19:40 -0500 Message-Id: <20201117232003.3580179-11-joel@joelfernandes.org> X-Mailer: git-send-email 2.29.2.299.gdc1121823c-goog In-Reply-To: <20201117232003.3580179-1-joel@joelfernandes.org> References: <20201117232003.3580179-1-joel@joelfernandes.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Peter Zijlstra The rationale is as follows. In the core-wide pick logic, even if need_sync == false, we need to go look at other CPUs (non-local CPUs) to see if they could be running RT. Say the RQs in a particular core look like this: Let CFS1 and CFS2 be 2 tagged CFS tags. Let RT1 be an untagged RT task. rq0 rq1 CFS1 (tagged) RT1 (not tag) CFS2 (tagged) Say schedule() runs on rq0. Now, it will enter the above loop and pick_task(RT) will return NULL for 'p'. It will enter the above if() block and see that need_sync == false and will skip RT entirely. The end result of the selection will be (say prio(CFS1) > prio(CFS2)): rq0 rq1 CFS1 IDLE When it should have selected: rq0 r1 IDLE RT Joel saw this issue on real-world usecases in ChromeOS where an RT task gets constantly force-idled and breaks RT. Lets cure it. NOTE: This problem will be fixed differently in a later patch. It just kept here for reference purposes about this issue, and to make applying later patches easier. Reported-by: Joel Fernandes (Google) Signed-off-by: Peter Zijlstra Signed-off-by: Joel Fernandes (Google) --- kernel/sched/core.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4ee4902c2cf5..53af817740c0 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5195,6 +5195,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) need_sync = !!rq->core->core_cookie; /* reset state */ +reset: rq->core->core_cookie = 0UL; if (rq->core->core_forceidle) { need_sync = true; @@ -5242,14 +5243,8 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) /* * If there weren't no cookies; we don't need to * bother with the other siblings. - * If the rest of the core is not running a tagged - * task, i.e. need_sync == 0, and the current CPU - * which called into the schedule() loop does not - * have any tasks for this class, skip selecting for - * other siblings since there's no point. We don't skip - * for RT/DL because that could make CFS force-idle RT. */ - if (i == cpu && !need_sync && class == &fair_sched_class) + if (i == cpu && !need_sync) goto next_class; continue; @@ -5259,7 +5254,20 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) * Optimize the 'normal' case where there aren't any * cookies and we don't need to sync up. */ - if (i == cpu && !need_sync && !p->core_cookie) { + if (i == cpu && !need_sync) { + if (p->core_cookie) { + /* + * This optimization is only valid as + * long as there are no cookies + * involved. We may have skipped + * non-empty higher priority classes on + * siblings, which are empty on this + * CPU, so start over. + */ + need_sync = true; + goto reset; + } + next = p; goto done; } @@ -5299,7 +5307,6 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) */ need_sync = true; } - } } next_class:; -- 2.29.2.299.gdc1121823c-goog