Received: by 2002:a05:7412:a9a2:b0:e2:908c:2ebd with SMTP id o34csp1208745rdh; Fri, 27 Oct 2023 07:41:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEM2FpG4gw1hVlOTkDoW3oqcyrfQnM3JF7ChTSuNxrddzVLKuP0dkJIL9srld4ZUANx6Ana X-Received: by 2002:a5b:9:0:b0:d9b:76c3:4567 with SMTP id a9-20020a5b0009000000b00d9b76c34567mr3013716ybp.55.1698417685779; Fri, 27 Oct 2023 07:41:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698417685; cv=none; d=google.com; s=arc-20160816; b=tfjhl8s/7idv1M7gLcVFFC79TJJoCqBhhbK6DqIRGRJr2K3EgCsNPS9InBvnV56OjB /IwjY2Gf5ZDhIZ1fRK9nlj+68XI3h/ltqSpoI8fL0sabseNCeWjCNK5nZfkG+jw6aewJ Yw2br1vZfYTR29O7/dc/b5Bal/o5tHgRet2Vx6v11XsFcpnW9dHcBOCw6ZhBGSYCr5lI pvyKi7uHPCfOR2AFKDWxBwbpm5HcEGbjUVSQ+mM1ZJes7YZLe1FytZ/QKJe3mMKjLYxv NS8SqxYF+/g7m3psD1BMDlnuoCVQAsRKsJxN1z+4JVFpPFBImuITaaXGNNNjsTbGZ6Ry uDPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ZdMcBX28/OOJ17CEoIzmn2m4naKz9LL++9bBrVcnuWE=; fh=Z5RfXUdA7RqZB5lwRaxglRkGuBAkH08O6zmnqH6r62g=; b=n6wZToczSJnQv0fLyMN/cvXwUcPNeW4cdjsHv5iYN5x4tgpRmyqw9Qj8Zft09iGay5 gNf5g3aGtGgz7zp1FT7Z0gd5iM0fhem5a5trxUNzglo4A3bkU2689h7Z1jVIAjKdE5gH Iz2vrE+8+q4xc3tTWI3AIY5xVsjwqqrGsyPwiABHQwhOHZJOCqjCFn6ImSfkzOY2AR6r 8nkH3LRM9RhHpUZS7GNYyApmi9XcxKEYohnJ3hUJ5aanvE/ZHHe5sC5Qx4bQ5L5p0spO mHwWrbPibZ81iZ9PBdccX2ANzAubJLiH9v3HCcadzSsjhHN/koVVfXoLJDjPI79X0DiV U1kA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=PFegXcS4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id j17-20020a252311000000b00d9ad3afa69asi2985028ybj.52.2023.10.27.07.41.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Oct 2023 07:41:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=PFegXcS4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 92B1D8301519; Fri, 27 Oct 2023 07:41:22 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345977AbjJ0OlL (ORCPT + 99 others); Fri, 27 Oct 2023 10:41:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345868AbjJ0OlJ (ORCPT ); Fri, 27 Oct 2023 10:41:09 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10A3311F; Fri, 27 Oct 2023 07:41:07 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D7EA9C433A9; Fri, 27 Oct 2023 14:41:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698417666; bh=aJr8jQIzF20EIQq6xhSUAOn5x5pigArjiu7D5NS7zLY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PFegXcS4zQj7CCraC8MJw6fNVlpUO/oO4mmGf3ki+qHwWjqKacCtoXEDbxT0h9jWX auZ0eB+YbwgXXYRtRHnamL0OFTlXSsi0BhFauWn0EogHhnMV8YRI4WekODYPlOIUA/ 7fNaBqicwOWC6ceKEZ7Jlt5WKRAENgiue7s3kgiu2Zt+L4jZrlOhTyQ7dwl16p8WEC h33z7TzjAt1B2B604vErrCdDh+dgwhpNZQPOfwTQJEbhBBKyHjeueQgfk/UxKDpN8a JvHeqweFw1j5+k9+9ALT+6HqmVNFxoqE2bHT+ae+TyZOLNet6dbUo4x82d+PBL2ja+ dzKBLmIyJUK+w== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Boqun Feng , Joel Fernandes , Josh Triplett , Mathieu Desnoyers , Neeraj Upadhyay , "Paul E . McKenney" , Steven Rostedt , Uladzislau Rezki , rcu , Zqiang , "Liam R . Howlett" , Peter Zijlstra Subject: [PATCH 2/4] rcu/tasks: Handle new PF_IDLE semantics Date: Fri, 27 Oct 2023 16:40:48 +0200 Message-Id: <20231027144050.110601-3-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231027144050.110601-1-frederic@kernel.org> References: <20231027144050.110601-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 27 Oct 2023 07:41:22 -0700 (PDT) The commit: cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup") has changed the semantics of what is to be considered an idle task in such a way that CPU boot code preceding the actual idle loop is excluded from it. This has however introduced new potential RCU-tasks stalls when either: 1) Grace period is started before init/0 had a chance to set PF_IDLE, keeping it stuck in the holdout list until idle ever schedules. 2) Grace period is started when some possible CPUs have never been online, keeping their idle tasks stuck in the holdout list until the CPU ever boots up. 3) Similar to 1) but with secondary CPUs: Grace period is started concurrently with secondary CPU booting, putting its idle task in the holdout list because PF_IDLE isn't yet observed on it. It stays then stuck in the holdout list until that CPU ever schedules. The effect is mitigated here by the hotplug AP thread that must run to bring the CPU up. Fix this with handling the new semantics of PF_IDLE, keeping in mind that it may or may not be set on an idle task. Take advantage of that to strengthen the coverage of an RCU-tasks quiescent state within an idle task, excluding the CPU boot code from it. Only the code running within the idle loop is now a quiescent state, along with offline CPUs. Fixes: cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup") Suggested-by: Joel Fernandes Suggested-by: Paul E . McKenney" Signed-off-by: Frederic Weisbecker --- kernel/rcu/tasks.h | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index bf5f178fe723..a604f59aee0b 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -895,10 +895,36 @@ static void rcu_tasks_pregp_step(struct list_head *hop) synchronize_rcu(); } +/* Check for quiescent states since the pregp's synchronize_rcu() */ +static bool rcu_tasks_is_holdout(struct task_struct *t) +{ + int cpu; + + /* Has the task been seen voluntarily sleeping? */ + if (!READ_ONCE(t->on_rq)) + return false; + + /* + * Idle tasks (or idle injection) within the idle loop are RCU-tasks + * quiescent states. But CPU boot code performed by the idle task + * isn't a quiescent state. + */ + if (is_idle_task(t)) + return false; + + cpu = task_cpu(t); + + /* Idle tasks on offline CPUs are RCU-tasks quiescent states. */ + if (t == idle_task(cpu) && !rcu_cpu_online(cpu)) + return false; + + return true; +} + /* Per-task initial processing. */ static void rcu_tasks_pertask(struct task_struct *t, struct list_head *hop) { - if (t != current && READ_ONCE(t->on_rq) && !is_idle_task(t)) { + if (t != current && rcu_tasks_is_holdout(t)) { get_task_struct(t); t->rcu_tasks_nvcsw = READ_ONCE(t->nvcsw); WRITE_ONCE(t->rcu_tasks_holdout, true); @@ -947,7 +973,7 @@ static void check_holdout_task(struct task_struct *t, if (!READ_ONCE(t->rcu_tasks_holdout) || t->rcu_tasks_nvcsw != READ_ONCE(t->nvcsw) || - !READ_ONCE(t->on_rq) || + !rcu_tasks_is_holdout(t) || (IS_ENABLED(CONFIG_NO_HZ_FULL) && !is_idle_task(t) && READ_ONCE(t->rcu_tasks_idle_cpu) >= 0)) { WRITE_ONCE(t->rcu_tasks_holdout, false); -- 2.34.1