Received: by 2002:a05:6500:1b41:b0:1fb:d597:ff75 with SMTP id cz1csp407198lqb; Tue, 4 Jun 2024 15:25:28 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXN2G35hS9ZTR0AJ12hxC1IakbiaPPrCP70g+YKN25FaWQzF0GcPLmMQgnV1WPqDzBOT4bqFxq0V3pjiG3yv9kfr3ccwYrcrAVhkuUY4w== X-Google-Smtp-Source: AGHT+IHYFefjt7HLOwcpZJJqrZtuohdN67ZxSWG699bIcVCJC9SHwFZQCDaGfRuwVW9RPjBKvsCw X-Received: by 2002:a05:6a20:3955:b0:1af:8a3a:35d7 with SMTP id adf61e73a8af0-1b2b712dfabmr1148452637.39.1717539927638; Tue, 04 Jun 2024 15:25:27 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717539927; cv=pass; d=google.com; s=arc-20160816; b=V+4yRMcU23dE2N93Jg0yIUHhlSpAAo7cdkZZPOUKWzsAL3EamzkofOHQarcdzHhwyz TpkoS93KkiRwTyytlZ2heJbKiaLW4mxUs4tN6gSCPwci9nfY/LW8FBO3Cg3VTNqq8V1Z rScPiV1l09liT6GGNNyDkAHt7mnmwToBMeIZgj/f1T8BUQqT5/o8gV/3gE+IfO9aJRsZ 2BPvwyE5oMJNMikGgJx92vlS9AVejRCg6qulPPK6tNzwSQwhJ+DNyfHPi7L9l5sV5LPc htjENJ7N/a40BUqD7ujAXxW/S+kIlEKd8sDTjcW13C5DHdHHEfdZUt6vG0/ijYs6y4ZY t3OA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=5+VgP8EQ5swPqhp/Z3JVQYC/MqHCBZnZmffMQd2Y6UE=; fh=j/6jXqcObt7Pm1W5aZvD4EQwbdi5JXSIgZt68BnsQ8s=; b=inLyfihaGiXIvH2VMNjUAMXR0Fqcg58me9n9J4NjiuXxsreNRZCwM7MrX0HbuNkMpN yxTxGXmwiXJ4UDwGRkh3HQTIEyKEkqUdnwpCDbGosr4/2CHSHc72kEy5ZBBa9nZ2Twcc lHcKyYRVET7tloM8g+eEgv9HqEmOppTKVCK381QNnpeXWposx1jLfkPXImsWiXFK4Ogc zLCSf0ptD56WfQ4r71XIWh6lu+iBNGE0HZrObpPL7RYdcwO3UcLo1BN4Zi0USmatCSTD iFdJgjGd5oU1352fbNcG8hqlYRrjozee78t9Y8i9GXGouEQk2K25zjYdXnCvixWlXrCT cj+g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=oU6xBh0V; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-201469-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-201469-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id 41be03b00d2f7-6c355157bacsi2670653a12.207.2024.06.04.15.25.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Jun 2024 15:25:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-201469-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=oU6xBh0V; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-201469-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-201469-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 2E879287383 for ; Tue, 4 Jun 2024 22:25:27 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 405F714EC6F; Tue, 4 Jun 2024 22:24:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oU6xBh0V" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6DBEE14D28F; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; cv=none; b=aWS+i2xMEcFyKeM7Edn9iPbmBFrypC+/ItQ0rUl9lhnEdkjN6uHON+Th9J1CGOG9KpHpM8bZltNuNlp4wmXeOT+NoGSm4+XtsPUqQFwnby95Dl82YN1dlxguj/SxPbWL/iiXzyunkWS3PU1spjxR+iQeK0/yX3bZV8vPZda1BC4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717539837; c=relaxed/simple; bh=yXmNx6GGZHORzw+Gi/ligi9zPZy9C9j2+D0gbKw6Oec=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RwzeMZ0TCK19l6PNYMkyGYyu9wguOx11fZyhEkJ/5vjfUozD7gXKE6q3zIJ8hOwjQ78kTbyH0UMeoOQBhjZNcD5wDL+sUaqu2WVRlMafykIU79OoollDvNLqJYYPZE3NWtOyEaeHW0teGj6LKw8T7GFmPMhHaZLt+p76lpg3TpY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oU6xBh0V; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2D17EC4AF09; Tue, 4 Jun 2024 22:23:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717539837; bh=yXmNx6GGZHORzw+Gi/ligi9zPZy9C9j2+D0gbKw6Oec=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oU6xBh0VDGe7iMxicBoLC3QF6WQmFzOSJtugKtOUFyrrHafO+M+Y3cBp0yNCHOKbs SAxgBt0nbSN4vSDr2TCJ4u5ZGIz/E+HiCslFN52DUwsc/Dl14I4K/M8v1dbXSEeIu4 fD6UkNzVcucXXz0kCzGy1O/I/lu2z8pOYSykXfLTN3A/2jNKB4ku+bfGlQfuTalaDJ MToyertvOMNHwHaSS/2vDoUoP6BCIJmgcj2urVR84N7m7BSxAMnyCtLDaNpuLie7qx wsOfz2lVGdpaPDk6hJ4vawC70JgXUd2MCkT0BYA4azl40xWEjzUhPhb5VCKBmNDAXL U0AUr7aV+nbQQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 6ECB1CE3F35; Tue, 4 Jun 2024 15:23:56 -0700 (PDT) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, Frederic Weisbecker , "Paul E . McKenney" Subject: [PATCH rcu 9/9] rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU invocation Date: Tue, 4 Jun 2024 15:23:55 -0700 Message-Id: <20240604222355.2370768-9-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> References: <657595c8-e86c-4594-a5b1-3c64a8275607@paulmck-laptop> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Frederic Weisbecker When rcu_barrier() calls rcu_rdp_cpu_online() and observes a CPU off rnp->qsmaskinitnext, it means that all accesses from the offline CPU preceding the CPUHP_TEARDOWN_CPU are visible to RCU barrier, including callbacks expiration and counter updates. However interrupts can still fire after stop_machine() re-enables interrupts and before rcutree_report_cpu_dead(). The related accesses happening between CPUHP_TEARDOWN_CPU and rnp->qsmaskinitnext clearing are _NOT_ guaranteed to be seen by rcu_barrier() without proper ordering, especially when callbacks are invoked there to the end, making rcutree_migrate_callback() bypass barrier_lock. The following theoretical race example can make rcu_barrier() hang: CPU 0 CPU 1 ----- ----- //cpu_down() smpboot_park_threads() //ksoftirqd is parked now rcu_sched_clock_irq() invoke_rcu_core() do_softirq() rcu_core() rcu_do_batch() // callback storm // rcu_do_batch() returns // before completing all // of them // do_softirq also returns early because of // timeout. It defers to ksoftirqd but // it's parked stop_machine() take_cpu_down() rcu_barrier() spin_lock(barrier_lock) // observes rcu_segcblist_n_cbs(&rdp->cblist) != 0 do_softirq() rcu_core() rcu_do_batch() //completes all pending callbacks //smp_mb() implied _after_ callback number dec rcutree_report_cpu_dead() rnp->qsmaskinitnext &= ~rdp->grpmask; rcutree_migrate_callback() // no callback, early return without locking // barrier_lock //observes !rcu_rdp_cpu_online(rdp) rcu_barrier_entrain() rcu_segcblist_entrain() // Observe rcu_segcblist_n_cbs(rsclp) == 0 // because no barrier between reading // rnp->qsmaskinitnext and rsclp->len rcu_segcblist_add_len() smp_mb__before_atomic() // will now observe the 0 count and empty // list, but too late, we enqueue regardless WRITE_ONCE(rsclp->len, rsclp->len + v); // ignored barrier callback // rcu barrier stall... This could be solved with a read memory barrier, enforcing the message passing between rnp->qsmaskinitnext and rsclp->len, matching the full memory barrier after rsclp->len addition in rcu_segcblist_add_len() performed at the end of rcu_do_batch(). However the rcu_barrier() is complicated enough and probably doesn't need too many more subtleties. CPU down is a slowpath and the barrier_lock seldom contended. Solve the issue with unconditionally locking the barrier_lock on rcutree_migrate_callbacks(). This makes sure that either rcu_barrier() sees the empty queue or its entrained callback will be migrated. Signed-off-by: Frederic Weisbecker Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 408b020c9501f..c58fc10fb5969 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -5147,11 +5147,15 @@ void rcutree_migrate_callbacks(int cpu) struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); bool needwake; - if (rcu_rdp_is_offloaded(rdp) || - rcu_segcblist_empty(&rdp->cblist)) - return; /* No callbacks to migrate. */ + if (rcu_rdp_is_offloaded(rdp)) + return; raw_spin_lock_irqsave(&rcu_state.barrier_lock, flags); + if (rcu_segcblist_empty(&rdp->cblist)) { + raw_spin_unlock_irqrestore(&rcu_state.barrier_lock, flags); + return; /* No callbacks to migrate. */ + } + WARN_ON_ONCE(rcu_rdp_cpu_online(rdp)); rcu_barrier_entrain(rdp); my_rdp = this_cpu_ptr(&rcu_data); -- 2.40.1