Received: by 2002:a89:d88:0:b0:1fa:5c73:8e2d with SMTP id eb8csp570684lqb; Fri, 24 May 2024 07:05:48 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUFxC1DXTFs6DM4YLrwt+UtshPKWpf8F6OyQUL1artdo6iM0487p7oRsJHOfu+er1/VaMJqe+VmRwe4E4EdYCMoRUu3xCLUKczsQsz+AA== X-Google-Smtp-Source: AGHT+IFI4Ocm2L3QRW32mCEKqovylxaOkFj2hRxD0BEeBlForMGEra1hesLoYplCieVGcdu1zv2f X-Received: by 2002:a17:90a:ac15:b0:2be:1df:783a with SMTP id 98e67ed59e1d1-2bf5eb30bb0mr2174148a91.15.1716559547896; Fri, 24 May 2024 07:05:47 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716559547; cv=pass; d=google.com; s=arc-20160816; b=ZoF2tX9WBwe/5aICloRwuPC6Pw/pjMRjzIbpFfQEPDtuPNBxjO11FpKwCQEcvRX6cB 3fHP4pnSr7KjYFPFAVLP45ZEI1xev1CKkWEzJd1SZpolRV1eXSA2zn6I85Qp7RbCxzr3 jA66tpE8oZBjJd7CE6t1EOuSSc8g7LIU5wn+e3JDz/PDUaNI9GIydYFz8/28TOz8PnTt J0xpzhvLLQosv3Bn0Dnab5AU3h+RJ1AUGIxezwTb5NUS74V4Y6sIFj6DW5ynQuRwwtyv D82ejsqqU5spzgsgB5hsHSxaNt3ZY/coKEMt2zC/bgC4hWJUaA86lHJGNqhj6gkhAwCs isSg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-disposition:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:message-id:subject:cc:to:from:date :dkim-signature; bh=TKkLYeq6TiFiIjRSWr+Wn42zP9zw0aXF4vjD3eQ5ONE=; fh=gXfTYIot31Kmj65/nI3poy84ZdteLIlz3wfm9/DBQao=; b=CdocjwkDz20UZvo6lhO4+mynjs4BMo7BdUqtTECoYOrgdDsANHqfBc/EaMhea31cio SS8mUCqzm/48FFePdwPaauBkC3IJddV+Xh8paRnHuO9O4SDuihIgdrW+QKvA3jCaLxBL oqt9Qrfdh/I1c/CNI0kMkl9T1EnPJ5JHS7Ps21s48j5UxlQt/gk3UjcchTsw/LyhcBN6 6oN2KE+RrcwKBE3Mj1HS53F+Q2Sy2V2xOphatKeHJwL3WRZoXmc6sJLUVOvT6DpxsYGk XumNv/a7duU9uPyNu2Ds82ikfYrA0PVLdVWecZkca0CN12YCfqVwxmvNFCviLxxFN6Xb fQ5A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=JdW6F8IB; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-188767-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-188767-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id 98e67ed59e1d1-2bf5fe35bbdsi1491929a91.123.2024.05.24.07.05.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 07:05:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-188767-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=JdW6F8IB; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-188767-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-188767-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 933E12826A7 for ; Fri, 24 May 2024 14:05:36 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1AB6112C47A; Fri, 24 May 2024 14:05:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JdW6F8IB" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3050D12BF38; Fri, 24 May 2024 14:05:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716559528; cv=none; b=pgT5aO6K8/7i5DW/5CHibBwj2iiZBNdPog67fGmRtQZQzD948IuTAETb0OkfZjOGC1XjsRwaUzWZ7ImWseUJ2wZsfMmmbAISJUJaKNwljnst8Qt9Qw8Sz5JeplT5wclE9hFcPcGTuqTKV2A9nLOuwwNKSls+gC9wO/+9tGKRoPc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716559528; c=relaxed/simple; bh=lEpDUMndEWxswvmAvM++8Gc2QtrqlIBZpwvAiQ7Z65E=; h=Date:From:To:Cc:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition; b=m5IBdjOHq2auq0wWCwEsTrHpYb7TGWMZnlUWbZ6EvLRT/S3pYbF3A8nz4XOce4arZLLBoT/PH3ZoNVhHlGSS4pWn/39cILuTYv0APPs1b5YUf1miAw4OUUda//mfFIrYTRVZOuutnDBnLmjgmRX1Kprs1qjD8IDKyFheIolRtXE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JdW6F8IB; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 29CB7C2BBFC; Fri, 24 May 2024 14:05:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1716559527; bh=lEpDUMndEWxswvmAvM++8Gc2QtrqlIBZpwvAiQ7Z65E=; h=Date:From:To:Cc:Subject:From; b=JdW6F8IB+hgDYDynBhunl/81pomaGV4o1g5U1Nyx0UXvTk897yltvSF0JZK+5j9h3 GyOE0gTKFJ/YeffUGVHujVSngeoLHAvuDIv7LMiqfnGRliBrv67lIif0Jji45wady0 EG5LWbvzYN12+7v/xnI+6OWMQPqe/vOfytq/hnIJLMUkopOpN17X5PhThfn1c3zdx9 ke95NeTj2HqrYxYaGCpMykccpcV7j6uss03m5uyHAGMIPsttqemy/GB50nFfh75Rwd 3UI7e2bqYB5LiBWW5VN+Ou8AppaoK1Mj3ifhHjFvRtLk+NRSv08glVmH853ReS1C0M jG8n3PEgeMsTg== Date: Fri, 24 May 2024 16:05:24 +0200 From: Frederic Weisbecker To: linux-kernel@vger.kernel.org Cc: Frederic Weisbecker , "Paul E . McKenney" , Boqun Feng , Joel Fernandes , Neeraj Upadhyay , Uladzislau Rezki , Zqiang , rcu Subject: [PATCH] rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU invocation Message-ID: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, I have found a potential race, but I might have missed something on the way that makes it actually impossible... Please double check. Thanks. --- Subject: [PATCH] rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU invocation When rcu_barrier() calls rcu_rdp_cpu_online() and observes a CPU off rnp->qsmaskinitnext, it means that all accesses from the offline CPU preceding the CPUHP_TEARDOWN_CPU are visible to RCU barrier, including callbacks expiration and counter updates. However interrupts can still fire after stop_machine() re-enables interrupts and before rcutree_report_cpu_dead(). The related accesses happening between CPUHP_TEARDOWN_CPU and rnp->qsmaskinitnext clearing are _NOT_ guaranteed to be seen by rcu_barrier() without proper ordering, especially when callbacks are invoked there to the end, making rcutree_migrate_callback() bypass barrier_lock. The following theoretical race example can make rcu_barrier() hang: CPU 0 CPU 1 ----- ----- //cpu_down() smpboot_park_threads() //ksoftirqd is parked now rcu_sched_clock_irq() invoke_rcu_core() do_softirq() rcu_core() rcu_do_batch() // callback storm // rcu_do_batch() returns // before completing all // of them // do_softirq also returns early because of // timeout. It defers to ksoftirqd but // it's parked stop_machine() take_cpu_down() rcu_barrier() spin_lock(barrier_lock) // observes rcu_segcblist_n_cbs(&rdp->cblist) != 0 do_softirq() rcu_core() rcu_do_batch() //completes all pending callbacks //smp_mb() implied _after_ callback number dec rcutree_report_cpu_dead() rnp->qsmaskinitnext &= ~rdp->grpmask; rcutree_migrate_callback() // no callback, early return without locking // barrier_lock //observes !rcu_rdp_cpu_online(rdp) rcu_barrier_entrain() rcu_segcblist_entrain() // Observe rcu_segcblist_n_cbs(rsclp) == 0 // because no barrier between reading // rnp->qsmaskinitnext and rsclp->len rcu_segcblist_add_len() smp_mb__before_atomic() // will now observe the 0 count and empty // list, but too late, we enqueue regardless WRITE_ONCE(rsclp->len, rsclp->len + v); // ignored barrier callback // rcu barrier stall... This could be solved with a read memory barrier, enforcing the message passing between rnp->qsmaskinitnext and rsclp->len, matching the full memory barrier after rsclp->len addition in rcu_segcblist_add_len() performed at the end of rcu_do_batch(). However the rcu_barrier() is complicated enough and probably doesn't need too many more subtleties. CPU down is a slowpath and the barrier_lock seldom contended. Solve the issue with unconditionally locking the barrier_lock on rcutree_migrate_callbacks(). This makes sure that either rcu_barrier() sees the empty queue or its entrained callback will be migrated. Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 944e55085262..925e006b64f9 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -4736,11 +4736,15 @@ void rcutree_migrate_callbacks(int cpu) struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); bool needwake; - if (rcu_rdp_is_offloaded(rdp) || - rcu_segcblist_empty(&rdp->cblist)) - return; /* No callbacks to migrate. */ + if (rcu_rdp_is_offloaded(rdp)) + return; raw_spin_lock_irqsave(&rcu_state.barrier_lock, flags); + if (rcu_segcblist_empty(&rdp->cblist)) { + raw_spin_unlock_irqrestore(&rcu_state.barrier_lock, flags); + return; /* No callbacks to migrate. */ + } + WARN_ON_ONCE(rcu_rdp_cpu_online(rdp)); rcu_barrier_entrain(rdp); my_rdp = this_cpu_ptr(&rcu_data); -- 2.34.1