Received: by 2002:a05:7412:a9a8:b0:f9:92ae:e617 with SMTP id o40csp79656rdh; Wed, 20 Dec 2023 16:57:20 -0800 (PST) X-Google-Smtp-Source: AGHT+IGoQTyG6QjP1+0AE/dcF2Ve7jcCvimkD9L4esgKXQbHljYL7mhPXOMH39gqjRUXJN53rTb3 X-Received: by 2002:a05:620a:201b:b0:781:890:9449 with SMTP id c27-20020a05620a201b00b0078108909449mr3474711qka.45.1703120240484; Wed, 20 Dec 2023 16:57:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703120240; cv=none; d=google.com; s=arc-20160816; b=RuY41YPV080jgH0dR+thCfpg06wrikDaOiM3GxiO6XeQASphhPQzzKYlgFxzQr+TUj cLNYhm2LBYOGrf6rFDkKRigMHNJa8XJ5UxhG6E2nX9ag4CNH+P5YU7VZLPiCrcoB+OGm H0U7J8pFXGUkluJsPJk7LhnvRTXkhVzck1i/JtQwHB8ttNe+LxFJyILnfwvdUClJmONq g5b5tPZTDndxt0ar89MJVoJZy/ADriZzIva8Fxp4CvV2A7JO3iHLQALGvZ81mXcm8vQG BnTji0kWiHjzHCVyYRABlzZ4E9uoMYbpwkw36zgQ84J83fuh2MBQxtNa2LI9UDKaknsL bmWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=4zqVmOU+3HPWfz2NQOAWQcWALSPUOb8qJwm7hRDnpCM=; fh=A24gxFC4DqFVikm6lMQthQXUqipoJser8HpE3kj1Ju0=; b=Ko4z6uEpYLou6UXjA+Vxak8wffBp+OeCtQ/I23q8X79fCdUOQw7s/ngz3j6vneuYsK 1/hcq8Lm1Z3v+dkIZ1DheU2G/O8O7b0B0HuZixiHExCZnDHYH8gesy5erlIacgGg+m+o /tGfLF6+zr28/3JQk2A/QaPjvsTZwAe8HnKMy+BFekMXufNL6gLvE0DuPZs7RoJYGhSi 1AHzXqvbmuU/cohtodGKPXMp0xF2xB7syTHFJb2wZHmbmvDJBQhsIPAqe3adk71IcwBz kUcVO5bgtKNgG3OekLPrL/Zt/XgICfSWKmBXEj4YINYRDZAtDYUjNCrbI7h81q3hATGC fEuw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ddbNw5qm; spf=pass (google.com: domain of linux-kernel+bounces-7712-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-7712-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id vq21-20020a05620a559500b0077f8c4cfd9csi971459qkn.186.2023.12.20.16.57.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Dec 2023 16:57:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-7712-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ddbNw5qm; spf=pass (google.com: domain of linux-kernel+bounces-7712-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-7712-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 2C1721C22684 for ; Thu, 21 Dec 2023 00:57:20 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B701215BF; Thu, 21 Dec 2023 00:57:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ddbNw5qm" X-Original-To: linux-kernel@vger.kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E0F2110EB; Thu, 21 Dec 2023 00:57:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 606A7C433C7; Thu, 21 Dec 2023 00:57:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1703120232; bh=ywWOLVkIVc6XdDV+LveLMln1b3lK8GaP445VxOveI7I=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=ddbNw5qm86vGbt9lbsaxFFTR6bomwV0UJwYFrl2WL2pY7uJrkt52ZfQWF8Wls6zd0 XjNuwIWdeSBCjbeYtaCgVxJtTAmktTcA86YjXlO9vThklnxrgXrOS8VxwYWvPBWzXm qYkijfG9Pp7X2HYc8cHmxaf+efHt32iiiojMwVPQOtm+dsJBe6mh3mnL5T9U+UM4WF HV5oRPQMgJBeY8Z1ckmRlWchow68fS+TQCDGC9pDUbUjOz60v0jmjiM2Jc9cFKizkx SKrdxMTYSzdRha5vf8DRHLjdal/wLgTPE1gdGlKLRRhb/tx7B3BBJCQG0JfTL2CIBZ D6baohUg7asQA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id E861FCE0ECF; Wed, 20 Dec 2023 16:57:11 -0800 (PST) Date: Wed, 20 Dec 2023 16:57:11 -0800 From: "Paul E. McKenney" To: Frederic Weisbecker Cc: Joel Fernandes , LKML , Boqun Feng , Neeraj Upadhyay , Uladzislau Rezki , Zqiang , rcu , Thomas Gleixner , Peter Zijlstra Subject: Re: [PATCH 2/3] rcu: Defer RCU kthreads wakeup when CPU is dying Message-ID: <6b613378-e21a-426a-9989-46c3fb9c45a7@paulmck-laptop> Reply-To: paulmck@kernel.org References: <20231218231916.11719-1-frederic@kernel.org> <20231218231916.11719-3-frederic@kernel.org> <65811051.d40a0220.75c79.66cf@mx.google.com> <65825924.050a0220.222f1.dc9d@mx.google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Wed, Dec 20, 2023 at 04:50:41PM +0100, Frederic Weisbecker wrote: > Le Tue, Dec 19, 2023 at 10:01:55PM -0500, Joel Fernandes a ?crit : > > > (Though right now I'm missing the flush_smp_call_function_queue() call that flushes > > > the ttwu queue between sched_cpu_deactivate() and sched_cpu_wait_empty()) > > > > Possible. I saw your IRC message to Peter on that as well, thanks for > > following up. I need to find some time to look more into that, but that does > > sound concerning. > > Found it! It's smpcfd_dying_cpu(). > > > > But note this patch does something different, it doesn't defer the runqueue > > > enqueue like ttwu queue does. It defers the whole actual wakeup. This means that the > > > decision as to where to queue the task is delegated to an online CPU. So it's > > > not the same constraints. Waking up a task _from_ a CPU that is active or not but > > > at least online is supposed to be fine. > > > > Agreed, thanks for the clarifications. But along similar lines (and at the > > risk of oversimplifying), is it not possible to send an IPI to an online CPU > > to queue the hrtimer locally there if you detect that the current CPU is > > going down? In the other thread to Hilf, you mentioned the hrtimer infra has > > to have equal or earlier deadline, but you can just queue the hrtimer from > > the IPI handler and that should take care of it? > > This is something that Thomas wanted to avoid IIRC, because the IPI can make > it miss the deadline. But I guess in the case of an offline CPU, it can be a > last resort. > > > Let me know if I missed something which should make for some good holiday > > reading material. ;-) > > Let me summarize the possible fixes we can have: > > 1) It's RCU's fault! We must check and fix all the wake ups performed by RCU > from rcutree_report_cpu_dead(). But beware other possible wake-ups/timer > enqueue from the outgoing CPU after hrtimers are migrated. > > 2) It's scheduler's fault! do_start_rt_bandwidth() should check if the current > CPU is offline and place manually the timer to an online CPU (through an > IPI? yuck) > > 3) It's hrtimer's fault! If the current CPU is offline, it must arrange for > queueing to an online CPU. Not easy to do as we must find one whose next > expiry is below/equal the scheduler timer. As a last resort, this could be > force queued to any and then signalled through an IPI, even though it's > something we've tried to avoid until now. > > Also It's hard for me to think about another way to fix the deadlock fixed > by 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94. Hrtimers migration can't happen > after rcutree_report_cpu_dead(), because it may use RCU... > > None of the above look pretty anyway. Thoughts? Make one of the surviving CPUs grab any leftover timers from the outgoing CPU, possibly checking periodically. Not pretty either, but three ugly options deserve a fourth one! Thanx, Paul