Received: by 2002:ab2:3141:0:b0:1ed:23cc:44d1 with SMTP id i1csp644618lqg; Fri, 1 Mar 2024 17:18:10 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWaw97Qmj7wcpc5+0XgeBpESi9j1zeTA6AVlT/Y1utE/CPEeCR087oqhDPDABdSLzyvvjl6spnRb58eNC5z+701tkf7UTxnOTXuWSRU4A== X-Google-Smtp-Source: AGHT+IGtTZ6WY5Ubn5OIJT4n6TPs6dcibDOfvtt7VJ3G0XhJ3ZbsLIf3i7hGqS8w1HKwW/b7cd6C X-Received: by 2002:a17:903:230a:b0:1dc:68b6:9243 with SMTP id d10-20020a170903230a00b001dc68b69243mr3909998plh.58.1709342290049; Fri, 01 Mar 2024 17:18:10 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709342290; cv=pass; d=google.com; s=arc-20160816; b=SBsw/CdBo8qg9DhnXw2uF6t3iPIs3ZoD5fL5CFDGs/qtIc9a9P0o44UZ4blEkjajgQ KANskUa2+QNpaF9p/Cd2piGUB+oInpwGpX/KNbgnxSmrVF5x2n+ZKUI4HpLiD1LVRDWq wGJ+//29gzK5mugEYdRkuqbYoxLGk/gTx7EqLgUx8y19l5rvUazrnFosXfBook09ZE3B kAc2NsO3xO8XNpyR0dk7liNrM9/AGSRIC1sDFIRlPWn1HF/KaqS4rVeAqWVg5tBYgxKV /dHln0IqU+1+N43hNZez8UNGW2TqvsO+SNkZTO3NKkAqEGv+EQI8ccdHAktJPxrTzEIo Z8jg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=+bLdXLDIlB3LLlCvC+/IFucCgKF7tKEYR/RJs8Xn6/A=; fh=BN1ahTY3pySoO1k/sKtFN5mQY9VI00h/kiH42mh64oQ=; b=dNb4w0pVRf7rN1XAjxPoo0H4WOEAoqiw99Q+PFg9XnqFG/MBtcBHqpjWWjM2v6ygFw 42hSNya5g92yuYApIFkRf5HTNjo7jcExCQVMhajclGEGIiqy+W3025Wd5ZRaOmVCtht0 JYgZ+eQDHYUUh/2vQ1RsFOst+mwVNFQZlJzvqcuoMY7z3iJ7DWAg3XEq2s2kuag4IkwM vBeSBW4sjOgi824BO1fO7P3WwR2nMGh8DkbM5poyAhKtOdBnycjciTJKVQHQIo7r7NVA mx8NDIJgRa67aE+FfhCA5K80ecdWZCooMbX9UBk5/d0lWz/h1KTRgTXRqK2i354kf9CC 4TZQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=oJEoBVqA; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-89288-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-89288-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id ma15-20020a170903094f00b001d717e60007si4723345plb.397.2024.03.01.17.18.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Mar 2024 17:18:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-89288-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=oJEoBVqA; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-89288-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-89288-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 59A76B22237 for ; Sat, 2 Mar 2024 01:16:44 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E61BB5660; Sat, 2 Mar 2024 01:16:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oJEoBVqA" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0193810E3 for ; Sat, 2 Mar 2024 01:16:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709342195; cv=none; b=rx/utdSRFLRvbU6DQFU5HO4b3CwltBUQvGZBtsEfpy5F+Co61lQNKhU+6OGFxm4i4XBOX5bt/pcwjY1q1WMfA1T1vobdelzinVegWFdbzfcq+4JItA2t1agxcRfh9YQ1peBNUxJt+dSp7czz/+GAK+iax441UZH09tFRN/yg5qw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709342195; c=relaxed/simple; bh=USATP74KI+XzG2soiQE5kgdb5/VxlCqpaYbTagyTRCk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=lfiIYcMAiL8ItFlY5BunEfW0fvzRPGgP3iXSbq9k+QkcQ/XrqqT/DAy9W2U6BwBSHtZOBO5h3P2xCTBDTlD9IVKfi0sPB7d98YcwTcItI0WbPnWxQB0r3ifDNFN7LC0JyeY6vEZBEccC53mHaNAVElHOj/Puvwwds2ezVX4SJtQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oJEoBVqA; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4DC43C433C7; Sat, 2 Mar 2024 01:16:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1709342194; bh=USATP74KI+XzG2soiQE5kgdb5/VxlCqpaYbTagyTRCk=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=oJEoBVqAesdyYp0ftv4LNZ2PSoPLVrt9JZolcsGssxjesJcbs8je0BzziPOgzWVZK +jpVEZXeTRpRLUaWvzdTOpOMhg/1MM9FCkRi6yP0lRorEv7gDIklOjUa6QcL+RUfCG IKLbjcTj8LYTBCMcisDaEf7q2j5Upd1cDkxL59ZqnL45RLluHeRDG6DQT+iegHXM73 ekhspmtZR/Tr0YAQQZ6BskDD5B7VLBxYP9alF75TqFairOqfXziNDrWJf1nQf4FtUz 4uiAPW3lwkE//2csNwBo0lrHtTr/ZVDS5t4y6/MNKo37K2JcamvCGPjLJjuBFozEE4 5hmEXzFOZQQaA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id D5A7DCE140C; Fri, 1 Mar 2024 17:16:33 -0800 (PST) Date: Fri, 1 Mar 2024 17:16:33 -0800 From: "Paul E. McKenney" To: Mark Rutland Cc: Steven Rostedt , Ankur Arora , linux-kernel@vger.kernel.org, tglx@linutronix.de, peterz@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, willy@infradead.org, mgorman@suse.de, jpoimboe@kernel.org, jgross@suse.com, andrew.cooper3@citrix.com, bristot@kernel.org, mathieu.desnoyers@efficios.com, glaubitz@physik.fu-berlin.de, anton.ivanov@cambridgegreys.com, mattst88@gmail.com, krypton@ulrich-teichert.org, David.Laight@aculab.com, richard@nod.at, jon.grimm@amd.com, bharata@amd.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com Subject: Re: [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling Message-ID: <91437fa8-c192-4a71-8073-bdd9c3889926@paulmck-laptop> Reply-To: paulmck@kernel.org References: <7db5c057-8bd4-4209-8484-3a0f9f3cd02d@paulmck-laptop> <2b735ba4-8081-4ddb-9397-4fe83143d97f@paulmck-laptop> <20240221131901.69c80c47@gandalf.local.home> <8f30ecd8-629b-414e-b6ea-b526b265b592@paulmck-laptop> <20240221151157.042c3291@gandalf.local.home> <53020731-e9a9-4561-97db-8848c78172c7@paulmck-laptop> <1ec4dc29-8868-4d82-8c5e-c17ad025bc22@paulmck-laptop> <5641c2f4-3453-4b04-ab0d-db9e5b464b9c@paulmck-laptop> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5641c2f4-3453-4b04-ab0d-db9e5b464b9c@paulmck-laptop> On Fri, Feb 23, 2024 at 07:31:50AM -0800, Paul E. McKenney wrote: > On Fri, Feb 23, 2024 at 11:05:45AM +0000, Mark Rutland wrote: > > On Thu, Feb 22, 2024 at 11:11:34AM -0800, Paul E. McKenney wrote: > > > On Thu, Feb 22, 2024 at 03:50:02PM +0000, Mark Rutland wrote: > > > > On Wed, Feb 21, 2024 at 12:22:35PM -0800, Paul E. McKenney wrote: > > > > > On Wed, Feb 21, 2024 at 03:11:57PM -0500, Steven Rostedt wrote: > > > > > > On Wed, 21 Feb 2024 11:41:47 -0800 > > > > > > "Paul E. McKenney" wrote: > > > > > > > > > > > > > > I wonder if we can just see if the instruction pointer at preemption is at > > > > > > > > something that was allocated? That is, if it __is_kernel(addr) returns > > > > > > > > false, then we need to do more work. Of course that means modules will also > > > > > > > > trigger this. We could check __is_module_text() but that does a bit more > > > > > > > > work and may cause too much overhead. But who knows, if the module check is > > > > > > > > only done if the __is_kernel() check fails, maybe it's not that bad. > > > > > > > > > > > > > > I do like very much that idea, but it requires that we be able to identify > > > > > > > this instruction pointer perfectly, no matter what. It might also require > > > > > > > that we be able to perfectly identify any IRQ return addresses as well, > > > > > > > for example, if the preemption was triggered within an interrupt handler. > > > > > > > And interrupts from softirq environments might require identifying an > > > > > > > additional level of IRQ return address. The original IRQ might have > > > > > > > interrupted a trampoline, and then after transitioning into softirq, > > > > > > > another IRQ might also interrupt a trampoline, and this last IRQ handler > > > > > > > might have instigated a preemption. > > > > > > > > > > > > Note, softirqs still require a real interrupt to happen in order to preempt > > > > > > executing code. Otherwise it should never be running from a trampoline. > > > > > > > > > > Yes, the first interrupt interrupted a trampoline. Then, on return, > > > > > that interrupt transitioned to softirq (as opposed to ksoftirqd). > > > > > While a softirq handler was executing within a trampoline, we got > > > > > another interrupt. We thus have two interrupted trampolines. > > > > > > > > > > Or am I missing something that prevents this? > > > > > > > > Surely the problematic case is where the first interrupt is taken from a > > > > trampoline, but the inner interrupt is taken from not-a-trampoline? If the > > > > innermost interrupt context is a trampoline, that's the same as that without > > > > any nesting. > > > > > > It depends. If we wait for each task to not have a trampoline in effect > > > then yes, we only need to know whether or not a given task has at least > > > one trampoline in use. One concern with this approach is that a given > > > task might have at least one trampoline in effect every time it is > > > checked, unlikely though that might seem. > > > > > > If this is a problem, one way around it is to instead ask whether the > > > current task still has a reference to one of a set of trampolines that > > > has recently been removed. This avoids the problem of a task always > > > being one some trampoline or another, but requires exact identification > > > of any and all trampolines a given task is currently using. > > > > > > Either way, we need some way of determining whether or not a given > > > PC value resides in a trampoline. This likely requires some data > > > structure (hash table? tree? something else?) that must be traversed > > > in order to carry out that determination. Depending on the traversal > > > overhead, it might (or might not) be necessary to make sure that the > > > traversal is not on the entry/exit/scheduler fast paths. It is also > > > necessary to keep the trampoline-use overhead low and the trampoline > > > call points small. > > > > Thanks; I hadn't thought about that shape of livelock problem; with that in > > mind my suggestion using flags was inadequate. > > > > I'm definitely in favour of just using Tasks RCU! That's what arm64 does today, > > anyhow! > > Full speed ahead, then!!! But if you come up with a nicer solution, > please do not keep it a secret! The networking NAPI code ends up needing special help to avoid starving Tasks RCU grace periods [1]. I am therefore revisiting trying to make Tasks RCU directly detect trampoline usage, but without quite as much need to identify specific trampolines... I am putting this information in a Google document for future reference [2]. Thoughts? Thanx, Paul [1] https://lore.kernel.org/all/Zd4DXTyCf17lcTfq@debian.debian/ [2] https://docs.google.com/document/d/1kZY6AX-AHRIyYQsvUX6WJxS1LsDK4JA2CHuBnpkrR_U/edit?usp=sharing