Received: by 2002:ab2:710b:0:b0:1ef:a325:1205 with SMTP id z11csp629812lql; Mon, 11 Mar 2024 12:26:50 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVvXXSjN8ABUFTMKiBi1kCMP7b7f+6RoXeKnqHX3bIvtvFI/5EtytXjfZciWI9TmX9qUOphF2nPY64lza458RMqxUqyDCJ2OxHjy40jNg== X-Google-Smtp-Source: AGHT+IHz1XmrG5vz8WzQ6WEcyXQ9MJwir/ZjKIt2EvnCQ28aWvK9aYS3r3SmmqmV4tSh3fz8pfnO X-Received: by 2002:a05:622a:178e:b0:42f:2073:cbba with SMTP id s14-20020a05622a178e00b0042f2073cbbamr10678528qtk.27.1710185206039; Mon, 11 Mar 2024 12:26:46 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710185206; cv=pass; d=google.com; s=arc-20160816; b=jWsdHM7dB5r6P5rCNnku5o1Ho6cJFAFv6zW1jDFOXICjHytoB7edmpg1reTyyP7I4+ j03aLJN/BdxhTDwsc8pwovw3rxfv4ziwENg097MJqvIeScrkxHA4txLcZ2GxGDqKKX3y 4X2L/GdAWSN3trME0XyTh3bXjTeTaQMkSsarkpEhnw0SHiqBS1fthFi9PRAHFjrIho/X 55pS8VjDAwluQm3s5ix9xi58e2jrMhVPSiCE6eWO952vfTl3109BI8wJ66kHdukisXz0 fCni8P4Hb7GLeHketA/3W0LPeG1I6lFZo1xxwV+UNOuUQamMdAEV18TOB7JH6Iq5ZRKM KvfA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=Z7esSxTre3ut4WEVSR1SOfqhMGmr9/1FDizhRE4Y630=; fh=XfRwInt81ujW0hihchniGj5rZJiBgIMl4XATyXdXMv8=; b=HfhtjwPYBUL+AEJ58ssvLhUvLCKHX7Gw7ev+mU/ylS8F73Yelko1GDQ7Nut+ESkXug FKK088cIt7kLoeCcVijb8AdXKYibp9achek2XQtL4jyFxBUsU1d/wN2fPlqo/KtddGAE L563bZoIfGPxNmyGx+YxS3eLnDVKbwzBmn1XJDTvgdRkU9wKQw9hOXfRGatFszYVRunQ APy1LYgJ+qh8t8UOpQCQYgH3XlaLztrp3m/vFELWpDvakQTXz/CsAOXdMs4SG+Wu7rYo MjP17kVbuy4wsSM3v6pvFJ6A/gQdIHvJvWdyNo+OXj9AIpLPCDHgwew7Y4cUP56PZXxu uBYQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=A8vzJc5a; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-99469-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-99469-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id l11-20020a05622a174b00b0042f249b701dsi6249905qtk.103.2024.03.11.12.26.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Mar 2024 12:26:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-99469-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=A8vzJc5a; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-99469-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-99469-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id B3E911C20B98 for ; Mon, 11 Mar 2024 19:26:45 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 951E356766; Mon, 11 Mar 2024 19:26:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="A8vzJc5a" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8CCD85674D for ; Mon, 11 Mar 2024 19:26:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710185197; cv=none; b=eoIKTzkdoD7kITBWpz+Ll9Rb+I3wqvDJHGfKtwZa3fSJIacFEVWW7Fav/3AGYRpjwHFmk8Z+QJn7gLbfWfjH5pNlfnVpmz+/pl4kivEGz0Y3hzSYcxirK9gtRD+0HXxJuP0bC1gfYn9xoaZ6KA266Si5+5xwankHNrhNegoFkno= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710185197; c=relaxed/simple; bh=75zgoSt/KEgbcevTlEMIQKyVfjhLWf+hwPRqmMbzczg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=INSfTGHT/LU7iKb72FlZ7m7zVNQ1gYtfDZ7AfLgWmi68Y27bOImDcjBA1ZHhDzKXLFhKtvEAEUladEqSB+o5nNArqqPosHy8rWs6S5/5hCdgeFiLmD8GUD/hiNPvRU3c/3iouZA17Q0PHuZPM4Odp7uOTuQRrNjXzLnBrj7R1Is= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=A8vzJc5a; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 10ABBC433C7; Mon, 11 Mar 2024 19:26:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1710185197; bh=75zgoSt/KEgbcevTlEMIQKyVfjhLWf+hwPRqmMbzczg=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=A8vzJc5aWVizIc8V3XKsd0W0fjz1AlNAKkT1x3/P5/AzqSJzjZYJHnyrndWjz0pgg y8b13e/XSGDFx3L2+qUX+49yE7wObdojyELHviAQ56RZPw57NT3jPQA0HOjE1+tBuL /XZWXywnUjlcBTVPZrALwyVzGxZHgvnz7RqfAgC+/H3p3t7BK97EYWCcr4fh0w97sO 6Jv6POTtXhSYyTnOMm8J6q+JbvACj50aKnoQvv/hZxBx1why6RVosOsay9v1SuK+nd xs4JBufYlwp+FDpfgOpjdu80ljyz+LbytyZg+v2AtbWQ7/iQuSGBBRsI5BTgSRpnUs z+m+wt2bsOpvw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 84363CE0B68; Mon, 11 Mar 2024 12:26:36 -0700 (PDT) Date: Mon, 11 Mar 2024 12:26:36 -0700 From: "Paul E. McKenney" To: Ankur Arora Cc: Joel Fernandes , linux-kernel@vger.kernel.org, tglx@linutronix.de, peterz@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, willy@infradead.org, mgorman@suse.de, jpoimboe@kernel.org, mark.rutland@arm.com, jgross@suse.com, andrew.cooper3@citrix.com, bristot@kernel.org, mathieu.desnoyers@efficios.com, geert@linux-m68k.org, glaubitz@physik.fu-berlin.de, anton.ivanov@cambridgegreys.com, mattst88@gmail.com, krypton@ulrich-teichert.org, rostedt@goodmis.org, David.Laight@aculab.com, richard@nod.at, mjguzik@gmail.com, jon.grimm@amd.com, bharata@amd.com, raghavendra.kt@amd.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com Subject: Re: [PATCH 26/30] sched: handle preempt=voluntary under PREEMPT_AUTO Message-ID: <36eef8c5-8ecd-4c90-8851-1c2ab342e2bb@paulmck-laptop> Reply-To: paulmck@kernel.org References: <20240213055554.1802415-27-ankur.a.arora@oracle.com> <65e3cd87.050a0220.bc052.7a29@mx.google.com> <87frx514jz.fsf@oracle.com> <12a20651-5429-43df-88d7-9d01ff6212c6@joelfernandes.org> <63380f0a-329c-43df-8e6c-4818de5eb371@paulmck-laptop> <6054a8e0-eb95-45a3-9901-fe2a31b6fe4e@paulmck-laptop> <87plw5pd2x.fsf@oracle.com> <87wmq9mkx2.fsf@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87wmq9mkx2.fsf@oracle.com> On Sun, Mar 10, 2024 at 09:50:33PM -0700, Ankur Arora wrote: > > Paul E. McKenney writes: > > > On Thu, Mar 07, 2024 at 08:22:30PM -0800, Ankur Arora wrote: > >> > >> Paul E. McKenney writes: > >> > >> > On Thu, Mar 07, 2024 at 07:15:35PM -0500, Joel Fernandes wrote: > >> >> > >> >> > >> >> On 3/7/2024 2:01 PM, Paul E. McKenney wrote: > >> >> > On Wed, Mar 06, 2024 at 03:42:10PM -0500, Joel Fernandes wrote: > >> >> >> Hi Ankur, > >> >> >> > >> >> >> On 3/5/2024 3:11 AM, Ankur Arora wrote: > >> >> >>> > >> >> >>> Joel Fernandes writes: > >> >> >>> > >> >> >> [..] > >> >> >>>> IMO, just kill 'voluntary' if PREEMPT_AUTO is enabled. There is no > >> >> >>>> 'voluntary' business because > >> >> >>>> 1. The behavior vs =none is to allow higher scheduling class to preempt, it > >> >> >>>> is not about the old voluntary. > >> >> >>> > >> >> >>> What do you think about folding the higher scheduling class preemption logic > >> >> >>> into preempt=none? As Juri pointed out, prioritization of at least the leftmost > >> >> >>> deadline task needs to be done for correctness. > >> >> >>> > >> >> >>> (That'll get rid of the current preempt=voluntary model, at least until > >> >> >>> there's a separate use for it.) > >> >> >> > >> >> >> Yes I am all in support for that. Its less confusing for the user as well, and > >> >> >> scheduling higher priority class at the next tick for preempt=none sounds good > >> >> >> to me. That is still an improvement for folks using SCHED_DEADLINE for whatever > >> >> >> reason, with a vanilla CONFIG_PREEMPT_NONE=y kernel. :-P. If we want a new mode > >> >> >> that is more aggressive, it could be added in the future. > >> >> > > >> >> > This would be something that happens only after removing cond_resched() > >> >> > might_sleep() functionality from might_sleep(), correct? > >> >> > >> >> Firstly, Maybe I misunderstood Ankur completely. Re-reading his comments above, > >> >> he seems to be suggesting preempting instantly for higher scheduling CLASSES > >> >> even for preempt=none mode, without having to wait till the next > >> >> scheduling-clock interrupt. Not sure if that makes sense to me, I was asking not > >> >> to treat "higher class" any differently than "higher priority" for preempt=none. > >> >> > >> >> And if SCHED_DEADLINE has a problem with that, then it already happens so with > >> >> CONFIG_PREEMPT_NONE=y kernels, so no need special treatment for higher class any > >> >> more than the treatment given to higher priority within same class. Ankur/Juri? > >> >> > >> >> Re: cond_resched(), I did not follow you Paul, why does removing the proposed > >> >> preempt=voluntary mode (i.e. dropping this patch) have to happen only after > >> >> cond_resched()/might_sleep() modifications? > >> > > >> > Because right now, one large difference between CONFIG_PREEMPT_NONE > >> > an CONFIG_PREEMPT_VOLUNTARY is that for the latter might_sleep() is a > >> > preemption point, but not for the former. > >> > >> True. But, there is no difference between either of those with > >> PREEMPT_AUTO=y (at least right now). > >> > >> For (PREEMPT_AUTO=y, PREEMPT_VOLUNTARY=y, DEBUG_ATOMIC_SLEEP=y), > >> might_sleep() is: > >> > >> # define might_resched() do { } while (0) > >> # define might_sleep() \ > >> do { __might_sleep(__FILE__, __LINE__); might_resched(); } while (0) > >> > >> And, cond_resched() for (PREEMPT_AUTO=y, PREEMPT_VOLUNTARY=y, > >> DEBUG_ATOMIC_SLEEP=y): > >> > >> static inline int _cond_resched(void) > >> { > >> klp_sched_try_switch(); > >> return 0; > >> } > >> #define cond_resched() ({ \ > >> __might_resched(__FILE__, __LINE__, 0); \ > >> _cond_resched(); \ > >> }) > >> > >> And, no change for (PREEMPT_AUTO=y, PREEMPT_NONE=y, DEBUG_ATOMIC_SLEEP=y). > > > > As long as it is easy to restore the prior cond_resched() functionality > > for testing in the meantime, I should be OK. For example, it would > > be great to have the commit removing the old functionality from > > cond_resched() at the end of the series, > > I would, of course, be happy to make any changes that helps testing, > but I think I'm missing something that you are saying wrt > cond_resched()/might_sleep(). > > There's no commit explicitly removing the core cond_reshed() > functionality: PREEMPT_AUTO explicitly selects PREEMPT_BUILD and selects > out PREEMPTION_{NONE,VOLUNTARY}_BUILD. > (That's patch-1 "preempt: introduce CONFIG_PREEMPT_AUTO".) > > For the rest it just piggybacks on the CONFIG_PREEMPT_DYNAMIC work > and just piggybacks on (!CONFIG_PREEMPT_DYNAMIC && CONFIG_PREEMPTION): > > #if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC) > /* ... */ > #if defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) > /* ... */ > #elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) > /* ... */ > #else /* !CONFIG_PREEMPTION */ > /* ... */ > #endif /* PREEMPT_DYNAMIC && CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */ > > #else /* CONFIG_PREEMPTION && !CONFIG_PREEMPT_DYNAMIC */ > static inline int _cond_resched(void) > { > klp_sched_try_switch(); > return 0; > } > #endif /* !CONFIG_PREEMPTION || CONFIG_PREEMPT_DYNAMIC */ > > Same for might_sleep() (which really amounts to might_resched()): > > #ifdef CONFIG_PREEMPT_VOLUNTARY_BUILD > /* ... */ > #elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) > /* ... */ > #elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) > /* ... */ > #else > # define might_resched() do { } while (0) > #endif /* CONFIG_PREEMPT_* */ > > But, I doubt that I'm telling you anything new. So, what am I missing? It is really a choice at your end. Suppose we enable CONFIG_PREEMPT_AUTO on our fleet, and find that there was some small set of cond_resched() calls that provided sub-jiffy preemption that matter to some of our workloads. At that point, what are our options? 1. Revert CONFIG_PREEMPT_AUTO. 2. Revert only the part that disables the voluntary preemption semantics of cond_resched(). Which, as you point out, ends up being the same as #1 above. 3. Hotwire a voluntary preemption into the required locations. Which we would avoid doing due to upstream-acceptance concerns. So, how easy would you like to make it for us to use as much of CONFIG_PREEMPT_AUTO=y under various possible problem scenarios? Yes, in a perfect world, we would have tested this already, but I am still chasing down problems induced by simple rcutorture testing. Cowardly of us, isn't it? ;-) Thanx, Paul