Received: by 2002:a05:7412:40d:b0:e2:908c:2ebd with SMTP id 13csp482436rdf; Tue, 21 Nov 2023 07:53:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IHqamQ44/FSoC5TcEyvidMsgVrszZ8RFWTNBK9pFGschkQxMtO/k57Ovqmj7bFNHhwcjhOK X-Received: by 2002:a05:6a20:12cd:b0:154:b4cb:2e8c with SMTP id v13-20020a056a2012cd00b00154b4cb2e8cmr10125963pzg.24.1700581993180; Tue, 21 Nov 2023 07:53:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700581993; cv=none; d=google.com; s=arc-20160816; b=VKhLl4rBNUr7hQBy4unRpSNem0tlI3YoW+AquHA0pZN8+KKFAhKTrJMqgCdXet8TkS sfM8Hlz021rNFNWO8KMVhaIipiMLr/dtPAo0WOJBUSzX2gpha+V1XvdKDdfe9XyyEbQY 8L6ibIfGNBdP/6QG7OZ3a73Iov3P1w3fYQl6mop8TuYBSA1vtewQvkT6RnQKbEiPNDX3 RhRx6UDM8YTzYhyoofzjPc5qY4jJtisblg29Ljfvnb0SQuWR9M+l/MYhWOJtY/OOX+LP 9hpGkOKdG21Jzo0HHMoKAF9Mjsv58a1f+gjtD64hzDcm0CMr7ZHtCMiOTECCq+hSQ7di HeMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=KdHhzIZEP8e/Eg4CSestZOI8TBCbSycP9Kr2SsMMtGo=; fh=mTU7essQGA6D1AOzS6tAV5fI5lbuvBFy4zLJnEnu5Zg=; b=UQo4rYZSJR0RBnYo4C2I1E3a9atkalooxm8EvCPuOMfcnimO0U71WlUAeIj3l19a+I LfPUdGsp/KJAHgBijFwkHN/JlTvzIRCyKp4psApLpl9Ga/C0x5GfJFvv1EkfTxEt1fVa f2jhmwv4MdNdl9SZyKAOW87ET9s0JJZ9OSWP1xWuygx633rmtpjA73jZHW5hKs6NGEsL v898QltK1JNKfW+rZhLo8n1JUb/T9wlyDLTNw6ZYcPp8ddIu9f5W8AlSzh7X1JOHWdbK tKsDoLFgSq/DqGZLpflw3P614/PlXIGBTXGAUrvkNde4NpR49emYHZJuYoIhSoBI6tff a4wQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=tYFhZuQW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id f21-20020aa79695000000b006c320a95e8esi10075938pfk.404.2023.11.21.07.53.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Nov 2023 07:53:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=tYFhZuQW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id EFC42805F9C9; Tue, 21 Nov 2023 07:51:59 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234460AbjKUPv6 (ORCPT + 99 others); Tue, 21 Nov 2023 10:51:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230509AbjKUPv5 (ORCPT ); Tue, 21 Nov 2023 10:51:57 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E082C1 for ; Tue, 21 Nov 2023 07:51:53 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C1C36C433C8; Tue, 21 Nov 2023 15:51:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700581912; bh=WwKdR5EpIuoPyTiF9l91qy2/U/TA9wpVWOK/oMQGTTA=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=tYFhZuQWeZs0lWDOpEMMrTsTZOk9FNIxy7sRunS70HvJkwJcc2Q7c9Vm+f8+yH2kb pIVOrRKjevNbzvaQuvNHfNIjaN/dDZ7BgYVHdatmXFDVYBzTYIJEnU0QqFjDNOIpdC oul2a9CuXx8CN2c6km9nADrsGY5R5UTE61JMZJxWElPVz2DUyBDb/znP2CBIRYTt5i PmcepSKVAffPnNst6hXbM4mzCQ9SlKZtMYbhNkHPenxW7j+OZSXpM5tKWaFLttlGt5 wrdRtgSYxEc70QU16iWwFpM2OuavpjPxMXNgyAiGKaoEaQh80CE7kmnXDikohc8/Q0 994pzEV1A6Lyw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 61131CE04BD; Tue, 21 Nov 2023 07:51:52 -0800 (PST) Date: Tue, 21 Nov 2023 07:51:52 -0800 From: "Paul E. McKenney" To: Mathieu Desnoyers Cc: Peter Zijlstra , Steven Rostedt , Masami Hiramatsu , linux-kernel@vger.kernel.org, Michael Jeanson , Alexei Starovoitov , Yonghong Song , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , bpf@vger.kernel.org, Joel Fernandes Subject: Re: [PATCH v4 1/5] tracing: Introduce faultable tracepoints Message-ID: Reply-To: paulmck@kernel.org References: <20231120214742.GC8262@noisy.programming.kicks-ass.net> <62c6e37c-88cc-43f7-ac3f-1c14059277cc@paulmck-laptop> <20231120222311.GE8262@noisy.programming.kicks-ass.net> <20231121084706.GF8262@noisy.programming.kicks-ass.net> <20231121143647.GI8262@noisy.programming.kicks-ass.net> <6f503545-9c42-4d10-aca4-5332fd1097f3@efficios.com> <20231121144643.GJ8262@noisy.programming.kicks-ass.net> <0364d2c5-e5af-4bb5-b650-124a90f3d220@efficios.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0364d2c5-e5af-4bb5-b650-124a90f3d220@efficios.com> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 21 Nov 2023 07:52:00 -0800 (PST) On Tue, Nov 21, 2023 at 09:56:55AM -0500, Mathieu Desnoyers wrote: > On 2023-11-21 09:46, Peter Zijlstra wrote: > > On Tue, Nov 21, 2023 at 09:40:24AM -0500, Mathieu Desnoyers wrote: > > > On 2023-11-21 09:36, Peter Zijlstra wrote: > > > > On Tue, Nov 21, 2023 at 09:06:18AM -0500, Mathieu Desnoyers wrote: > > > > > Task trace RCU fits a niche that has the following set of requirements/tradeoffs: > > > > > > > > > > - Allow page faults within RCU read-side (like SRCU), > > > > > - Has a low-overhead read lock-unlock (without the memory barrier overhead of SRCU), > > > > > - The tradeoff: Has a rather slow synchronize_rcu(), but tracers should not care about > > > > > that. Hence, this is not meant to be a generic replacement for SRCU. > > > > > > > > > > Based on my reading of https://lwn.net/Articles/253651/ , preemptible RCU is not a good > > > > > fit for the following reasons: > > > > > > > > > > - It disallows blocking within a RCU read-side on non-CONFIG_PREEMPT kernels, > > > > > > > > Your counter points are confused, we simply don't build preemptible RCU > > > > unless PREEMPT=y, but that could surely be fixed and exposed as a > > > > separate flavour. > > > > > > > > > - AFAIU the mmap_sem used within the page fault handler does not have priority inheritance. > > > > > > > > What's that got to do with anything? Preemptible RCU allows blocking/preemption only in those cases where priority inheritance applies. As Mathieu says below, this prevents indefinite postponement of a global grace period. Such postponement is especially problematic in kernels built with PREEMPT_RCU=y. For but one example, consider a situation where someone maps a file served by NFS. We can debate the wisdom of creating such a map, but having the kernel OOM would be a completely unacceptable "error message". > > > > Still utterly confused about what task-tracing rcu is and how it is > > > > different from preemptible rcu. Task Trace RCU allows general blocking, which it tolerates by stringent restrictions on what exactly it is used for (tracing in cases where the memory to be included in the tracing might page fault). Preemptible RCU does not permit general blocking. Tasks Trace RCU is a very specialized tool for a very specific use case. > > > In addition to taking the mmap_sem, the page fault handler need to block > > > until its requested pages are faulted in, which may depend on disk I/O. > > > Is it acceptable to wait for I/O while holding preemptible RCU read-side? > > > > I don't know, preemptible rcu already needs to track task state anyway, > > it needs to ensure all tasks have passed through a safe spot etc.. vs regular > > RCU which only needs to ensure all CPUs have passed through start. > > > > Why is this such a hard question? It is not a hard question. Nor is the answer, which is that preemptible RCU is not a good fit for this task for all the reasons that Mathieu has laid out. > Personally what I am looking for is a clear documentation of preemptible rcu > with respect to whether it is possible to block on I/O (take a page fault, > call schedule() explicitly) from within a preemptible rcu critical section. > I guess this is a hard question because there is no clear statement to that > effect in the kernel documentation. > > If it is allowed (which I doubt), then I wonder about the effect of those > long readers on grace period delays. Things like expedited grace periods may > suffer. > > Based on Documentation/RCU/rcu.rst: > > Preemptible variants of RCU (CONFIG_PREEMPT_RCU) get the > same effect, but require that the readers manipulate CPU-local > counters. These counters allow limited types of blocking within > RCU read-side critical sections. SRCU also uses CPU-local > counters, and permits general blocking within RCU read-side > critical sections. These variants of RCU detect grace periods > by sampling these counters. > > Then we just have to find a definition of "limited types of blocking" > vs "general blocking". The key point is that you are not allowed to place any source code in a preemptible RCU reader that would not be legal in a non-preemptible RCU reader. The rationale again is that the cases in which preemptible RCU readers call schedule are cases to which priority boosting applies. It is quite possible that the documentation needs upgrading. ;-) Thanx, Paul