Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp797335pxp; Fri, 11 Mar 2022 15:24:21 -0800 (PST) X-Google-Smtp-Source: ABdhPJyEHSqkbQlr9ASeHKPS4QJ9i+Sutc944gDD6YIN9kjdAlH09smYGLatr8Py7MlNIEcho/wK X-Received: by 2002:a17:90a:7884:b0:1be:ef64:2212 with SMTP id x4-20020a17090a788400b001beef642212mr13120745pjk.53.1647041061491; Fri, 11 Mar 2022 15:24:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1647041061; cv=none; d=google.com; s=arc-20160816; b=GjiF4+itP28akUMGw2gXw3lgA6mOdo0KTXipu9TisC3eecvvmwCFNHpA5++ZnXobyY D3B7A4qnqHjsZz8ZJYgIGrG2rk2pnMvJC2lknMbj+8C04wknB+egjiw0PhXDEtfw64bi K2iHEP2QqoLVG1UrpnLhJdOkbjclNAHZkMZM7bFyJNSnzrOfTCz8MlEaE84192leoFtI 7i3AdIdRIpAyl7fOu3virwisx3VwAQepWday9X214vZSqPoW7E1qm1MwQV2CqmCnTq22 Ytb6xlZzBy69NFU3dv/S7UHRzHjmZZ0ULpzYFG0HEYDxPQMHysgylgGdyqRsPca5CV3M bgYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=Nfcy81ERY7MKyZiCw3JA2sFQ/mx/gNVaD9n6Io6X1ss=; b=VWCK4pfivb7oh5EAP+V3osuRH8+unv9C0+FMmsmGyJZqDgGScr2a0cH8WqsWBaIuLN mtJz4yuM7Yy2cle9WWa1HopkVJuu/mPxS6zeE30MjQCsN4BrpduPxlsGeQR4NX5iGgD/ 57bArPmL+zIx3185DyObupH8KHjwz5SwowtDisEQ4sxsZ7FCv0hz0xnN7cIjjWV1oaQS f5OZ/E8gnxuVI6JETZt9uF+ptWuImQ0wMDBfQKkNC+K3SEJSVv7VtuMSlS84xwcK3S/M Duz5J40zjqtAFdiujTrk8+Jwy92rEt5Xo79RmdWBQMtn0exooOyRG7QkQ2AtqDxINLNg fLzw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=pC9eiC+7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id bg10-20020a056a02010a00b00368dffe8d4fsi9212763pgb.752.2022.03.11.15.24.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 15:24:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=pC9eiC+7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9024B2CF9E0; Fri, 11 Mar 2022 14:10:44 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350656AbiCKR35 (ORCPT + 99 others); Fri, 11 Mar 2022 12:29:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347454AbiCKR34 (ORCPT ); Fri, 11 Mar 2022 12:29:56 -0500 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5E24175821 for ; Fri, 11 Mar 2022 09:28:48 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 24D8ECE2986 for ; Fri, 11 Mar 2022 17:28:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3CF96C340E9; Fri, 11 Mar 2022 17:28:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1647019725; bh=d5z9c3EOdhxF2p3xDGp+4qitBzF+8Lm3Do/gcuSvtSA=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=pC9eiC+706ZZZyFZiO+I2EImk55YkM5DNe6IUnidqCTowfTE6mlPz8X3obfmOQsYt C2ePj/tCLaltOl8NWS8U1/U2pYrw3hXYiAs4/tIhV0AKEVp7Aa12ZQQeO4/2WRkQc8 tvEQjkEfk0Gjdm2Nimpyb50tvEA46visb+L2OREnHqwitJ6QpYSYt3jCcpCqVBwGxT DWtNc1TU1GHIlyTxi7F8KsWuK09xSGbtm2ZCc1OKRayYH9w1Sb0DTRBkQ3PMdvN6us vW4BwGDJ/cISPo0881WGhfFRHtiQEM7CZ1Fut/6rW+TAryOsbENDmFC9O18MfaezIS H2LzNkpTCTXmQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id BF6885C0140; Fri, 11 Mar 2022 09:28:44 -0800 (PST) Date: Fri, 11 Mar 2022 09:28:44 -0800 From: "Paul E. McKenney" To: Frederic Weisbecker Cc: LKML , Peter Zijlstra , Phil Auld , Alex Belits , Nicolas Saenz Julienne , Xiongfeng Wang , Neeraj Upadhyay , Thomas Gleixner , Yu Liao , Boqun Feng , Marcelo Tosatti , Paul Gortmaker , Uladzislau Rezki , Joel Fernandes Subject: Re: [PATCH 18/19] rcu/context_tracking: Merge dynticks counter and context tracking states Message-ID: <20220311172844.GJ4285@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <20220302154810.42308-1-frederic@kernel.org> <20220302154810.42308-19-frederic@kernel.org> <20220310203222.GC4285@paulmck-ThinkPad-P17-Gen-1> <20220311163525.GF227945@lothringen> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220311163525.GF227945@lothringen> X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 11, 2022 at 05:35:25PM +0100, Frederic Weisbecker wrote: > On Thu, Mar 10, 2022 at 12:32:22PM -0800, Paul E. McKenney wrote: > > On Wed, Mar 02, 2022 at 04:48:09PM +0100, Frederic Weisbecker wrote: > > > Updating the context tracking state and the RCU dynticks counter > > > atomically in a single operation is a first step towards improving CPU > > > isolation. This makes the context tracking state updates fully ordered > > > and therefore allow for later enhancements such as postponing some work > > > while a task is running isolated in userspace until it ever comes back > > > to the kernel. > > > > > > The state field becomes divided in two parts: > > > > > > 1) Lower bits for context tracking state: > > > > > > CONTEXT_IDLE = 1, > > > CONTEXT_USER = 2, > > > CONTEXT_GUEST = 4, > > > > And the CONTEXT_DISABLED value of -1 works because you can have only > > one of the above three bits set at a time? > > > > Except that RCU needs this to unconditionally at least distinguish > > between kernel and idle, given the prevalence of CONFIG_NO_HZ_IDLE=y. > > So does the CONTEXT_DISABLED really happen anymore? > > > > A few more questions interspersed below. > > The value of CONTEXT_DISABLED is never stored in the ct->state. It is just > returned as is when CONTEXT_TRACKING is disabled. So this shouldn't conflict > with RCU. Whew! ;-) > > > @@ -452,15 +453,16 @@ void noinstr __ct_user_exit(enum ctx_state state) > > > * Exit RCU idle mode while entering the kernel because it can > > > * run a RCU read side critical section anytime. > > > */ > > > - rcu_eqs_exit(true); > > > + ct_kernel_enter(true, RCU_DYNTICKS_IDX - state); > > > if (state == CONTEXT_USER) { > > > instrumentation_begin(); > > > vtime_user_exit(current); > > > trace_user_exit(0); > > > instrumentation_end(); > > > } > > > + } else { > > > + atomic_sub(state, &ct->state); > > > > OK, atomic_sub() got my attention. What is going on here? ;-) > > Right :-) > > So that's when context tracking user is running but RCU doesn't > track user. This is for example when NO_HZ_FULL=n but VIRT_CPU_ACCOUNTING_GEN=y. > > I might remove that standalone VIRT_CPU_ACCOUNTING_GEN=y one day but for now > it's there. > > Anyway so in this case we only want to track KERNEL <-> USER from context > tracking POV, but we don't need the DYNTICKS_RCU_IDX part, hence the spared > ordering. > > But it still needs to be atomic because NMIs may increase DYNTICKS_RCU_IDX on > the same field. OK, so the idea is because NO_HZ_FULL=n, RCU doesn't care about user space execution? How about looking at it the other way? Is there some reason that RCU shouldn't take advantage of the userspace-execution information when it exists? For example, in the NO_HZ_FULL=n but VIRT_CPU_ACCOUNTING_GEN=y case, is there some chance that RCU would be ignoring a non-noinstr function? > > > @@ -548,7 +550,7 @@ EXPORT_SYMBOL_GPL(context_tracking); > > > void ct_idle_enter(void) > > > { > > > lockdep_assert_irqs_disabled(); > > > - rcu_eqs_enter(false); > > > + ct_kernel_exit(false, RCU_DYNTICKS_IDX + CONTEXT_IDLE); > > > } > > > EXPORT_SYMBOL_GPL(ct_idle_enter); > > > > > > @@ -566,7 +568,7 @@ void ct_idle_exit(void) > > > unsigned long flags; > > > > > > local_irq_save(flags); > > > - rcu_eqs_exit(false); > > > + ct_kernel_enter(false, RCU_DYNTICKS_IDX - CONTEXT_IDLE); > > > > Nice! This works because all transitions must be either from or > > to kernel context, correct? > > Exactly. There is no such thing as IDLE -> USER -> GUEST, etc... > There has to be KERNEL in the middle of each. Because we never > call rcu_idle_enter() -> rcu_user_enter() for example. The has to be > rcu_idle_exit() in the middle. > > (famous last words). Works for me, for the moment, anyway. ;-) > > > /* Return true if the specified CPU is currently idle from an RCU viewpoint. */ > > > @@ -321,8 +321,7 @@ bool rcu_dynticks_zero_in_eqs(int cpu, int *vp) > > > int snap; > > > > > > // If not quiescent, force back to earlier extended quiescent state. > > > - snap = ct_dynticks_cpu(cpu) & ~0x1; > > > - > > > + snap = ct_dynticks_cpu(cpu) & ~RCU_DYNTICKS_IDX; > > > > Do we also need to get rid of the low-order bits? Or is that happening > > elsewhere? Or is there some reason that they can stick around? > > Yep, ct_dynticks_cpu() clears the low order CONTEXT_* bits. Whew! ;-) > > > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h > > > index 9bf5cc79d5eb..1ac48c804006 100644 > > > --- a/kernel/rcu/tree_stall.h > > > +++ b/kernel/rcu/tree_stall.h > > > @@ -459,7 +459,7 @@ static void print_cpu_stall_info(int cpu) > > > rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' : > > > "!."[!delta], > > > ticks_value, ticks_title, > > > - rcu_dynticks_snap(cpu) & 0xfff, > > > + (rcu_dynticks_snap(cpu) >> RCU_DYNTICKS_SHIFT) & 0xfff , > > > > Actually, the low-ordder several bits are useful when debugging, so > > could you please not shift them away? Maybe also go to 0xffff to allow > > for more bits taken? > > Yeah that makes sense, I'll change that. > > Thanks a lot for the reviews! Thank you for the series! Thanx, Paul