Received: by 2002:a5d:9c59:0:0:0:0:0 with SMTP id 25csp2599268iof; Wed, 8 Jun 2022 08:14:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzeNcd89K8ILHelNVRmxWbLu7Z3RXs8KiLBP9qb5gtAMHOfEMAB6+WZGfa3tkvw5bGQJ93n X-Received: by 2002:a17:90a:ca13:b0:1e2:fcf3:c7a6 with SMTP id x19-20020a17090aca1300b001e2fcf3c7a6mr54909227pjt.186.1654701275173; Wed, 08 Jun 2022 08:14:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654701275; cv=none; d=google.com; s=arc-20160816; b=KC0n2gHTKgXYKrTX/sJ7JYcljSTE8hz1PfcmwzNIAe3g/CPNWdY1Gxos1FSsie111t U3qHvkO8fFVnIRVi45WUyYkAR3Tftl19KyT82UTgkeghmAGSmCKFfeResWeG/uYOFODY fdQRcJscyihzoUwUcAiudeTRP7n3jqjxqZWe+nnPCu3wXWA1BnWbX7CsrncwlZlumuUO zmsbzaw9KC9hAhB3BCY+5UjezrVvHKsOhtWUCFmnAAMULjQGHT7hc65frXtmkqGlTfjK yokbexhyRjkU2IbhSeXkucxqg6EW4/4KmsBpaAkPqCgrdFLBTZNT/VFcYQlI/Lujv1GC APcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=qoInS3ndh8OQXVSs25yRERedsyHw0RJy70EMrOyyu44=; b=WqORfOD3Fn/LXBJEPwKnf59NKywQzg2n/GnZjD6P1vGqvV5oL8nup+ME08hojMOYyc fkPAxqANp2XF/XbhMYHlf+Q4E1dhxy8N1C/LOc6oqnZwDIV4HPBuTuhEJwBycKLV7fAJ jOxz4xYHU7d+A00iq8AxXkxeAM6PH9c9rlYCtzimGSXVxGiU2hw3P8JZDtCpEvkffN// lE0nYRYpgEw1t/LSH6eVyzcZFbG8uFguOmzD7mDPXPZtMs0JyNB0y86cP522dKF/Bx0h Z/mOOSEq1ju1mDwLG8VbQbtt/YPfVQ4lkOgccOoJeW87tY7/CHr3iQRamAcgAzkofPjA 5Q6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="O+/Ps8sx"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id d9-20020a17090a628900b001d4e0d81848si29685396pjj.152.2022.06.08.08.14.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jun 2022 08:14:35 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="O+/Ps8sx"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 53CC61D871B; Wed, 8 Jun 2022 07:44:25 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241697AbiFHOmv (ORCPT + 99 others); Wed, 8 Jun 2022 10:42:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242070AbiFHOmS (ORCPT ); Wed, 8 Jun 2022 10:42:18 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 47AFB2A24B for ; Wed, 8 Jun 2022 07:41:55 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9A1E961BB9 for ; Wed, 8 Jun 2022 14:41:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CD6B0C3411E; Wed, 8 Jun 2022 14:41:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1654699314; bh=HWCFka5weLXGq0bS4SsHRIPkvDGK8ohd+BW3CLeixpI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=O+/Ps8sxZ/sDoNZ+ZbALC8iZZtYhkuEigRzILqm9c5hbT4VhQVGDe7gbn+Btq++CQ Kg88zvIGNepwtqbonOYpRS+1lIV6/Ew7Ed/GxcR43wZSrw5me3JDJEoxYhTOk5aLBu EvQsCmnPSn+5inaixK4ECvAu/MNTR5TfJT5no6PvFp2a4FrUTGm80gh+WBdm3pq01G gyU+RL3zNflRoEh/QKSCj+9Q3SXZpCJpaodP4UJx5GuyI6a+VUwuCHGd8gWZXBpI48 RC8yyI1EiHjOwubUo4rUHWwrXq+ymOF36W361HYgYZeU5jK7be4VzAObSFuv7NNQUi FL1adr9w/RHiA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Peter Zijlstra , Phil Auld , Alex Belits , Nicolas Saenz Julienne , Xiongfeng Wang , Neeraj Upadhyay , Thomas Gleixner , Yu Liao , Boqun Feng , "Paul E . McKenney" , Marcelo Tosatti , Paul Gortmaker , Uladzislau Rezki , Joel Fernandes Subject: [PATCH 19/20] rcu/context_tracking: Merge dynticks counter and context tracking states Date: Wed, 8 Jun 2022 16:40:36 +0200 Message-Id: <20220608144037.1765000-20-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220608144037.1765000-1-frederic@kernel.org> References: <20220608144037.1765000-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Updating the context tracking state and the RCU dynticks counter atomically in a single operation is a first step towards improving CPU isolation. This makes the context tracking state updates fully ordered and therefore allow for later enhancements such as postponing some work while a task is running isolated in userspace until it ever comes back to the kernel. The state field becomes divided in two parts: 1) Two Lower bits for context tracking state: CONTEXT_KERNEL = 0 CONTEXT_IDLE = 1, CONTEXT_USER = 2, CONTEXT_GUEST = 3, 2) Higher bits for RCU eqs dynticks counting: RCU_DYNTICKS_IDX = 4 The dynticks counting is always incremented by this value. (state & RCU_DYNTICKS_IDX) means we are NOT in an extended quiescent state. This makes the chance for a collision more likely between two RCU dynticks snapshots but wrapping up 28 bits of eqs dynticks increments still takes some bad luck (also rdp.dynticks_snap could be converted from int to long?) Some RCU eqs functions have been renamed to better reflect their broader scope that now include context tracking state. Signed-off-by: Frederic Weisbecker Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Neeraj Upadhyay Cc: Uladzislau Rezki Cc: Joel Fernandes Cc: Boqun Feng Cc: Nicolas Saenz Julienne Cc: Marcelo Tosatti Cc: Xiongfeng Wang Cc: Yu Liao Cc: Phil Auld Cc: Paul Gortmaker Cc: Alex Belits --- include/linux/context_tracking.h | 8 +- include/linux/context_tracking_state.h | 35 ++++--- kernel/context_tracking.c | 132 ++++++++++++++++--------- kernel/rcu/tree.c | 13 ++- kernel/rcu/tree_stall.h | 4 +- 5 files changed, 121 insertions(+), 71 deletions(-) diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h index a8c1db0a3f65..fd354eaea510 100644 --- a/include/linux/context_tracking.h +++ b/include/linux/context_tracking.h @@ -118,16 +118,16 @@ extern void ct_idle_exit(void); */ static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void) { - return !(arch_atomic_read(this_cpu_ptr(&context_tracking.dynticks)) & 0x1); + return !(arch_atomic_read(this_cpu_ptr(&context_tracking.state)) & RCU_DYNTICKS_IDX); } /* - * Increment the current CPU's context_tracking structure's ->dynticks field + * Increment the current CPU's context_tracking structure's ->state field * with ordering. Return the new value. */ -static __always_inline unsigned long rcu_dynticks_inc(int incby) +static __always_inline unsigned long ct_state_inc(int incby) { - return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.dynticks)); + return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.state)); } #else diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h index 1501df6d4cfa..580a525bfba7 100644 --- a/include/linux/context_tracking_state.h +++ b/include/linux/context_tracking_state.h @@ -10,12 +10,20 @@ #define DYNTICK_IRQ_NONIDLE ((LONG_MAX / 2) + 1) enum ctx_state { - CONTEXT_DISABLED = -1, /* returned by ct_state() if unknown */ - CONTEXT_KERNEL = 0, - CONTEXT_USER, - CONTEXT_GUEST, + CONTEXT_DISABLED = -1, /* returned by ct_state() if unknown */ + CONTEXT_KERNEL = 0, + CONTEXT_IDLE = 1, + CONTEXT_USER = 2, + CONTEXT_GUEST = 3, + CONTEXT_MAX = 4, }; +/* Even value for idle, else odd. */ +#define RCU_DYNTICKS_IDX CONTEXT_MAX + +#define CT_STATE_MASK (CONTEXT_MAX - 1) +#define CT_DYNTICKS_MASK (~CT_STATE_MASK) + struct context_tracking { #ifdef CONFIG_CONTEXT_TRACKING_USER /* @@ -26,10 +34,11 @@ struct context_tracking { */ bool active; int recursion; +#endif +#ifdef CONFIG_CONTEXT_TRACKING atomic_t state; #endif #ifdef CONFIG_CONTEXT_TRACKING_IDLE - atomic_t dynticks; /* Even value for idle, else odd. */ long dynticks_nesting; /* Track process nesting level. */ long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */ #endif @@ -37,24 +46,29 @@ struct context_tracking { #ifdef CONFIG_CONTEXT_TRACKING DECLARE_PER_CPU(struct context_tracking, context_tracking); + +static __always_inline int __ct_state(void) +{ + return atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_STATE_MASK; +} #endif #ifdef CONFIG_CONTEXT_TRACKING_IDLE static __always_inline int ct_dynticks(void) { - return atomic_read(this_cpu_ptr(&context_tracking.dynticks)); + return atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_DYNTICKS_MASK; } static __always_inline int ct_dynticks_cpu(int cpu) { struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu); - return atomic_read(&ct->dynticks); + return atomic_read(&ct->state) & CT_DYNTICKS_MASK; } static __always_inline int ct_dynticks_cpu_acquire(int cpu) { struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu); - return atomic_read_acquire(&ct->dynticks); + return atomic_read_acquire(&ct->state) & CT_DYNTICKS_MASK; } static __always_inline long ct_dynticks_nesting(void) @@ -98,11 +112,6 @@ static inline bool context_tracking_enabled_this_cpu(void) return context_tracking_enabled() && __this_cpu_read(context_tracking.active); } -static __always_inline int __ct_state(void) -{ - return atomic_read(this_cpu_ptr(&context_tracking.state)); -} - /** * ct_state() - return the current context tracking state if known * diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c index 810bca217151..d50bc66f2b1c 100644 --- a/kernel/context_tracking.c +++ b/kernel/context_tracking.c @@ -28,8 +28,8 @@ DEFINE_PER_CPU(struct context_tracking, context_tracking) = { #ifdef CONFIG_CONTEXT_TRACKING_IDLE .dynticks_nesting = 1, .dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE, - .dynticks = ATOMIC_INIT(1), #endif + .state = ATOMIC_INIT(RCU_DYNTICKS_IDX), }; EXPORT_SYMBOL_GPL(context_tracking); @@ -76,7 +76,7 @@ static __always_inline void rcu_dynticks_task_trace_exit(void) * RCU is watching prior to the call to this function and is no longer * watching upon return. */ -static noinstr void rcu_dynticks_eqs_enter(void) +static noinstr void ct_kernel_exit_state(int offset) { int seq; @@ -86,9 +86,9 @@ static noinstr void rcu_dynticks_eqs_enter(void) * next idle sojourn. */ rcu_dynticks_task_trace_enter(); // Before ->dynticks update! - seq = rcu_dynticks_inc(1); + seq = ct_state_inc(offset); // RCU is no longer watching. Better be in extended quiescent state! - WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & 0x1)); + WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & RCU_DYNTICKS_IDX)); } /* @@ -96,7 +96,7 @@ static noinstr void rcu_dynticks_eqs_enter(void) * called from an extended quiescent state, that is, RCU is not watching * prior to the call to this function and is watching upon return. */ -static noinstr void rcu_dynticks_eqs_exit(void) +static noinstr void ct_kernel_enter_state(int offset) { int seq; @@ -105,10 +105,10 @@ static noinstr void rcu_dynticks_eqs_exit(void) * and we also must force ordering with the next RCU read-side * critical section. */ - seq = rcu_dynticks_inc(1); + seq = ct_state_inc(offset); // RCU is now watching. Better not be in an extended quiescent state! rcu_dynticks_task_trace_exit(); // After ->dynticks update! - WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & 0x1)); + WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & RCU_DYNTICKS_IDX)); } /* @@ -119,7 +119,7 @@ static noinstr void rcu_dynticks_eqs_exit(void) * the possibility of usermode upcalls having messed up our count * of interrupt nesting level during the prior busy period. */ -static void noinstr rcu_eqs_enter(bool user) +static void noinstr ct_kernel_exit(bool user, int offset) { struct context_tracking *ct = this_cpu_ptr(&context_tracking); @@ -139,13 +139,13 @@ static void noinstr rcu_eqs_enter(bool user) WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); rcu_preempt_deferred_qs(current); - // instrumentation for the noinstr rcu_dynticks_eqs_enter() - instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks)); + // instrumentation for the noinstr ct_kernel_exit_state() + instrument_atomic_write(&ct->state, sizeof(ct->state)); instrumentation_end(); WRITE_ONCE(ct->dynticks_nesting, 0); /* Avoid irq-access tearing. */ // RCU is watching here ... - rcu_dynticks_eqs_enter(); + ct_kernel_exit_state(offset); // ... but is no longer watching here. rcu_dynticks_task_enter(); } @@ -158,7 +158,7 @@ static void noinstr rcu_eqs_enter(bool user) * allow for the possibility of usermode upcalls messing up our count of * interrupt nesting level during the busy period that is just now starting. */ -static void noinstr rcu_eqs_exit(bool user) +static void noinstr ct_kernel_enter(bool user, int offset) { struct context_tracking *ct = this_cpu_ptr(&context_tracking); long oldval; @@ -173,12 +173,12 @@ static void noinstr rcu_eqs_exit(bool user) } rcu_dynticks_task_exit(); // RCU is not watching here ... - rcu_dynticks_eqs_exit(); + ct_kernel_enter_state(offset); // ... but is watching here. instrumentation_begin(); - // instrumentation for the noinstr rcu_dynticks_eqs_exit() - instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks)); + // instrumentation for the noinstr ct_kernel_enter_state() + instrument_atomic_write(&ct->state, sizeof(ct->state)); trace_rcu_dyntick(TPS("End"), ct_dynticks_nesting(), 1, ct_dynticks()); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); @@ -192,7 +192,7 @@ static void noinstr rcu_eqs_exit(bool user) * ct_nmi_exit - inform RCU of exit from NMI context * * If we are returning from the outermost NMI handler that interrupted an - * RCU-idle period, update ct->dynticks and ct->dynticks_nmi_nesting + * RCU-idle period, update ct->state and ct->dynticks_nmi_nesting * to let the RCU grace-period handling know that the CPU is back to * being RCU-idle. * @@ -229,12 +229,12 @@ void noinstr ct_nmi_exit(void) trace_rcu_dyntick(TPS("Startirq"), ct_dynticks_nmi_nesting(), 0, ct_dynticks()); WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */ - // instrumentation for the noinstr rcu_dynticks_eqs_enter() - instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks)); + // instrumentation for the noinstr ct_kernel_exit_state() + instrument_atomic_write(&ct->state, sizeof(ct->state)); instrumentation_end(); // RCU is watching here ... - rcu_dynticks_eqs_enter(); + ct_kernel_exit_state(RCU_DYNTICKS_IDX); // ... but is no longer watching here. if (!in_nmi()) @@ -244,7 +244,7 @@ void noinstr ct_nmi_exit(void) /** * ct_nmi_enter - inform RCU of entry to NMI context * - * If the CPU was idle from RCU's viewpoint, update ct->dynticks and + * If the CPU was idle from RCU's viewpoint, update ct->state and * ct->dynticks_nmi_nesting to let the RCU grace-period handling know * that the CPU is active. This implementation permits nested NMIs, as * long as the nesting level does not overflow an int. (You will probably @@ -275,14 +275,14 @@ void noinstr ct_nmi_enter(void) rcu_dynticks_task_exit(); // RCU is not watching here ... - rcu_dynticks_eqs_exit(); + ct_kernel_enter_state(RCU_DYNTICKS_IDX); // ... but is watching here. instrumentation_begin(); // instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs() - instrument_atomic_read(&ct->dynticks, sizeof(ct->dynticks)); - // instrumentation for the noinstr rcu_dynticks_eqs_exit() - instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks)); + instrument_atomic_read(&ct->state, sizeof(ct->state)); + // instrumentation for the noinstr ct_kernel_enter_state() + instrument_atomic_write(&ct->state, sizeof(ct->state)); incby = 1; } else if (!in_nmi()) { @@ -315,7 +315,7 @@ void noinstr ct_nmi_enter(void) void noinstr ct_idle_enter(void) { WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled()); - rcu_eqs_enter(false); + ct_kernel_exit(false, RCU_DYNTICKS_IDX + CONTEXT_IDLE); } EXPORT_SYMBOL_GPL(ct_idle_enter); @@ -333,7 +333,7 @@ void noinstr ct_idle_exit(void) unsigned long flags; raw_local_irq_save(flags); - rcu_eqs_exit(false); + ct_kernel_enter(false, RCU_DYNTICKS_IDX - CONTEXT_IDLE); raw_local_irq_restore(flags); } EXPORT_SYMBOL_GPL(ct_idle_exit); @@ -421,8 +421,8 @@ void ct_irq_exit_irqson(void) local_irq_restore(flags); } #else -static __always_inline void rcu_eqs_enter(bool user) { } -static __always_inline void rcu_eqs_exit(bool user) { } +static __always_inline void ct_kernel_exit(bool user, int offset) { } +static __always_inline void ct_kernel_enter(bool user, int offset) { } #endif /* #ifdef CONFIG_CONTEXT_TRACKING_IDLE */ #ifdef CONFIG_CONTEXT_TRACKING_USER @@ -493,28 +493,49 @@ void noinstr __ct_user_enter(enum ctx_state state) * that will fire and reschedule once we resume in user/guest mode. */ rcu_irq_work_resched(); + /* * Enter RCU idle mode right before resuming userspace. No use of RCU * is permitted between this call and rcu_eqs_exit(). This way the * CPU doesn't need to maintain the tick for RCU maintenance purposes * when the CPU runs in userspace. */ - rcu_eqs_enter(true); + ct_kernel_exit(true, RCU_DYNTICKS_IDX + state); + + /* + * Special case if we only track user <-> kernel transitions for tickless + * cputime accounting but we don't support RCU extended quiescent state. + * In this we case we don't care about any concurrency/ordering. + */ + if (!IS_ENABLED(CONFIG_CONTEXT_TRACKING_IDLE)) + atomic_set(&ct->state, state); + } else { + /* + * Even if context tracking is disabled on this CPU, because it's outside + * the full dynticks mask for example, we still have to keep track of the + * context transitions and states to prevent inconsistency on those of + * other CPUs. + * If a task triggers an exception in userspace, sleep on the exception + * handler and then migrate to another CPU, that new CPU must know where + * the exception returns by the time we call exception_exit(). + * This information can only be provided by the previous CPU when it called + * exception_enter(). + * OTOH we can spare the calls to vtime and RCU when context_tracking.active + * is false because we know that CPU is not tickless. + */ + if (!IS_ENABLED(CONFIG_CONTEXT_TRACKING_IDLE)) { + /* Tracking for vtime only, no concurrent RCU EQS accounting */ + atomic_set(&ct->state, state); + } else { + /* + * Tracking for vtime and RCU EQS. Make sure we don't race + * with NMIs. OTOH we don't care about ordering here since + * RCU only requires RCU_DYNTICKS_IDX increments to be fully + * ordered. + */ + atomic_add(state, &ct->state); + } } - /* - * Even if context tracking is disabled on this CPU, because it's outside - * the full dynticks mask for example, we still have to keep track of the - * context transitions and states to prevent inconsistency on those of - * other CPUs. - * If a task triggers an exception in userspace, sleep on the exception - * handler and then migrate to another CPU, that new CPU must know where - * the exception returns by the time we call exception_exit(). - * This information can only be provided by the previous CPU when it called - * exception_enter(). - * OTOH we can spare the calls to vtime and RCU when context_tracking.active - * is false because we know that CPU is not tickless. - */ - atomic_set(&ct->state, state); } context_tracking_recursion_exit(); } @@ -594,15 +615,36 @@ void noinstr __ct_user_exit(enum ctx_state state) * Exit RCU idle mode while entering the kernel because it can * run a RCU read side critical section anytime. */ - rcu_eqs_exit(true); + ct_kernel_enter(true, RCU_DYNTICKS_IDX - state); if (state == CONTEXT_USER) { instrumentation_begin(); vtime_user_exit(current); trace_user_exit(0); instrumentation_end(); } + + /* + * Special case if we only track user <-> kernel transitions for tickless + * cputime accounting but we don't support RCU extended quiescent state. + * In this we case we don't care about any concurrency/ordering. + */ + if (!IS_ENABLED(CONFIG_CONTEXT_TRACKING_IDLE)) + atomic_set(&ct->state, CONTEXT_KERNEL); + + } else { + if (!IS_ENABLED(CONFIG_CONTEXT_TRACKING_IDLE)) { + /* Tracking for vtime only, no concurrent RCU EQS accounting */ + atomic_set(&ct->state, CONTEXT_KERNEL); + } else { + /* + * Tracking for vtime and RCU EQS. Make sure we don't race + * with NMIs. OTOH we don't care about ordering here since + * RCU only requires RCU_DYNTICKS_IDX increments to be fully + * ordered. + */ + atomic_sub(state, &ct->state); + } } - atomic_set(&ct->state, CONTEXT_KERNEL); } context_tracking_recursion_exit(); } diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 155e8ce3d267..642622f2a6b4 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -276,9 +276,9 @@ void rcu_softirq_qs(void) */ static void rcu_dynticks_eqs_online(void) { - if (ct_dynticks() & 0x1) + if (ct_dynticks() & RCU_DYNTICKS_IDX) return; - rcu_dynticks_inc(1); + ct_state_inc(RCU_DYNTICKS_IDX); } /* @@ -297,7 +297,7 @@ static int rcu_dynticks_snap(int cpu) */ static bool rcu_dynticks_in_eqs(int snap) { - return !(snap & 0x1); + return !(snap & RCU_DYNTICKS_IDX); } /* Return true if the specified CPU is currently idle from an RCU viewpoint. */ @@ -325,8 +325,7 @@ bool rcu_dynticks_zero_in_eqs(int cpu, int *vp) int snap; // If not quiescent, force back to earlier extended quiescent state. - snap = ct_dynticks_cpu(cpu) & ~0x1; - + snap = ct_dynticks_cpu(cpu) & ~RCU_DYNTICKS_IDX; smp_rmb(); // Order ->dynticks and *vp reads. if (READ_ONCE(*vp)) return false; // Non-zero, so report failure; @@ -352,9 +351,9 @@ notrace void rcu_momentary_dyntick_idle(void) int seq; raw_cpu_write(rcu_data.rcu_need_heavy_qs, false); - seq = rcu_dynticks_inc(2); + seq = ct_state_inc(2 * RCU_DYNTICKS_IDX); /* It is illegal to call this from idle state. */ - WARN_ON_ONCE(!(seq & 0x1)); + WARN_ON_ONCE(!(seq & RCU_DYNTICKS_IDX)); rcu_preempt_deferred_qs(current); } EXPORT_SYMBOL_GPL(rcu_momentary_dyntick_idle); diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 91e4fd4db12d..c3fbbcc09327 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -469,7 +469,7 @@ static void print_cpu_stall_info(int cpu) rcuc_starved = rcu_is_rcuc_kthread_starving(rdp, &j); if (rcuc_starved) sprintf(buf, " rcuc=%ld jiffies(starved)", j); - pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld%s%s\n", + pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%04x/%ld/%#lx softirq=%u/%u fqs=%ld%s%s\n", cpu, "O."[!!cpu_online(cpu)], "o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)], @@ -478,7 +478,7 @@ static void print_cpu_stall_info(int cpu) rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' : "!."[!delta], ticks_value, ticks_title, - rcu_dynticks_snap(cpu) & 0xfff, + rcu_dynticks_snap(cpu) & 0xffff, ct_dynticks_nesting_cpu(cpu), ct_dynticks_nmi_nesting_cpu(cpu), rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu), data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart, -- 2.25.1