Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp905132rdb; Fri, 1 Dec 2023 01:29:28 -0800 (PST) X-Google-Smtp-Source: AGHT+IGFaBss0mW8tfPyN6uOSAJbSacv79wrVHyxwNxZB4mrJx0JozW9rL2TvBNkdWV2KKG0pnMS X-Received: by 2002:a05:6a00:2e1e:b0:6cb:a93c:dfd1 with SMTP id fc30-20020a056a002e1e00b006cba93cdfd1mr33338801pfb.14.1701422968137; Fri, 01 Dec 2023 01:29:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701422968; cv=none; d=google.com; s=arc-20160816; b=y4CC5lZ/RcY+aA4ki+pafI3ujOIMS2Fcs3ppQUK05DZfKTd2wpqtpBIwOgYajbhv1F 83j3FzoKfE3xnHpgpDZBeU2IhK+BtLW5av4O9/W2tw7M3aJou8j42ZRUSaFROxGI51Zx tJ8gIgNuaUMfVA6azMzMBQ+IXQCsGMGdZ0IzO18sTux3pSKis/eoiQhLlei2HLox0rvs idHuz0KH/J4zwQcS3f9v6MGsbEYVQNurqmJb5QGIct43mk9ZKn4UT7McrNm0t2x5LhZS L3tKdrOD/gBhUvwv7W/wSEtmaWC7Okvkw9XwUdXoztdq31Z6pQFNhVBMI8eLF8gKSQlx hsIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=QYexXRgS2m5/dffYZ2zGFXnxjxh6wHD+OeXAnIrdHS4=; fh=Qsvhj/y6vhUDpzzDJgUZ1ZrqVjoqsWAHfqtsSs/uzrU=; b=1Bwn5EaNc+XzNzZxEr7Hu782avrmwfOPX1eAFu6FmIRdK9KUPxtJ7X0aFkSnaDZ8eo vR33n34K49jTvR8OkE0SZSIBkcbaeVSBsC25PRx8QSIMPMJwsAq64W0uF1XBUUT0w8X9 YfqBPfmXfUfbEEMw8nC16XIQ4FN0KlXnidQJBh/I8IYFpM7h4trgEGWfsJKGnTSr43Rd jKJ43b41FqyWsnlovB6S5kc0vluCdfmHLYctSb5My9L24xxqPwGw8PxvbrZwQkOKUOLn VMVCdFKOIpLYGff7IWSm4Hb6zw6wLn//D9iJVL4oqYJZDTrL/6s3SeUALp0tc95Bqir5 +bew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=CaKDJWrM; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id f10-20020a056a0022ca00b006cddd78a8f1si3045075pfj.27.2023.12.01.01.29.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Dec 2023 01:29:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=CaKDJWrM; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id F393281552F7; Fri, 1 Dec 2023 01:29:24 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378141AbjLAJ3G (ORCPT + 99 others); Fri, 1 Dec 2023 04:29:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378201AbjLAJ2R (ORCPT ); Fri, 1 Dec 2023 04:28:17 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D88E1FDD for ; Fri, 1 Dec 2023 01:27:30 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1701422849; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QYexXRgS2m5/dffYZ2zGFXnxjxh6wHD+OeXAnIrdHS4=; b=CaKDJWrMIQ2XDsHebkDLVj6vhfpY1bVbxlutZsI2FXCWK5GlSqOp9ycw/krKlnZcFKGsaX jSZcx3uZRM97ThMaXYFFFJmCWI864VKBNRpf92bkHuN3DykbJ7cDnWITVhwupQzdBiMR3C VDhuTVto3q6v34p/0MU4EEmoa+i01EcMnxju8dE+Lb9C7Qubv7nrrElqNIBRevESh3OjqH Ky8ewq1ml+IcXQspz+Wrw1MZ0tUyQ2JHWZuX5fn8Vc/JxL9l60YggYqiXRQsnsAftvNM5f g9gDlKPSCoZlnABDbZ1fGARLiMBLefQ5l8uNx3F8nCiWoJIR+yEnQIH+TaC+cQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1701422849; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QYexXRgS2m5/dffYZ2zGFXnxjxh6wHD+OeXAnIrdHS4=; b=A0lpCutDv4FDkeQp+mxGZD/cO52q4pGv2KnxDGTXksG1RQRM0/MTNKD+h5e0wys3W7C0zA IZU3vnjknhOnYjAA== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Steven Rostedt , Sebastian Siewior , Giovanni Gherdovich , Lukasz Luba , "Gautham R . Shenoy" , Srinivas Pandruvada , K Prateek Nayak , Anna-Maria Behnsen , Richard Cochran Subject: [PATCH v9 32/32] timers: Always queue timers on the local CPU Date: Fri, 1 Dec 2023 10:26:54 +0100 Message-Id: <20231201092654.34614-33-anna-maria@linutronix.de> In-Reply-To: <20231201092654.34614-1-anna-maria@linutronix.de> References: <20231201092654.34614-1-anna-maria@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 01 Dec 2023 01:29:25 -0800 (PST) The timer pull model is in place so we can remove the heuristics which try to guess the best target CPU at enqueue/modification time. All non pinned timers are queued on the local CPU in the separate storage and eventually pulled at expiry time to a remote CPU. Originally-by: Richard Cochran (linutronix GmbH) Signed-off-by: Anna-Maria Behnsen --- v9: - Update to the changes of the preceding patches v6: - Update TIMER_PINNED flag description. v5: - Move WARN_ONCE() in add_timer_on() into a previous patch - Fold crystallball magic related hunks into this patch v4: Update comment about TIMER_PINNED flag (heristic is removed) --- include/linux/timer.h | 14 ++++--------- kernel/time/timer.c | 46 +++++++++++++++++++++---------------------- 2 files changed, 26 insertions(+), 34 deletions(-) diff --git a/include/linux/timer.h b/include/linux/timer.h index 404bb31a95c7..4dd59e4e5681 100644 --- a/include/linux/timer.h +++ b/include/linux/timer.h @@ -50,16 +50,10 @@ struct timer_list { * workqueue locking issues. It's not meant for executing random crap * with interrupts disabled. Abuse is monitored! * - * @TIMER_PINNED: A pinned timer will not be affected by any timer - * placement heuristics (like, NOHZ) and will always expire on the CPU - * on which the timer was enqueued. - * - * Note: Because enqueuing of timers can migrate the timer from one - * CPU to another, pinned timers are not guaranteed to stay on the - * initialy selected CPU. They move to the CPU on which the enqueue - * function is invoked via mod_timer() or add_timer(). If the timer - * should be placed on a particular CPU, then add_timer_on() has to be - * used. + * @TIMER_PINNED: A pinned timer will always expire on the CPU on which the + * timer was enqueued. When a particular CPU is required, add_timer_on() + * has to be used. Enqueue via mod_timer() and add_timer() is always done + * on the local CPU. */ #define TIMER_CPUMASK 0x0003FFFF #define TIMER_MIGRATING 0x00040000 diff --git a/kernel/time/timer.c b/kernel/time/timer.c index ac3e888d053f..6e9e1d852438 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -590,10 +590,13 @@ trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer) /* * We might have to IPI the remote CPU if the base is idle and the - * timer is not deferrable. If the other CPU is on the way to idle - * then it can't set base->is_idle as we hold the base lock: + * timer is pinned. If it is a non pinned timer, it is only queued + * on the remote CPU, when timer was running during queueing. Then + * everything is handled by remote CPU anyway. If the other CPU is + * on the way to idle then it can't set base->is_idle as we hold + * the base lock: */ - if (base->is_idle) + if (base->is_idle && timer->flags & TIMER_PINNED) wake_up_nohz_cpu(base->cpu); } @@ -941,17 +944,6 @@ static inline struct timer_base *get_timer_base(u32 tflags) return get_timer_cpu_base(tflags, tflags & TIMER_CPUMASK); } -static inline struct timer_base * -get_target_base(struct timer_base *base, unsigned tflags) -{ -#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON) - if (static_branch_likely(&timers_migration_enabled) && - !(tflags & TIMER_PINNED)) - return get_timer_cpu_base(tflags, get_nohz_timer_target()); -#endif - return get_timer_this_cpu_base(tflags); -} - static inline void __forward_timer_base(struct timer_base *base, unsigned long basej) { @@ -1106,7 +1098,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, unsigned int option if (!ret && (options & MOD_TIMER_PENDING_ONLY)) goto out_unlock; - new_base = get_target_base(base, timer->flags); + new_base = get_timer_this_cpu_base(timer->flags); if (base != new_base) { /* @@ -2228,11 +2220,17 @@ static inline u64 __get_next_timer_interrupt(unsigned long basej, u64 basem, * BASE_GLOBAL base, deferrable timers may still see large * granularity skew (by design). */ - if (!base_local->is_idle) { - bool is_idle = time_after(nextevt, basej + 1); - base_local->is_idle = base_global->is_idle = is_idle; - } + /* + * base->is_idle information is required to wakeup an idle CPU + * when a new timer was enqueued. Only pinned timers could be + * enqueued remotely into a idle base. Therefore do maintain + * only base_local->is_idle information and ignore + * base_global->is_idle information. + */ + if (!base_local->is_idle) + base_local->is_idle = time_after(nextevt, basej + 1); + *idle = base_local->is_idle; trace_timer_base_idle(base_local->is_idle, base_local->cpu); @@ -2307,13 +2305,13 @@ bool timer_base_is_idle(void) void timer_clear_idle(void) { /* - * We do this unlocked. The worst outcome is a remote enqueue sending - * a pointless IPI, but taking the lock would just make the window for - * sending the IPI a few instructions smaller for the cost of taking - * the lock in the exit from idle path. + * We do this unlocked. The worst outcome is a remote pinned timer + * enqueue sending a pointless IPI, but taking the lock would just + * make the window for sending the IPI a few instructions smaller + * for the cost of taking the lock in the exit from idle + * path. Required for BASE_LOCAL only. */ __this_cpu_write(timer_bases[BASE_LOCAL].is_idle, false); - __this_cpu_write(timer_bases[BASE_GLOBAL].is_idle, false); trace_timer_base_idle(0, smp_processor_id()); -- 2.39.2