Received: by 2002:ab2:6857:0:b0:1ef:ffd0:ce49 with SMTP id l23csp3461329lqp; Tue, 26 Mar 2024 09:41:20 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXDTxJkZnER7QAMmH64GvD1LX2QmXcmiOBud9n+qAWpz740jMA68uvQQKYCM3BSItZ0H48spbEYN6k13st1V6WO5iPX+Ka8D2IJr5x7bg== X-Google-Smtp-Source: AGHT+IF8k64gDmWb3YLJtUzzLIeRiVWY26ydtuKbtGNy4YplMgdKluK6wk/cePVAGLerLLyHcJFm X-Received: by 2002:a17:906:b1d9:b0:a46:a1d0:8451 with SMTP id bv25-20020a170906b1d900b00a46a1d08451mr8270217ejb.16.1711471280646; Tue, 26 Mar 2024 09:41:20 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711471280; cv=pass; d=google.com; s=arc-20160816; b=LkCTGGt+DbH+7+I/RzAftYNtqPl2KVVCqDj9HY8lBFMgrzn82QMSO9ySCFKwkGIhPp X/lhOdkbbdYRAlwTHuvNhUgIyR1zu1OLhYVcbFA6PXNGM8TEI1n2q99w+izpos6y8o29 HxF+EDqQQkxvRYVQNMZXeUV8uQZMeL7b8zs959TzUYJSPBtjp7NNZkk6/ZxWm1qr7NpI Ji4oHBRRIfQ5I/2RHM8by3z2Fc83RvIDvGwfrTvJ/dPjJcNxjgZDIOrl03NBkDeFW038 TwO9QRc8W8NIMSU8rKexZNiHOadebJnTBvFB5XKSN6vg4sWe+vhDELFX62mSzXfEMUCK Gixw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :message-id:date:references:in-reply-to:subject:cc:to:dkim-signature :dkim-signature:from; bh=G2tXaIE73etqiedBoIAmR9YYNuCBVN3eHa7m40AS61c=; fh=0v68XJejyVkBhMEDTXEVsIainyFR0IYK5JYLuk6T3QY=; b=ewXQYVUnPFulYh82ra7K+GHXuIOLjBEa8BKWhi9ILDFIXED6v0L6MsFyCYR8a+vPkN ojWjcG1CY3lMI6ssldkSsKOcO/d0qjkMRbV9GqGjQrKkdH5mk0xd1oluBIZ8sKdeKz8+ enGT4k65vLthIGt0KVoeKILoyo505SnoWzg/frKA2q+1fsHZXH3DOwNkkY7T8cn0Xhag yUGMFsxQGWaot5b+ovDi2USVORpUEbg7nYEnIxjVWrD6gvhZkjjvajdzZ0RkRE0fx2/p +dUb0rlFq48ltEb7tLqexK1aGWOouhyZnKw1nAI75gK0tkBx2n0WodEDY7xKbDPEINbS 3HQQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="v/UOCWyZ"; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; arc=pass (i=1 spf=pass spfdomain=linutronix.de dkim=pass dkdomain=linutronix.de dmarc=pass fromdomain=linutronix.de); spf=pass (google.com: domain of linux-kernel+bounces-119486-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-119486-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id lx25-20020a170906af1900b00a4dfb42ab95si252536ejb.362.2024.03.26.09.41.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Mar 2024 09:41:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-119486-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="v/UOCWyZ"; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; arc=pass (i=1 spf=pass spfdomain=linutronix.de dkim=pass dkdomain=linutronix.de dmarc=pass fromdomain=linutronix.de); spf=pass (google.com: domain of linux-kernel+bounces-119486-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-119486-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 166861F81FE6 for ; Tue, 26 Mar 2024 16:41:20 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C1A20171BB; Tue, 26 Mar 2024 16:41:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="v/UOCWyZ"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="GPx97lR1" Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B3C1C15B; Tue, 26 Mar 2024 16:41:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711471271; cv=none; b=ainzpzst9ZOvAwgU8pXDdzNm+PV1FTPOj98JXn2h2yFyyqLR8CpDBoxH71rX327tARhjiMHcZ4BbDEGIAmK597/YXQ8xas+62t5x3oZDq/izNO0D63SdFJKS9bsdo/5VaVSkWaEvPqigaT8T5RCN87AosARHGdNXQtatssXmedM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711471271; c=relaxed/simple; bh=95hW0FoazmcZRKcfjdOY3hRmZ+Ur6YJRDcmEPH4tum4=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=dNUMTnZjmZPdjs6VTHMpPa5fXKrBhZUh/TwJ+PMF1BYaTTZTBADalQffj4cTYK6AYAaCpfafXsjSfFcBrAVZaaVDethSpaKe3t4fiOuEt54hrCj8f6TQ6ndfCGKTFY/XcJzPCFi7B+3JjaVjmArj8VjZTt6P6DQ2UxPCuqBPVzw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=v/UOCWyZ; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=GPx97lR1; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1711471261; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=G2tXaIE73etqiedBoIAmR9YYNuCBVN3eHa7m40AS61c=; b=v/UOCWyZjyBtCJugLnU6XIkrvVbSr/wD0iDXtlMLCPxNBrpyGND370VU7Rs9RaDfjoS1N0 lmKtUa7+GCzFFiutRECLO4zwmykKhRnAPbPMO0zClZcDYaGzhfaOL7rGbteyR2olbq3KTA ff3a5TH2eB2zZGBJJGhn3zbLoKUOOJP77viVThWxMPhWWKDj8SGWgESCxx8Tu3sAueDF6G GEaROCK3Jkh+gLNcFXD4spTFmjN5ps/K/28YMD/Q9Pht7oXYrHYhGCpkd+bC+Clnh9mtfT ND8jEloiOqwUW4W/Ht/WqG3I+BAABIFucoqNQCq6EwTMOSItkCmQzAIXi0JUgQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1711471261; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=G2tXaIE73etqiedBoIAmR9YYNuCBVN3eHa7m40AS61c=; b=GPx97lR1VZTO0c6FTnAnfNZ84zwihzJJDZ8T8ULPCjhyZ3eHzh3t67clZqBUSz3Eg//iL/ qWb4TDAF61VpsRBQ== To: Frederic Weisbecker , Boqun Feng , Florian Fainelli Cc: Thomas Gleixner , "Russell King (Oracle)" , Joel Fernandes , Linus Torvalds , linux-kernel@vger.kernel.org, kernel-team@meta.com, paulmck@kernel.org, mingo@kernel.org, rcu@vger.kernel.org, neeraj.upadhyay@amd.com, urezki@gmail.com, qiang.zhang1211@gmail.com, bigeasy@linutronix.de, chenzhongjin@huawei.com, yangjihong1@huawei.com, rostedt@goodmis.org, Justin Chen Subject: Re: [PATCH] timer/migration: Remove buggy early return on deactivation [was Re: Unexplained long boot delays [Was Re: [GIT PULL] RCU changes for v6.9]] In-Reply-To: References: <533151c9-afb5-453b-8014-9fbe7c3b26c2@gmail.com> <87v85olez3.ffs@tglx> <87sf0sldbi.ffs@tglx> Date: Tue, 26 Mar 2024 17:41:00 +0100 Message-ID: <87zfulrlnn.fsf@somnus> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain Hi Frederic, I'm sorry sending my concerns late, but I was on sick leave. Keep in mind, it is definitely possible that my brain is not yet in the timer migration hierarchy mood after the sick leave :) so please correct me whenever I'm wrong. Frederic Weisbecker writes: > On Thu, Mar 14, 2024 at 03:05:53PM -0700, Boqun Feng wrote: >> I notice CPU3 didn't have its own non-deferrable timer queued (local or >> global), so could the following happen? >> >> timer_base_try_to_set_idle(): >> __get_next_timer_interrupt(): >> fetch_next_timer_interrupt(): >> // nextevt_local == nextevt_global == basej + NEXT_TIMER_MAX_DELTA >> // tevt->local == tevt->gloabl = KTIME_MAX >> timer_use_tmigr(): >> tmigr_cpu_deactivate(): >> __tmigr_cpu_deactivate(): >> // tmc->cpuevt.ignore untouched still == true >> walk_groups(&tmigr_inactive_up, ...): >> tmigr_inactive_up(): >> data->remote = true; >> tmigr_update_events(): >> if (child) { // child is NULL >> ... >> } else { >> first_childevt = evt = data->evt; >> >> if (evt->ignore && !remote) >> return true; // no remote tick is picked. >> ... >> } > > Nice catch! Florian can you try the following? > > From b0e335371ed758f68bf4f501246298c98a615b04 Mon Sep 17 00:00:00 2001 > From: Frederic Weisbecker > Date: Fri, 15 Mar 2024 00:21:01 +0100 > Subject: [PATCH] timer/migration: Remove buggy early return on deactivation > > When a CPU enters into idle and deactivates itself from the timer > migration hierarchy without any global timer of its own to propagate, > the group event of that CPU is set to "ignore" and tmigr_update_events() > accordingly performs an early return without considering timers queued > by other CPUs. > > If the hierarchy has a single level, and the CPU is the last one to > enter idle, it will ignore others' global timers, as in the following > layout: > > [GRP0:0] > migrator = 0 > active = 0 > nextevt = T0i > / \ > 0 1 > active (T0i) idle (T1) > > 0) CPU 0 is active thus its event is ignored (the letter 'i') and so are > upper levels' events. CPU 1 is idle and has the timer T1 enqueued. > > [GRP0:0] > migrator = NONE > active = NONE > nextevt = T0i > / \ > 0 1 > idle (T0i) idle (T1) > > 1) CPU 0 goes idle without global event queued. Therefore KTIME_MAX is > pushed as its next expiry and its own event kept as "ignore". As a result > tmigr_update_events() ignores T1 and CPU 0 goes to idle with T1 > unhandled. This is broken - indeed. > This isn't proper to single level hierarchy though. A similar issue, > although slightly different, may arise on multi-level: > > [GRP1:0] > migrator = GRP0:0 > active = GRP0:0 > nextevt = T0:0i, T0:1 > / \ > [GRP0:0] [GRP0:1] > migrator = 0 migrator = NONE > active = 0 active = NONE > nextevt = T0i nextevt = T2 > / \ / \ > 0 (T0i) 1 (T1) 2 (T2) 3 > idle idle idle idle > > 0) CPU 0 is active thus its event is ignored (the letter 'i') and so are > upper levels' events. CPU 1 is idle and has the timer T1 enqueued. > CPU 2 also has a timer. The expiry order is T0 (ignored) < T1 < T2 > > [GRP1:0] > migrator = GRP0:0 > active = GRP0:0 > nextevt = T0:0i, T0:1 > / \ > [GRP0:0] [GRP0:1] > migrator = NONE migrator = NONE > active = NONE active = NONE > nextevt = T0i nextevt = T2 > / \ / \ > 0 (T0i) 1 (T1) 2 (T2) 3 > idle idle idle idle > > 1) CPU 0 goes idle without global event queued. Therefore KTIME_MAX is > pushed as its next expiry and its own event kept as "ignore". As a result > tmigr_update_events() ignores T1. The change only propagated up to 1st > level so far. Right. T0 doesn't has to be enqueued into the timer queue of GRP0:0 as this timer could be ignored. So nothing changes directly in GRP0:0. > [GRP1:0] > migrator = NONE > active = NONE > nextevt = T0:1 > / \ > [GRP0:0] [GRP0:1] > migrator = NONE migrator = NONE > active = NONE active = NONE > nextevt = T0i nextevt = T2 > / \ / \ > 0 (T0i) 1 (T1) 2 (T2) 3 > idle idle idle idle > > 2) The change now propagates up to the top. tmigr_update_events() finds > that the child event is ignored and thus removes it. The top level next > event is now T2 which is returned to CPU 0 as its next effective expiry > to take account for as the global idle migrator. However T1 has been > ignored along the way, leaving it unhandled. Now propagation goes on as GRP0:0 is completely idle. When executing tmigr_update_events() in the next step of walking the hierarchy via tmigr_inactive_up(), the arguments for tmigr_update_events() are set in the following way: group = GRP1:0 child = GRP0:0 Then at the begin of tmigr_update_events() the group event of child is updated - so all ignored events are removed (T0i), and the child->groupevt and child->next_expiry is updated with T1. This reevaluated child->groupevt is then queued/updated in the GRP1:0 timerqueue. So T1 will be handled! As there is no parent, the top level group event is updated (see goto label "check_toplvl") and T1 will be still the first event. > Fix those issues with removing the buggy related early return. Ignored > child events must not prevent from evaluating the other events within > the same group. I would prefere to keep this early return but skip it, when there is !group->parent (only a single level in hierarchy). Then it would prevent taking the group lock and making some random event updates which are done nevertheless on the next iteration of the hierarchy walk. Thanks, Anna-Maria