Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp4335274rwl; Mon, 3 Apr 2023 03:34:12 -0700 (PDT) X-Google-Smtp-Source: AKy350ZqXilOIObHXDDcgGVQfTWkGUrnNTljNPwc17+SSno1Cl9XGJhqqcQn2PV4Dl18qLAmvsuf X-Received: by 2002:a17:907:7719:b0:947:cd7a:2a3b with SMTP id kw25-20020a170907771900b00947cd7a2a3bmr8915197ejc.69.1680518052592; Mon, 03 Apr 2023 03:34:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680518052; cv=none; d=google.com; s=arc-20160816; b=YSm2BjI1QAL7Wo0BvtR4jMFQl5JOwDvgw0Uk0iYJVKQydZGsdTYCTUlq6uI+JauWqB wv+yLVtHbUVc9b2BjDrxIKgakHk+Yh7kdZMBustSEn4YtUMsD1fRs1Y+5OxiiTWe+QTy MG6nH7uCiTWBKswCuGVE/e4McIbvIxyPlhOyLou9JLM248AKoOGAkeeklD5twV0nyKGp LP3gxdGP/OyCWtu5WD9mk1g5vyYprBE0aadwWBqt3ASrmrnBPEdNfgU9a/PzcYTtKpm2 FSiwGsMwa3xB9ynw3+57MvBiAhML1++2cs6QxkF+KdRH9CeGqDUEN6zX0vv+Zs8YJbwn VnNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=tGLFrS2PrbNVDfBA+ztOZdm9hUpwuziX8CNKL+vb430=; b=gKiMbJ7Zd5Gm8ULczRiFIX1fNxNsL41Tptxhf4tnRlEXRC5KLH/pOSXwBGzAy+IoHK aAM/m8ELxviMRjiot/erIHZmagd0Qx9+NSxlZ+jc0M8k7Zelr7S1GuckpiG72FKgJkww 3Sd1svKn1peKWAuvhns5gtdcwTC7zIWroa3CQrFZ5l64IS4yfCf1gdGB4CLTH9mmB9fL d74LoVIjUWcfNx6yO3hHsGQCQnMyZFBJqV6sX4zz/0kRvqzJ90bE87JXfNf+iv6W1PL1 NVIEIDeUc9ZCv7fp+xI9tQUqygUls5tTdeEo7QMIkCTAr72VuqlkML9l1bR7Nb1ukbdO jSOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=i3mvjVY6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l10-20020a1709060e0a00b00926e4a8a488si7551796eji.647.2023.04.03.03.33.47; Mon, 03 Apr 2023 03:34:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=i3mvjVY6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231976AbjDCK0x (ORCPT + 99 others); Mon, 3 Apr 2023 06:26:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232109AbjDCK0s (ORCPT ); Mon, 3 Apr 2023 06:26:48 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CF99FB770 for ; Mon, 3 Apr 2023 03:26:31 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2F56A60B5F for ; Mon, 3 Apr 2023 10:26:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 03B2EC433EF; Mon, 3 Apr 2023 10:26:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680517590; bh=MhlRGCvX3KgilKlmvbMbusJqaUk+wppIWs8KSSwxG+U=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=i3mvjVY6O3NlogVMp7Ouurp251Gvh2TGtFi7hLqZu0XkTopzHYEY3DbCjdSi0XwHH +LivAfHnQV+Bc6C9/OgcHsr0pDHrRrrDbEpn+zN7GqeghvEiMxmuiXfkGTtXw9sl1n RoCMDy2oYZTzUN4OPHoV+d6BKOemG/mLja8zc+cjUj7dFuZXiZPot18+lFnkLKcdhU OSZoZdYF/22rf/IIWla/hGKqRxSeipkbzYwNpFWx1h248+0AUhzzYJReRjLcXYQvpC dARVO8E1bmwg0yhUKGf6Z5qvFQcZz2mMqT+ikdc3tAAB5Y8nZuPpBK2Pvz3+PCQLtu tQVojwLeCk9Pg== Date: Mon, 3 Apr 2023 12:26:27 +0200 From: Frederic Weisbecker To: Victor Hassan Cc: fweisbec@gmail.com, tglx@linutronix.de, mingo@kernel.org, jindong.yue@nxp.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH] tick/broadcast: Do not set oneshot_mask except was_periodic was true Message-ID: References: <20230328063629.108510-1-victor@allwinnertech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230328063629.108510-1-victor@allwinnertech.com> X-Spam-Status: No, score=-5.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 28, 2023 at 02:36:29PM +0800, Victor Hassan wrote: > If a broadcast timer is registered after the system switched to oneshot > mode, a hang_task err could occur like that: > > INFO: task kworker/u15:0:7 blocked for more than 120 seconds. > Tainted: G E 5.15.41-android13-8-00002-xxx #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:kworker/u16:0 state:D stack: 9808 pid: 7 ppid: 2 flags:0x00000008 > Workqueue: events_unbound deferred_probe_work_func.cfi_jt > Call trace: > __switch_to+0y240/0x490 > __schedule+0x620/0xafc > schedule+0x110/0x204 > schedule_hrtimeout_range_clock+0x9c/0x118 > usleep_range_state+0x150/0x1ac > _regulator_do_enable+0x528/0x878 > set_machine_constraints+0x6a0/0xf2c > regulator_register+0x3ac/0x7ac > devm_regulator_register+0xbc/0x120 > pmu_ext_regulator_probe+0xb0/0x1b4 [pmu_ext_regulator] > platform_probe+0x70/0x194 > really_proe+0x320/0x68c > __driver_probe_device+0x204/0x260 > driver_probe_device+0x48/0x1e0 > > When the new broadcast timer was registered after the system switched > to oneshot mode, the broadcast timer was not used as periodic. If the > oneshot mask was set incorrectly, all cores which did not enter cpu_idle > state can't enter cpu_idle normally, causing the hrtimer mechanism to > break. > > This patch fixes the issue by moving the update action about oneshot > mask to a more strict conditions. The tick_broadcast_setup_oneshot would > be called in two typical condition, and they all will work. > > 1. tick_handle_periodic -> tick_broadcast_setup_oneshot > > The origin broadcast was periodic, so it can set the oneshot_mask bits > for those waiting for periodic broadcast and program the broadcast timer > to fire. > > 2. tick_install_broadcast_device -> tick_broadcast_setup_oneshot > > The origin broadcast was oneshot, so the cores which enter the cpu_idle > already used the oneshot_mask bits. It is unnecessary to update the > oneshot_mask. > > Fixes: 9c336c9935cf ("tick/broadcast: Allow late registered device to enter oneshot mode") > > Signed-off-by: Victor Hassan > --- > kernel/time/tick-broadcast.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c > index 93bf2b4e47e5..fdbbba487978 100644 > --- a/kernel/time/tick-broadcast.c > +++ b/kernel/time/tick-broadcast.c > @@ -1041,12 +1041,13 @@ static void tick_broadcast_setup_oneshot(struct clock_event_device *bc) > */ > cpumask_copy(tmpmask, tick_broadcast_mask); > cpumask_clear_cpu(cpu, tmpmask); > - cpumask_or(tick_broadcast_oneshot_mask, > - tick_broadcast_oneshot_mask, tmpmask); > > if (was_periodic && !cpumask_empty(tmpmask)) { > ktime_t nextevt = tick_get_next_period(); > > + cpumask_or(tick_broadcast_oneshot_mask, > + tick_broadcast_oneshot_mask, tmpmask); > + Good catch, it looks like one issue that can trigger is due to the resulting ignored calls to tick_broadcast_exit(). Indeed if the cpu is already in tick_broadcast_oneshot_mask then cpuidle won't call the exit. Leading to such race: * CPU 1 stop its tick, next event is in one hour * CPU 0 registers new broadcast and sets CPU 1 in tick_broadcast_oneshot_mask * CPU 1 runs into cpuidle_enter_state(), and tick_broadcast_enter() is ignored because the CPU is already in tick_broadcast_oneshot_mask * CPU 1 goes to sleep * CPU 0 runs the broadcast callback, sees that the next timer for CPU 1 is in one hour, program the broadcast to that deadline * CPU 1 gets an interrupt that enqueues a new timer expiring in the next jiffy * CPU 1 don't call tick_broadcast_exit and thus don't remove itself from tick_broadcast_oneshot_mask * CPU 1 re-enters in cpuidle_enter_state(), tick_broadcast_enter() is again ignored so the new timer isn't propagated to the broadcast. * CPU 1 goes to sleep and won't be woken before one hour. Reviewed-by: Frederic Weisbecker Thanks. > clockevents_switch_state(bc, CLOCK_EVT_STATE_ONESHOT); > tick_broadcast_init_next_event(tmpmask, nextevt); > tick_broadcast_set_event(bc, cpu, nextevt); > -- > 2.29.0 >