Received: by 2002:a05:7412:40d:b0:e2:908c:2ebd with SMTP id 13csp1178180rdf; Wed, 22 Nov 2023 07:33:51 -0800 (PST) X-Google-Smtp-Source: AGHT+IGxw40GaCo5ZR0bpFIONe68ohA4RHcfSFbUJUPWRZz9042zPq5YBuJUmKRcfz3qS/FlZyzI X-Received: by 2002:a17:90b:4c42:b0:280:2823:6612 with SMTP id np2-20020a17090b4c4200b0028028236612mr2476938pjb.35.1700667230859; Wed, 22 Nov 2023 07:33:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700667230; cv=none; d=google.com; s=arc-20160816; b=frbe4q6W9lN6ST8Rrpg40MFPYv5Ulhuo4WIi/cHghLWLJSrYULuYHQJNvbo9LWE6VL J+KxPuQCA7CRYFeP5F6LM0S15Jrmn63jnq8H2DKndzanCI5nn6Jzvx1jVTgNtN8coB0F rEnVIz51so5rjJ08kj4TVgdpRRLbuD3/yaoJ/7+aRwScPDnIVbF00w4diH+mb9ZlpxDk j/ofKG1yI1+sIRgf3kqcE+yyZMDEjdqCQyJ9DyJwKz8FOb0rV0qhBYz7OosdXiOvjILg k1UhqT8dx1V7zG6MgBtqO87ohPYCred/5yEda2ndaZKdkH4hGOF6sCUEa/q9gOaUXidO G6Gg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=MmOlDwe4E7RdE+PTXW7sIYMUN61NEED5J3jbdYUeRuQ=; fh=XZ0uwZE0paIV2j00M5UPyZjQ8Mltj2gjM9nA995WUzQ=; b=AoErSd2x0+Z6EELWywYRKAKNOpir66MgYIU3jLy9O9bDMY8Ft8LMmCFurclDyQBZdz bheVnRnubTrWkWvH/RBzIzRhFwzY0RC+V53PnMd6gHbf4cC+diYOV003VbVjfJxJgH3r eyE6RLJhjheoa8bDTY/pVz7V3pEq9K5UToKitd/ADsrMlJJeDnR7d1kQ4L18ZllyDI8p xnfr6b7Ql2Hp1PW2Xk40YUXIH1ng4IorYjISfIX1hwrvMyjrEzkbnJ3Q3U3xJN0B5DIG UUC/eIju/jKDUmLwCFDOdl9FawNuT5+b9DJIbENWbWKphT8ZqiRRkTnLxRXY/GizB1Kl QadQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=hOjcjIUS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id c20-20020a631c14000000b005c1b5a5213esi12549335pgc.768.2023.11.22.07.33.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Nov 2023 07:33:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=hOjcjIUS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 471C28276C0C; Wed, 22 Nov 2023 07:33:21 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344359AbjKVPcd (ORCPT + 99 others); Wed, 22 Nov 2023 10:32:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344082AbjKVPc2 (ORCPT ); Wed, 22 Nov 2023 10:32:28 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3BC8C10E; Wed, 22 Nov 2023 07:32:24 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E9D81C433C8; Wed, 22 Nov 2023 15:32:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700667143; bh=9O5vkpZN+jgFTN2kaswcUNq+pzbriQ5vXaWtRP8ZAQ0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hOjcjIUSsWBMtGQdF6IgAty4L0hkUJPUZn6SwwWFkCeFC69EtgUHjevyZTJHlxgZu V6K/VcNxcGycsmcYw/Tf16C4rEgz6ZUAFWqMgaMvJy8KW+0UuYlS9xhzq9d6xbzrYI Y+3etfB3mJTD0Ynf9kVdqBqVMbXwH+YjBJ5hS7GkUxQFSpSeGRLCgjoMObrdCYyV1q iDHcVy2e9taK5kZFQ6v7cwWAGkPwaYaankQFQ/iIFfjuZNgUjXtwXXMRjTQ0YW2Tre otS7qYvBjKr+1JmkHZTw2F83uYefoXvhsHfmjrr92Uwk0SdqCUvg20UV1Y4GTHTCee tbYYXmyDl8Owg== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Thomas Gleixner , Yu Liao , Liu Tie , Sasha Levin , peterz@infradead.org Subject: [PATCH AUTOSEL 6.6 03/17] hrtimers: Push pending hrtimers away from outgoing CPU earlier Date: Wed, 22 Nov 2023 10:31:32 -0500 Message-ID: <20231122153212.852040-3-sashal@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122153212.852040-1-sashal@kernel.org> References: <20231122153212.852040-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.6.2 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.3 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Wed, 22 Nov 2023 07:33:21 -0800 (PST) From: Thomas Gleixner [ Upstream commit 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94 ] 2b8272ff4a70 ("cpu/hotplug: Prevent self deadlock on CPU hot-unplug") solved the straight forward CPU hotplug deadlock vs. the scheduler bandwidth timer. Yu discovered a more involved variant where a task which has a bandwidth timer started on the outgoing CPU holds a lock and then gets throttled. If the lock required by one of the CPU hotplug callbacks the hotplug operation deadlocks because the unthrottling timer event is not handled on the dying CPU and can only be recovered once the control CPU reaches the hotplug state which pulls the pending hrtimers from the dead CPU. Solve this by pushing the hrtimers away from the dying CPU in the dying callbacks. Nothing can queue a hrtimer on the dying CPU at that point because all other CPUs spin in stop_machine() with interrupts disabled and once the operation is finished the CPU is marked offline. Reported-by: Yu Liao Signed-off-by: Thomas Gleixner Tested-by: Liu Tie Link: https://lore.kernel.org/r/87a5rphara.ffs@tglx Signed-off-by: Sasha Levin --- include/linux/cpuhotplug.h | 1 + include/linux/hrtimer.h | 4 ++-- kernel/cpu.c | 8 +++++++- kernel/time/hrtimer.c | 33 ++++++++++++--------------------- 4 files changed, 22 insertions(+), 24 deletions(-) diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 28c1d3d77b70f..624d4a38c358a 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -194,6 +194,7 @@ enum cpuhp_state { CPUHP_AP_ARM_CORESIGHT_CTI_STARTING, CPUHP_AP_ARM64_ISNDEP_STARTING, CPUHP_AP_SMPCFD_DYING, + CPUHP_AP_HRTIMERS_DYING, CPUHP_AP_X86_TBOOT_DYING, CPUHP_AP_ARM_CACHE_B15_RAC_DYING, CPUHP_AP_ONLINE, diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h index 0ee140176f102..f2044d5a652b5 100644 --- a/include/linux/hrtimer.h +++ b/include/linux/hrtimer.h @@ -531,9 +531,9 @@ extern void sysrq_timer_list_show(void); int hrtimers_prepare_cpu(unsigned int cpu); #ifdef CONFIG_HOTPLUG_CPU -int hrtimers_dead_cpu(unsigned int cpu); +int hrtimers_cpu_dying(unsigned int cpu); #else -#define hrtimers_dead_cpu NULL +#define hrtimers_cpu_dying NULL #endif #endif diff --git a/kernel/cpu.c b/kernel/cpu.c index 1a189da3bdac5..f6803b00157c0 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -2106,7 +2106,7 @@ static struct cpuhp_step cpuhp_hp_states[] = { [CPUHP_HRTIMERS_PREPARE] = { .name = "hrtimers:prepare", .startup.single = hrtimers_prepare_cpu, - .teardown.single = hrtimers_dead_cpu, + .teardown.single = NULL, }, [CPUHP_SMPCFD_PREPARE] = { .name = "smpcfd:prepare", @@ -2198,6 +2198,12 @@ static struct cpuhp_step cpuhp_hp_states[] = { .startup.single = NULL, .teardown.single = smpcfd_dying_cpu, }, + [CPUHP_AP_HRTIMERS_DYING] = { + .name = "hrtimers:dying", + .startup.single = NULL, + .teardown.single = hrtimers_cpu_dying, + }, + /* Entry state on starting. Interrupts enabled from here on. Transient * state for synchronsization */ [CPUHP_AP_ONLINE] = { diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 238262e4aba7e..760793998cdd7 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -2219,29 +2219,22 @@ static void migrate_hrtimer_list(struct hrtimer_clock_base *old_base, } } -int hrtimers_dead_cpu(unsigned int scpu) +int hrtimers_cpu_dying(unsigned int dying_cpu) { struct hrtimer_cpu_base *old_base, *new_base; - int i; + int i, ncpu = cpumask_first(cpu_active_mask); - BUG_ON(cpu_online(scpu)); - tick_cancel_sched_timer(scpu); + tick_cancel_sched_timer(dying_cpu); + + old_base = this_cpu_ptr(&hrtimer_bases); + new_base = &per_cpu(hrtimer_bases, ncpu); - /* - * this BH disable ensures that raise_softirq_irqoff() does - * not wakeup ksoftirqd (and acquire the pi-lock) while - * holding the cpu_base lock - */ - local_bh_disable(); - local_irq_disable(); - old_base = &per_cpu(hrtimer_bases, scpu); - new_base = this_cpu_ptr(&hrtimer_bases); /* * The caller is globally serialized and nobody else * takes two locks at once, deadlock is not possible. */ - raw_spin_lock(&new_base->lock); - raw_spin_lock_nested(&old_base->lock, SINGLE_DEPTH_NESTING); + raw_spin_lock(&old_base->lock); + raw_spin_lock_nested(&new_base->lock, SINGLE_DEPTH_NESTING); for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) { migrate_hrtimer_list(&old_base->clock_base[i], @@ -2252,15 +2245,13 @@ int hrtimers_dead_cpu(unsigned int scpu) * The migration might have changed the first expiring softirq * timer on this CPU. Update it. */ - hrtimer_update_softirq_timer(new_base, false); + __hrtimer_get_next_event(new_base, HRTIMER_ACTIVE_SOFT); + /* Tell the other CPU to retrigger the next event */ + smp_call_function_single(ncpu, retrigger_next_event, NULL, 0); - raw_spin_unlock(&old_base->lock); raw_spin_unlock(&new_base->lock); + raw_spin_unlock(&old_base->lock); - /* Check, if we got expired work to do */ - __hrtimer_peek_ahead_timers(); - local_irq_enable(); - local_bh_enable(); return 0; } -- 2.42.0