Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp19778657rwd; Wed, 28 Jun 2023 14:16:02 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7JX3XJxG0e76rv92OsJ7+9vAFE6IICOoGjmyTXtyXi5BsKEMVvOVmSS02XE/EUdnTMM31Q X-Received: by 2002:a17:90a:6605:b0:262:deec:502d with SMTP id l5-20020a17090a660500b00262deec502dmr3172858pjj.7.1687986962220; Wed, 28 Jun 2023 14:16:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687986962; cv=none; d=google.com; s=arc-20160816; b=O7xS6Y3r0xcGmTuPEnALHyp8p5+0oTlJ5AzQ6zjsCcHBl1wMU+fh83nQfL8yx7J4aw ego+Fm1N1DbzGyk8L/VVMuHX6AjGu386BL9n6VKFXJN67vUlFqTQkd4tEAZ0tWVzjQy6 28lAhhCw6jkHxo/0DoDRBN2o/1ErYkYZkwA/oxz/cvkr/2th9j47tuH7JIPE86Rc5YMr a9P0+c+ucbsWKlca9EdptEAOdTW8qcgd51LygW6oCT8jaIeYg8C1uIXwvk3N61labrZ9 DQug2RsdILvv0lsEKnum8JKMTe/PPYqxkKP7MyED+ve4liUzJhsln5hif5Z6ulOUacyi w9dA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=NfWivBbB9CVFPmfW5ghsqhRiOMsMXKFj03Bb7yAbSp8=; fh=3veSKpFo2ned+6Z2FiS6xLHue8+Qfh6n157ocd/zINg=; b=nFRXLqGRmbKSpiIHi+hy3VILT+FA9ickU5wVl2tFZLM/vdCBcsFNuJbMdj9tZt76o2 zu3N+uwYdz5ZfvfAqlfCtBTuzUWZXmyqYR+3U5NM3vP6p58yuUAxMrm86LTye+wsylRW XZIv9EqMqmG5Gpj9pst6hWXxY8HO32h3ZJ4Ql7IEhTaZLepaeb6VGKzd/xAMPIuOaFR0 ijLkotI0ulTCtYVobbTM9bvNCQRDqbDCo/UeUc/L3gh1r/K+nMugqvO1cEmCioVGCqjo 6WuV+6XHb4vDcyQJ+9t6byDV5J5Y+ArJPifVAbYdFbzNHJm0GLEYO6OiWLC0m571x170 /wzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=iwxulBDy; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=Prt+vhMv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h5-20020a17090ac38500b00262ecc71b13si6744409pjt.109.2023.06.28.14.15.43; Wed, 28 Jun 2023 14:16:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=iwxulBDy; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=Prt+vhMv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233069AbjF1VMI (ORCPT + 99 others); Wed, 28 Jun 2023 17:12:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232634AbjF1VKB (ORCPT ); Wed, 28 Jun 2023 17:10:01 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 268771FDC for ; Wed, 28 Jun 2023 14:10:00 -0700 (PDT) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1687986597; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NfWivBbB9CVFPmfW5ghsqhRiOMsMXKFj03Bb7yAbSp8=; b=iwxulBDyqoxoYc67idxuGHptPiqIEcYN4K6VgcWFrLzinJoQIJoCWiXwyq66wOOiMhsBYw WjneLK8b51+VPCbFxPTAlLDZcwTP6naRW6BDbmbQwptP4uqGbvZFvSbSWLIjLOAouRxw8b nldD2D2+NKBGvHs+59birCNOA7tnchadsquq8EXPBGbDXqt7ram36aY5UiaGnzgpXKjTt9 xQaTwhT5tFQy9J8vQYiEpH2rT5Bm6iZwCpaLpeAhyEd994p27rhYhUrU4EPmFzmnQAQAiW X9t5p+MX0dPsFgORQUvYC+TaFsSz5ROA5P8DFwh54tItqws7rM3nOk4XzpbIgA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1687986597; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NfWivBbB9CVFPmfW5ghsqhRiOMsMXKFj03Bb7yAbSp8=; b=Prt+vhMvrIzR39qAU+NqSLzDhp4jinRyajpUKYGx0SKEFvHWr0cIwcutxl4oMRAv3Sek1P 6fV/yVMOFsdXZ5Aw== To: Vincent Guittot , Xiongfeng Wang Cc: vschneid@redhat.com, Phil Auld , vdonnefort@google.com, Linux Kernel Mailing List , Wei Li , "liaoyu (E)" , zhangqiao22@huawei.com, Peter Zijlstra , Dietmar Eggemann , Ingo Molnar Subject: Re: [Question] report a race condition between CPU hotplug state machine and hrtimer 'sched_cfs_period_timer' for cfs bandwidth throttling In-Reply-To: References: <8e785777-03aa-99e1-d20e-e956f5685be6@huawei.com> <87mt18it1y.ffs@tglx> <68baeac9-9fa7-5594-b5e7-4baf8ac86b77@huawei.com> Date: Wed, 28 Jun 2023 23:09:56 +0200 Message-ID: <87zg4j2t0b.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 28 2023 at 15:30, Vincent Guittot wrote: > On Tue, 27 Jun 2023 at 18:46, Vincent Guittot >> > > + struct hrtimer_clock_base *clock_base = cfs_b->period_timer.base; >> > > + int cpu = clock_base->cpu_base->cpu; >> > > + if (!cpu_active(cpu) && cpu != smp_processor_id()) >> > > + hrtimer_start_expires(&cfs_b->period_timer, >> > > HRTIMER_MODE_ABS_PINNED); >> > > return; >> > > + } Can you please trim your replies? >> I have been able to reproduce your problem and run your fix on top. I >> still wonder if there is a > > Looks like I have been preempted and never finished the sentence. The > full sentence is: > I still wonder if there is a race condition where the hang can still > happen but i haven't been able to find one so far As I explained before. Assume the timer fires on the outgoing CPU and the other CPU tries to rearm it concurrently. It will stay on the outgoing CPU and not move over. Thanks, tglx