Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp770527pxb; Tue, 5 Apr 2022 22:35:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz7XKpmprlmrJTascVZy3Ncs9+2M7JldGrVsd/8jZzztihzMJFRlgevDC4nNPg/P0gYl8UE X-Received: by 2002:a17:90b:4d0e:b0:1c6:3ea9:7b5f with SMTP id mw14-20020a17090b4d0e00b001c63ea97b5fmr8091794pjb.166.1649223331886; Tue, 05 Apr 2022 22:35:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649223331; cv=none; d=google.com; s=arc-20160816; b=jsiyVe3ciQeLXPESki9JT5Usxr7MANERA1MJPDMeBDeq26o9XH7BqMMAYrVrOenYS3 uLKh7p1vqETK+EbBYJFtS3enOuwGyoqIs2pz7hjIKMMFU1OmMvKoX77IB+J1j+BikYo+ JVOlUWLn8fsYFgxZfJvDZfgyIHxGfQNZWfzTvfVRb74ZdJ6KASisdT7E0CseKhz+KbTo pA916pUQ8Gw2WhgK3b3KoXIzhS4PVwCeXf9dtk0e0gmzFJ6v+ESdwu2rdvqPDXC1Jofm KdlD7COusIW4183JOh5Wix3YjuOs59hegQEhwsvkzLL3//ONVkIWskZQr4EAtEdv+e6C 8EmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=E6JV8zFGCZHwuE5XQBxxuF6dginIdY0wojSrfDf5WPw=; b=iSQRzFeGvaM5v0mTCQKSct6KRYqBQGa80xiyp++CMMStZn8N1NmIWAINisQVOntZEd ByUgUpaoEkFt1mQtz3lo6b3Gu8ZeHdrCUgn+CLQ6B5+VICHizscwGusvptw4QjygjiVO ws6F5La6QisvfRSogBWRueUToEM6YZJDfKJh2OvxKyj5LnfYnwBjtlBTPOxi1LilwuEs hDoyjhjXqquqstSRtc2v4n6CEcudzjoUifMc4q1hPR8rDlYw4o/PknuhedP+btJKXplP 3800PcSmP7CPT9QhGZxXr4CcTGmaVE/5+W4yw0Knbn6WOqtif2WYenEcyy5Ha9nlyXHf G83g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@aurora.tech header.s=google header.b=Ii7ey2hT; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=aurora.tech Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id i13-20020a170902eb4d00b001561b99e90asi15332888pli.235.2022.04.05.22.35.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Apr 2022 22:35:31 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@aurora.tech header.s=google header.b=Ii7ey2hT; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=aurora.tech Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C2D152B3304; Tue, 5 Apr 2022 21:16:36 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1388440AbiDEVy3 (ORCPT + 99 others); Tue, 5 Apr 2022 17:54:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1457638AbiDEQX3 (ORCPT ); Tue, 5 Apr 2022 12:23:29 -0400 Received: from mail-yb1-xb31.google.com (mail-yb1-xb31.google.com [IPv6:2607:f8b0:4864:20::b31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D61D81492 for ; Tue, 5 Apr 2022 09:21:28 -0700 (PDT) Received: by mail-yb1-xb31.google.com with SMTP id y38so24208084ybi.8 for ; Tue, 05 Apr 2022 09:21:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aurora.tech; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=E6JV8zFGCZHwuE5XQBxxuF6dginIdY0wojSrfDf5WPw=; b=Ii7ey2hT/J9PA4X9sGdgxr4Fh1UGPKsIxFbr40aeSR/KOogE4gz5jfWehcdoO63/T1 t/AFdRe2rtBx22Xg79K2W/3rSgaiGYSC+5r+Ipm1d2Zjm94AblvikVCBGp1LfpAylLUV oRETyJ/x1Rdvk27tKjA9EAeHdIQJiZ39GqRcdtiZPdtwsFY4+WmP4GrUXF9WnlLK2Y5V 24X44FjKFMXicoGpHuIo529x+H4xFvh3+WZuk/eQqO3ULjV5YBd7puFMtFjI23YLWClo 41G/7e13tpuR3QQpWnTUNboTWdCP1F6ArdPiS7A4FyCNN4AqEZrLXWQ+PV106wfR8lw+ OzTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=E6JV8zFGCZHwuE5XQBxxuF6dginIdY0wojSrfDf5WPw=; b=P0lxep65WhNhtQ+1+RiPdxVdDAEp9BEf51LXju0GeKVj83AbLGmN5aJdJnDEMAWqBE h0EaShrgrQPWb7yt7V/2dNGyFXN+kucInK8DFZA/E+UrIedwslyiJlkcTJCsWDmH6sOY 4YRDEPtvtlHs1eAXdJ0H0h98aHx4cX+y6pluK6uOyMQ5NJPK/X86vRFV2wrmMF+6UtCM pAFZadpze3nGpYPrgPAjMW24i7YFRgNg9l1k19OVA7NA6bZ5xOAvKkHk0Et4/jliJSLh LhCNQpUDp0XnqEndQ1zLGepFudhvARf/s+oLDdU0HtIwW/l/3NBuhkgU0kty5U9pI1j7 fKYQ== X-Gm-Message-State: AOAM533cht+dBag2l/zS9oFxRAo8Fma0KKGEBmnwQn+SXOmXiR6pwnGt sMsJmbPZL0n8zXDzQWOAggyDGeukRFVUt38olfUlSw== X-Received: by 2002:a05:6902:1506:b0:638:44f6:673d with SMTP id q6-20020a056902150600b0063844f6673dmr3163404ybu.605.1649175687868; Tue, 05 Apr 2022 09:21:27 -0700 (PDT) MIME-Version: 1.0 References: <20220405010752.1347437-1-frederic@kernel.org> <20220405010752.1347437-2-frederic@kernel.org> In-Reply-To: <20220405010752.1347437-2-frederic@kernel.org> From: Alison Chaiken Date: Tue, 5 Apr 2022 09:21:16 -0700 Message-ID: Subject: Re: [RT][PATCH 2/2] tick: Fix timer storm since introduction of timersd To: Frederic Weisbecker Cc: LKML , linux-rt-users , Mel Gorman , Sebastian Andrzej Siewior , Thomas Gleixner , Glenn Elliott Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 4, 2022 at 9:33 PM Frederic Weisbecker wrote: > > If timers are pending while the tick is reprogrammed on nohz_mode, the > next expiry is not armed to fire now, it is delayed one jiffy forward > instead so as not to raise an inextinguishable timer storm with such > scenario: > > 1) IRQ triggers and queue a timer > 2) ksoftirqd() is woken up > 3) IRQ tail: timer is reprogrammed to fire now > 4) IRQ exit > 5) TIMER interrupt > 6) goto 3) > > ...all that until we finally reach ksoftirqd. > > Unfortunately we are checking the wrong softirq vector bitmask since > timersd kthread has split from ksoftirqd. Timers now have their own > vector state field that must be checked separately. With kernel 5.15 and the timersd patch applied, we've observed that x86_64 cores tend to enter deeper C-states even when there are pending hrtimers. Presumably failure to check the right bits could also explain that observation and, accordingly, the patch might fix it? > As a result, the > old timer storm is back. This shows up early on boot with extremely long > initcalls: > > [ 333.004807] initcall dquot_init+0x0/0x111 returned 0 after 323822879 usecs > > and the cause is uncovered with the right trace events showing just > 10 microseconds between ticks (~100 000 Hz): > > swapper/-1 1dn.h111 60818582us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415486608 > swapper/-1 1dn.h111 60818592us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415496082 > swapper/-1 1dn.h111 60818601us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415505550 > swapper/-1 1dn.h111 60818611us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415515013 > swapper/-1 1dn.h111 60818620us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415524483 > swapper/-1 1dn.h111 60818630us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415533949 > swapper/-1 1dn.h111 60818639us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415543426 > swapper/-1 1dn.h111 60818649us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415553061 > swapper/-1 1dn.h111 60818658us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415562511 > > Fix this with checking the right timer vector state from the nohz code. > > Signed-off-by: Frederic Weisbecker > Cc: Mel Gorman > Cc: Sebastian Andrzej Siewior > Cc: Thomas Gleixner > --- > include/linux/interrupt.h | 12 ++++++++++++ > kernel/softirq.c | 7 +------ > kernel/time/tick-sched.c | 2 +- > 3 files changed, 14 insertions(+), 7 deletions(-) > > diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h > index e4b8a04e67ce..da248458f4d9 100644 > --- a/include/linux/interrupt.h > +++ b/include/linux/interrupt.h > @@ -607,9 +607,16 @@ extern void raise_softirq(unsigned int nr); > > #ifdef CONFIG_PREEMPT_RT > DECLARE_PER_CPU(struct task_struct *, timersd); > +DECLARE_PER_CPU(unsigned long, pending_timer_softirq); > + > extern void raise_timer_softirq(void); > extern void raise_hrtimer_softirq(void); > > +static inline unsigned int local_pending_timers(void) > +{ > + return __this_cpu_read(pending_timer_softirq); > +} > + > #else > static inline void raise_timer_softirq(void) > { > @@ -620,6 +627,11 @@ static inline void raise_hrtimer_softirq(void) > { > raise_softirq_irqoff(HRTIMER_SOFTIRQ); > } > + > +static inline unsigned int local_pending_timers(void) > +{ > + return local_softirq_pending(); > +} > #endif > > DECLARE_PER_CPU(struct task_struct *, ksoftirqd); > diff --git a/kernel/softirq.c b/kernel/softirq.c > index 89eb45614af6..c0aef5f760e5 100644 > --- a/kernel/softirq.c > +++ b/kernel/softirq.c > @@ -625,12 +625,7 @@ static inline void tick_irq_exit(void) > } > > DEFINE_PER_CPU(struct task_struct *, timersd); > -static DEFINE_PER_CPU(unsigned long, pending_timer_softirq); > - > -static unsigned int local_pending_timers(void) > -{ > - return __this_cpu_read(pending_timer_softirq); > -} > +DEFINE_PER_CPU(unsigned long, pending_timer_softirq); > > static void wake_timersd(void) > { > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c > index 17a283ce2b20..7c359f029b97 100644 > --- a/kernel/time/tick-sched.c > +++ b/kernel/time/tick-sched.c > @@ -763,7 +763,7 @@ static void tick_nohz_restart(struct tick_sched *ts, ktime_t now) > > static inline bool local_timer_softirq_pending(void) > { > - return local_softirq_pending() & BIT(TIMER_SOFTIRQ); > + return local_pending_timers() & BIT(TIMER_SOFTIRQ); > } > > static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) > -- > 2.25.1 > Thanks, Alison Chaiken achaiken@aurora.tech Aurora Innovation