Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp1573430pxa; Thu, 6 Aug 2020 10:34:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxpEtDMKpPhQCYwFSwKHzPKSQmShuP5eBoVSSenKV2iLkkHwN3shqr12D/yTsEE389KE+pe X-Received: by 2002:aa7:c353:: with SMTP id j19mr5200777edr.128.1596735283232; Thu, 06 Aug 2020 10:34:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596735283; cv=none; d=google.com; s=arc-20160816; b=HMOxP0xl1Do56FyIA1dh0RDbWPDK5nILmVEbZcqn0Oq4IyGDMKhhApADJsHWV9hS7P H445VHmiT8fLRfJtOrisAJ201uBuEI5IwN6cCJcvfg0dkLoCSmNR36otFB0KdsaiWR2P C4P1d2Tz8bGTDS/H/0eTYlQTRdYnLj0SphkTlF3M3Y9JFB4m1omnDZIxgRzFE9JwTE9E RFXEUQSoUn+CJGVeEgkCn/VCXvTHs75QsEW7krLCb4Qe/a7RsstklWWJWaqfim+eQPuc 8LSsq1lbcdD3AA87Ye3/7qJNIvlU31mpHOzCusTy4im+IkyH43cPma0L9ZzUxs2QlAMC 0OHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=FgWgF46TrawWsiYvQO39gHkdBt8ylUlaNBXA+RNnnps=; b=fa/tRQ1H+kZUQZg0lzkuR1kIJoAXLdhZc2ddPY/dL8D2HgumwrccrfnTy8pN5+HIc0 OTK8TKyOV5gIaTDf07BBca9lTCB9gOvpjY67upIyuVJOoZcaa4cgvrel0Rcqs0j3lzYf sY42gTdIEGH9dbMdXyAiLFH04eAF7tUyu3NqB/KsGGKhNu6ebCYlCy0eCuLKjhHTxyVK T+3iENW7BXF9lQ2SC5JtZmYuSN2Uddntm/Tj4tI0ozp952Fw+t9ZN4g83+JDFWlyJE5G VmqveGvreqXeMBLH6qttQdiwUvH3lxBn8K2vMyDwFpfduixfmPCvCv/ieBCp7lPXhbEk pFKQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=ADXfnL9F; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y7si3651647eju.152.2020.08.06.10.34.19; Thu, 06 Aug 2020 10:34:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=ADXfnL9F; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728715AbgHFRcM (ORCPT + 99 others); Thu, 6 Aug 2020 13:32:12 -0400 Received: from mail.kernel.org ([198.145.29.99]:54488 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729272AbgHFRbX (ORCPT ); Thu, 6 Aug 2020 13:31:23 -0400 Received: from paulmck-ThinkPad-P72.home (unknown [50.45.173.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C6D6E2310C; Thu, 6 Aug 2020 13:27:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596720430; bh=3CR47nCFqExTQ2ML0Q2q5ma+kjZk2NCJpGaKDoZNAp0=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=ADXfnL9F5IxPiRumosniAILFcwP44WwOkV2KfXpn2zCUQ+j5RQzq/J9lyn1bZXQid f0qnMGIxOJgpQoFxUeFMWej2QiZ0sJcbXGL462KhzOt5H703XMuzejKaCV6bDplpPS v3AJ40qDkmnKh+NbCNNblwBjPDB36nrbvXnkSjv4= Received: by paulmck-ThinkPad-P72.home (Postfix, from userid 1000) id 7476335206C1; Thu, 6 Aug 2020 06:27:10 -0700 (PDT) Date: Thu, 6 Aug 2020 06:27:10 -0700 From: "Paul E. McKenney" To: peterz@infradead.org Cc: Thomas Gleixner , Valentin Schneider , Vladimir Oltean , Kurt Kanzenbach , Alison Wang , catalin.marinas@arm.com, will@kernel.org, mw@semihalf.com, leoyang.li@nxp.com, vladimir.oltean@nxp.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Anna-Maria Gleixner Subject: Re: [RFC PATCH] arm64: defconfig: Disable fine-grained task level IRQ time accounting Message-ID: <20200806132710.GL4295@paulmck-ThinkPad-P72> Reply-To: paulmck@kernel.org References: <87lfiwm2bj.fsf@nanos.tec.linutronix.de> <20200803114112.mrcuupz4ir5uqlp6@skbuf> <87d047n4oh.fsf@nanos.tec.linutronix.de> <875z9zmt4i.fsf@nanos.tec.linutronix.de> <20200805134002.GQ2674@hirez.programming.kicks-ass.net> <20200805153120.GU2674@hirez.programming.kicks-ass.net> <874kpgi025.fsf@nanos.tec.linutronix.de> <20200806114545.GA2674@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200806114545.GA2674@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 06, 2020 at 01:45:45PM +0200, peterz@infradead.org wrote: > On Thu, Aug 06, 2020 at 11:41:06AM +0200, Thomas Gleixner wrote: > > peterz@infradead.org writes: > > > On Wed, Aug 05, 2020 at 02:56:49PM +0100, Valentin Schneider wrote: > > > > > >> I've been tempted to say the test case is a bit bogus, but am not familiar > > >> enough with the RT throttling details to stand that ground. That said, from > > >> both looking at the execution and the stress-ng source code, it seems to > > >> unconditionally spawn 32 FIFO-50 tasks (there's even an option to make > > >> these FIFO-99!!!), which is quite a crowd on monoCPU systems. > > > > > > Oh, so it's a case of: we do stupid without tuning and the system falls > > > over. I can live with that. > > > > It's not a question of whether you can live with that behaviour for a > > particular silly test case. > > > > The same happens with a single RT runaway task with enough interrupt > > load on a UP machine. Just validated that. > > Of course. > > > And that has nothing to do > > with a silly test case. Sporadic runaways due to a bug in a once per > > week code path simply can happen and having the safety net working > > depending on a config option selected or not is just wrong. > > The safety thing is concerned with RT tasks. It doesn't pretend to help > with runnaway IRQs, never has, never will. Getting into the time machine back to the 1990s... DYNIX/ptx had a discretionary mechanism to deal with excessive interrupts. There was a function that long-running interrupt handlers were supposed to call periodically that would return false if the system felt that the CPU had done enough interrupts for the time being. In that case, the interrupt handler was supposed to schedule itself for a later time, but leave the interrupt unacknowledged in order to prevent retriggering in the meantime. Of course, this mechanism would be rather less helpful in Linux. For one, Linux has way more device drivers and way more oddball devices. In contrast, the few devices that DYNIX/ptx supported were carefully selected, and the selection criteria included being able to put up with this sort of thing. Also, the fact that there was but a handful of device drivers meant that changes like this could be more easily propagated through all drivers. Also, Linux supports way more workloads. In contrast, DYNIX/ptx could pick a small percentage of each CPU that would be permitted to be used by hardware interrupt handlers. As in there are probably Linux workloads that run >90% of some poor CPU within hardware interrupt handlers. But reminiscing anyway on the off-chance that this inspires someone to come up with an idea that would work well in the Linux environment. Thanx, Paul > The further extreme is an interrupt storm, those have always taken a > machine down. > > Accounting unrelated IRQ time to RT tasks is equally wrong, the task > execution is unrelated to the IRQs. The config option at least offers > insight into where time goes -- and it's a config option because doing > time accounting on interrupts adds overhead :/ > > This really is a no-win all round. > > The only 'sensible' option here is threaded IRQs, where the IRQ line > gets disabled until the handler thread has ran, that also helps with IRQ > storms.