Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp4757194pxb; Tue, 28 Sep 2021 03:29:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyoxHfZclWH1XElaoGPVGoNesUUvBfCRn2/+TC6aqPs9diCgCRitz99xGwtuPu/516ianbE X-Received: by 2002:a17:906:8152:: with SMTP id z18mr5864121ejw.153.1632824941728; Tue, 28 Sep 2021 03:29:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632824941; cv=none; d=google.com; s=arc-20160816; b=uePfGGdh8kkMtAwsLGBvEg8k3EqNLsCxrsZUflN2gcvqdLcOCMKfDF7ZR/Hoz9fk5Q Gb3fpiF8R6xGWYqbY/sZCmkcc2+ma8MnrLNx4XmOzwwsN7+Isbw025X7F3ub/P+hmQfB UcaxRWO7uBvTYMMS664NlT1Qh+5acCL5//toXiQs2K9RyOu7WvxE8frS7pnjwfrEX2fo pJqke8GG+DH0VWBYB+yRD4+vJS7TIxXsVo5xIYVMb+gsR3xHexmxZFJsiG13VTLuLSGs O7UfxbhA65NDapc8fihFW7bv7mKdn1HExoTBp+/VQ4ZIAEYqFN90qLtfSXoc00MAwEbY t8+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=5+QD84kIqzIKQf/qofXfvgdD7aEzsNpmUp0tfWNiW7g=; b=goRPNiwfESVZK4ZLIbzFhHh+AonN28pfuZMbWDP+DcUpqyErDAthh1Yrpe7YsYE7nb hL4ScauHdS6+Dg3oKNrWri8uTS1z9lyPvQtSWxRxmISGRjCRoenwwXxIs+r/zccpWNSV cyvXtg3H/FbzYKe4Hlvp4Jpf2qQSh5TVcq/pa6zphOUsH57u3c399VZPjZxX3723/Cx8 jK9WE/UWAndpoSasoewW75GOI1vdealdU/7qvsjDgcaHJ7wKZySE/9/s0YhW7rdskRiK z/M+mh6DiECriJWGbeMQERRKhA/LxmUyhJZl5FVA9c978tI2jZG7covt5Wa3jLsYJRCn 9zeQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i21si22451610ejo.253.2021.09.28.03.28.36; Tue, 28 Sep 2021 03:29:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240177AbhI1K14 (ORCPT + 99 others); Tue, 28 Sep 2021 06:27:56 -0400 Received: from foss.arm.com ([217.140.110.172]:49032 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240172AbhI1K1w (ORCPT ); Tue, 28 Sep 2021 06:27:52 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 79D186D; Tue, 28 Sep 2021 03:26:12 -0700 (PDT) Received: from C02TD0UTHF1T.local (unknown [10.57.23.93]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 261A63F7B4; Tue, 28 Sep 2021 03:26:07 -0700 (PDT) Date: Tue, 28 Sep 2021 11:26:04 +0100 From: Mark Rutland To: Sven Schnelle Cc: "Paul E. McKenney" , Pingfan Liu , Thomas Gleixner , linux-arm-kernel@lists.infradead.org, Catalin Marinas , Will Deacon , Marc Zyngier , Joey Gouly , Sami Tolvanen , Julien Thierry , Yuichi Ito , linux-kernel@vger.kernel.org, Vasily Gorbik , Heiko Carstens Subject: Re: [PATCHv2 0/5] arm64/irqentry: remove duplicate housekeeping of Message-ID: <20210928102604.GE1924@C02TD0UTHF1T.local> References: <20210924132837.45994-1-kernelfans@gmail.com> <20210924173615.GA42068@C02TD0UTHF1T.local> <20210924225954.GN880162@paulmck-ThinkPad-P17-Gen-1> <20210927092303.GC1131@C02TD0UTHF1T.local> <20210928000922.GY880162@paulmck-ThinkPad-P17-Gen-1> <20210928083222.GA1924@C02TD0UTHF1T.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 28, 2021 at 11:52:51AM +0200, Sven Schnelle wrote: > Mark Rutland writes: > > > On Mon, Sep 27, 2021 at 05:09:22PM -0700, Paul E. McKenney wrote: > >> On Mon, Sep 27, 2021 at 10:23:18AM +0100, Mark Rutland wrote: > >> > Sure; I didn't mean to suggest those weren't balanced! The problem here > >> > is *nesting*. Due to the structure of our entry code and the core IRQ > >> > code, when handling an IRQ we have a sequence: > >> > > >> > irq_enter() // arch code > >> > irq_enter() // irq code > >> > > >> > < irq handler here > > >> > > >> > irq_exit() // irq code > >> > irq_exit() // arch code > >> > > >> > ... and if we use something like rcu_is_cpu_rrupt_from_idle() in the > >> > middle (e.g. as part of rcu_sched_clock_irq()), this will not give the > >> > expected result because of the additional nesting, since > >> > rcu_is_cpu_rrupt_from_idle() seems to expect that dynticks_nmi_nesting > >> > is only incremented once per exception entry, when it does: > >> > > >> > /* Are we at first interrupt nesting level? */ > >> > nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting); > >> > if (nesting > 1) > >> > return false; > >> > > >> > What I'm trying to figure out is whether that expectation is legitimate, > >> > and assuming so, where the entry/exit should happen. > >> > >> Oooh... > >> > >> The penalty for fooling rcu_is_cpu_rrupt_from_idle() is that RCU will > >> be unable to detect a userspace quiescent state for a non-nohz_full > >> CPU. That could result in RCU CPU stall warnings if a user task runs > >> continuously on a given CPU for more than 21 seconds (60 seconds in > >> some distros). And this can easily happen if the user has a CPU-bound > >> thread that is the only runnable task on that CPU. > >> > >> So, yes, this does need some sort of resolution. > >> > >> The traditional approach is (as you surmise) to have only a single call > >> to irq_enter() on exception entry and only a single call to irq_exit() > >> on exception exit. If this is feasible, it is highly recommended. > > > > Cool; that's roughly what I was expecting / hoping to hear! > > > >> In theory, we could have that "1" in "nesting > 1" be a constant supplied > >> by the architecture (you would want "3" if I remember correctly) but > >> in practice could we please avoid this? For one thing, if there is > >> some other path into the kernel for your architecture that does only a > >> single irq_enter(), then rcu_is_cpu_rrupt_from_idle() just doesn't stand > >> a chance. It would need to compare against a different value depending > >> on what exception showed up. Even if that cannot happen, it would be > >> better if your architecture could remain in blissful ignorance of the > >> colorful details of ->dynticks_nmi_nesting manipulations. > > > > I completely agree. I think it's much harder to keep that in check than > > to enforce a "once per architectural exception" policy in the arch code. > > > >> Another approach would be for the arch code to supply RCU a function that > >> it calls. If there is such a function (or perhaps better, if some new > >> Kconfig option is enabled), RCU invokes it. Otherwise, it compares to > >> "1" as it does now. But you break it, you buy it! ;-) > > > > I guess we could look at the exception regs and inspect the original > > context, but it sounds overkill... > > > > I think the cleanest thing is to leave this to arch code, and have the > > common IRQ code stay well clear. Unfortunately most architectures > > (including arch/arm) still need the common IRQ code to handle this, so > > we'll have to make that conditional on Kconfig, something like the below > > (build+boot tested only). > > > > If there are no objections, I'll go check who else needs the same > > treatment (IIUC at least s390 will), and spin that as a real > > patch/series. > > Hmm, s390 doesn't use handle_domain_irq() and doesn't have > HANDLE_DOMAIN_IRQ set. So i don't think the patch below applies to s390. > However, i'll follow the code to make sure we're not calling > irq_enter/irq_exit twice. I wasn't clear, but for s390, my concern was that in do_io_irq() and do_ext_irq() you have the sequence: irqentry_enter() // calls rcu_irq_enter() irq_enter(); // calls rcu_irq_enter() then irq_enter_rcu(); < handler> irq_exit(); // calls __irq_exit_rcu then rcu_irq_exit(); irqentry_exit(); // calls rcu_irq_exit() ... and so IIUC you call rcu_irq_enter() and rcu_irq_exit() twice, getting the same double-increment of `dynticks_nmi_nesting` per interrupt, and the same potential problem with rcu_is_cpu_rrupt_from_idle(). Thanks, Mark.