Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp445820rwd; Wed, 31 May 2023 00:03:57 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6X0fJC/1H/2bh/Uwy2JAqN9JupONORiuRttQAUzDRe8Fe4jNRb9pJREEv4aQE1nkmFH8fT X-Received: by 2002:a05:6808:298c:b0:397:ee99:a616 with SMTP id ex12-20020a056808298c00b00397ee99a616mr3074431oib.15.1685516637517; Wed, 31 May 2023 00:03:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685516637; cv=none; d=google.com; s=arc-20160816; b=zdgHTPrgdGyST3DtDjrtdRsOHJplgsRwUZwxKgTrbVDCEC1xtBfqTiVoKBxHSVPVgh D0Aq20ydud6aSPzE5y0f2s/PHdcfMVeJAaubBx/FA/BuPHkcQ5WUv7b/xD1kDudeAd4M +tn556vpbpPFcetXY6k+vS6UzeWhNP7jR/Yp/H8nhIsBQJTykb7Zzo4MYOQdhuPjBS4Z pCljnsCp6Y2W/DpyU8fkM1KRLz5ZVz4Q52hF1ObKHh7TFXEuBv44C5QJV33TWIZq5Dz0 ChUZZG62JJ37i5VtucbPqY0NZVjutRdgHsE0CXdG5W/N62y+761S0gQm0DSnc2zffCca kfAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date:dkim-signature; bh=zXb6uBQeR7OTs03ALwGzBRdWSwiDGAaygGjubFxYkpw=; b=HU3gCAQc0cTLpeNJ1HM2uH9EDLzLux453Zb32zrB+wV4yhQ8Rfq2p9b+B+M9O2n092 uiIJ6p15IRTxElcK/gpQy0WtCxDLeCK5YQc6Y0lGRdo4VCNG3mZamqE+bHROMHmDclkC Cicn04nYACc0GmrXy6j5n2BUlUcb+bEjb/EB1OuBlSK8nyvMFwCP6FZnHDm+Ucl1TXAL tu5da/BJ6Eipj+ZyAp6C6I3ZJhLAPzmTyqFi5cTZBDwx3hZ/13HjQN1W9ZmT8BxWXuCT eaHZqnn3nzIuGilmaG8/Qqh9XHdnK7loZq3Q6jMRKGsX4JCY349WN05Nbvgm6u0nJLIs 6ppg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ip037B3P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b22-20020a637156000000b0053481057d3asi506538pgn.75.2023.05.31.00.03.39; Wed, 31 May 2023 00:03:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ip037B3P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234518AbjEaHA6 (ORCPT + 99 others); Wed, 31 May 2023 03:00:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234308AbjEaHAr (ORCPT ); Wed, 31 May 2023 03:00:47 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F39F186 for ; Wed, 31 May 2023 00:00:41 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2C96F61290 for ; Wed, 31 May 2023 07:00:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 858ACC433D2; Wed, 31 May 2023 07:00:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1685516440; bh=ZiKHj++BwH3B7HG2lnoOWuj+ZH79sYmATg6sHn5lT9A=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=ip037B3Pa7i51xmV2/KgC4EKvtZj5TvdZXpNqNa5882jeEUQsGAPGaZOVjHT6ltY3 L4qilchlXCvV5PxXza3JZdkuvH+/6CCJgCxiZhmMaIk8afGh2nWi1xiHgiTKEi2f81 y4/j18LcgskrE4d5HaU74LAX0eQGvv44wQnUc0owOj3EKxVmqdZ4hcf9BvPyDTuK8A Pfw8fYxOcK62l/Lj7e9JEH3eXsAcA+1fEYq9GzmDI/9gjES8G9VuoYbmX+VKyOPtax FG3bSDH283vPav8V7g6xC5UEFFsK+XMm1qTjjF4eAZEp/QlyrJ0c8QNCUJBJePIebo pnnnTLLiz+7FA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1q4Foz-001XQs-MM; Wed, 31 May 2023 08:00:37 +0100 Date: Wed, 31 May 2023 08:00:37 +0100 Message-ID: <867cspc7e2.wl-maz@kernel.org> From: Marc Zyngier To: James Gowans Cc: Thomas Gleixner , , Liao Chang , KarimAllah Raslan , Yipeng Zou , Zhang Jianhua Subject: Re: [PATCH 2/2] genirq: fasteoi resends interrupt on concurrent invoke In-Reply-To: <20230530213848.3273006-2-jgowans@amazon.com> References: <20230530213848.3273006-1-jgowans@amazon.com> <20230530213848.3273006-2-jgowans@amazon.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: jgowans@amazon.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, liaochang1@huawei.com, karahmed@amazon.com, zouyipeng@huawei.com, chris.zjh@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 30 May 2023 22:38:48 +0100, James Gowans wrote: > > Update the generic handle_fasteoi_irq to cater for the case when the > next interrupt comes in while the previous handler is still running. > Currently when that happens the irq_may_run() early out causes the next > IRQ to be lost. Change the behaviour to mark the interrupt as pending > and re-send the interrupt when handle_fasteoi_irq sees that the pending > flag has been set. This is largely inspired by handle_edge_irq. > > Generally it should not be possible for the next interrupt to arrive > while the previous handler is still running: the next interrupt should > only arrive after the EOI message has been sent and the previous handler > has returned. There is no such message with LPIs. I pointed that out previously. > However, there is a race where if the interrupt affinity > is changed while the previous handler is running, then the next > interrupt can arrive at a different CPU while the previous handler is > still running. In that case there will be a concurrent invoke and the > early out will be taken. > > For example: > > CPU 0 | CPU 1 > -----------------------------|----------------------------- > interrupt start | > handle_fasteoi_irq | set_affinity(CPU 1) > handler | > ... | interrupt start > ... | handle_fasteoi_irq -> early out > handle_fasteoi_irq return | interrupt end > interrupt end | > > This issue was observed specifically on an arm64 system with a GIC-v3 > handling MSIs; GIC-v3 uses the handle_fasteoi_irq handler. The issue is > that the global ITS is responsible for affinity but does not know > whether interrupts are pending/running, only the CPU-local redistributor > handles the EOI. Hence when the affinity is changed in the ITS, the new > CPU's redistributor does not know that the original CPU is still running > the handler. Similar to your previous patch, you don't explain *why* the interrupt gets delivered when it is an LPI, and not for any of the other GICv3 interrupt types. That's an important point. > > Implementation notes: > > It is believed that it's NOT necessary to mask the interrupt in > handle_fasteoi_irq() the way that handle_edge_irq() does. This is > because handle_edge_irq() caters for controllers which are too simple to > gate interrupts from the same source, so the kernel explicitly masks the > interrupt if it re-occurs [0]. > > [0] https://lore.kernel.org/all/bf94a380-fadd-8c38-cc51-4b54711d84b3@huawei.com/ > > Suggested-by: Liao Chang > Signed-off-by: James Gowans > Cc: Thomas Gleixner > Cc: Marc Zyngier > Cc: KarimAllah Raslan > Cc: Yipeng Zou > Cc: Zhang Jianhua > --- > kernel/irq/chip.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c > index 49e7bc871fec..42f33e77c16b 100644 > --- a/kernel/irq/chip.c > +++ b/kernel/irq/chip.c > @@ -692,8 +692,15 @@ void handle_fasteoi_irq(struct irq_desc *desc) > > raw_spin_lock(&desc->lock); > > - if (!irq_may_run(desc)) > + /* > + * When an affinity change races with IRQ delivery, the next interrupt > + * can arrive on the new CPU before the original CPU has completed > + * handling the previous one. Mark it as pending and return EOI. > + */ > + if (!irq_may_run(desc)) { > + desc->istate |= IRQS_PENDING; > goto out; > + } > > desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING); > > @@ -715,6 +722,12 @@ void handle_fasteoi_irq(struct irq_desc *desc) > > cond_unmask_eoi_irq(desc, chip); > > + /* > + * When the race descibed above happens, this will resend the interrupt. > + */ > + if (unlikely(desc->istate & IRQS_PENDING)) > + check_irq_resend(desc, false); > + > raw_spin_unlock(&desc->lock); > return; > out: While I'm glad that you eventually decided to use the resend mechanism instead of spinning on the "old" CPU, I still think imposing this behaviour on all users without any discrimination is wrong. Look at what it does if an interrupt is a wake-up source. You'd pointlessly requeue the interrupt (bonus points if the irqchip doesn't provide a HW-based retrigger mechanism). I still maintain that this change should only be applied for the particular interrupts that *require* it, and not as a blanket change affecting everything under the sun. I have proposed such a change in the past, feel free to use it or roll your own. In the meantime, I strongly oppose this change as proposed. Thanks, M. -- Without deviation from the norm, progress is not possible.