Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp260457pxy; Wed, 21 Apr 2021 02:16:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy+x8hAJ9nDnqOKmzFLOb0q81eFf318OQ1mI32MDwPzj3KXgOklbndaFTS9CKzTgkCz4U1t X-Received: by 2002:a17:906:1749:: with SMTP id d9mr30756130eje.12.1618996570284; Wed, 21 Apr 2021 02:16:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618996570; cv=none; d=google.com; s=arc-20160816; b=jskevvbmCsRiZp81qn4PXzUGbcuN/yI51M1wZti4Fykr3RgqgnTIS8vIG6WnB6yprr aD7UvxXOgTGf/RJoumy31Jlj8UawGeBLXETcKeskbwMmmq5YzeF4wjtdpZ+Mj5WgkcH2 VLwMETRR2zk+3H/UmfwTPuqU0H034BaCUmB+GlDqHQIshEiG7fM+tTwYB39y6t0I7ii1 XkF3rY/OHXp5xdhnotQL6Xh8Lv26zWKJziEDXC/ts2csl9hNemQRIkmAPzeQfLMcgGT2 i5tmh6hiSlwB1gO5rRz4zWLtRTAc1sdR5KER5+greZJSJ2rq6eix26O+Y+KcMqO95Zvl bp7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:references:cc:to:from :subject; bh=u782JeDn6LzQ/KMt1qhCR8eVcxhYP9hhSND86TG1flg=; b=BoLF4T+p5THK3vdRsrAQnIhkATvz15RCTJ3n4asWPG4IJFiU5w7txMH4SIE06OKOPU TRG7SqERDcwiqhJ66H8gKvuw9q/0hoxIZINxOxCXkiKGBN1JUwVzc2q2CqGiZs1OoTKr ZS4iN18txDf5LbVP1s+lgzVgS23dt5LWgR2/MLQRKfEM5EUyzWL+MrGZr00DytMv3p3Z GOpoIW2kZhAQZ3/jkP652eDu3CE/1V6bVmss1mWuIuq8Q6lQTrlhFv3RI3/ZqUBi8W8w fqWrUzeureTwGlOI3Ir3aoHl3ycVqbPy4Ugjtdj/K8qHB34gDMNUO+4wk5Zy/laWhbQG 4jkA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ko25si1314304ejc.451.2021.04.21.02.15.45; Wed, 21 Apr 2021 02:16:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237972AbhDUJOT (ORCPT + 99 others); Wed, 21 Apr 2021 05:14:19 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:16147 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237666AbhDUJOR (ORCPT ); Wed, 21 Apr 2021 05:14:17 -0400 Received: from DGGEMS402-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4FQFC644tNzpYlJ; Wed, 21 Apr 2021 17:10:42 +0800 (CST) Received: from [10.67.110.136] (10.67.110.136) by DGGEMS402-HUB.china.huawei.com (10.3.19.202) with Microsoft SMTP Server id 14.3.498.0; Wed, 21 Apr 2021 17:13:39 +0800 Subject: Re: [RFC PATCH] irqchip/gic-v3: Do not enable irqs when handling spurious interrups From: He Ying To: Marc Zyngier CC: , , , , , Mark Rutland References: <20210416062217.25157-1-heying24@huawei.com> <87y2dis4d7.wl-maz@kernel.org> Message-ID: <0bdfe2c8-1ebe-1f6c-3522-da821398d820@huawei.com> Date: Wed, 21 Apr 2021 17:13:40 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.110.136] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Marc, Any ideas? Thanks. 在 2021/4/17 10:01, He Ying 写道: > Hello Marc, > > > 在 2021/4/16 22:15, Marc Zyngier 写道: >> [+ Mark] >> >> On Fri, 16 Apr 2021 07:22:17 +0100, >> He Ying wrote: >>> We found this problem in our kernel src tree: >>> >>> [   14.816231] ------------[ cut here ]------------ >>> [   14.816231] kernel BUG at irq.c:99! >>> [   14.816232] Internal error: Oops - BUG: 0 [#1] SMP >>> [   14.816232] Process swapper/0 (pid: 0, stack limit = >>> 0x(____ptrval____)) >>> [   14.816233] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           >>> O      4.19.95-1.h1.AOS2.0.aarch64 #14 >>> [   14.816233] Hardware name: evb (DT) >>> [   14.816234] pstate: 80400085 (Nzcv daIf +PAN -UAO) >>> [   14.816234] pc : asm_nmi_enter+0x94/0x98 >>> [   14.816235] lr : asm_nmi_enter+0x18/0x98 >>> [   14.816235] sp : ffff000008003c50 >>> [   14.816235] pmr_save: 00000070 >>> [   14.816237] x29: ffff000008003c50 x28: ffff0000095f56c0 >>> [   14.816238] x27: 0000000000000000 x26: ffff000008004000 >>> [   14.816239] x25: 00000000015e0000 x24: ffff8008fb916000 >>> [   14.816240] x23: 0000000020400005 x22: ffff0000080817cc >>> [   14.816241] x21: ffff000008003da0 x20: 0000000000000060 >>> [   14.816242] x19: 00000000000003ff x18: ffffffffffffffff >>> [   14.816243] x17: 0000000000000008 x16: 003d090000000000 >>> [   14.816244] x15: ffff0000095ea6c8 x14: ffff8008fff5ab40 >>> [   14.816244] x13: ffff8008fff58b9d x12: 0000000000000000 >>> [   14.816245] x11: ffff000008c8a200 x10: 000000008e31fca5 >>> [   14.816246] x9 : ffff000008c8a208 x8 : 000000000000000f >>> [   14.816247] x7 : 0000000000000004 x6 : ffff8008fff58b9e >>> [   14.816248] x5 : 0000000000000000 x4 : 0000000080000000 >>> [   14.816249] x3 : 0000000000000000 x2 : 0000000080000000 >>> [   14.816250] x1 : 0000000000120000 x0 : ffff0000095f56c0 >>> [   14.816251] Call trace: >>> [   14.816251]  asm_nmi_enter+0x94/0x98 >>> [   14.816251]  el1_irq+0x8c/0x180 >>> [   14.816252]  gic_handle_irq+0xbc/0x2e4 >>> [   14.816252]  el1_irq+0xcc/0x180 >>> [   14.816253]  arch_timer_handler_virt+0x38/0x58 >>> [   14.816253]  handle_percpu_devid_irq+0x90/0x240 >>> [   14.816253]  generic_handle_irq+0x34/0x50 >>> [   14.816254]  __handle_domain_irq+0x68/0xc0 >>> [   14.816254]  gic_handle_irq+0xf8/0x2e4 >>> [   14.816255]  el1_irq+0xcc/0x180 >>> [   14.816255]  arch_cpu_idle+0x34/0x1c8 >>> [   14.816255]  default_idle_call+0x24/0x44 >>> [   14.816256]  do_idle+0x1d0/0x2c8 >>> [   14.816256]  cpu_startup_entry+0x28/0x30 >>> [   14.816256]  rest_init+0xb8/0xc8 >>> [   14.816257]  start_kernel+0x4c8/0x4f4 >>> [   14.816257] Code: 940587f1 d5384100 b9401001 36a7fd01 (d4210000) >>> [   14.816258] Modules linked in: start_dp(O) smeth(O) >>> [   15.103092] ---[ end trace 701753956cb14aa8 ]--- >>> [   15.103093] Kernel panic - not syncing: Fatal exception in interrupt >>> [   15.103099] SMP: stopping secondary CPUs >>> [   15.103100] Kernel Offset: disabled >>> [   15.103100] CPU features: 0x36,a2400218 >>> [   15.103100] Memory Limit: none >> Urgh... >> >>> Our kernel src tree is based on 4.19.95 and backports arm64 pseudo-NMI >>> patches but doesn't support nested NMI. Its top relative commit is >>> commit 17ce302f3117 ("arm64: Fix interrupt tracing in the presence >>> of NMIs"). >> Can you please reproduce it with mainline and without any backport? >> It is hard to reason about something that isn't a vanilla kernel. > > I think our kernel is quite like v5.3 mainline. Reproducing it in v5.3 > mainline may > > be a little difficult for us because our product needs some more self > developed > > patches to work. > >> >>> I look into this issue and find that it's caused by 'BUG_ON(in_nmi())' >>> in nmi_enter(). From the call trace, we find two 'el1_irqs' which >>> means an interrupt preempts the other one and the new one is an NMI. >>> Furthermore, by adding some prints, we find the first irq also calls >>> nmi_enter(), but its priority is not GICD_INT_NMI_PRI and its irq >>> number >>> is 1023. It enables irq by calling gic_arch_enable_irqs() in >>> gic_handle_irq(). At this moment, the second irq preempts the first irq >>> and it's an NMI but current context is already in nmi. So that may be >>> the problem. >> I'm not sure I get it. From the stack trace, I see this: >> >> [   14.816251]  asm_nmi_enter+0x94/0x98 >> [   14.816251]  el1_irq+0x8c/0x180            (C) >> [   14.816252]  gic_handle_irq+0xbc/0x2e4 >> [   14.816252]  el1_irq+0xcc/0x180            (B) >> [   14.816253]  arch_timer_handler_virt+0x38/0x58 >> [   14.816253]  handle_percpu_devid_irq+0x90/0x240 >> [   14.816253]  generic_handle_irq+0x34/0x50 >> [   14.816254]  __handle_domain_irq+0x68/0xc0 >> [   14.816254]  gic_handle_irq+0xf8/0x2e4 >> [   14.816255]  el1_irq+0xcc/0x180            (A) >> >> which indicates that we preempted a timer interrupt (A) with another >> IRQ (B), itself immediately preempted by another IRQ (C)? That's >> indeed at least one too many. >> >> Can you please describe for each of (A), (B) and (C) whether they are >> spurious or not, what their priorities are if they aren't spurious? > > Yes. I ignored interrupt (A). (B) is spurious and its priority is 0xa0 > and PMR is 0x70. > > (C) is an NMI and its priority is 0x20. Note that GIC_PRIO_IRQON is 0xe0, > > GIC_PRIO_IRQOFF is 0x60, GICD_INT_DEF_PRI is 0xa0 and GICD_INT_NMI_PRI is > > 0x20 in our kernel. > >>> In my opinion, when handling spurious interrupts, we shouldn't >>> enable irqs. >>> My reason is that for spurious interrupts we may enter nmi context in >>> el1_irq() because current PMR may be GIC_PRIO_IRQOFF. If we enable irqs >>> at this time, another NMI may happen and preempt this spurious >>> interrupt >>> but the context is already in nmi. That causes a bug on if nested >>> NMI is >>> not supported. Even for nested nmi, I think it's not a normal scenario. >> I would tend to agree that this isn't great. Actually, I'd probably >> move the check for a spurious interrupt right after the read of >> ICC_IAR1_EL1, because there is no real need to do anything else at >> that point. > > So, we don't need to check NMI for spurious interrupts? Do you mean > that spurious > > interrupts' can't be NMIs? Or even spurious interrups are NMIs, we > shouldn't do > > anything for them? If so, I will move the check after the read of > ICC_IAR1_EL1 and > > send a V2. > >> >> However, upstream is quite different from 4.19 in that respect, and >> I'm not sure if what I am looking at is what you are seeing with your >> older kernel. > > I know that. And I look into all patches about arm64 pseudo NMIs. As I > said before, > > our kernel is very quite like v5.3 mainline. I think we are talking > about the same thing. > > > In my opinion, since commit 17ce302f3117 ("arm64: Fix interrupt > tracing in the presence of NMIs"), > > spurious interrups can enter nmi context in interrupt entry because > PMR can be GIC_PRIO_IRQOFF > > for spurious interrupts. That means test_irqs_unmasked is not 0 and > asm_nmi_enter is called. > >    (some el1_irq entry code from v5.3) > >    test_irqs_unmasked  res=x0, pmr=x20 > >    cbz x0, 1f > >    bl asm_nmi_enter > > > And it then calls gic_handle_irq(). It doesn't call gic_handle_nmi() > because its priority is not GICD_INT_NMI_PRI. > > Then it enables irqs. If at that point another NMI comes and preempts > it, which means NMI occurs in nmi > > context. That may cause a bug on if nested NMI is not supported. > >    (some gic_handle_irq code from v5.3) > >    irqnr = gic_read_iar(); > >    if (gic_supports_nmi() && >        unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) { >            gic_handle_nmi(irqnr, > regs);                                   (C) >            return; >    } > >    if (gic_prio_masking_enabled()) { >          gic_pmr_mask_irqs(); > gic_arch_enable_irqs(); (D) >    } > >> >> Thanks, >> >>     M. >> >>> Fixes: 17ce302f3117 ("arm64: Fix interrupt tracing in the presence >>> of NMIs") >>> Signed-off-by: He Ying >>> --- >>>   drivers/irqchip/irq-gic-v3.c | 8 ++++---- >>>   1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/irqchip/irq-gic-v3.c >>> b/drivers/irqchip/irq-gic-v3.c >>> index 94b89258d045..d3b52734a2c5 100644 >>> --- a/drivers/irqchip/irq-gic-v3.c >>> +++ b/drivers/irqchip/irq-gic-v3.c >>> @@ -654,15 +654,15 @@ static asmlinkage void __exception_irq_entry >>> gic_handle_irq(struct pt_regs *regs >>>           return; >>>       } >>>   +    /* Check for special IDs first */ >>> +    if ((irqnr >= 1020 && irqnr <= 1023)) >>> +        return; >>> + >>>       if (gic_prio_masking_enabled()) { >>>           gic_pmr_mask_irqs(); >>>           gic_arch_enable_irqs(); >>>       } >>>   -    /* Check for special IDs first */ >>> -    if ((irqnr >= 1020 && irqnr <= 1023)) >>> -        return; >>> - >>>       if (static_branch_likely(&supports_deactivate_key)) >>>           gic_write_eoir(irqnr); >>>       else >>> -- >>> 2.17.1 >>> >>> > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel