Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 818A3C433EF for ; Tue, 14 Dec 2021 09:26:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232367AbhLNJ0P (ORCPT ); Tue, 14 Dec 2021 04:26:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230013AbhLNJ0N (ORCPT ); Tue, 14 Dec 2021 04:26:13 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71CC6C061574 for ; Tue, 14 Dec 2021 01:26:13 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 355F4B81826 for ; Tue, 14 Dec 2021 09:26:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C760CC34601; Tue, 14 Dec 2021 09:26:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1639473970; bh=KnugpELl5v1/R/s0cCmHrTSJD7eACS0SdIGoso0nYLQ=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=eCic1VaiUZXDHjPJA6MFkVDoCGFmTyvoT0GUZYq+KBnYVtvNASyTU0T0kyLizm3nP MQDxnHmJFtfWpeTtaFZn//bEuQLE/2T96xR/+ZcfjIb15wb+H9AfwAuDajpQyxzUaQ d4VLXOSwCDurY1UvLLlnXKVqse9S8uBtuwd9NCbAm0iROw7aQU6beip7HdeJwZjmXT KVpFbxnUI5y17JYG0R2FtJjjKG+kViBLuqKWZJvO3LfacyU7dVtCMFGF83O5DRcFeM Smlld0UcVuGRcfo0v06pEvHW4+7zAJLZdPQkMZnOFB6Hf/oaO22S2kSHmw2XKkTj4u ORflWWyuGyN9g== Received: from cfbb000407.r.cam.camfibre.uk ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1mx44W-00C0Xh-Qn; Tue, 14 Dec 2021 09:26:08 +0000 Date: Tue, 14 Dec 2021 09:26:08 +0000 Message-ID: <87h7bbk05r.wl-maz@kernel.org> From: Marc Zyngier To: Jay Chen Cc: tglx@linutronix.de, linux-kernel@vger.kernel.org, zhangliguang@linux.alibaba.com, Lorenzo Pieralisi Subject: Re: [RFC PATCH] irqchip/gic-v4.1:fix the kdump GIC ITS RAS error for ITS BASER2 In-Reply-To: <20211214064716.21407-1-jkchen@linux.alibaba.com> References: <20211214064716.21407-1-jkchen@linux.alibaba.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: jkchen@linux.alibaba.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, zhangliguang@linux.alibaba.com, lorenzo.pieralisi@arm.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [+ Lorenzo, just in case...] Hi Jay, Thanks for this. On Tue, 14 Dec 2021 06:47:16 +0000, Jay Chen wrote: > > We encounter a GIC RAS Error in below flow: > (1) Configure ITS related register (including > GITS_BASER2, GITS_BASER2.valid = 1'b1) > (2) Configure GICR related register (including > GICR_VPROPBASER, GICR_VPROPBASER.valid = 1'b1) > The common settings in above 2 register are the same > and currently everything is OK > (3) Kernel panic and os start the kdump flow.And then os > reconfigure ITS related register (including GITS_BASER2, > GITS_BASER2.valid = 1'b1). But at this time, gicr_vpropbaser > is not initialized, so it is still an old value. At this point, > the new value of its_baser2 and the old value of gicr_vpropbaser is > different, resulting in its RAS error. > > https://bugzilla.kernel.org/show_bug.cgi?id=215327 I'm sorry, but I don't have any access to this. Please add all the relevant details to the commit message and drop the link. Could you please detail what HW this is on? The architecture specification for GICv4.1 doesn't make any mention of RAS error conditions, so this must be implementation specific. A reference to the TRM of the IP would certainly help. Now, I think you have identified something interesting, but I'm not convinced by the implementation, see below. > > Signed-off-by: Jay Chen > --- > drivers/irqchip/irq-gic-v3-its.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c > index eb0882d15366..c340bbf4427b 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -2623,6 +2623,12 @@ static int its_alloc_tables(struct its_node *its) > return err; > } > > + if ((i == 2) && is_kdump_kernel() && is_v4_1(its)) { > + val = its_read_baser(its, baser); > + val &= ~GITS_BASER_VALID; > + its_write_baser(its, baser, val); > + } This looks like a very odd way to address the issue. You are silently disabling the Base Register containing the VPE table, and carry on as if nothing happened. What happen if someone starts a guest using direct injection at this point? A kdump kernel still is a full fledged kernel, and I don't expect it to behave differently. If we are to make this work, we need to either disable the v4.1 extension altogether or sanitise the offending registers so that we don't leave things in a bad state. My preference is of course the latter. Could you please give this patch a go and let me know if it helps? Thanks, M. diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index daec3309b014..cb339ace5046 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -920,6 +920,15 @@ static int __gic_update_rdist_properties(struct redist_region *region, { u64 typer = gic_read_typer(ptr + GICR_TYPER); + /* Boot-time cleanup */ + if ((typer & GICR_TYPER_VLPIS) && (typer & GICR_TYPER_RVPEID)) { + u64 val; + + val = gicr_read_vpropbaser(ptr + SZ_128K + GICR_VPROPBASER); + val &= ~GICR_VPROPBASER_4_1_VALID; + gicr_write_vpropbaser(val, ptr + SZ_128K + GICR_VPROPBASER); + } + gic_data.rdists.has_vlpis &= !!(typer & GICR_TYPER_VLPIS); /* RVPEID implies some form of DirectLPI, no matter what the doc says... :-/ */ -- Without deviation from the norm, progress is not possible.