Received: by 2002:a89:48b:0:b0:1f5:f2ab:c469 with SMTP id a11csp1174477lqd; Thu, 25 Apr 2024 08:00:39 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVFIAkXmcLRXpN+I+Nkwi9W/0yroKcKdmThhtwHOu4Vr/WouA3fG08zrl6Pb1msk4vj3LjwJF58DvOL6XQNZdQcSsGZzvP0LS5zGX3gaQ== X-Google-Smtp-Source: AGHT+IGiRVX6BsS+uYm2TPqzGQaFCea0+7Z1HyDuD0GLlMJqwzwvGZvAbPaZf4Z5kE4seNWR8WOO X-Received: by 2002:a05:6a20:7286:b0:1a7:242a:cb69 with SMTP id o6-20020a056a20728600b001a7242acb69mr6846124pzk.40.1714057238696; Thu, 25 Apr 2024 08:00:38 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714057238; cv=pass; d=google.com; s=arc-20160816; b=l3nA5rqPFrJuaMvtL7Yg9BeDWiQm/pYobzgpnAREX71jW5iDS+QAcsxM1kmY9clJUe V2VoHecD0/1yvqsaC7FrZY58GasM+todFNhxfb1MWiRsHC9n/xkEU9+G8hG9Reybvi9C iRhtFSqHpzEbmaJIFlgsHLp/3rK8cOsonSvCLFbPoNCZ5+d4fCegxSSWPxJw6Y1bkC6f 0/V3c9BgqG2VJP29Hs3v3pJamgFuDxEX8HauJvi5h1+ZvbqLWd3AtHlo44mqkddBs6fH b0bcq+ugwXotatAI3tQ+bN7Z2kDUbWTg18PMXLvoi5TvodnrsdFO6GmuhAjI7/JU+Rcg BNCQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:organization:references :in-reply-to:message-id:subject:cc:to:from:date; bh=EJocEBv2TcQoEiYQvh5CF/Pn0JvnUGQ8fj4RYscSPtg=; fh=uMCVkZb0QdIDswU2jAV0PfB1+Rt7eki7rJfvDRFFsB8=; b=om2TOrqYft/xutOc805XJcq9Ja171cTpYDMFwQJxzEPRt/7bPNSh+/wR9fzVbixUpk gItmTMEtfR+7xyaVRCv/W9OeLfTB034rpiEvaUXzrn2fth6bPhSbK3vVQ4AYsb3q9NuN KDU0JuV/Oc6bb8npDe7/v08rCC40V57KasgQsn3IAzrGvlXNFT/Ab6hlRqi5CKE+PafS 4MXdPhRijDjzDbl1mvxq8bMlTb5WY9szE3owsRH0HfLfuNOTyibbLIc5tU9yPJEFVM8Q RoeGUiCOpTocQf03S+IJxJhbbH8e9lzA7kC5p2B8DqUwMdVWu07V6Cj1JVT8kMbnDY3x MYzg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-158751-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-158751-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id m20-20020a638c14000000b005f8071cbc1dsi11093795pgd.176.2024.04.25.08.00.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 25 Apr 2024 08:00:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-158751-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-158751-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-158751-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id BC7A2281B84 for ; Thu, 25 Apr 2024 15:00:37 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B9DD114A622; Thu, 25 Apr 2024 15:00:28 +0000 (UTC) Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C78E1494BC; Thu, 25 Apr 2024 15:00:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714057228; cv=none; b=iUe0G5u4yM6kkt80L2mZUZhF5/1cn9/087qkXfXySYmmyBGO+FsDJ9LxqTPGbof0x6i11D/k4Ss1/ia89AniK2bx9Vem4/ft0aGqUXShBGqe3qb3r9yFCDqnoAf4Dnrh1kcJfDK1fLy5BOdn8rFJzcCZyK8vm2oF4/1XxwBZJS4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714057228; c=relaxed/simple; bh=HvN8k9aDJPOql2/EPOWvZFEF30jxyxQy9xYUGmACbj4=; h=Date:From:To:CC:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=EYZWjEQlvKj+aGQIc+6GELaKzm9lL3wyIbj6z5r+c8tjSoQ6vs4EBysZgAMfFF3IalRa8fnviVH0nP2AWQzdpysVJY2cHk64f8K+a1nGER61NYJrYUYjnbHPFiViyGrQ3Wb2jwT92U3w/rC8FjVDEi8Psmkr7NfvqcYJB5btanA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=Huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=Huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.216]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4VQJtd31g7z6DBbS; Thu, 25 Apr 2024 23:00:09 +0800 (CST) Received: from lhrpeml500005.china.huawei.com (unknown [7.191.163.240]) by mail.maildlp.com (Postfix) with ESMTPS id 917A4140736; Thu, 25 Apr 2024 23:00:19 +0800 (CST) Received: from localhost (10.202.227.76) by lhrpeml500005.china.huawei.com (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Thu, 25 Apr 2024 16:00:18 +0100 Date: Thu, 25 Apr 2024 16:00:17 +0100 From: Jonathan Cameron To: Marc Zyngier , CC: Thomas Gleixner , Peter Zijlstra , , , , , , , , , Russell King , "Rafael J . Wysocki" , Miguel Luis , "James Morse" , Salil Mehta , Jean-Philippe Brucker , Catalin Marinas , Will Deacon , Ingo Molnar , Borislav Petkov , Dave Hansen , , Subject: Re: [PATCH v7 11/16] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs Message-ID: <20240425155726.000063f7@huawei.com> In-Reply-To: <20240425133150.000009fa@Huawei.com> References: <20240418135412.14730-1-Jonathan.Cameron@huawei.com> <20240418135412.14730-12-Jonathan.Cameron@huawei.com> <20240422114020.0000294f@Huawei.com> <87plugthim.wl-maz@kernel.org> <20240424135438.00001ffc@huawei.com> <86il06rd19.wl-maz@kernel.org> <20240425133150.000009fa@Huawei.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-ClientProxiedBy: lhrpeml100006.china.huawei.com (7.191.160.224) To lhrpeml500005.china.huawei.com (7.191.163.240) On Thu, 25 Apr 2024 13:31:50 +0100 Jonathan Cameron wrote: > On Wed, 24 Apr 2024 16:33:22 +0100 > Marc Zyngier wrote: > > > On Wed, 24 Apr 2024 13:54:38 +0100, > > Jonathan Cameron wrote: > > > > > > On Tue, 23 Apr 2024 13:01:21 +0100 > > > Marc Zyngier wrote: > > > > > > > On Mon, 22 Apr 2024 11:40:20 +0100, > > > > Jonathan Cameron wrote: > > > > > > > > > > On Thu, 18 Apr 2024 14:54:07 +0100 > > > > > Jonathan Cameron wrote: > > > > [...] > > > > > > > > > > > > > + /* > > > > > > + * Capable but disabled CPUs can be brought online later. What about > > > > > > + * the redistributor? ACPI doesn't want to say! > > > > > > + * Virtual hotplug systems can use the MADT's "always-on" GICR entries. > > > > > > + * Otherwise, prevent such CPUs from being brought online. > > > > > > + */ > > > > > > + if (!(gicc->flags & ACPI_MADT_ENABLED)) { > > > > > > + pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu); > > > > > > + set_cpu_present(cpu, false); > > > > > > + set_cpu_possible(cpu, false); > > > > > > + return 0; > > > > > > + } > > > > > > > > It seems dangerous to clear those this late in the game, given how > > > > disconnected from the architecture code this is. Are we sure that > > > > nothing has sampled these cpumasks beforehand? > > > > > > Hi Marc, > > > > > > Any firmware that does this is being considered as buggy already > > > but given it is firmware and the spec doesn't say much about this, > > > there is always the possibility. > > > > There is no shortage of broken firmware out there, and I expect this > > trend to progress. > > > > > Not much happens between the point where these are setup and > > > the point where the the gic inits and this code runs, but even if careful > > > review showed it was fine today, it will be fragile to future changes. > > > > > > I'm not sure there is a huge disadvantage for such broken firmware in > > > clearing these masks from the point of view of what is used throughout > > > the rest of the kernel. Here I think we are just looking to prevent the CPU > > > being onlined later. > > > > I totally agree on the goal, I simply question the way you get to it. > > > > > > > > We could add a set_cpu_broken() with appropriate mask. > > > Given this is very arm64 specific I'm not sure Rafael will be keen on > > > us checking such a mask in the generic ACPI code, but we could check it in > > > arch_register_cpu() and just not register the cpu if it matches. > > > That will cover the vCPU hotplug case. > > > > > > Does that sounds sensible, or would you prefer something else? > > > > > > Such a 'broken_rdists' mask is exactly what I have in mind, just > > keeping it private to the GIC driver, and not expose it anywhere else. > > You can then fail the hotplug event early, and avoid changing the > > global masks from within the GIC driver. At least, we don't mess with > > the internals of the kernel, and the CPU is properly marked as dead > > (that mechanism should already work). > > > > I'd expect the handling side to look like this (will not compile, but > > you'll get the idea): > Hi Marc, > > In general this looks good - but... > > I haven't gotten to the bottom of why yet (and it might be a side > effect of how I hacked the test by lying in minimal fashion and > just frigging the MADT read functions) but the hotplug flow is only getting > as far as calling __cpu_up() before it seems to enter an infinite loop. > That is it never gets far enough to fail this test. > > Getting stuck in a psci cpu_on call. I'm guessing something that > we didn't get to in the earlier gicv3 calls before bailing out is blocking that? > Looks like it gets to > SMCCC smc > and is never seen again. > > Any ideas on where to look? The one advantage so far of the higher level > approach is we never tried the hotplug callbacks at all so avoided hitting > that call. One (little bit horrible) solution that might avoid this would > be to add another cpuhp state very early on and fail at that stage. > I'm not keen on doing that without a better explanation than I have so far! Whilst it still doesn't work I suspect I'm loosing ability to print to the console between that point and somewhat later and real problem is elsewhere. Jonathan > > Thanks, > > J > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > > index 6fb276504bcc..e8f02bfd0e21 100644 > > --- a/drivers/irqchip/irq-gic-v3.c > > +++ b/drivers/irqchip/irq-gic-v3.c > > @@ -1009,6 +1009,9 @@ static int __gic_populate_rdist(struct redist_region *region, void __iomem *ptr) > > u64 typer; > > u32 aff; > > > > + if (cpumask_test_cpu(smp_processor_id(), &broken_rdists)) > > + return 1; > > + > > /* > > * Convert affinity to a 32bit value that can be matched to > > * GICR_TYPER bits [63:32]. > > @@ -1260,14 +1263,15 @@ static int gic_dist_supports_lpis(void) > > !gicv3_nolpi); > > } > > > > -static void gic_cpu_init(void) > > +static int gic_cpu_init(void) > > { > > void __iomem *rbase; > > - int i; > > + int ret, i; > > > > /* Register ourselves with the rest of the world */ > > - if (gic_populate_rdist()) > > - return; > > + ret = gic_populate_rdist(); > > + if (ret) > > + return ret; > > > > gic_enable_redist(true); > > > > @@ -1286,6 +1290,8 @@ static void gic_cpu_init(void) > > > > /* initialise system registers */ > > gic_cpu_sys_reg_init(); > > + > > + return 0; > > } > > > > #ifdef CONFIG_SMP > > @@ -1295,7 +1301,11 @@ static void gic_cpu_init(void) > > > > static int gic_starting_cpu(unsigned int cpu) > > { > > - gic_cpu_init(); > > + int ret; > > + > > + ret = gic_cpu_init(); > > + if (ret) > > + return ret; > > > > if (gic_dist_supports_lpis()) > > its_cpu_init(); > > > > But the question is: do you rely on these masks having been > > "corrected" anywhere else? > > > > Thanks, > > > > M. > > > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel