Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp779293pxb; Wed, 13 Apr 2022 12:09:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxqZxUaS8MTPTSVdqhV32nLwebdRiWpFKtRiRIlGvK9fEwWKxdSBR51DP6d6dcJCASXASN2 X-Received: by 2002:a17:906:6a0b:b0:6e8:d7ba:6648 with SMTP id qw11-20020a1709066a0b00b006e8d7ba6648mr1539344ejc.288.1649876970661; Wed, 13 Apr 2022 12:09:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649876970; cv=none; d=google.com; s=arc-20160816; b=JyFAgGM/w3Ub+uDhfoZzJSWuPpsndaEk/d6kQlfWYohWxuWDP7A8QbxPtrYdmqTJEi nVyWARMLKCoXtfEpg+5VHDy8K0IZfMKUX4KcBXFEB5hR4IO8bdrz5Ii3salihGn7a6U2 B/XJtgv5UIbKxhFiMVXyx/hITaPw7KuYLud3l6zG44LMbqSCRDdSuWQ49QhL5NPhA9Zp kuVFx1U1Fb62BI1izM8dlZezKwYW9UzWxyEHDG14v/WprojNmyrR2RV0YeUaATHZBIUp vy4XSyToCva6y2Ma6amTjYQw2Biog0bdM3xk6F6af0u8fU/Dz/SfXAVIPopvlQrOtgvF Ku1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date:dkim-signature; bh=vZ/fnh0NgEuyCWnaabQg7R4vbTP+FnMwBhB9pAEGGCY=; b=OpwjSxFQ0lt6s/FbtXz2JopsneOWQodj3xUBYcie4J5Oe2su2TckKihTzvMEJQUpy8 7T/WpBTHAtEkDdp9j1fGxqe1iaYKWnbKsaO56M9r2yiRbI9nA335AJJsmtvSyPf8gF7g EpIRVtHQlmZ3qezWigF9uutEBWp5xZRZTf9oXaj6IV8TpvhZ31wxkAEPoq2LZFdrt1OV 8KHq/DWAs7Pvf76Jcq2nyAqPtxSFKHUgxb671ohJHGxglF/YqBDGpHOABiFfoyRWCoO8 whAGGuxNtUgrLk80mUXOLNBe69HWxDQO+DmyyG3Ww1KtwVmaMcNWXwbxYbqF3ASlsuqn 43vw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=FmNDAQNU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a14-20020a170906244e00b006e6faf50b7csi462997ejb.939.2022.04.13.12.09.04; Wed, 13 Apr 2022 12:09:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=FmNDAQNU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229714AbiDMR3A (ORCPT + 99 others); Wed, 13 Apr 2022 13:29:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230267AbiDMR26 (ORCPT ); Wed, 13 Apr 2022 13:28:58 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB2FC6B08F; Wed, 13 Apr 2022 10:26:36 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8818F61E78; Wed, 13 Apr 2022 17:26:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DD230C385A4; Wed, 13 Apr 2022 17:26:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649870795; bh=7uCtl2l/6qgsCTcPFKM7OpuMCl53900Ywcxd+mwkmRc=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=FmNDAQNU5kqS+UW/kmorWSuiXJMue7T2LaZA/r8j7Y7jjbBIfKmC1MDZH+6g2ySnF nulRdLF8rfwxG+01+KG17m97entevsMsPKvSGICnuKZOMDWnEgw+dtrOaQSamOnL60 tjqgSbTdnmIMg6G3wcsiiNnQSM++ArMAz2lPNj0TQQXsXSJnibETu9WPjmjRW59gIL /D5g54+IxregJDmSUiF5jtRWQSVmC3YxJYxOwcwU/s8VGfC0nVk0KFgf+J4pj0kTmq VMxTWk87LcsWF+N5sN2OuUdZw2gVv7wEAsd4QLY4JhF4WVkWS19aIClSaTXFipq0eE 9PqzeKoyIACZg== Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1neglF-00463n-FP; Wed, 13 Apr 2022 18:26:33 +0100 Date: Wed, 13 Apr 2022 18:26:33 +0100 Message-ID: <878rs8c2t2.wl-maz@kernel.org> From: Marc Zyngier To: Marek Szyprowski Cc: linux-kernel , 'Linux Samsung SOC' , Thomas Gleixner , John Garry , Xiongfeng Wang , David Decotigny , Krzysztof Kozlowski Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs In-Reply-To: <4b7fc13c-887b-a664-26e8-45aed13f048a@samsung.com> References: <20220405185040.206297-1-maz@kernel.org> <20220405185040.206297-3-maz@kernel.org> <4b7fc13c-887b-a664-26e8-45aed13f048a@samsung.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: m.szyprowski@samsung.com, linux-kernel@vger.kernel.org, linux-samsung-soc@vger.kernel.org, tglx@linutronix.de, john.garry@huawei.com, wangxiongfeng2@huawei.com, ddecotig@google.com, krzk@kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Marek, On Wed, 13 Apr 2022 15:59:21 +0100, Marek Szyprowski wrote: > > Hi Marc, > > On 05.04.2022 20:50, Marc Zyngier wrote: > > When booting with maxcpus= (or even loading a driver > > while most CPUs are offline), it is pretty easy to observe managed > > affinities containing a mix of online and offline CPUs being passed > > to the irqchip driver. > > > > This means that the irqchip cannot trust the affinity passed down > > from the core code, which is a bit annoying and requires (at least > > in theory) all drivers to implement some sort of affinity narrowing. > > > > In order to address this, always limit the cpumask to the set of > > online CPUs. > > > > Signed-off-by: Marc Zyngier > > This patch landed in linux next-20220413 as commit 33de0aa4bae9 > ("genirq: Always limit the affinity to online CPUs"). Unfortunately it > breaks booting of most ARM 32bit Samsung Exynos based boards. > > I don't see anything specific in the log, though. Booting just hangs at > some point. The only Samsung Exynos boards that boot properly are those > Exynos4412 based. > > I assume that this is related to the Multi Core Timer IRQ configuration > specific for that SoCs. Exynos4412 uses PPI interrupts, while all other > Exynos SoCs have separate IRQ lines for each CPU. > > Let me know how I can help debugging this issue. Thanks for the heads up. Can you pick the last working kernel, enable CONFIG_GENERIC_IRQ_DEBUGFS, and dump the /sys/kernel/debug/irq/irqs/ entries for the timer IRQs? Also, see below. > > > --- > > kernel/irq/manage.c | 25 +++++++++++++++++-------- > > 1 file changed, 17 insertions(+), 8 deletions(-) > > > > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c > > index c03f71d5ec10..f71ecc100545 100644 > > --- a/kernel/irq/manage.c > > +++ b/kernel/irq/manage.c > > @@ -222,11 +222,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask, > > { > > struct irq_desc *desc = irq_data_to_desc(data); > > struct irq_chip *chip = irq_data_get_irq_chip(data); > > + const struct cpumask *prog_mask; > > int ret; > > > > + static DEFINE_RAW_SPINLOCK(tmp_mask_lock); > > + static struct cpumask tmp_mask; > > + > > if (!chip || !chip->irq_set_affinity) > > return -EINVAL; > > > > + raw_spin_lock(&tmp_mask_lock); > > /* > > * If this is a managed interrupt and housekeeping is enabled on > > * it check whether the requested affinity mask intersects with > > @@ -248,24 +253,28 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask, > > */ > > if (irqd_affinity_is_managed(data) && > > housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) { > > - const struct cpumask *hk_mask, *prog_mask; > > - > > - static DEFINE_RAW_SPINLOCK(tmp_mask_lock); > > - static struct cpumask tmp_mask; > > + const struct cpumask *hk_mask; > > > > hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); > > > > - raw_spin_lock(&tmp_mask_lock); > > cpumask_and(&tmp_mask, mask, hk_mask); > > if (!cpumask_intersects(&tmp_mask, cpu_online_mask)) > > prog_mask = mask; > > else > > prog_mask = &tmp_mask; > > - ret = chip->irq_set_affinity(data, prog_mask, force); > > - raw_spin_unlock(&tmp_mask_lock); > > } else { > > - ret = chip->irq_set_affinity(data, mask, force); > > + prog_mask = mask; > > } > > + > > + /* Make sure we only provide online CPUs to the irqchip */ > > + cpumask_and(&tmp_mask, prog_mask, cpu_online_mask); > > + if (!cpumask_empty(&tmp_mask)) > > + ret = chip->irq_set_affinity(data, &tmp_mask, force); > > + else > > + ret = -EINVAL; Can you also check that with the patch applied, it is this path that is taken and that it is the timer interrupts that get rejected? If that's the case, can you put a dump_stack() here and give me that stack trace? The use of irq_force_affinity() in the driver looks suspicious... Finally, is there a QEMU emulation of one of these failing boards? Thanks, M. -- Without deviation from the norm, progress is not possible.