Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp608127pxp; Sat, 5 Mar 2022 12:55:28 -0800 (PST) X-Google-Smtp-Source: ABdhPJx1lfBLByfRDs9e0obpVUSF0RSR5Z/n0q/mK3ko41kArEX3ZQsn8zvRRgfCh0ewWaV6QeLM X-Received: by 2002:a17:907:7d94:b0:6db:207:c0cd with SMTP id oz20-20020a1709077d9400b006db0207c0cdmr3409123ejc.362.1646513728566; Sat, 05 Mar 2022 12:55:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1646513728; cv=none; d=google.com; s=arc-20160816; b=tyT8dCjVpv3oGDsGyaHO6p59Q+lZYzuWaNwa0XUvenbWlM6qUwezUH3A5h35LIQxcR GV2ZNelkPJa34cghrZKoRrSCNfI9OlfYbJ8HaqAk9Sx5gGntGhjDsuvszG9zw81GxQaj QixGNQnUo3C9LuAv8pNj0LsPHW/KHWcoSTC+ktCzbLnYOkLG7RGnO1b644WrtzeBb0Su 6IdPPgRY9/57W+7q/TzwbQ7fRGcrZjrOuQhp41mBKmM742RHgGZjEAOm4MhGXhlMhDKN LcnVUdCxN+JbEqV26BRvkRXMdwN4h093TOwyrEz0V6H2gmk9DGNKsLBl5EMq+jOyCWMS CYDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:subject:cc:to:from:message-id :date:dkim-signature; bh=gednKQ+lXKyCp9rjS5ApOJJabutdYwpHpVq7oXDXt+U=; b=AAgtzeyVRUVB2BFtXua+nIPqbUEgpTWZ+UCsYTYQfDD9xegNtLQ56l+zDNcq/rvPAw u+nLNvUwacEB4oEMCyVNTMvbgh2ao73RBr9y2O5o4sqTP49XpsmB2LBZwwGeIwO6Je+L WlUrVmqPkevi1iUtfAKWxuws4I9hHzqymHqnUYK2ZDhAHjSrpWUasuzZP3C32rV5nLkb HPFOlA5uqv44a7r4ZTof6TiT8onj+sB+kfPqlD3E1uxUcnHokYlOQDP0poTRD2kjUjKM Y31ZxSVVjWy/oDNLhGlYPkrY2fj6iIlN5K+IMQARMzemO7nfcO4raQ1VmKQMejmxgn4G bX+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=e16kNu2l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hp25-20020a1709073e1900b006d7170508cdsi6772472ejc.238.2022.03.05.12.55.01; Sat, 05 Mar 2022 12:55:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=e16kNu2l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231968AbiCEPlV (ORCPT + 99 others); Sat, 5 Mar 2022 10:41:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231964AbiCEPlT (ORCPT ); Sat, 5 Mar 2022 10:41:19 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A04163BFBB for ; Sat, 5 Mar 2022 07:40:29 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3D63261376 for ; Sat, 5 Mar 2022 15:40:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7470C004E1; Sat, 5 Mar 2022 15:40:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1646494828; bh=YN9L2FkeHDpIRmpx0V+fQEsZyvLKLeQ9Fxacm6tqc94=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=e16kNu2liV4rtdaHcoEEdEmsIXSloWyMCPUDYuZ38xveBPA5sozYEcSHVdLiMaB0Q 96MPFFzxP5ZpLQuH7Zdt2eYXIzZGrFwCt996tAhlA8XA1gb9cy16LYFDsFqhJiosK3 oyRYBxZKt9voQjA03Y7Tu4JIhZbunBkaUaQry1Nabj0rDcaPHsIn8sprlbXAJ58LnM He8Hl/1vch+cZnGSHoPLngQYuj0t6Vo39xrXAx7xJK/AREd8W8giBtQ+83kiymBkZP nODkMvu0RGQz7qbqtl7IjHw1yebXQceejCwQJ+mTD4LwsjOnseQO9mfwsfTlZii/6n Y1DQcTxFfOqfQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=billy-the-mountain.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nQWWA-00CS6D-4L; Sat, 05 Mar 2022 15:40:26 +0000 Date: Sat, 05 Mar 2022 15:40:25 +0000 Message-ID: <87a6e4tnkm.wl-maz@kernel.org> From: Marc Zyngier To: John Garry Cc: Thomas Gleixner , chenxiang , Shameer Kolothum , "linux-kernel@vger.kernel.org" , "liuqi (BA)" , , David Decotigny Subject: Re: PCI MSI issue for maxcpus=1 In-Reply-To: <1cbe7daa-8003-562b-06fa-5a50f7ee6ed2@huawei.com> References: <78615d08-1764-c895-f3b7-bfddfbcbdfb9@huawei.com> <87a6g8vp8k.wl-maz@kernel.org> <19d55cdf-9ef7-e4a3-5ae5-0970f0d7751b@huawei.com> <87v8yjyjc0.wl-maz@kernel.org> <87k0ey9122.wl-maz@kernel.org> <5f529b4e-1f6c-5a7d-236c-09ebe3a7db29@huawei.com> <1cbe7daa-8003-562b-06fa-5a50f7ee6ed2@huawei.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: john.garry@huawei.com, tglx@linutronix.de, chenxiang66@hisilicon.com, shameerali.kolothum.thodi@huawei.com, linux-kernel@vger.kernel.org, liuqi115@huawei.com, wangxiongfeng2@huawei.com, decot@google.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [+ David, who was chasing something similar] Hi John, On Fri, 04 Mar 2022 12:53:31 +0000, John Garry wrote: >=20 > > ... >=20 > >=20 > > [ 7.961007]=C2=A0 valid_col+0x14/0x24 > > [ 7.964223]=C2=A0 its_send_single_command+0x4c/0x150 > > [ 7.968741]=C2=A0 its_irq_domain_activate+0xc8/0x104 > > [ 7.973259]=C2=A0 __irq_domain_activate_irq+0x5c/0xac > > [ 7.977865]=C2=A0 __irq_domain_activate_irq+0x38/0xac > > [ 7.982471]=C2=A0 irq_domain_activate_irq+0x3c/0x64 > > [ 7.986902]=C2=A0 __msi_domain_alloc_irqs+0x1a8/0x2f4 > > [ 7.991507]=C2=A0 msi_domain_alloc_irqs+0x20/0x2c > > [ 7.995764]=C2=A0 __pci_enable_msi_range+0x2ec/0x590 > > [ 8.000284]=C2=A0 pci_alloc_irq_vectors_affinity+0xe0/0x140 > > [ 8.005410]=C2=A0 hisi_sas_v3_probe+0x300/0xbe0 > > [ 8.009494]=C2=A0 local_pci_probe+0x44/0xb0 > > [ 8.013232]=C2=A0 work_for_cpu_fn+0x20/0x34 > > [ 8.016969]=C2=A0 process_one_work+0x1d0/0x354 > > [ 8.020966]=C2=A0 worker_thread+0x2c0/0x470 > > [ 8.024703]=C2=A0 kthread+0x17c/0x190 > > [ 8.027920]=C2=A0 ret_from_fork+0x10/0x20 > > [ 8.031485] ---[ end trace bb67cfc7eded7361 ]--- > >=20 >=20 > ... >=20 > > Ah, of course. the CPU hasn't booted yet, so its collection isn't > > mapped. I was hoping that the core code would keep the interrupt in > > shutdown state, but it doesn't seem to be the case... > >=20 > > > Apart from this, I assume that if another cpu comes online later in > > > the affinity mask I would figure that we want to target the irq to > > > that cpu (which I think we would not do here). > >=20 > > That's probably also something that should come from core code, as > > we're not really in a position to decide this in the ITS driver. > > . >=20 >=20 > Hi Marc, >=20 > Have you had a chance to consider this issue further? >=20 > So I think that x86 avoids this issue as it uses matrix.c, which > handles CPUs being offline when selecting target CPUs for managed > interrupts. >=20 > So is your idea still that core code should keep the interrupt in > shutdown state (for no CPUs online in affinity mask)? Yup. I came up with this: diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c index 2bdfce5edafd..97e9eb9aecc6 100644 --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -823,6 +823,19 @@ static int msi_init_virq(struct irq_domain *domain, in= t virq, unsigned int vflag if (!(vflags & VIRQ_ACTIVATE)) return 0; =20 + if (!(vflags & VIRQ_CAN_RESERVE)) { + /* + * If the interrupt is managed but no CPU is available + * to service it, shut it down until better times. + */ + if (irqd_affinity_is_managed(irqd) && + !cpumask_intersects(irq_data_get_affinity_mask(irqd), + cpu_online_mask)) { + irqd_set_managed_shutdown(irqd); + return 0; + } + } + ret =3D irq_domain_activate_irq(irqd, vflags & VIRQ_CAN_RESERVE); if (ret) return ret; With this in place, I get the following results (VM booted with 4 vcpus and maxcpus=3D1, the virtio device is using managed interrupts): root@debian:~# cat /proc/interrupts=20 CPU0 =20 10: 2298 GICv3 27 Level arch_timer 12: 84 GICv3 33 Level uart-pl011 49: 0 GICv3 41 Edge ACPI:Ged 50: 0 ITS-MSI 16384 Edge virtio0-config 51: 2088 ITS-MSI 16385 Edge virtio0-req.0 52: 0 ITS-MSI 16386 Edge virtio0-req.1 53: 0 ITS-MSI 16387 Edge virtio0-req.2 54: 0 ITS-MSI 16388 Edge virtio0-req.3 55: 11641 ITS-MSI 32768 Edge xhci_hcd 56: 0 ITS-MSI 32769 Edge xhci_hcd IPI0: 0 Rescheduling interrupts IPI1: 0 Function call interrupts IPI2: 0 CPU stop interrupts IPI3: 0 CPU stop (for crash dump) interrupts IPI4: 0 Timer broadcast interrupts IPI5: 0 IRQ work interrupts IPI6: 0 CPU wake-up interrupts Err: 0 root@debian:~# echo 1 >/sys/devices/system/cpu/cpu2/online=20 root@debian:~# cat /proc/interrupts=20 CPU0 CPU2 =20 10: 2530 90 GICv3 27 Level arch_timer 12: 103 0 GICv3 33 Level uart-pl011 49: 0 0 GICv3 41 Edge ACPI:Ged 50: 0 0 ITS-MSI 16384 Edge virtio0-config 51: 2097 0 ITS-MSI 16385 Edge virtio0-req.0 52: 0 0 ITS-MSI 16386 Edge virtio0-req.1 53: 0 12 ITS-MSI 16387 Edge virtio0-req.2 54: 0 0 ITS-MSI 16388 Edge virtio0-req.3 55: 13487 0 ITS-MSI 32768 Edge xhci_hcd 56: 0 0 ITS-MSI 32769 Edge xhci_hcd IPI0: 38 45 Rescheduling interrupts IPI1: 3 3 Function call interrupts IPI2: 0 0 CPU stop interrupts IPI3: 0 0 CPU stop (for crash dump) interrupts IPI4: 0 0 Timer broadcast interrupts IPI5: 0 0 IRQ work interrupts IPI6: 0 0 CPU wake-up interrupts Err: 0 Would this solve your problem? Thanks, M. --=20 Without deviation from the norm, progress is not possible.