Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp2792948rwb; Fri, 16 Dec 2022 06:36:56 -0800 (PST) X-Google-Smtp-Source: AA0mqf4yrm8BsJ7V0xpvvmKRjRzsq3SFFbBKc+HqscqjynOeYNdL+gEVwSMIWzYlvqLtOkdS6zNp X-Received: by 2002:a17:90a:ac07:b0:219:aa58:77ba with SMTP id o7-20020a17090aac0700b00219aa5877bamr33490884pjq.25.1671201416671; Fri, 16 Dec 2022 06:36:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671201416; cv=none; d=google.com; s=arc-20160816; b=GfhtShPzxW3EZCpEhPmYRf7Dc615lldqhsHdY8NlvZ6hR7aFD6JEST7KNG0Yhp28cX Xr38N5357YJT8Y5TR0WPpnRohWKAowNjl1Nat/v7NfqK2/6YmzQkYWaADTOYwl/CwFMG n+rcNQeAmy3QQX/5wlCH0V+9TknGIPdD8AT9vF3Uw36hOnQGcD8qgpK02Miq2Dth2Yd4 adrROYmQDq7CDnP6xSSPR6vNA4b664iyEDPxjF6qfn/wntMpqFsYtm1YMWhfXt5PyCcg jF6RV0JjiOLLNu3tRG8m8LyJlRssW5uh3QQc7qDwIBy2xDdlJhvd3HAfqQR3fJNU+ATK 2K3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date:dkim-signature; bh=QDuHIcT92kIMsgck5MbeJQXgTyCn54wsgvphIVC1bEc=; b=uBrMZs2/70woBITR9gxWQc0QyZX7apzFAoVOe5TXd4r5jT0DCD34Hq53kwicoqkQJK 5HJSPqHyEMRzDkwPxFIw4n23j3ZtNK8VWK6crxrxAQmji4mmMauIKdG9I4S4Y3fGdsjQ gQ9wx5FDYEfrrtT9zmagYWVKKIqCeQd3AZCsv7isTt9rNF8EB0qVc0LCbh5tRkWxcWkP 4127pK1LBPlzfS9U0VQokUL4XRApdaAmoD8SFPMhLD2OMpx/LOyUe4MoBNZndun27WDf scEqkf+jpJnpPxwSYAZEB8FooKim9a+QOiLl8bVPaQU0DPgPz8rQaYxPfPwPRzlfRFeE 9WBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Hhl9Uize; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id rj1-20020a17090b3e8100b00218e1e7534bsi8288129pjb.180.2022.12.16.06.36.47; Fri, 16 Dec 2022 06:36:56 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Hhl9Uize; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230448AbiLPN7I (ORCPT + 68 others); Fri, 16 Dec 2022 08:59:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229547AbiLPN7G (ORCPT ); Fri, 16 Dec 2022 08:59:06 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B20720BE1; Fri, 16 Dec 2022 05:59:05 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C5CB56211D; Fri, 16 Dec 2022 13:59:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 098AAC433EF; Fri, 16 Dec 2022 13:59:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1671199144; bh=oOuNDBUsumxC7rkoyfkCw2WT0lUMyMyT5vyUHvQIq1U=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Hhl9UizeNUTtfUINpUH0qTih2139NJoVnFhVsTJecf7DfKeKzgi2+HwJGg3NQ1D5q u1brphMrKWsX/vVVGh0R3ZoYnIhVS0xslWp+YJgWqz+to6WLaTwOcs8ztoz1DnPJuT oeBzL40cG3oy3OBW2mbDq4/01D0UR/X2eG7wsbCYpyCQtgKXfMgwWN1Wh5ygAGPWZx gXYuo+znFVWHi/33V2+ODtdwcK6+rEfwl93ScrG5s2maFO4MplpTrsc2nC8q+qlsd0 wOOasPKjnH4w76xZmfz+nBo/ngOWao9I3LE0QNeHwYmMtfnhTt61hTfrVAsPbmYci+ YJH+PSAsZbc4A== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1p6BEr-00D5QM-IL; Fri, 16 Dec 2022 13:59:01 +0000 Date: Fri, 16 Dec 2022 13:58:59 +0000 Message-ID: <86v8mbphzw.wl-maz@kernel.org> From: Marc Zyngier To: Matthew Rosato Cc: Thomas Gleixner , Niklas Schnelle , Guenter Roeck , LKML , x86@kernel.org, Joerg Roedel , Will Deacon , linux-pci@vger.kernel.org, Bjorn Helgaas , Lorenzo Pieralisi , Greg Kroah-Hartman , Jason Gunthorpe , Dave Jiang , Alex Williamson , Kevin Tian , Dan Williams , Logan Gunthorpe , Ashok Raj , Jon Mason , Allen Hubbe Subject: Re: [patch V3 09/33] genirq/msi: Add range checking to msi_insert_desc() In-Reply-To: <0acb8c63-7f6c-6df6-cb40-66b265a6e6ce@linux.ibm.com> References: <20221124230505.073418677@linutronix.de> <20221124232325.798556374@linutronix.de> <20221213190425.GA3943240@roeck-us.net> <4e0a129855490febb1c57e7e979bcfb579d39054.camel@linux.ibm.com> <87fsdgzpqs.ffs@tglx> <86wn6rptdu.wl-maz@kernel.org> <0acb8c63-7f6c-6df6-cb40-66b265a6e6ce@linux.ibm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: mjrosato@linux.ibm.com, tglx@linutronix.de, schnelle@linux.ibm.com, linux@roeck-us.net, linux-kernel@vger.kernel.org, x86@kernel.org, joro@8bytes.org, will@kernel.org, linux-pci@vger.kernel.org, bhelgaas@google.com, lorenzo.pieralisi@arm.com, gregkh@linuxfoundation.org, jgg@mellanox.com, dave.jiang@intel.com, alex.williamson@redhat.com, kevin.tian@intel.com, dan.j.williams@intel.com, logang@deltatee.com, ashok.raj@intel.com, jdmason@kudzu.us, allenbh@gmail.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 16 Dec 2022 13:50:59 +0000, Matthew Rosato wrote: > > On 12/16/22 4:53 AM, Marc Zyngier wrote: > > On Thu, 15 Dec 2022 16:23:20 +0000, > > Matthew Rosato wrote: > >> > >> On 12/15/22 9:49 AM, Thomas Gleixner wrote: > >>> On Wed, Dec 14 2022 at 10:42, Niklas Schnelle wrote: > >>>> On Tue, 2022-12-13 at 11:04 -0800, Guenter Roeck wrote: > >>>>> This patch results in various s390 qemu test failures. > >>>>> There is a warning backtrace > >>>>> > >>>>> 12.674858] WARNING: CPU: 0 PID: 1 at kernel/irq/msi.c:167 msi_ctrl_valid+0x2a/0xb0 > >>>>> > >>>>> followed by > >>>>> > >>>>> [ 12.684333] virtio_net: probe of virtio0 failed with error -34 > >>>>> > >>>>> and Ethernet interfaces don't instantiate. > >>>> As far as I'm aware so far he tracked this down to code calling > >>>> msi_domain_get_hwsize() which in turn calls msi_get_device_domain() > >>>> which then returns NULL leading to msi_domain_get_hwsize() returning 0. > >>>> I think this is related to the fact that we currently don't use IRQ > >>>> domains. > >>> > >>> Correct and for some stupid reason I thought 0 is a good return value > >>> here :) > >>> > >>> > >>> > >>> diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c > >>> index bd4d4dd626b4..8fb10f216dc0 100644 > >>> --- a/kernel/irq/msi.c > >>> +++ b/kernel/irq/msi.c > >>> @@ -609,8 +609,8 @@ static unsigned int msi_domain_get_hwsize(struct device *dev, unsigned int domid > >>> info = domain->host_data; > >>> return info->hwsize; > >>> } > >>> - /* No domain, no size... */ > >>> - return 0; > >>> + /* No domain, default to MSI_MAX_INDEX */ > >>> + return MSI_MAX_INDEX; > >>> } > >>> > >>> static inline void irq_chip_write_msi_msg(struct irq_data *data, > >> > >> Ah, that makes sense... So, with that diff applied, that fixes most of the issues I'm seeing incl. the virtio one that Guenter mentioned. But it looks like NVMe devices are still broken on s390 with a different backtrace -- the bisect for that one points to another patch in part2 of this work and looks like another issue with missing irq domain: > >> > >> 40742716f294 genirq/msi: Make msi_add_simple_msi_descs() device domain aware > >> > >> > >> [ 4.308861] ------------[ cut here ]------------ > >> [ 4.308865] WARNING: CPU: 7 PID: 9 at kernel/irq/msi.c:167 msi_domain_free_msi_descs_range+0x3c/0xd0 > >> [ 4.308877] Modules linked in: mlx5_core aes_s390 nvme des_s390 libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 nvme_core sha1_s390 sha_common pkey zcrypt rng_core autofs4 > >> [ 4.308896] CPU: 7 PID: 9 Comm: kworker/u20:0 Not tainted 6.1.0 #179 > >> [ 4.308898] Hardware name: IBM 3931 A01 782 (KVM/Linux) > >> [ 4.308900] Workqueue: events_unbound async_run_entry_fn > >> [ 4.308905] Krnl PSW : 0704c00180000000 00000000b6426b78 (msi_domain_free_msi_descs_range+0x40/0xd0) > >> [ 4.308909] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 > >> [ 4.308911] Krnl GPRS: 0700000080eda0a0 0000000000000000 0000000080eda0a0 0000000000000000 > >> [ 4.308913] 0000000000000000 0000000000000000 0000000000000cc0 0000000080eda000 > >> [ 4.308914] 00000000b7ddc000 0000000091934aa8 000000000000ffff 0000000000000000 > >> [ 4.308915] 0000000080344200 0000000080f2b1c0 0000037fffb5b918 0000037fffb5b8c8 > >> [ 4.308924] Krnl Code: 00000000b6426b68: e54cf0ac0000 mvhi 172(%r15),0 > >> [ 4.308924] 00000000b6426b6e: ec3c000b017f clij %r3,1,12,00000000b6426b84 > >> [ 4.308924] #00000000b6426b74: af000000 mc 0,0 > >> [ 4.308924] >00000000b6426b78: eb9ff0b00004 lmg %r9,%r15,176(%r15) > >> [ 4.308924] 00000000b6426b7e: 07fe bcr 15,%r14 > >> [ 4.308924] 00000000b6426b80: 47000700 bc 0,1792 > >> [ 4.308924] 00000000b6426b84: b90400a5 lgr %r10,%r5 > >> [ 4.308924] 00000000b6426b88: b9040013 lgr %r1,%r3 > >> [ 4.308935] Call Trace: > >> [ 4.308938] [<00000000b6426b78>] msi_domain_free_msi_descs_range+0x40/0xd0 > >> [ 4.308945] [<00000000b6bb126e>] pci_free_msi_irqs+0x26/0x48 > >> [ 4.308950] [<00000000b6baf4d4>] pci_disable_msix+0x6c/0x80 > >> [ 4.308954] [<00000000b6baf716>] pci_free_irq_vectors+0x26/0x88 > >> [ 4.308956] [<000003ff7fdfa8f4>] nvme_setup_io_queues+0x18c/0x398 [nvme] > >> [ 4.308968] [<000003ff7fdfbf1e>] nvme_probe+0x2e6/0x3b0 [nvme] > >> [ 4.308972] [<00000000b6ba44cc>] local_pci_probe+0x44/0x80 > >> [ 4.308974] [<00000000b6ba46d8>] pci_call_probe+0x50/0x180 > >> [ 4.308976] [<00000000b6ba5166>] pci_device_probe+0xae/0x110 > >> [ 4.308978] [<00000000b6c0a19a>] really_probe+0xd2/0x480 > >> [ 4.308982] [<00000000b6c0a6f8>] driver_probe_device+0x40/0xf0 > >> [ 4.308984] [<00000000b6c0a80e>] __driver_attach_async_helper+0x66/0xf0 > >> [ 4.308986] [<00000000b63cfb72>] async_run_entry_fn+0x4a/0x1b0 > >> [ 4.308987] [<00000000b63c1368>] process_one_work+0x200/0x458 > >> [ 4.308991] [<00000000b63c1aee>] worker_thread+0x66/0x480 > >> [ 4.308993] [<00000000b63caa00>] kthread+0x108/0x110 > >> [ 4.308996] [<00000000b634f2dc>] __ret_from_fork+0x3c/0x58 > >> [ 4.308999] [<00000000b6f8da2a>] ret_from_fork+0xa/0x40 > >> [ 4.309006] Last Breaking-Event-Address: > >> [ 4.309007] [<00000000b6426ba8>] msi_domain_free_msi_descs_range+0x70/0xd0 > >> [ 4.309009] ---[ end trace 0000000000000000 ]--- > >> [ 8.957187] nvme: probe of 0003:00:00.0 failed with error -22 > >> [ 8.957216] ------------[ cut here ]------------ > >> [ 8.957217] WARNING: CPU: 5 PID: 9 at kernel/irq/msi.c:275 msi_device_data_release+0x76/0xa0 > >> [ 8.957229] Modules linked in: mlx5_core aes_s390 nvme des_s390 libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 nvme_core sha1_s390 sha_common pkey zcrypt rng_core autofs4 > >> [ 8.957248] CPU: 5 PID: 9 Comm: kworker/u20:0 Tainted: G W 6.1.0 #179 > >> [ 8.957252] Hardware name: IBM 3931 A01 782 (KVM/Linux) > >> [ 8.957254] Workqueue: events_unbound async_run_entry_fn > >> [ 8.957259] Krnl PSW : 0704e00180000000 00000000b642729a (msi_device_data_release+0x7a/0xa0) > >> [ 8.957262] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 > >> [ 8.957265] Krnl GPRS: a813fdc020800000 00000000928b6840 0000000091934ab8 0000000080344200 > >> [ 8.957267] 0000000000000000 0000000091934a80 0000000080d79988 0000000000000000 > >> [ 8.957268] 0000000080eda0a0 0000037fffb5bc10 0000000080eda0a0 0000000091934aa8 > >> [ 8.957270] 0000000080344200 00000000800e0402 00000000b642724e 0000037fffb5bad0 > >> [ 8.957279] Krnl Code: 00000000b642728c: f0a0000407fe srp 4(11,%r0),2046,0 > >> [ 8.957279] 00000000b6427292: 47000700 bc 0,1792 > >> [ 8.957279] #00000000b6427296: af000000 mc 0,0 > >> [ 8.957279] >00000000b642729a: a7f4ffdf brc 15,00000000b6427258 > >> [ 8.957279] 00000000b642729e: af000000 mc 0,0 > >> [ 8.957279] 00000000b64272a2: 4120b048 la %r2,72(%r11) > >> [ 8.957279] 00000000b64272a6: c0e5005a0c4d brasl %r14,00000000b6f68b40 > >> [ 8.957279] 00000000b64272ac: e548a1180000 mvghi 280(%r10),0 > >> [ 8.957290] Call Trace: > >> [ 8.957292] [<00000000b642729a>] msi_device_data_release+0x7a/0xa0 > >> [ 8.957295] ([<00000000b642724e>] msi_device_data_release+0x2e/0xa0) > >> [ 8.957298] [<00000000b6c0f608>] release_nodes+0x50/0xd8 > >> [ 8.957305] [<00000000b6c111aa>] devres_release_all+0xaa/0xf0 > >> [ 8.957308] [<00000000b6c0a2f2>] really_probe+0x22a/0x480 > >> [ 8.957310] [<00000000b6c0a6f8>] driver_probe_device+0x40/0xf0 > >> [ 8.957312] [<00000000b6c0a80e>] __driver_attach_async_helper+0x66/0xf0 > >> [ 8.957314] [<00000000b63cfb72>] async_run_entry_fn+0x4a/0x1b0 > >> [ 8.957315] [<00000000b63c1368>] process_one_work+0x200/0x458 > >> [ 8.957320] [<00000000b63c1aee>] worker_thread+0x66/0x480 > >> [ 8.957322] [<00000000b63caa00>] kthread+0x108/0x110 > >> [ 8.957325] [<00000000b634f2dc>] __ret_from_fork+0x3c/0x58 > >> [ 8.957328] [<00000000b6f8da2a>] ret_from_fork+0xa/0x40 > >> [ 8.957336] Last Breaking-Event-Address: > >> [ 8.957337] [<00000000b6427254>] msi_device_data_release+0x34/0xa0 > >> [ 8.957339] ---[ end trace 0000000000000000 ]--- > >> > >> The line number for the first warning points to the WARN_ON check in msi_ctrl_valid -- specifically it's the !dev->msi.data->__domains[ctrl->domid].domain check that is failing. > >> > >> The second warning is the WARN_ON_ONCE(!xa_empty(&md->__domains[i].store)) check in msi_device_data_release, presumably a victim of backing out after the first error. > >> > > > > Yeah, the non-irqdomain legacy path definitely wounds up here, and we > > end-up leaking descriptors. If the following hack works for you, I'll > > ferry the two fixes to Linus asap. > > > > Thanks, > > > > M. > > > > diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c > > index bd4d4dd626b4..9921dc57f1b4 100644 > > --- a/kernel/irq/msi.c > > +++ b/kernel/irq/msi.c > > @@ -165,7 +165,8 @@ static bool msi_ctrl_valid(struct device *dev, struct msi_ctrl *ctrl) > > unsigned int hwsize; > > > > if (WARN_ON_ONCE(ctrl->domid >= MSI_MAX_DEVICE_IRQDOMAINS || > > - !dev->msi.data->__domains[ctrl->domid].domain)) > > + (dev->msi.domain && > > + !dev->msi.data->__domains[ctrl->domid].domain)) > > return false; > > > > hwsize = msi_domain_get_hwsize(dev, ctrl->domid); > > > > Close, but I had to add an extra ) at the end that was missing :) Hey, I never said I tried to compile the thing! ;-) > > With both these fixes applied, it actually then leads to the very > next WARN_ON failing in msi_ctrl_valid... Because ctrl->last == > hwsize. I think Thomas' initial fix for msi_domain_get_hwsize has > an off-by-1 error, I think we should return MSI_XA_DOMAIN_SIZE for > msi_domain_get_hwsize instead. Yes, that's a good point, and that's consistent with what __msi_create_irq_domain() does already, assuming MSI_XA_DOMAIN_SIZE when info->hwsize is 0. No reason to do something else here. I'll update Thomas' patch. Once Guenter confirms that PPC is OK, I'll send it out. M. -- Without deviation from the norm, progress is not possible.