Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp21341847rwd; Thu, 29 Jun 2023 14:47:22 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5FpG8Xyk35UQ5BOidRuSRNa8OQcE+rIkk84NlH3ZDuAWeXaN1Rnfoe38qrlJAW2uF2YyMR X-Received: by 2002:a05:6a20:8f0c:b0:117:a2f3:3c93 with SMTP id b12-20020a056a208f0c00b00117a2f33c93mr1051832pzk.2.1688075242539; Thu, 29 Jun 2023 14:47:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688075242; cv=none; d=google.com; s=arc-20160816; b=jBy5VyyfQhMHP/d5v/BfxeEdT8PI6S+aQ0YcaagrO77a9KOLYGtiEHp+kUkhA/vg/h sByrv3LxIm3sseO/7Qv97Yg4ayIjUNtdPsLcvzw34ldpDGFnzdbXTlLxZnoSRRiW6Xx4 wOGpB1GadNy3pcOWk5bDjnQR3ajQwzKFGzIlJYr929PkhBixaMjAvMkgRW1ebrrc70RP XeQdZVMUuhhjwlypNQhkvoywYoGhzUhMfjRnIS7h5nSCcT3u22QX3Valx3mFi/WrZu9h glqMf+0SKpoj13Y8K3NaFEF0HdBry8png4Y5BfxhcOmZ9HilGaKKeSeVR7LZuIcZrebv CmgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id:dkim-signature; bh=FtDP1Q6klKq+vrLWjMDjwV4PiAup+z6NgGkS2LHe5Nc=; fh=Z2KY4kvz9+2/iKdL6idvvjhZS77jj5gIZrMPvGOYajo=; b=VZ6ORz1wzjI6Kpm0DCWCAUXAzpzXeG6kFkH5m9eagukHNgjmEl6LeHkEgAZQqB0vFR IQiiBXtdbQk45mjcW1ry36Rkn4/sB4hGkEu6ZQjGvVW7scA2WVlt8FompxcDhAas//vm 0MuBFyZuehV891Mlie3GIr/nrWfxe2MaF9zc6VBKDRDZsCPDplGEobCPLL7CD34fevK0 GWyGFQiu11vO8mkF6tXDw6moAkcMnIJERjKYGkVMAVbqy9prCFDla/gzPUv3q1H++t4N ZD4LlZVOSSXmIYN/0CL0qKT3aoPT7FEJxBAUiFKqFWXtM/x0tsy1V9t7KMfRPBpoiJyj 0B+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dwsTtL1e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w19-20020a170902d71300b001a6ef92d441si10978882ply.599.2023.06.29.14.47.10; Thu, 29 Jun 2023 14:47:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dwsTtL1e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230012AbjF2V2w (ORCPT + 99 others); Thu, 29 Jun 2023 17:28:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56092 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230410AbjF2V2u (ORCPT ); Thu, 29 Jun 2023 17:28:50 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 173612D69 for ; Thu, 29 Jun 2023 14:28:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1688074079; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FtDP1Q6klKq+vrLWjMDjwV4PiAup+z6NgGkS2LHe5Nc=; b=dwsTtL1eUPZfTKkKfk3t8VGMaQpnD4Ush+0AzR1ZHAh/Pm2f6QBL8P8xdBEcwUk+iQFgF4 3zd6jJ0VIqSLmdCs6twL5h87MzYc+c5Tim1l7rmglcUJPEgQ/YHvHv/McPzCoOjdYtRk4V hKBhZ/u90dmqwyBgEz7vDHtN5hXBZOw= Received: from mail-oo1-f72.google.com (mail-oo1-f72.google.com [209.85.161.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-441-viF12TKQPm6a8lKSds960w-1; Thu, 29 Jun 2023 17:27:57 -0400 X-MC-Unique: viF12TKQPm6a8lKSds960w-1 Received: by mail-oo1-f72.google.com with SMTP id 006d021491bc7-560ce5f7646so1265566eaf.3 for ; Thu, 29 Jun 2023 14:27:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688074077; x=1690666077; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=FtDP1Q6klKq+vrLWjMDjwV4PiAup+z6NgGkS2LHe5Nc=; b=BsJCOqNTo6vSMBB6UIbIJXZq77XHRVc3GcE9vCegpeOarzhQkd6FswzVAh2i6QaafM vTNCdsE7ii/Wy7qSKB/p9gF0QAWMFyr7hWlnMG/MIFNuUAsP00wMmQrdZuN+c4zlMnH6 mmYlzQp48kIYDGyHrcDZV+HFup6qUzc94HmVD37nUdkP9ZF1EoeRE1UAJzZSERC9Rgz/ Kep1YNWHE5Fv2ijoGoFjYk+yItvs1iWCNW36e/ljnr4nawQOZkVp1/mFVGwpFx+1hFrQ fVWPqBLZbpTKT69PjNLlo+7CuGGSVmZKG/eDfm5yCBwGr9Ikv3IGPMh6wCfS6QU/ES1h 2ouQ== X-Gm-Message-State: ABy/qLbKNB6YAWYtndqGi4ONjB2CtCcuwaGn8fNKdJ+s+khaU8+Zh7dr fLWZqo4EyKdeojPwVtGoiykmbmCszmZhvj+AMERyqVMpx2nceKpUahw9UauRy+pUs/gx/8r74yK 6gWxUxYNH4JiDRCiAQv7m5wJq X-Received: by 2002:a05:6359:a2f:b0:130:afe8:43de with SMTP id el47-20020a0563590a2f00b00130afe843demr693808rwb.30.1688074076884; Thu, 29 Jun 2023 14:27:56 -0700 (PDT) X-Received: by 2002:a05:6359:a2f:b0:130:afe8:43de with SMTP id el47-20020a0563590a2f00b00130afe843demr693797rwb.30.1688074076589; Thu, 29 Jun 2023 14:27:56 -0700 (PDT) Received: from thinkpad-p1.localdomain (cpe00fc8d79db03-cm00fc8d79db00.cpe.net.fido.ca. [72.137.118.218]) by smtp.gmail.com with ESMTPSA id k28-20020a05620a143c00b0075785052e97sm6571404qkj.95.2023.06.29.14.27.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jun 2023 14:27:56 -0700 (PDT) Message-ID: <3a6ce0da4d37ed2ba9673a60062e109503adca78.camel@redhat.com> Subject: Re: [PATCH 1/1] PCI: dwc: Use regular interrupt instead of chained From: Radu Rendec To: Bjorn Helgaas Cc: Jingoo Han , Gustavo Pimentel , Lorenzo Pieralisi , Krzysztof Wilczynski , Rob Herring , Bjorn Helgaas , Marc Zyngier , Thomas Gleixner , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 29 Jun 2023 17:27:54 -0400 In-Reply-To: <20230629205830.GA447734@bhelgaas> References: <20230629205830.GA447734@bhelgaas> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.4 (3.46.4-1.fc37) MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2023-06-29 at 15:58 -0500, Bjorn Helgaas wrote: > On Thu, Jun 29, 2023 at 04:42:07PM -0400, Radu Rendec wrote: > > On Thu, 2023-06-29 at 14:57 -0500, Bjorn Helgaas wrote: > > > On Thu, Jun 29, 2023 at 02:30:19PM -0400, Radu Rendec wrote: > > > > The DesignWare PCIe host driver uses a chained interrupt to demulti= plex > > > > the downstream MSI interrupts. On Qualcomm SA8540P Ride, enabling b= oth > > > > pcie2a and pcie3a at the same time can create an interrupt storm wh= ere > > > > the parent interrupt fires continuously, even though reading the PC= Ie > > > > host registers doesn't identify any child MSI interrupt source. Thi= s > > > > effectively locks up CPU0, which spends all the time servicing thes= e > > > > interrupts. > > > >=20 > > > > This is a clear example of how bypassing the interrupt core by usin= g > > > > chained interrupts can be very dangerous if the hardware misbehaves= . > > > >=20 > > > > Convert the driver to use a regular interrupt for the demultiplex > > > > handler. This allows the interrupt storm detector to detect the fau= lty > > > > interrupt and disable it, allowing the system to run normally. > > >=20 > > > There are many other users of irq_set_chained_handler_and_data() in > > > drivers/pci/controller/.=C2=A0 Should they be similarly converted?=C2= =A0 If not, > > > how do we decide which need to use irq_set_chained_handler_and_data() > > > and which do not? > >=20 > > According to Thomas Gleixner, yes. Obviously I don't want to put words > > in his mouth, but I think that's the gist of what he said in a reply to > > an RFC patch that I sent a few weeks ago: > > https://lore.kernel.org/all/877csohcll.ffs@tglx/ >=20 > Is it's a bug in pcie-designware-host.c, and it's also a bug in the > other callers, we should fix them all. I wouldn't call it a bug, because the driver works as long as the hardware behaves. But if the hardware misbehaves and generates an interrupt storm for whatever reason, you end up with an apparently frozen system and no clue as to what's going on. By contrast, using regular interrupts allows the interrupt storm detector to kick in. Then the system works fine, and there's also a useful log message pointing to that particular interrupt. > But you do imply that there's some DesignWare hardware issue involved, > too, so I guess it's possible the other drivers don't have an issue > and/or actually require the chained IRQs.=C2=A0 That's why I asked how we > should decide. That makes sense. I don't know if it's a DesignWare hardware issue or something else down the PCIe bus. Other folks in my team are investigating the root cause. This thread has the details: https://lore.kernel.org/all/pmodcoakbs25z2a7mlo5gpuz63zluh35vbgb5itn6k5aqhj= nny@jvphbpvahtse/ I want to point out that this patch doesn't *fix* the root problem (the interrupt storm). It just makes the kernel handle it (much) better if it occurs. I can't see why any driver would _require_ chained IRQs. As I see it, chained interrupts are just a "shortcut" that circumvents the IRQ core for (debatable) performance reasons. Other than that, they should work the same as regular interrupts. There are other benefits associated with using regular interrupts. One of them is the ability to control the SMP affinity of the parent interrupt. But that's a different topic. Radu > > > > -static void dw_pcie_free_msi(struct dw_pcie_rp *pp) > > > > +static void __dw_pcie_free_msi(struct dw_pcie_rp *pp, u32 num_ctrl= s) > > > > =C2=A0{ > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0u32 ctrl; > > > > =C2=A0 > > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0for (ctrl =3D 0; ctrl < = MAX_MSI_CTRLS; ctrl++) { > > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0for (ctrl =3D 0; ctrl < = num_ctrls; ctrl++) { > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (pp->msi_irq[ctrl] > 0) > > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0irq= _set_chained_handler_and_data(pp->msi_irq[ctrl], > > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 NULL, NULL); > > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0fre= e_irq(pp->msi_irq[ctrl], pp); > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > > > > =C2=A0 > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0irq_domain_remove(p= p->msi_domain); > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0irq_domain_remove(p= p->irq_domain); > > > > =C2=A0} > > > > =C2=A0 > > > > +#define dw_pcie_free_msi(pp) __dw_pcie_free_msi(pp, MAX_MSI_CTRLS) > > >=20 > > > What is the benefit of the dw_pcie_free_msi() macro? > >=20 > > It allows me to add the num_ctrls parameter to the corresponding > > function (now renamed to __dw_pcie_free_msi()) without forcing all the > > existing call sites to send MAX_MSI_CTRLS explicitly. > >=20 > > I needed that extra parameter to avoid duplicating the tear down code > > on the (new) error path in dw_pcie_msi_init() - see below. > >=20 > > > > =C2=A0static void dw_pcie_msi_init(struct dw_pcie_rp *pp) > > > > =C2=A0{ > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct dw_pcie *pci= =3D to_dw_pcie_from_pp(pp); > > > > @@ -361,9 +353,16 @@ static int dw_pcie_msi_host_init(struct dw_pci= e_rp *pp) > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return ret; > > > > =C2=A0 > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0for (ctrl =3D 0; ct= rl < num_ctrls; ctrl++) { > > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0if (pp->msi_irq[ctrl] > 0) > > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0irq= _set_chained_handler_and_data(pp->msi_irq[ctrl], > > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 dw_chained_msi_isr, pp); > > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0if (pp->msi_irq[ctrl] > 0) { > > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0ret= =3D request_irq(pp->msi_irq[ctrl], dw_pcie_msi_isr, 0, > > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 dev_name(dev), pp); > > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if = (ret) { > > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0dev_err(dev, "Failed to reques= t irq %d: %d\n", > > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0pp->msi_irq[ctrl], ret); > > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0__dw_pcie_free_msi(pp, ctrl); > >=20 > > This is where I'm using the extra parameter. If we fail to request an > > interrupt, we need to free all the other interrupts that we have > > requested so far, to leave everything in a clean state. But we can't > > use MAX_MSI_CTRLS with __dw_pcie_free_msi() and rely on the check there > > because there may be extra interrupts that we haven't requested *yet* > > and we would attempt to free them. >=20 > Makes sense, thanks. >=20 > Bjorn >=20