Received: by 10.223.185.116 with SMTP id b49csp3927803wrg; Tue, 6 Mar 2018 07:13:56 -0800 (PST) X-Google-Smtp-Source: AG47ELuuD+y8UZ5C9aw/y2naeGEB+7rgFlixuM5P8aZYW8San9gkiWmA74VIhm/YyPiaweKuyWxw X-Received: by 10.98.192.74 with SMTP id x71mr18990776pff.21.1520349236727; Tue, 06 Mar 2018 07:13:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520349236; cv=none; d=google.com; s=arc-20160816; b=ucNsw9EBvAaqu10zmATKVwdbq/7FD7EAv+BnUfq5GY8kLCal/9/2ZEihZ+/38IQShX jl3UA6ik0Cy/cnOOolYtwEKT+lzd+k1Jsxsdkzqssfn7fE6LujBqyLy4TGpoiq9FMycB I1KDCnKguAZPvwK/bDg95TcUV5v560d8p7+0fcWJqYErUIOmYrFivDTPm8Lo70FZ+Dt7 nTPBR7wawK19vBZtfLneZoGDqtM+Tk8ls/NSwBPQlf74xoTAxswMmwRBUFkiE04qrhiH XOzUjQl+gPka1FRxWEExc8l04MYMgsu71/MknaTKh5jWJDS5j7ofGy2ctL9iJ6ohG0LA PJng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=MFbRoLCWWGNzL0sgFKymKOKpFcSg+ODBgDwVAPyOweQ=; b=pjvzp9MOo2MAXUpqmcZ1hXSQX9J/P/rblg2NkoMM8EgndZiTt692/QEy2xBklONrcF gMDHjUvEcq0BjPADovCUS8N5lVg+4aJ8jLzxC9fpqI7jn/ftnXWY/5vi0JnLZx/64dBE b3JVsilD7LR/siJynkVUzM0sIwQfr+Xr9UvSS9uBHysoyXGuNKXoB9Xqyn3/GuRjqeyu czzK9ZNHdujCDG/VqZP6y3s0jJ0cm/yEPz4aC0gqXI5wtZ31CgZ5OQ4ZI0WoVaMsK9q3 g7rLzuULZ79x8t1TdIzDvmwCUminN9AGTq0lCnThEbQ+4xqXXePiKKfkra8OXXHFl2vq 6CXg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d9-v6si11263930plj.254.2018.03.06.07.13.41; Tue, 06 Mar 2018 07:13:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753720AbeCFPM0 (ORCPT + 99 others); Tue, 6 Mar 2018 10:12:26 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:39880 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753487AbeCFPMY (ORCPT ); Tue, 6 Mar 2018 10:12:24 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1394E14; Tue, 6 Mar 2018 07:12:24 -0800 (PST) Received: from red-moon (red-moon.cambridge.arm.com [10.1.206.55]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 46C723F53D; Tue, 6 Mar 2018 07:12:22 -0800 (PST) Date: Tue, 6 Mar 2018 15:12:47 +0000 From: Lorenzo Pieralisi To: Vignesh R Cc: Jingoo Han , Joao Pinto , KISHON VIJAY ABRAHAM , Bjorn Helgaas , Niklas Cassel , "linux-omap@vger.kernel.org" , "linux-pci@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH 2/3] PCI: dwc: pci-dra7xx: Improve MSI IRQ handling Message-ID: <20180306151247.GA7378@red-moon> References: <20180209120415.17590-1-vigneshr@ti.com> <20180209120415.17590-3-vigneshr@ti.com> <20180212175801.GA29070@e107981-ln.cambridge.arm.com> <7cfa73af-09fa-d298-aaab-3e74ee7e1dd5@ti.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <7cfa73af-09fa-d298-aaab-3e74ee7e1dd5@ti.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 15, 2018 at 09:59:21AM +0530, Vignesh R wrote: > Hi, > > On Monday 12 February 2018 11:28 PM, Lorenzo Pieralisi wrote: > > On Fri, Feb 09, 2018 at 05:34:14PM +0530, Vignesh R wrote: > >> We need to ensure that there are no pending MSI IRQ vector set (i.e > >> PCIE_MSI_INTR0_STATUS reads 0 at least once) before exiting > >> dra7xx_pcie_msi_irq_handler(). Else, the dra7xx PCIe wrapper will not > >> register new MSI IRQs even though PCIE_MSI_INTR0_STATUS shows IRQs are > >> pending. Therefore, keep calling dra7xx_pcie_msi_irq_handler() until it > >> returns IRQ_NONE, which suggests that PCIE_MSI_INTR0_STATUS is 0. > >> > >> This fixes a bug, where PCIe wifi cards with 4 DMA queues like Intel > >> 8260 used to throw following error and stall during ping/iperf3 tests. > >> > >> [?? 97.776310] iwlwifi 0000:01:00.0: Queue 9 stuck for 2500 ms. > >> > >> Signed-off-by: Vignesh R > >> --- > >>? drivers/pci/dwc/pci-dra7xx.c | 21 ++++++++++++++++++--- > >>? 1 file changed, 18 insertions(+), 3 deletions(-) > >> > >> diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c > >> index ed8558d638e5..3420cbf7b60a 100644 > >> --- a/drivers/pci/dwc/pci-dra7xx.c > >> +++ b/drivers/pci/dwc/pci-dra7xx.c > >> @@ -254,14 +254,31 @@ static irqreturn_t dra7xx_pcie_msi_irq_handler(int irq, void *arg) > >>??????? struct dra7xx_pcie *dra7xx = arg; > >>??????? struct dw_pcie *pci = dra7xx->pci; > >>??????? struct pcie_port *pp = &pci->pp; > >> +???? int count = 0; > >>??????? unsigned long reg; > >>??????? u32 virq, bit; > >>? > >>??????? reg = dra7xx_pcie_readl(dra7xx, PCIECTRL_DRA7XX_CONF_IRQSTATUS_MSI); > >> +???? dra7xx_pcie_writel(dra7xx, PCIECTRL_DRA7XX_CONF_IRQSTATUS_MSI, reg); > >>? > >>??????? switch (reg) { > >>??????? case MSI: > >> -???????????? dw_handle_msi_irq(pp); > >> +???????????? /* > >> +????????????? * Need to make sure no MSI IRQs are pending before > >> +????????????? * exiting handler, else the wrapper will not catch new > >> +????????????? * IRQs. So loop around till dw_handle_msi_irq() returns > >> +????????????? * IRQ_NONE > >> +????????????? */ > >> +???????????? while (dw_handle_msi_irq(pp) != IRQ_NONE && count < 1000) > >> +???????????????????? count++; > >> + > >> +???????????? if (count == 1000) { > >> +???????????????????? dev_err(pci->dev, "too much work in msi irq\n"); > >> +???????????????????? dra7xx_pcie_writel(dra7xx, > >> +??????????????????????????????????????? PCIECTRL_DRA7XX_CONF_IRQSTATUS_MSI, > >> +??????????????????????????????????????? reg); > >> +???????????????????? return IRQ_HANDLED; > > > > I am not merging any code patching this IRQ handling routine anymore > > unless you thoroughly explain to me how this CONF_IRQSTATUS_MSI register > > works (and how it is related to DW registers) and why this specific host > > controller needs handling that is not required by any other host > > controller relying on dw_handle_msi_irq(). > > Unlike other DW PCIe controllers, TI implementation has a wrapper on top > of DW core. This wrapper latches the DW core level MSI and legacy > interrupts and then propagates it to GIC. > PCIECTRL_DRA7XX_CONF_IRQSTATUS_MSI register is present in this TI > wrapper which aggregates all the MSI IRQs(PCIE_MSI_INTR0_STATUS) of DW > level. They are mapped on the MSI interrupt line of PCIe controller, > using a single status bit in the PCIECTRL_TI_CONF_IRQSTATUS_MSI register. > > So, the irq handler, dra7xx_pcie_msi_irq_handler(), first needs to look > at PCIECTRL_DRA7XX_CONF_IRQSTATUS_MSI[4] to know that its MSI IRQ and > then call dw_handle_msi_irq() to handle individual MSI vectors. > Driver has to make sure there are no pending vectors in DW core MSI How can it make *sure* ? And what makes the wrapper latch MSI IRQs again ? > status register before exiting handler. Otherwise next MSI IRQ will not > be latched by the wrapper. I am sorry but I do not understand how this works - what is the condition that makes wrapper latch IRQs again ? This is at least racy, if not outright broken. That count == 1000 is a symptom there is something broken on how this driver handles IRQs and I have the impression that we are applying plasters on top of plasters to make it less broken than it actually is. > > I suspect there is a code design flaw with the way this host handles > > IRQs and we are going to find it and fix it the way it should, not with > > any plaster like this patch. > > > > I agree there has been some churn wrt this wrapper level IRQ handler. > But, that was because hardware documentation/TRM did not match > actual behavior and so it took some time to understand how the > hardware is working. How does HW work :) ? Please explain in detail how this works in HW then we will get to the code. Thanks, Lorenzo > I have extensively tested this series on multiple problematic PCIe USB > cards and PCIe WiFi cards over week long stress tests. And also had > some agreement with internal hardware designers. Hardware > documentations will also be updated. > > > > Lorenzo > > > >> +???????????? } > >>??????????????? break; > >>??????? case INTA: > >>??????? case INTB: > >> @@ -275,8 +292,6 @@ static irqreturn_t dra7xx_pcie_msi_irq_handler(int irq, void *arg) > >>??????????????? break; > >>??????? } > >>? > >> -???? dra7xx_pcie_writel(dra7xx, PCIECTRL_DRA7XX_CONF_IRQSTATUS_MSI, reg); > >> - > >>??????? return IRQ_HANDLED; > >>? } > >>? > >> -- > >> 2.16.1 > >> > > -- > Regards > Vignesh