Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp3634192rwb; Fri, 30 Sep 2022 06:23:17 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4I34uN4podIB8lRQkHF9/xPgKatHmIGA/5RKWaCXXph0iCb0AmMh5FKKBjlLLgHt+A6iFz X-Received: by 2002:a17:90b:1d8c:b0:202:abf5:4b21 with SMTP id pf12-20020a17090b1d8c00b00202abf54b21mr22674190pjb.162.1664544197447; Fri, 30 Sep 2022 06:23:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664544197; cv=none; d=google.com; s=arc-20160816; b=KjW+/jSh+mMUMiS1QH0AAcDvJmcB2ads3aH7Cy80FYjAN4BcWj4HnNKNlJIRd3AZwI CoqNcngAm2qBBs9odPo6jVEZNoEjwcTzgoDiQIeyT3miJRIDkeJTN6ItJFQUeZy1ZXAE YPREQojPD7e5s32lRd+XhAT/Tl8k41BJo1UgMfAnMsXcWHfgoliVbiKUHvhJCT5LAGxJ +S26Dgvf7G8N4NfR9Qmlk1cmcvUenNxMNgW3jPV/Ub4iexqNBOrhkTI9fsypBRnJfHx+ U618lJJQfRu5U0/jd/44uuNJh/DcCC+P1WEKNQ+NhV5bOIvnPqukC0QhzCTQsZu80tXd EwPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=xurImKGcvSyufer6C0f4fkhZtri5xnvHv13d6F9fFl8=; b=CqRLdFY/hNRkevv+UZxRhxvwAaCnjaWrUnfTxhJuHnImvCCE1G1w9isyM9Cg5SB/5n CADhmKXijyjrwJWjGRiJ1eOSPdwDEJVMoLuukFofokGoizT2bA3Ke1Aj2g4tFPiHj2Og psPldiWum5Q0NL1ox1WZpr/q+XjcVQQ4KPezqDicICRDtkuuYuAlqqQFp40r2GP+fX8N mjrXDzht9r8P3WXJlThoHHrUghiD2zWujCWbNHk9rE+Z3mmqjqip2td5KCUSe3lJkcDW 60WY/ITjudzZ8ByfMEHRd9Z0oHTvT3+z8vzfHDdvgBs9RtODlSLo1zkKWirJ51ReD4Ny N+aw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=dqazimzW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i12-20020a17090332cc00b00176953fee67si3199919plr.86.2022.09.30.06.23.05; Fri, 30 Sep 2022 06:23:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=dqazimzW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230363AbiI3M5p (ORCPT + 99 others); Fri, 30 Sep 2022 08:57:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229694AbiI3M5n (ORCPT ); Fri, 30 Sep 2022 08:57:43 -0400 Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0BCEE1739E5; Fri, 30 Sep 2022 05:57:41 -0700 (PDT) Received: by mail-lf1-x12f.google.com with SMTP id s6so6733832lfo.7; Fri, 30 Sep 2022 05:57:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date; bh=xurImKGcvSyufer6C0f4fkhZtri5xnvHv13d6F9fFl8=; b=dqazimzWjongcSnF3P70uk5+yYl5luYnGcP+TCo6NiydkrtRltGTy6hXcHZVdE0pQr 2hvtVS1n1S4ziYmLk2dulA4FPvUoOKWo4DR6VfQSthdDlsHUF1LWTbDemDmn0KbaYPma HbNwDSkymFM6bjKr6FmzTw3/y+eXMxQBHRYc4q90aLVRhOSswrL9/chCUxJcR9b91jU0 DrNbzde1Jd3EooSPJ1HqoirHUyEqvsIwvK1l0k4r6pq3x8rcTUVD7mDWbYFp5AFgieqe C5hJtDdAiI+dYpktYrcVPqOEqoGrH3IacHQgIgFT1rJWeX60ATHbkacLqA4sKFNFnanq 06QQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date; bh=xurImKGcvSyufer6C0f4fkhZtri5xnvHv13d6F9fFl8=; b=yw9s3O44NbH1sy0pgm+w4t8J1nysYKPVSsRKOJ8EJqYAjRsuZMs02d6oc8Sn6cnqjL mPX0NncF7kehgCkZRAFYwxeFstvBjKXfZx87HRrv9q5NwvU6E66qYBBpk16wuTsL06xP BolEMAjOHbxOli/m0M7yHyCUw8tQqAeIHiGW/2Uu4GZncoOTlFSpdvXkIRIfNCFYFSUd X5rGZbylMUzcsmLnoOYSvPSJtmZidtYBO9murilRu2+RrSLC3ULJ/l3na3IWel45kw7y sEt57swZimVnZZ51K77VCW0U3eYKUECKoys/vbzt+KR18a1q3sICkPXn5HNRYj6Ag+uR Y67w== X-Gm-Message-State: ACrzQf3iJHh4BLpJHvriq12n8O+Etry1U3OkaN+XrrYs9hDCQNb3oBlx BZJur9Fkejb2cpOZFMXKXso= X-Received: by 2002:a05:6512:b1c:b0:4a1:baad:1e6a with SMTP id w28-20020a0565120b1c00b004a1baad1e6amr3138267lfu.466.1664542659059; Fri, 30 Sep 2022 05:57:39 -0700 (PDT) Received: from mobilestation ([95.79.133.202]) by smtp.gmail.com with ESMTPSA id p18-20020a2eb992000000b0026c47426cd0sm142667ljp.140.2022.09.30.05.57.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Sep 2022 05:57:38 -0700 (PDT) Date: Fri, 30 Sep 2022 15:57:36 +0300 From: Serge Semin To: Robin Murphy Cc: Will McVicker , Jingoo Han , Gustavo Pimentel , Lorenzo Pieralisi , Rob Herring , Krzysztof =?utf-8?Q?Wilczy=C5=84ski?= , Bjorn Helgaas , kernel-team@android.com, Vidya Sagar , Christoph Hellwig , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, "Isaac J . Manjarres" Subject: Re: [PATCH v5 1/2] PCI: dwc: Drop dependency on ZONE_DMA32 Message-ID: <20220930125736.rovbmgilxv3bzvzc@mobilestation> References: <20220825185026.3816331-1-willmcvicker@google.com> <20220825185026.3816331-2-willmcvicker@google.com> <20220928114136.4yvtfnrcril3jkgg@mobilestation> <4dc31a63-00b1-f379-c5ac-7dc9425937f4@arm.com> <20220929193241.pdjj5ifm7vgpff42@mobilestation> <6b919ea9-2f87-65ca-8286-5b4baa6e1c3c@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6b919ea9-2f87-65ca-8286-5b4baa6e1c3c@arm.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 30, 2022 at 12:01:58PM +0100, Robin Murphy wrote: > On 2022-09-29 20:32, Serge Semin wrote: > > On Thu, Sep 29, 2022 at 07:25:03PM +0100, Robin Murphy wrote: > > > On 2022-09-28 12:41, Serge Semin wrote: > > > > On Thu, Aug 25, 2022 at 06:50:24PM +0000, Will McVicker wrote: > > > > > Re-work the msi_msg DMA allocation logic to use dmam_alloc_coherent() which > > > > > uses the coherent DMA mask to try to return an allocation within the DMA > > > > > mask limits. With that, we now can drop the msi_page parameter in struct > > > > > dw_pcie_rp. This allows kernel configurations that disable ZONE_DMA32 to > > > > > continue supporting a 32-bit DMA mask. Without this patch, the PCIe host > > > > > device will fail to probe when ZONE_DMA32 is disabled. > > > > > > > > As Rob already said here > > > > https://lore.kernel.org/all/CAL_JsqJh=d-B51b6yPBRq0tOwbChN=AFPr-a19U1QdQZAE7c1A@mail.gmail.com/ > > > > and I mentioned in this thread > > > > https://lore.kernel.org/linux-pci/20220912000211.ct6asuhhmnatje5e@mobilestation/ > > > > DW PCIe MSI doesn't cause any DMA due to the way the iMSI-RX engine is > > > > designed. So reserving any real system memory is a waste of one in > > > > this case. Reserving DMA-coherent even more inappropriate since it > > > > can be expensive on some platforms (see note in Part Ia of > > > > Documentation/core-api/dma-api.rst). For instance on MIPS32 with > > > > non-corehent common DMA. > > > > > > > > This has been discussed before - in general it is difficult to pick an > > > arbitrary MSI address that is *guaranteed* not to overlap any valid DMA > > > address that somebody may try to use later. However there is a very easy way > > > to guarantee that the DMA API won't give anyone a particular DMA address, > > > which is to get an address directly from the DMA API and keep it. Yes, that > > > can technically be done with a streaming mapping *if* you already have some > > > memory allocated in a suitable physical location, but coherent allocations > > > are even more foolproof, simpler to clean up (particularly with devres), and > > > unlikely to be an issue on relevant platforms (do any MIPS32 systems use > > > this driver?) > > > > My patchset adds the DW PCIe RP controller support on MIPS32 arch: > > https://lore.kernel.org/linux-pci/20220822184701.25246-21-Sergey.Semin@baikalelectronics.ru/ > > > > > > > > > > Fixes: 35797e672ff0 ("PCI: dwc: Fix MSI msi_msg DMA mapping") > > > > > Reported-by: Isaac J. Manjarres > > > > > Signed-off-by: Will McVicker > > > > > Acked-by: Jingoo Han > > > > > Reviewed-by: Rob Herring > > > > > --- > > > > > .../pci/controller/dwc/pcie-designware-host.c | 28 +++++-------------- > > > > > drivers/pci/controller/dwc/pcie-designware.h | 1 - > > > > > 2 files changed, 7 insertions(+), 22 deletions(-) > > > > > > > > > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c > > > > > index 7746f94a715f..39f3b37d4033 100644 > > > > > --- a/drivers/pci/controller/dwc/pcie-designware-host.c > > > > > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c > > > > > @@ -267,15 +267,6 @@ static void dw_pcie_free_msi(struct dw_pcie_rp *pp) > > > > > irq_domain_remove(pp->msi_domain); > > > > > irq_domain_remove(pp->irq_domain); > > > > > - > > > > > - if (pp->msi_data) { > > > > > - struct dw_pcie *pci = to_dw_pcie_from_pp(pp); > > > > > - struct device *dev = pci->dev; > > > > > - > > > > > - dma_unmap_page(dev, pp->msi_data, PAGE_SIZE, DMA_FROM_DEVICE); > > > > > - if (pp->msi_page) > > > > > - __free_page(pp->msi_page); > > > > > - } > > > > > } > > > > > static void dw_pcie_msi_init(struct dw_pcie_rp *pp) > > > > > @@ -336,6 +327,7 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp) > > > > > struct dw_pcie *pci = to_dw_pcie_from_pp(pp); > > > > > struct device *dev = pci->dev; > > > > > struct platform_device *pdev = to_platform_device(dev); > > > > > + u64 *msi_vaddr; > > > > > int ret; > > > > > u32 ctrl, num_ctrls; > > > > > @@ -375,22 +367,16 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp) > > > > > dw_chained_msi_isr, pp); > > > > > } > > > > > > > > > - ret = dma_set_mask(dev, DMA_BIT_MASK(32)); > > > > > + ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); > > > > > > > > This has been redundant in the first place since none of the DW PCIe > > > > low-level drivers update the mask, and it's of 32-bits wide by default > > > > anyway: > > > > https://elixir.bootlin.com/linux/latest/source/drivers/of/platform.c#L167 > > > > > > > > No, in general drivers should always explicitly set their mask(s) and check > > > the return value to make sure DMA is possible at all before trying any other > > > DMA API calls. There's no guarantee that the default mask is usable (e.g. > > > some systems don't have any 32-bit addressable RAM), or that it's even > > > always 32 bits (due to crufty reasons of something of_dma_configure() tried > > > to do a long time ago). > > > > Suppose you are right and DMA-mask should be always set before any > > mapping. What do you suggest to do in this case? (1) The code above > > overrides the real DMA-mask which could be set by the platform > > drivers, which in its turn are normally aware of the device DMA > > capabilities. > > I am right. Appropriate DMA API usage as defined by the DMA API maintainers > is not a matter of supposition. I literally just explained right there why > drivers can't blindly assume the default mask is usable on modern systems > (yes, it was different 20 years ago when system topologies were simpler). > > However, having now gone and looked at the whole driver rather than unclear > fragments of patch context, the code here *is* technically wrong. I've been > mistakenly thinking all along that this was operating on the PCI device > because I know that's what it *should* be doing, and seeing misleading > things like "dev = pci->dev" falsely affirmed that assumption that it would > be correct because it's been around for ages. > AFAIU the correct PCI device > won't actually exist until we've got far enough through pci_host_probe(), so > I'm not sure how to easily solve this :/ Right. The code affected by the subject patch has nothing to do with the real PCI devices. The DMA-mask is set to the DW PCIe Host controller platform device in order to force a page being allocated within 32-bit address space. That's it. Here is a log of the related changes: 0. Initially a GFP_KERNEL-based page was allocated for the MSI buffer. It may cause having the DMA/PCIe-address above 4GB, which wouldn't work for the PCIe peripherals with only 32-bit MSI capability. Though nobody bothered back then. 1. 07940c369a6b ("PCI: dwc: Fix MSI page leakage in suspend/resume") After this commit nothing really has changed, but instead of allocating the MSI-message pseudo-buffer turned to be embedded into the private data. It could be allocated at any base address with no actual limitation (because private data is kmalloc'ed). 2. 660c486590aa ("PCI: dwc: Set 32-bit DMA mask for MSI target address allocation") Someone found out that some devices failed to deliver MSI to the address above 4GB of PCIe address space and fixed the problem by force-setting the DMA-mask to being 32-bit before mapping the MSI buffer. It indeed fixed the problem, but the actual buffer still left being allocated from any address space. Instead, the mapping procedure just bounced the buffer to 4GB space. So basically the solution was very clumsy since turns a bounce buffer being reserved forever. 3. 35797e672ff0 PCI: dwc: Fix MSI msi_msg DMA mapping @William basically got things back to (0) but instead of GFP_KERNEL the page was allocated from GFP_DMA32. At this stage he should have dropped the DMA-mask setting too since the buffer was already guaranteed to be within 4GB space, but he didn't. So now we have what we have. The DMA-mask is pointlessly changed for something not really implying any DMA. That's why I insisted on dropping it at the very least. Another reason I thought was also appropriate was the default DMA-mask being set to 32bits anyway. But you said we shouldn't rely on the default DMA-mask setting. So ok, it doesn't count then. But it doesn't change the uselessness of the DMA-mask change in the current driver. > AFAIU the correct PCI device > won't actually exist until we've got far enough through pci_host_probe(), so > I'm not sure how to easily solve this :/ Walk over all PCIe devices detected on the PCIe-bus. Check their MSI-capability flags. If any of them have no 64-bit MSI flag set, then make sure the MSI-base address is allocated within 4GB memory region. It isn't that easy to implement though... > > Of course *this* patch doesn't change any of that either, so it's no worse > than the existing code and I don't see that dropping it helps you at all; > the current driver is already trampling your 64-bit mask back to 32 bits Yes, and by this pathset @William intend to fix the DMA-mask-override behaviour by using the dma_alloc_coherent() method. So any platform-specific DMA-mask setting will be overwritten, and the DMA-mask setting won't be able to be moved/dropped due to the dma_alloc_coherent() method usage. I have added a DW eDMA-engine support to the DW PCIe driver (you've already seen my patches) and the engine initialization is supposed to be performed after any basic initializations like CSRs mapping, data allocations, MSI, etc. Since DMA is performed by the controller itself it's required to have a correct DMA-mask set to the DW PCIe platform device otherwise any consequent mapping will be bounce buffered to the lowest 4GB even though the corresponding platform can support more than 4GB of memory (even our MIPS32 can) with DW eDMA easily reaching that memory. What would help me in this case if the MSI-buffer allocation procedure wouldn't change the device DMA-mask. As an alternative to completely dropping the DMA-mask setting, the DMA-mask setup process could be moved to the low-level platform device drivers. It would be even more justified since the platform-specific device capabilities (like DW PCIe AXI-interface address-bus width) are unknown in the generic code. As another alternative I could override the DMA-mask within the DW eDMA probe procedure. But that would make things more complicated than relying on the low-level platform drivers doing that. > and > reserving the doorbell address in the wrong DMA address space (modulo the > other dma-ranges bug which also took far too long to figure out). Actually current driver (without William patch) reserve the doorbell address in the correct DMA address space (if we don't take the dma-ranges settings into account). It works as expected in case if the PCIe<->CPU space has one-on-one mapping (which is true in the most of the cases). The only thing which is wrong is the pointless DMA-mask update. I could have easily dropped it in my patchset. But after the @William patchset is applied I won't be able to do that due to using the dma_alloc_coherent() here. > At this > point I'd rather keep it since getting rid of the __GFP_DMA32 abuse is > objectively good. If losing one page of coherent memory is a measurably > significant problem for T1 once the other issues are worked out and that > series lands, then you're welcome to propose a change on top (but I would > prefer that all the drivers using this trick are changed consistently). Regarding DMA-coherent allocation. I am not happy with losing a whole page of the dma-coherent memory, but we can live with that. What give additional difficulty for our eDMA-patches is the DMA-mask override. If you still insist on preserving the @William patchset as it is, where do you suggest for me to update the DMA-mask if the low-level driver won't be suitable for that anymore? -Sergey > > Thanks, > Robin. > > > But in this case due to override afterwards any buffers > > above 4GB mapping will cause using the bounce buffers. (2) It's set > > here for something which isn't actual DMA. So to speak on one side is > > this patchset which overrides the mask for something which isn't DMA, > > and there are another patchsets: > > https://lore.kernel.org/linux-pci/20220822184701.25246-1-Sergey.Semin@baikalelectronics.ru/ > > and > > https://lore.kernel.org/linux-pci/20220728142841.12305-1-Sergey.Semin@baikalelectronics.ru/ > > which add the real DMA support to DW PCIe driver and for which setting > > the real DMA-mask is crucial. What do you suggest? Setting the mask > > twice: before allocating MSI-buffer and afterwards for the sake of > > eDMA buffers mapping? Moving DMA-mask setting from the generic DW PCIe > > code to the platform drivers? > > > > -Sergey > > > > > > > > Thanks, > > > Robin. > > > > > > > > if (ret) > > > > > dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n"); > > > > > - pp->msi_page = alloc_page(GFP_DMA32); > > > > > - pp->msi_data = dma_map_page(dev, pp->msi_page, 0, > > > > > - PAGE_SIZE, DMA_FROM_DEVICE); > > > > > - ret = dma_mapping_error(dev, pp->msi_data); > > > > > - if (ret) { > > > > > - dev_err(pci->dev, "Failed to map MSI data\n"); > > > > > - __free_page(pp->msi_page); > > > > > - pp->msi_page = NULL; > > > > > - pp->msi_data = 0; > > > > > + msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data, > > > > > + GFP_KERNEL); > > > > > > > > Changing the whole device DMA-mask due to something that doesn't > > > > perform seems inappropriate. I'd suggest to preserve the ZONE_DMA32 > > > > here until there is something like suggested by @Robin > > > > https://lore.kernel.org/linux-pci/1e63a581-14ae-b4b5-a5bf-ca8f09c33af6@arm.com/ > > > > in the last paragraph is implemented. Especially seeing there still > > > > common drivers in kernel which still rely on that zone. > > > > > > > > -Sergey > > > > > > > > > + if (!msi_vaddr) { > > > > > + dev_err(dev, "Failed to alloc and map MSI data\n"); > > > > > dw_pcie_free_msi(pp); > > > > > - > > > > > - return ret; > > > > > + return -ENOMEM; > > > > > } > > > > > return 0; > > > > > diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h > > > > > index 09b887093a84..a871ae7eb59e 100644 > > > > > --- a/drivers/pci/controller/dwc/pcie-designware.h > > > > > +++ b/drivers/pci/controller/dwc/pcie-designware.h > > > > > @@ -243,7 +243,6 @@ struct dw_pcie_rp { > > > > > struct irq_domain *irq_domain; > > > > > struct irq_domain *msi_domain; > > > > > dma_addr_t msi_data; > > > > > - struct page *msi_page; > > > > > struct irq_chip *msi_irq_chip; > > > > > u32 num_vectors; > > > > > u32 irq_mask[MAX_MSI_CTRLS]; > > > > > -- > > > > > 2.37.2.672.g94769d06f0-goog > > > > > > > > > >