Received: by 10.213.65.68 with SMTP id h4csp653280imn; Fri, 30 Mar 2018 12:46:56 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+wUunHhiG4+c1oUaZKnRJnGdnzhm97qjZs428y6X86NVba4TnaOfhrClihtzR873NrJo5t X-Received: by 10.99.63.205 with SMTP id m196mr224196pga.82.1522439216821; Fri, 30 Mar 2018 12:46:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522439216; cv=none; d=google.com; s=arc-20160816; b=Dsqqx2bqeMohXiBpaeTvR/NBmyauj9v98EBIkm5GYHe3p1FySzW1K+Yab44cxgeaPZ k5eM19w2f5eXWcpTWsiJGdQOhXOgi1E1a2A9eVBiPUWgWlhYtYAsrJCV4b+Bv6tn7Gqw zukpT6gMDlACVsSzwuLWYmLQpoVpU4pUim0YH3SfS+d54IK6QQNZZCpzjnMQA9Ilwu/7 uc1Ggib6VtNNf5eXgIq7q6vh2317cDRxhUobhYI0v5iXwFUv5p8F7QrCEZFOlRVZI+x7 jyXJiCXo0kwiN4AQLRKMT5VbwM7jce244hZNciieeFb4L/KAXjZYdgD95bkbwzCVjnZ+ Y95w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=eONgp2N3e+08Ztjfm418Eqda/2/gIuJL19LTaDHwDZk=; b=crmyilDrWcRSaVrwcPcgIC5KGA9DhNySLGGXzSctvEMJgbGfQWMvTLjrTu5oLfFEnG ESTvNOYffhCH81xmCXssAQidGFaSsVvKXjLSe5okWA2eqOY8R7KP1CMXtfgblHG8Itcp 0mPzEo2tgZ1cWthA/FLZo2fIK4TD6UiT+VYFmvzjUjbDU/ezT4LNZMOg81j3FhP3VYBv fif6t/bkKctaChRLdeX/41B6ybUdQYKxNDwWyqkKdLU1eOlxiwqQ+nMHCL/M+/ODXPex 2Ylxh2Z1iBQiKa8RLJHftx8k4czwTczEwCVpHiNA3zfV10lL+Y58rrR4FopcFyvXAQOz 54zg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m10si5866997pge.245.2018.03.30.12.46.39; Fri, 30 Mar 2018 12:46:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752563AbeC3TpZ (ORCPT + 99 others); Fri, 30 Mar 2018 15:45:25 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:41364 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752259AbeC3TpY (ORCPT ); Fri, 30 Mar 2018 15:45:24 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 93139830B824; Fri, 30 Mar 2018 19:45:23 +0000 (UTC) Received: from redhat.com (ovpn-121-212.rdu2.redhat.com [10.10.121.212]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1AC072026DFD; Fri, 30 Mar 2018 19:45:21 +0000 (UTC) Date: Fri, 30 Mar 2018 15:45:19 -0400 From: Jerome Glisse To: Logan Gunthorpe Cc: Christian =?iso-8859-1?Q?K=F6nig?= , Christoph Hellwig , Will Davis , Joerg Roedel , linaro-mm-sig@lists.linaro.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, Bjorn Helgaas Subject: Re: [PATCH 2/8] PCI: Add pci_find_common_upstream_dev() Message-ID: <20180330194519.GC3198@redhat.com> References: <6a5c9a10-50fe-b03d-dfc1-791d62d79f8e@amd.com> <73578b4e-664b-141c-3e1f-e1fae1e4db07@amd.com> <1b08c13e-b4a2-08f2-6194-93e6c21b7965@deltatee.com> <70adc2cc-f7aa-d4b9-7d7a-71f3ae99f16c@gmail.com> <98ce6cfd-bcf3-811e-a0f1-757b60da467a@deltatee.com> <8d050848-8970-b8c4-a657-429fefd31769@amd.com> <20180330015854.GA3572@redhat.com> <0234bc5e-495e-0f68-fb0a-debb17a35761@deltatee.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <0234bc5e-495e-0f68-fb0a-debb17a35761@deltatee.com> User-Agent: Mutt/1.9.2 (2017-12-15) X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Fri, 30 Mar 2018 19:45:23 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Fri, 30 Mar 2018 19:45:23 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jglisse@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 30, 2018 at 12:46:42PM -0600, Logan Gunthorpe wrote: > > > On 29/03/18 07:58 PM, Jerome Glisse wrote: > > On Thu, Mar 29, 2018 at 10:25:52AM -0600, Logan Gunthorpe wrote: > >> > >> > >> On 29/03/18 10:10 AM, Christian K?nig wrote: > >>> Why not? I mean the dma_map_resource() function is for P2P while other > >>> dma_map_* functions are only for system memory. > >> > >> Oh, hmm, I wasn't aware dma_map_resource was exclusively for mapping > >> P2P. Though it's a bit odd seeing we've been working under the > >> assumption that PCI P2P is different as it has to translate the PCI bus > >> address. Where as P2P for devices on other buses is a big unknown. > > > > dma_map_resource() is the right API (thought its current implementation > > is fill with x86 assumptions). So i would argue that arch can decide to > > implement it or simply return dma error address which trigger fallback > > path into the caller (at least for GPU drivers). SG variant can be added > > on top. > > > > dma_map_resource() is the right API because it has all the necessary > > informations. It use the CPU physical address as the common address > > "language" with CPU physical address of PCIE bar to map to another > > device you can find the corresponding bus address from the IOMMU code > > (NOP on x86). So dma_map_resource() knows both the source device which > > export its PCIE bar and the destination devices. > > Well, as it is today, it doesn't look very sane to me. The default is to > just return the physical address if the architecture doesn't support it. > So if someone tries this on an arch that hasn't added itself to return > an error they're very likely going to just end up DMAing to an invalid > address and loosing the data or causing a machine check. Looking at upstream code it seems that the x86 bits never made it upstream and thus what is now upstream is only for ARM. See [1] for x86 code. Dunno what happen, i was convince it got merge. So yes current code is broken on x86. ccing Joerg maybe he remembers what happened there. [1] https://lwn.net/Articles/646605/ > > Furthermore, the API does not have all the information it needs to do > sane things. A phys_addr_t doesn't really tell you anything about the > memory behind it or what needs to be done with it. For example, on some > arm64 implementations if the physical address points to a PCI BAR and > that BAR is behind a switch with the DMA device then the address must be > converted to the PCI BUS address. On the other hand, if it's a physical > address of a device in an SOC it might need to be handled in a > completely different way. And right now all the implementations I can > find seem to just assume that phys_addr_t points to regular memory and > can be treated as such. Given it is currently only used by ARM folks it appear to at least work for them (tm) :) Note that Christian is doing this in PCIE only context and again dma_map_resource can easily figure that out if the address is a PCIE or something else. Note that the exporter export the CPU bus address. So again dma_map_resource has all the informations it will ever need, if the peer to peer is fundamentaly un-doable it can return dma error and it is up to the caller to handle this, just like GPU code do. Do you claim that dma_map_resource is missing any information ? > > This is one of the reasons that, based on feedback, our work went from > being general P2P with any device to being restricted to only P2P with > PCI devices. The dream that we can just grab the physical address of any > device and use it in a DMA request is simply not realistic. I agree and understand that but for platform where such feature make sense this will work. For me it is PowerPC and x86 and given PowerPC has CAPI which has far more advance feature when it comes to peer to peer, i don't see something more basic not working. On x86, Intel is a bit of lone wolf, dunno if they gonna support this usecase pro-actively. AMD definitly will. If you feel like dma_map_resource() can be interpreted too broadly, more strict phrasing/wording can be added to it so people better understand its limitation and gotcha. Cheers, J?r?me