Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp5476020pxj; Wed, 23 Jun 2021 01:59:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwhtHkOX7Yxo4ww+N9E+uE9q8jdSurhblrLOd1iYJYgDmXxCH7iwH79CFBPWQzg+j188lhe X-Received: by 2002:a92:d348:: with SMTP id a8mr2134360ilh.93.1624438790241; Wed, 23 Jun 2021 01:59:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624438790; cv=none; d=google.com; s=arc-20160816; b=knYKDoJJ2Sv5NA2mqGszRJ9LXL/FA6VZaviX7HI+qRfFrvOxCmQE0kFR49DRHOHT5o 6+rJSRaa4Jef+NbpLVqcMumyzn1IrGJI+cjLhk8Khe7tOc2DkyaQM/7kb/ZaXSUpO5NH yLAmwTUHONLASXfLPL8EL9Grf+K2gjkrwpllndgXy8mmd8uf+i2tio3/re819oJVHagd 0W/6TVesx20V6DM5vcGQkklmi3YOGnu1ACz91R6SUe57IEL3cQ0vuAunPz2NoYeOL/R5 /LQGe3X8S7Az9TN+TWMrPSKrQxgVubFBaGdBRDTKo9NiJ49dSjLuIEKUv9tHezwhBbJw ok9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=ZDxBkMAz03P8f+aMbelwxHltUcPmaJQiAy762htmSOw=; b=kOgv3hfF9zoH8WL/+HzoxnuNHFWqo/r+f/AobTGYS5XAo6vDg32kAFBpGg8id6NQla +kYrsq2FHdw5RktU/8h+aMNCXcBAaoN8RayyUdOEgbKUhiE8An9zQfO01EBpUpLmyJWl slin054Gprt18BSKJC9zMGnVXYbcwH8KKd4KbCAyjoe4V16691j/FxW0WQGyiK1GSPKb WEGGArVn5Uv8gf/h0tar1qtlTSC2PoxdU26018OAQLz8yqwthvFjQJYZo3pYkeGAzZLB mounp2pTpsoI/Iu7WASLZ+8lWMe5C9ftHEFmIsqk5mPMNNl04xitGwDL/3rhYCjf7yDL tldA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=gKPDCYXu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b9si19464640ilq.92.2021.06.23.01.59.38; Wed, 23 Jun 2021 01:59:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=gKPDCYXu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230072AbhFWI75 (ORCPT + 99 others); Wed, 23 Jun 2021 04:59:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40672 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229881AbhFWI74 (ORCPT ); Wed, 23 Jun 2021 04:59:56 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C92EFC061574; Wed, 23 Jun 2021 01:57:38 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id l12so995986wrt.3; Wed, 23 Jun 2021 01:57:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=ZDxBkMAz03P8f+aMbelwxHltUcPmaJQiAy762htmSOw=; b=gKPDCYXud3/Ny6GugpuD8PxV3Af5S6G9njiquRSYdnYuIyUPc5d4pP41bLP8IncC+b q0G+HGYLvTHmPkI8X4/wi97s/dtCDFf49XbHlQi3Y8XHCnuItPph3HC8KyweTZ73ISLS 6U70DcSwpJUikE7mrhXsZVWSIjwXeGdbOwnc7Gqoc2QWpTyRWMdjNrQM48WmgKKpatH8 rBMNFlxNEU7ReMuigAM0E6MsLA5W/Uz06GOd1A8ft0Tyi8xffgc2Iz9N5ULy5mLXvBvT cN+ZLDZs6oxy0wovyiU+HR9yKzaIKdGhpmY80/A/dYDf/wFVdMrbAX+Lqv+PElUMYYoR h9Nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=ZDxBkMAz03P8f+aMbelwxHltUcPmaJQiAy762htmSOw=; b=EJzwHO47YKP74yDVa9fSBN3uf1jV+CvYs+8PEpUfW7UidRV6tYPJnejmOCQPRSfukh MrpoVbuHpUU8dX/l1ujsZNtkskbXrxdgzqYAYc1AQOoNfCGKIafKSWFFBaAcSCnRlPeM g25s6lJ9yq4K46kzvoXFgYsPdHPdywZF87CQBEjW5eIsf2ebeZu1iNSjENb3K/q5oLI1 w1BFPsuP1C5aZr+gmQ+2t9mQ1b42B7TUUodxXic4m8+51GiNSt3aEIi086nv8kyuyPIS dmWixNurW+AUPxa6LZ015nnZ0In+h7EMFeyLaA3jZn4Uwi0IhS5UcdKN/eUk8c5fq3U3 Rjkw== X-Gm-Message-State: AOAM533SOrnYayiCUajFGGwiY97A2exfCwdh4XC4jnWdhkBqlphJ1wUU sjQpFxCBsj6BVaoG+c7IMhRY68hW4vo= X-Received: by 2002:adf:8b4d:: with SMTP id v13mr9929193wra.223.1624438657460; Wed, 23 Jun 2021 01:57:37 -0700 (PDT) Received: from ?IPv6:2a02:908:1252:fb60:69e4:a619:aa86:4e9c? ([2a02:908:1252:fb60:69e4:a619:aa86:4e9c]) by smtp.gmail.com with ESMTPSA id u12sm2195254wrr.40.2021.06.23.01.57.36 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 23 Jun 2021 01:57:36 -0700 (PDT) Subject: Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF To: Jason Gunthorpe , =?UTF-8?Q?Christian_K=c3=b6nig?= Cc: Oded Gabbay , Gal Pressman , sleybo@amazon.com, linux-rdma , Oded Gabbay , Christoph Hellwig , Linux Kernel Mailing List , dri-devel , "moderated list:DMA BUFFER SHARING FRAMEWORK" , Doug Ledford , Tomer Tayar , amd-gfx list , Greg KH , Alex Deucher , Leon Romanovsky , "open list:DMA BUFFER SHARING FRAMEWORK" References: <20210621232912.GK1096940@ziepe.ca> <20210622120142.GL1096940@ziepe.ca> <20210622152343.GO1096940@ziepe.ca> <3fabe8b7-7174-bf49-5ffe-26db30968a27@amd.com> <20210622154027.GS1096940@ziepe.ca> <09df4a03-d99c-3949-05b2-8b49c71a109e@amd.com> <20210622160538.GT1096940@ziepe.ca> From: =?UTF-8?Q?Christian_K=c3=b6nig?= Message-ID: Date: Wed, 23 Jun 2021 10:57:35 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210622160538.GT1096940@ziepe.ca> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am 22.06.21 um 18:05 schrieb Jason Gunthorpe: > On Tue, Jun 22, 2021 at 05:48:10PM +0200, Christian König wrote: >> Am 22.06.21 um 17:40 schrieb Jason Gunthorpe: >>> On Tue, Jun 22, 2021 at 05:29:01PM +0200, Christian König wrote: >>>> [SNIP] >>>> No absolutely not. NVidia GPUs work exactly the same way. >>>> >>>> And you have tons of similar cases in embedded and SoC systems where >>>> intermediate memory between devices isn't directly addressable with the CPU. >>> None of that is PCI P2P. >>> >>> It is all some specialty direct transfer. >>> >>> You can't reasonably call dma_map_resource() on non CPU mapped memory >>> for instance, what address would you pass? >>> >>> Do not confuse "I am doing transfers between two HW blocks" with PCI >>> Peer to Peer DMA transfers - the latter is a very narrow subcase. >>> >>>> No, just using the dma_map_resource() interface. >>> Ik, but yes that does "work". Logan's series is better. >> No it isn't. It makes devices depend on allocating struct pages for their >> BARs which is not necessary nor desired. > Which dramatically reduces the cost of establishing DMA mappings, a > loop of dma_map_resource() is very expensive. Yeah, but that is perfectly ok. Our BAR allocations are either in chunks of at least 2MiB or only a single 4KiB page. Oded might run into more performance problems, but those DMA-buf mappings are usually set up only once. >> How do you prevent direct I/O on those pages for example? > GUP fails. At least that is calming. >> Allocating a struct pages has their use case, for example for exposing VRAM >> as memory for HMM. But that is something very specific and should not limit >> PCIe P2P DMA in general. > Sure, but that is an ideal we are far from obtaining, and nobody wants > to work on it prefering to do hacky hacky like this. > > If you believe in this then remove the scatter list from dmabuf, add a > new set of dma_map* APIs to work on physical addresses and all the > other stuff needed. Yeah, that's what I totally agree on. And I actually hoped that the new P2P work for PCIe would go into that direction, but that didn't materialized. But allocating struct pages for PCIe BARs which are essentially registers and not memory is much more hacky than the dma_resource_map() approach. To re-iterate why I think that having struct pages for those BARs is a bad idea: Our doorbells on AMD GPUs are write and read pointers for ring buffers. When you write to the BAR you essentially tell the firmware that you have either filled the ring buffer or read a bunch of it. This in turn then triggers an interrupt in the hardware/firmware which was eventually asleep. By using PCIe P2P we want to avoid the round trip to the CPU when one device has filled the ring buffer and another device must be woken up to process it. Think of it as MSI-X in reverse and allocating struct pages for those BARs just to work around the shortcomings of the DMA API makes no sense at all to me. We also do have the VRAM BAR, and for HMM we do allocate struct pages for the address range exposed there. But this is a different use case. Regards, Christian. > > Otherwise, we have what we have and drivers don't get to opt out. This > is why the stuff in AMDGPU was NAK'd. > > Jason