Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp225445imu; Mon, 19 Nov 2018 21:18:44 -0800 (PST) X-Google-Smtp-Source: AJdET5cmBlX+6XDnbZvk2LRRQWpncJ4F7tEJj5h9VrJKumXZrdSLxpANG3Y++m9/y+16hQZE6YI3 X-Received: by 2002:a62:4251:: with SMTP id p78-v6mr758499pfa.72.1542691124837; Mon, 19 Nov 2018 21:18:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542691124; cv=none; d=google.com; s=arc-20160816; b=yWMLcHZqiE8L4FX0S14PB50bnM35a+kFSZUxhHquhFITzfZXF/GZJ30KG01RcJbC5m vD7727bfjF2w4laHw5o98h01gB1rVDWXv4hB9audYIvbclcf+uuv8/mTwC+Ni4qRjYR0 4xLfnaDDd6Ip71zUEX4PgBcEpXyI2pY2D+c1gk4QA111/cfqSlyaeV7KvkEjotEewcgE lv1JrKr58WDzUE6Xl0HyrkSisT2m3tfU8rNB/bUvCsuFU083H/weQP6llqIgNdguQ8W2 CkoTQgpGKCs8cdH9NW8+p0s5obdQxZe1U29WN+b3a/A6gIqXF0FsZvDFWJlNzhihiIk3 i7mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=i8cip6iFOsrgbSLvZ0oqqTRA/Swknu+QPJ/MEIcmu+M=; b=b0NY6H153sbZcAvWRZwEaI2Xn3B8W+NA5zsXKCLC+37/x74nz2WzK6FxbNLsgIZObG L4+SY3FMGNdGrN2dRVtuLgOBiPc6aBVjS3bUGZezP9ZJRJyJslyHFhz0s+LG5xybaJv5 9Kqxl0aNODbhRBECVbWLJh8FVoULixdyXHI9Er8qq2JpXNJXGnQuSHofCKO+skvYqvAx mSEmTWeO5OPXC6g7J7YyXZnqxGYsGi7GpTILbGgmks80Xb/BBBfXCK3sM7dmw6ybra3T oxR01hapI1cTe0ZYhB+8shd2LwPAjkzmvCn7oPIoQyb4estU1GH7gccQvGQ4m51ql+5d XSGA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=QOITQS20; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f22si31343166pgm.81.2018.11.19.21.18.29; Mon, 19 Nov 2018 21:18:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=QOITQS20; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731472AbeKTPpE (ORCPT + 99 others); Tue, 20 Nov 2018 10:45:04 -0500 Received: from mail.kernel.org ([198.145.29.99]:47816 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726902AbeKTPpE (ORCPT ); Tue, 20 Nov 2018 10:45:04 -0500 Received: from localhost (unknown [77.138.135.184]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id ACB292075B; Tue, 20 Nov 2018 05:17:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1542691069; bh=sSs4NBjtggHh+00w/aU3U3/f5Gi3zhcKsoyFM8wyybk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=QOITQS201IVvPCnW/caBs/hcoL4+5OEgGc4gQZ/moeI4CsUUV28665KMEKBTpb/qz HY+3f83NEMwRTx6VQXXUSgBiofsm2OlmXuQhO+AXYT4ewbyPfBZp963UsaGIhAOCBo quw5a/lTTIlZy40ds80+jC1EKQBXiFGnVlz2yIRw= Date: Tue, 20 Nov 2018 07:17:44 +0200 From: Leon Romanovsky To: Kenneth Lee Cc: Jason Gunthorpe , Kenneth Lee , Tim Sell , linux-doc@vger.kernel.org, Alexander Shishkin , Zaibo Xu , zhangfei.gao@foxmail.com, linuxarm@huawei.com, haojian.zhuang@linaro.org, Christoph Lameter , Hao Fang , Gavin Schenk , RDMA mailing list , Zhou Wang , Doug Ledford , Uwe =?iso-8859-1?Q?Kleine-K=F6nig?= , David Kershner , Johan Hovold , Cyrille Pitchen , Sagar Dharia , Jens Axboe , guodong.xu@linaro.org, linux-netdev , Randy Dunlap , linux-kernel@vger.kernel.org, Vinod Koul , linux-crypto@vger.kernel.org, Philippe Ombredanne , Sanyog Kale , "David S. Miller" , linux-accelerators@lists.ozlabs.org Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce Message-ID: <20181120051743.GD25178@mtr-leonro.mtl.com> References: <20181112075807.9291-1-nek.in.cn@gmail.com> <20181112075807.9291-2-nek.in.cn@gmail.com> <20181113002354.GO3695@mtr-leonro.mtl.com> <95310df4-b32c-42f0-c750-3ad5eb89b3dd@gmail.com> <20181114160017.GI3759@mtr-leonro.mtl.com> <20181115085109.GD157308@Turing-Arch-b> <20181115145455.GN3759@mtr-leonro.mtl.com> <20181119091405.GE157308@Turing-Arch-b> <20181119184954.GB4890@ziepe.ca> <20181120030702.GH157308@Turing-Arch-b> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="g7w8+K/95kPelPD2" Content-Disposition: inline In-Reply-To: <20181120030702.GH157308@Turing-Arch-b> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --g7w8+K/95kPelPD2 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Nov 20, 2018 at 11:07:02AM +0800, Kenneth Lee wrote: > On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote: > > Date: Mon, 19 Nov 2018 11:49:54 -0700 > > From: Jason Gunthorpe > > To: Kenneth Lee > > CC: Leon Romanovsky , Kenneth Lee , > > Tim Sell , linux-doc@vger.kernel.org, Alexand= er > > Shishkin , Zaibo Xu > > , zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > haojian.zhuang@linaro.org, Christoph Lameter , Hao Fang > > , Gavin Schenk , RDMA mai= ling > > list , Zhou Wang , > > Doug Ledford , Uwe Kleine-K=F6nig > > , David Kershner > > , Johan Hovold , Cyrille > > Pitchen , Sagar Dharia > > , Jens Axboe , > > guodong.xu@linaro.org, linux-netdev , Randy Du= nlap > > , linux-kernel@vger.kernel.org, Vinod Koul > > , linux-crypto@vger.kernel.org, Philippe Ombredanne > > , Sanyog Kale , "David = S. > > Miller" , linux-accelerators@lists.ozlabs.org > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > User-Agent: Mutt/1.9.4 (2018-02-28) > > Message-ID: <20181119184954.GB4890@ziepe.ca> > > > > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote: > > > > > If the hardware cannot share page table with the CPU, we then need to= have > > > some way to change the device page table. This is what happen in ODP.= It > > > invalidates the page table in device upon mmu_notifier call back. But= this cannot > > > solve the COW problem: if the user process A share a page P with devi= ce, and A > > > forks a new process B, and it continue to write to the page. By COW, = the > > > process B will keep the page P, while A will get a new page P'. But y= ou have > > > no way to let the device know it should use P' rather than P. > > > > Is this true? I thought mmu_notifiers covered all these cases. > > > > The mm_notifier for A should fire if B causes the physical address of > > A's pages to change via COW. > > > > And this causes the device page tables to re-synchronize. > > I don't see such code. The current do_cow_fault() implemenation has nothi= ng to > do with mm_notifer. > > > > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it = support > > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you= don't need > > > to write any code for that. Because it has been done by IOMMU framewo= rk. If it > > > > Looks like the IOMMU code uses mmu_notifier, so it is identical to > > IB's ODP. The only difference is that IB tends to have the IOMMU page > > table in the device, not in the CPU. > > > > The only case I know if that is different is the new-fangled CAPI > > stuff where the IOMMU can directly use the CPU's page table and the > > IOMMU page table (in device or CPU) is eliminated. > > > > Yes. We are not focusing on the current implementation. As mentioned in t= he > cover letter. We are expecting Jean Philips' SVA patch: > git://linux-arm.org/linux-jpb. > > > Anyhow, I don't think a single instance of hardware should justify an > > entire new subsystem. Subsystems are hard to make and without multiple > > hardware examples there is no way to expect that it would cover any > > future use cases. > > Yes. That's our first expectation. We can keep it with our driver. But be= cause > there is no user driver support for any accelerator in mainline kernel. E= ven the > well known QuickAssit has to be maintained out of tree. So we try to see = if > people is interested in working together to solve the problem. > > > > > If all your driver needs is to mmap some PCI bar space, route > > interrupts and do DMA mapping then mediated VFIO is probably a good > > choice. > > Yes. That is what is done in our RFCv1/v2. But we accepted Jerome's opini= on and > try not to add complexity to the mm subsystem. > > > > > If it needs to do a bunch of other stuff, not related to PCI bar > > space, interrupts and DMA mapping (ie special code for compression, > > crypto, AI, whatever) then you should probably do what Jerome said and > > make a drivers/char/hisillicon_foo_bar.c that exposes just what your > > hardware does. > > Yes. If no other accelerator driver writer is interested. That is the > expectation:) > > But we really like to have a public solution here. Consider this scenario: > > You create some connections (queues) to NIC, RSA, and AI engine. Then you= got > data direct from the NIC and pass the pointer to RSA engine for decryptio= n. The > CPU then finish some data taking or operation and then pass through to th= e AI > engine for CNN calculation....This will need a place to maintain the same > address space by some means. You are using NIC terminology, in the documentation, you wrote that it is n= eeded for DPDK use and I don't really understand, why do we need another shiny new interface for DPDK. > > It is not complex, but it is helpful. > > > > > If you have networking involved in here then consider RDMA, > > particularly if this functionality is already part of the same > > hardware that the hns infiniband driver is servicing. > > > > 'computational MRs' are a reasonable approach to a side-car offload of > > already existing RDMA support. > > OK. Thanks. I will spend some time on it. But personally, I really don't = like > RDMA's complexity. I cannot even try one single function without a...some > expensive hardwares and complexity connection in the lab. This is not lik= e a > open source way. It is not very accurate. We have RXE driver which is virtual RDMA device which is implemented purely in SW. It struggles from bad performance and sporadic failures, but it is enough to try RDMA on your laptop in VM. Thanks > > > > > Jason > --g7w8+K/95kPelPD2 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIcBAEBAgAGBQJb85j3AAoJEORje4g2clinCzcQALTEmtC5tWhaxj2FyJvJYyOv S1bTcT7nO5VNULTjooacNWtFUJpuoSfUCQOEguiB/37F2G27Qd8uR8tBnVheBCeh Gho5oBNDOJNfgGGJ09SWbTZVOycfUyWAl4cMeLCSNJ6lb3iAmu5EAE0dySZf2dJW PMtnzuQM9hiqSeqWnSna9BJRtQI8Y8oWmoaT0aYVkTwsnB1Blw+g/r1FsQI1gTt3 3lYbmOf1EgdU3WHsd3FTRrvkmkkkCDPl/6cCwo9ilNbMmAzFf7H/Fi+yNwYzhAs5 3BN5EKv4ZdImI654067p2z9K5CjvEC24mGSGNBWPTESK3UYR9r+sdSzeSJNesDMi mVpk2dmqsvnn/JX78whbhsuvvFWyOk0xEfawzPQio8Ya2tnrJD8nW+6i58Lok1EF /FKd2XkvDrE2Y21UVUtDyPFHYdAsiBZCL/VX/Vo/eBYsYOzkZW69npCI9dnSNFbO yGrnpTpsc2iG8YwwKzPbVtGxjOGmEOauPJo4O5aj+xcnsrg7RKJ9sncw94ltEtoJ U4cKaWXB/okFbBi7WHWfvYGo6dsa3NV2ULX5V1vF9iEkCDU+ECQSSrtJeAGwdyd6 KMJZHHUKiF816glIcJCzNI3wqDBOe5GxZqHPnlLOkL/N114Ka+g0X+AlFmoXj3bZ JtoStRrcGOTsZsGDKwdx =gp0U -----END PGP SIGNATURE----- --g7w8+K/95kPelPD2--