Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2660278imm; Tue, 4 Sep 2018 08:03:47 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZTed6ay0WJLOuZWg19aPYtAaH6OPZai1nojw9kVRHrwbQ/+2EOT/gCGGV4UjmnfFspZXoz X-Received: by 2002:a17:902:f213:: with SMTP id gn19mr28668456plb.266.1536073427313; Tue, 04 Sep 2018 08:03:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536073427; cv=none; d=google.com; s=arc-20160816; b=wUG3gwaCDdXmkb9cuMeddnM15mMmjp24OQlhIRnop5hzydGq8++4yHaU8v8ebOql89 EYztZRj78fnPymYHPYBbwMGKS/UZwgu8UaSg9hO7rr5NU453TU+EDAGiKEnZhcSmHsBL pbKnQKLOeP7YkiTJtb1aPAaBFYuQx8F4Cw16XFUzFw8lMWOIbkmaroKsAPqYVDmDXkfQ GFP17EpWbO0mtxvO9dpFkAAgpJk2IaXlE0YhZS+RiPhueoBgE7PsBi6+cLAIds4KnR8Z 9DBvHBqEy9aQN+2RVI7XdPcU2btpf+Gy//pqtO7wxQGzCdynmqkZ6Bdk/mttrZ/aQS4J PURg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=ewh4Z130l/zZ9S/P54aNGxiGM9pIsDmS5i1cYbW9gPM=; b=X24BC15k4NrxmG3DcuVQNXQqEOKqo6P1akTWxrdE3xS7jlhlS7Gb58FR9QOYDlouo9 h46LF9MLzGlAV2LqL411KPAviHqAa2HDMesBIUY9+1MMdazOHoLwCLqzoP6QVHE/VALX 3+k7e1umGKZgfwD4LCT/FTLbyljvNakprr7G0ag1++y77jJEe/Xc6rn3+ir6XZ2Bj+z2 Qapk31bx4SZaOcm43mONI88os4rRBENjSjyv4GSme0Agrar0vm1mY9rTqGt/0lQScZqu KSTc7HpzOgIhVvrCdPK4UGwJ0/NAFzHyxIE81GRvIUXiwbQ1O7dAK09DuGWD8Voc3S6J ocxg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o1-v6si22438022plb.499.2018.09.04.08.03.29; Tue, 04 Sep 2018 08:03:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727557AbeIDTZz (ORCPT + 99 others); Tue, 4 Sep 2018 15:25:55 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:54396 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726015AbeIDTZz (ORCPT ); Tue, 4 Sep 2018 15:25:55 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CEF3987927; Tue, 4 Sep 2018 15:00:25 +0000 (UTC) Received: from redhat.com (ovpn-125-19.rdu2.redhat.com [10.10.125.19]) by smtp.corp.redhat.com (Postfix) with ESMTPS id AD08863ADC; Tue, 4 Sep 2018 15:00:21 +0000 (UTC) Date: Tue, 4 Sep 2018 11:00:19 -0400 From: Jerome Glisse To: Kenneth Lee Cc: Jonathan Corbet , Herbert Xu , "David S . Miller" , Joerg Roedel , Alex Williamson , Kenneth Lee , Hao Fang , Zhou Wang , Zaibo Xu , Philippe Ombredanne , Greg Kroah-Hartman , Thomas Gleixner , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux-foundation.org, kvm@vger.kernel.org, linux-accelerators@lists.ozlabs.org, Lu Baolu , Sanjay Kumar , linuxarm@huawei.com Subject: Re: [RFCv2 PATCH 0/7] A General Accelerator Framework, WarpDrive Message-ID: <20180904150019.GA4024@redhat.com> References: <20180903005204.26041-1-nek.in.cn@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20180903005204.26041-1-nek.in.cn@gmail.com> User-Agent: Mutt/1.10.0 (2018-05-17) X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Tue, 04 Sep 2018 15:00:26 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Tue, 04 Sep 2018 15:00:26 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jglisse@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 03, 2018 at 08:51:57AM +0800, Kenneth Lee wrote: > From: Kenneth Lee > > WarpDrive is an accelerator framework to expose the hardware capabilities > directly to the user space. It makes use of the exist vfio and vfio-mdev > facilities. So the user application can send request and DMA to the > hardware without interaction with the kernel. This removes the latency > of syscall. > > WarpDrive is the name for the whole framework. The component in kernel > is called SDMDEV, Share Domain Mediated Device. Driver driver exposes its > hardware resource by registering to SDMDEV as a VFIO-Mdev. So the user > library of WarpDrive can access it via VFIO interface. > > The patchset contains document for the detail. Please refer to it for more > information. > > This patchset is intended to be used with Jean Philippe Brucker's SVA > patch [1], which enables not only IO side page fault, but also PASID > support to IOMMU and VFIO. > > With these features, WarpDrive can support non-pinned memory and > multi-process in the same accelerator device. We tested it in our SoC > integrated Accelerator (board ID: D06, Chip ID: HIP08). A reference work > tree can be found here: [2]. > > But it is not mandatory. This patchset is tested in the latest mainline > kernel without the SVA patches. So it supports only one process for each > accelerator. > > We have noticed the IOMMU aware mdev RFC announced recently [3]. > > The IOMMU aware mdev has similar idea but different intention comparing to > WarpDrive. It intends to dedicate part of the hardware resource to a VM. > And the design is supposed to be used with Scalable I/O Virtualization. > While sdmdev is intended to share the hardware resource with a big amount > of processes. It just requires the hardware supporting address > translation per process (PCIE's PASID or ARM SMMU's substream ID). > > But we don't see serious confliction on both design. We believe they can be > normalized as one. > So once again i do not understand why you are trying to do things this way. Kernel already have tons of example of everything you want to do without a new framework. Moreover i believe you are confuse by VFIO. To me VFIO is for VM not to create general device driver frame work. So here is your use case as i understand it. You have a device with a limited number of command queues (can be just one) and in some case it can support SVA/SVM (when hardware support it and it is not disabled). Final requirement is being able to schedule cmds from userspace without ioctl. All of this exists already exists upstream in few device drivers. So here is how every body else is doing it. Please explain why this does not work. 1 Userspace open device file driver. Kernel device driver create a context and associate it with on open. This context can be uniq to the process and can bind hardware resources (like a command queue) to the process. 2 Userspace bind/acquire a commands queue and initialize it with an ioctl on the device file. Through that ioctl userspace can be inform wether either SVA/SVM works for the device. If SVA/ SVM works then kernel device driver bind the process to the device as part of this ioctl. 3 If SVM/SVA does not work userspace do an ioctl to create dma buffer or something that does exactly the same thing. 4 Userspace mmap the command queue (mmap of the device file by using informations gather at step 2) 5 Userspace can write commands into the queue it mapped 6 When userspace close the device file all resources are release just like any existing device drivers. Now if you want to create a device driver framework that expose a device file with generic API for all of the above steps fine. But it does not need to be part of VFIO whatsoever or explain why. Note that if IOMMU is fully disabled you probably want to block userspace from being able to directly scheduling commands onto the hardware as it would allow userspace to DMA anywhere and thus would open the kernel to easy exploits. In this case you can still keeps the same API as above and use page fault tricks to valid commands written by userspace into fake commands ring. This will be as slow or maybe even slower than ioctl but at least it allows you to validate commands. Cheers, J?r?me