Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp563837imm; Fri, 3 Aug 2018 07:56:41 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdVEIitVg6W9cg5KdaHnkI6t/RY2atCwTWv6kbDQB1lNQ5BEUkk9DZf5ON8P314qiy3DLQn X-Received: by 2002:a62:c8c2:: with SMTP id i63-v6mr4942210pfk.73.1533308201179; Fri, 03 Aug 2018 07:56:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533308201; cv=none; d=google.com; s=arc-20160816; b=jKL7Ax3bcQF6HLG/YTtRoSDul7k/gGJh2nhOnAhOHo82IFvchvduKyfkAK/2HdDQVj f1tfcOkvy3dNDYmAQfkwT7y64so4XBGNE7vLHrrBO5W669169lEw7DXl7HEPRSyC6wq9 I2mP9u6XklMmAasN1V8nPwP7pr3RWbVbz09LEF1pwsTlgMP4JDOKL8TGoD4z4pFCta6K 58qfV0Db0+W+MU1Kevn5j175n+BbPkIaM0lJ1OyoywbFM/vzEtyXuxGPGPyiCjEYdh9v ZqZHfTaafdLoZvU6az9vRkAn3G69dxFqp3OTujoMIYnjyZmKRba/XASOIpqsQjcVVzZS wJrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=ViQCoFDxC0roP6iD216MJFiPC7QgYTdP/AAOqjymVGM=; b=A81oAHBSGs7Kf9n9eTDudDAzTRyiXA6Q7RUWPpIKTxMFyIpIUCFAzNlPB6/iwmXk9G UirTsK8SJZwa4kYTrHY2WA958LZfEKtD3j0ex8ibgzOtVZtb2PExP6DyaSVRZiIkCy7u 5KfyGbscnq7dl7XiYRpC4z/shI/7OfggwkZlsB4Nq2FDYQPIl2z9zBJ5m8m9QJqtXJMI NoCflHGi0YzkSuoo8apQOwWqE9Sw9W03Lm7Yltz9q0e7Io1Tx3PJbS+lVINjNGtFBMHh 5Tfbx5B/pBvSFo39iQIp4dX9y9l3xbsPrbWTkd0yDv6BX+25zIyTUPSnq528p9R9uJqO eOSA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r73-v6si5451118pfk.83.2018.08.03.07.56.24; Fri, 03 Aug 2018 07:56:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732323AbeHCQwO (ORCPT + 99 others); Fri, 3 Aug 2018 12:52:14 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:47432 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726115AbeHCQwO (ORCPT ); Fri, 3 Aug 2018 12:52:14 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5A4F9406E886; Fri, 3 Aug 2018 14:55:32 +0000 (UTC) Received: from redhat.com (ovpn-125-78.rdu2.redhat.com [10.10.125.78]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0411D10CD65B; Fri, 3 Aug 2018 14:55:27 +0000 (UTC) Date: Fri, 3 Aug 2018 10:55:26 -0400 From: Jerome Glisse To: Alan Cox Cc: "Tian, Kevin" , Kenneth Lee , Hao Fang , Herbert Xu , "kvm@vger.kernel.org" , Jonathan Corbet , Greg Kroah-Hartman , "linux-doc@vger.kernel.org" , "Kumar, Sanjay K" , "iommu@lists.linux-foundation.org" , "linux-kernel@vger.kernel.org" , "linuxarm@huawei.com" , Alex Williamson , Thomas Gleixner , "linux-crypto@vger.kernel.org" , Philippe Ombredanne , Zaibo Xu , Kenneth Lee , "David S . Miller" , Ross Zwisler Subject: Re: [RFC PATCH 0/7] A General Accelerator Framework, WarpDrive Message-ID: <20180803145526.GB4079@redhat.com> References: <20180801102221.5308-1-nek.in.cn@gmail.com> <20180801165644.GA3820@redhat.com> <20180802111000.4649d9ed@alans-desktop> <20180802144627.GB3481@redhat.com> <20180803152043.40f88947@alans-desktop> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20180803152043.40f88947@alans-desktop> User-Agent: Mutt/1.10.0 (2018-05-17) X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 03 Aug 2018 14:55:32 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 03 Aug 2018 14:55:32 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jglisse@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 03, 2018 at 03:20:43PM +0100, Alan Cox wrote: > > If we are going to have any kind of general purpose accelerator API then > > > it has to be able to implement things like > > > > Why is the existing driver model not good enough ? So you want > > a device with function X you look into /dev/X (for instance > > for GPU you look in /dev/dri) > > Except when my GPU is in an FPGA in which case it might be somewhere else > or it's a general purpose accelerator that happens to be usable as a GPU. > Unusual today in big computer space but you'll find it in > microcontrollers. You do need a specific userspace driver for each of those device correct ? You definitly do for GPU and i do not see that going away any time soon. I doubt Xilinx and Altera will ever compile down to same bit- stream format either. So userspace application is bound to use some kind of library that implement the userspace side of the driver and that library can easily provide helpers to enumerate all the devices it supports. For instance that is what OpenCL allows for both GPU and FPGA. One single API and multiple different hardware you can target from it. > > Each of those device need a userspace driver and thus this > > user space driver can easily knows where to look. I do not > > expect that every application will reimplement those drivers > > but instead use some kind of library that provide a high > > level API for each of those devices. > > Think about it from the user level. You have a pipeline of things you > wish to execute, you need to get the right accelerator combinations and > they need to fit together to meet system constraints like number of > IOMMU ids the accelerator supports, where they are connected. Creating a pipe of device ie one device consuming the work of the previous one, is a problem on its own and it should be solved separatly and not inside VFIO. GPU (on ARM) already have this pipe thing because the IP block that do overlay, or the IP block that push pixel to the screen or the IP block that do 3D rendering are all coming from different company. I do not see the value of having all the device enumerated through VFIO to address this problem. I can definitly understand having a specific kernel mechanism to expose to userspace what is do-able but i believe this should be its own thing that allow any device (a VFIO one, a network one, a GPU, a FPGA, ...) to be use in "pipe" mode. > > Now you have a hierarchy of memory for the CPU (HBM, local > > node main memory aka you DDR dimm, persistent memory) each > > It's not a heirarchy, it's a graph. There's no fundamental reason two > accelerators can't be close to two different CPU cores but have shared > HBM that is far from each processor. There are physical reasons it tends > to look more like a heirarchy today. Yes you are right i used the wrong word. > > > Anyway i think finding devices and finding relation between > > devices and memory is 2 separate problems and as such should > > be handled separatly. > > At a certain level they are deeply intertwined because you need a common > API. It's not good if I want a particular accelerator and need to then > see which API its under on this machine and which interface I have to > use, and maybe have a mix of FPGA, WarpDrive and Google ASIC interfaces > all different. > > The job of the kernel is to impose some kind of sanity and unity on this > lot. > > All of it in the end comes down to > > 'Somehow glue some chunk of memory into my address space and find any > supporting driver I need' > > plus virtualization of the above. > > That bit's easy - but making it usable is a different story. My point is that for all those devices you will have a userspace drivers and thus all the complexity of finding best memory for a given combination of hardware can be push to userspace. Kernel only have to expose the topology of the different memory and their relation with each of the devices. Very much like you have NUMA node for CPU. I rather see kernel having one API to expose topology, one API to expose device mixing capabilities (ie how can you can pipe device together if at all). Then having to update every single existing upstream driver that want to participate in the above to now become a VFIO driver. I have nothing against VFIO ;) Just to me it seems that this is all reinventing a new device driver infrastructure under it while the existing one is good enough and can evolve to support all the cases discussed here. Cheers, J?r?me