Received: by 2002:a25:5b86:0:0:0:0:0 with SMTP id p128csp2068265ybb; Fri, 29 Mar 2019 18:10:18 -0700 (PDT) X-Google-Smtp-Source: APXvYqzfzCDbx8YIM+t6+WZehi3ksis5K9BdSaqNsgTFlKDv/0ZL89xXYU5oXXdMNjI50aqUMHyA X-Received: by 2002:a65:4247:: with SMTP id d7mr30375585pgq.114.1553908218489; Fri, 29 Mar 2019 18:10:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553908218; cv=none; d=google.com; s=arc-20160816; b=aHY1npPQV70h843K48gCQYL2Sx4eeC3TiBWu7/rKt0h/i+46ruAwj5vmhRVoaSIrtg 97NQh/9LYXBUmGT+fyq3F4uxDEsX4IhJGq5E5tkQm7Ni3RtaoDFYXd0cQNjYYb5zkJLd HUz4rKxBusBmUXlyAitY+5dD6xRKanujnL7cy9OgJAl4XOjiiaZ13E2owYWQnBrc53wE fckQn2pOzLhmWx9O4kNSLyBeFqXd+2K+hmRL/ybAkg6237QiewCNTHP/1Q8mOKLE/1jF m/alUiQVimYrPmRtWv27rJWqO+iHGV7EFo4ICY1GjatJuTN0NQmMJUZxxhSeXEGXTgna 71fQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:message-id:in-reply-to:date:references:subject:cc:to :from; bh=TB/l0iuxJXt6csy35DD8Onnn/+9fJ29MnJZrEAuDzdo=; b=t8KPyKdcozJloVPmAOWVB6ZxHGG+AmK8KVTvOhJvchSUyrgjY6Y5SRnrsdDMhQiQWt 0cFEM+zRYWLXDqt3IdBGsBK/WGXaf6RnT9W3WJ5KmDsBAepo6mIuNfcZWAe6xo2CMwrD YRoP1dhsD6CruzeuqyrNQMN64Lk6fGBP7dhntXZFNjuND9an3c9ggf2r89d7VTowCTuq g0KSMatFe45GpNZxjPWy6hU7Fle1Yx7v20GO9bpPiuNTH32X76TgF2TsO9tq0+iEPUOe EYFdWTlzV71YMdn/2BkwL4oCq202a+saw0qOPBeZGIRn/2HsGgkhJD2qZyOaHqrL8U/X VcwQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w8si3102953plp.349.2019.03.29.18.10.03; Fri, 29 Mar 2019 18:10:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731435AbfC3BJa convert rfc822-to-8bit (ORCPT + 99 others); Fri, 29 Mar 2019 21:09:30 -0400 Received: from relay8-d.mail.gandi.net ([217.70.183.201]:33183 "EHLO relay8-d.mail.gandi.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730589AbfC3BJ3 (ORCPT ); Fri, 29 Mar 2019 21:09:29 -0400 X-Originating-IP: 149.199.65.200 Received: from fisel.enstb.org (unknown [149.199.65.200]) (Authenticated sender: ronan@keryell.fr) by relay8-d.mail.gandi.net (Postfix) with ESMTPSA id 0EB311BF230; Sat, 30 Mar 2019 01:09:21 +0000 (UTC) From: Ronan KERYELL To: Dave Airlie Cc: Sonal Santan , Daniel Vetter , "dri-devel\@lists.freedesktop.org" , "gregkh\@linuxfoundation.org" , Cyril Chemparathy , "linux-kernel\@vger.kernel.org" , Lizhi Hou , Michal Simek , "airlied\@redhat.com" , linux-fpga@vger.kernel.org, Ralph Wittig , Ronan Keryell Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver References: <20190319215401.6562-1-sonal.santan@xilinx.com> <20190325202810.GG2665@phenom.ffwll.local> <20190327141137.GK2665@phenom.ffwll.local> Date: Fri, 29 Mar 2019 18:09:18 -0700 In-Reply-To: (Dave Airlie's message of "Fri, 29 Mar 2019 14:56:17 +1000") Message-ID: <871s2pw4ld.fsf@fisel.enstb.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I am adding linux-fpga@vger.kernel.org, since this is why I missed this thread in the first place... >>>>> On Fri, 29 Mar 2019 14:56:17 +1000, Dave Airlie said: Hi Dave! Dave> On Thu, 28 Mar 2019 at 10:14, Sonal Santan wrote: >>> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] [...] >>> Note: There's no expectation for the fully optimizing compiler, >>> and we're totally ok if there's an optimizing proprietary >>> compiler and a basic open one (amd, and bunch of other >>> companies all have such dual stacks running on top of drm >>> kernel drivers). But a basic compiler that can convert basic >>> kernels into machine code is expected. >> Although the compiler is not open source the compilation flow >> lets users examine output from various stages. For example if you >> write your kernel in OpenCL/C/C++ you can view the RTL >> (Verilog/VHDL) output produced by first stage of compilation. >> Note that the compiler is really generating a custom circuit >> given a high level input which in the last phase gets synthesized >> into bitstream. Expert hardware designers can handcraft a circuit >> in RTL and feed it to the compiler. Our FPGA tools let you view >> the generated hardware design, the register map, etc. You can get >> more information about a compiled design by running XRT tool like >> xclbinutil on the generated file. >> In essence compiling for FPGAs is quite different than compiling >> for GPU/CPU/DSP. Interestingly FPGA compilers can run anywhere >> from 30 mins to a few hours to compile a testcase. Dave> So is there any open source userspace generator for what this Dave> interface provides? Is the bitstream format that gets fed into Dave> the FPGA proprietary and is it signed? Short answer: - a bitstream is an opaque content similar to various firmware handled by Linux, EFI capsules, x86 microcode, WiFi modems, etc. - there is no open-source generator for what the interface consume; - I do not know if it is signed; - it is probably similar to what Intel FPGA (not GPU) drivers provide already inside the Linux kernel and I guess there is no pure open-source way to generate their bit-stream either. Long answer: - processors, GPU and other digital circuits are designed from a lot of elementary transistors, wires, capacitors, resistors... using some very complex (and expensive) tools from some EDA companies but at the end, after months of work, they come often with a "simple" public interface, the... instruction set! So it is rather "easy" at the end to generate some instructions with a compiler such as LLVM from a description of this ISA or some reverse engineering. Note that even if the ISA is public, it is very difficult to make another efficient processor from scratch just from this ISA, so there is often no concern about making this ISA public to develop the ecosystem ; - FPGA are field-programmable gate arrays, made also from a lot of elementary transistors, wires, capacitors, resistors... but organized in billions of very low-level elementary gates, memory elements, DSP blocks, I/O blocks, clock generators, specific accelerators... directly exposed to the user and that can be programmed according to a configuration memory (the bitstream) that details how to connect each part, routing element, configuring each elemental piece of hardware. So instead of just writing instructions like on a CPU or a GPU, you need to configure each bit of the architecture in such a way it does something interesting for you. Concretely, you write some programs in RTL languages (Verilog, VHDL) or higher-level (C/C++, OpenCL, SYCL...) and you use some very complex (and expensive) tools from some EDA companies to generate the bitstream implementing an equivalent circuit with the same semantics. Since the architecture is so low level, there is a direct mapping between the configuration memory (bitstream) and the hardware architecture itself, so if it is public then it is easy to duplicate the FPGA itself and to start a new FPGA company. That is unfortunately something the existing FPGA companies do not want... ;-) To summarize: - on a CPU & GPU, the vendor used the expensive EDA tools once already for you and provide the simpler ISA interface; - on an FPGA, you have access to a pile of low-level hardware and it is up to you to use the lengthy process of building your own computing architecture using the heavy expensive very subtle EDA tools that will run for hours or days to generate some good-enough placement for your pleasure. There is some public documentation on-line: https://www.xilinx.com/products/silicon-devices/fpga/virtex-ultrascale-plus.html#documentation To have an idea of the elementary architecture: https://www.xilinx.com/support/documentation/user_guides/ug574-ultrascale-clb.pdf https://www.xilinx.com/support/documentation/user_guides/ug579-ultrascale-dsp.pdf https://www.xilinx.com/support/documentation/user_guides/ug573-ultrascale-memory-resources.pdf Even on the configuration and the file format, but without any detailed semantics: https://www.xilinx.com/support/documentation/user_guides/ug570-ultrascale-configuration.pdf The Xilinx compiler xocc taking for example some LLVM IR and generating some bitstream is not open-source and will probably never be for the reasons above... :-( Xilinx is open-sourcing all what can reasonably be open-sourced: - the user-level and system run-time, including the OpenCL runtime: https://github.com/Xilinx/XRT to handle the bitstreams generated by some close-source tools - the kernel device drivers which are already in https://github.com/Xilinx/XRT but we want to upstream into the Linux kernel to make life easier (this is the matter of this e-mail thread); - to generate some real code in the most (modern and) open-source way, there is an open-source framework to compile some SYCL C++ including some Xilinx FPGA-specific extensions down to SPIR LLVM IR using Clang/LLVM and to feed the close-source xocc tool with it https://github.com/triSYCL/triSYCL You can see starting from https://github.com/triSYCL/triSYCL/blob/master/tests/Makefile#L322 how to start from C++ code, generate some SPIR LLVM IR and to feed xocc and build a fat binary that will use the XRT runtime. Some documentation in https://github.com/triSYCL/triSYCL/blob/master/doc/architecture.rst There are other more official ways to generate bitstream (they are called products instead of research projects like triSYCL :-) ). We are also working on an other open-source SYCL compiler with Intel to have a better common implementation https://github.com/intel/llvm/wiki and to upstream this into Clang/LLVM. So for Xilinx FPGA, you can see the LLVM IR as the equivalent of PTX for nVidia. But xocc is close-source for some more fundamental reasons: it would expose all the details of the FPGA. I guess this is exactly the same for Xilinx FPGA. Note that probably most of the tool chains used to generate the low-level firmware for the various CPU (microcode), GPU, etc. are also close-source. See you, -- Ronan KERYELL, Xilinx Research Labs / San José, California.