Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753337AbdFMMgk (ORCPT ); Tue, 13 Jun 2017 08:36:40 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:8275 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752083AbdFMMgj (ORCPT ); Tue, 13 Jun 2017 08:36:39 -0400 From: "Wuzongyong (Cordius Wu, Euler Dept)" To: Jerome Glisse CC: "iommu@lists.linux-foundation.org" , "linux-kernel@vger.kernel.org" , "oded.gabbay@amd.com" , "Wanzongshun (Vincent)" , "Lifei (Louis)" Subject: =?utf-8?B?562U5aSNOiBXaGF0IGRpZmZlcmVuY2VzIGFuZCByZWxhdGlvbnMgYmV0d2Vl?= =?utf-8?Q?n_SVM,_HSA,_HMM_and_Unified_Memory=3F?= Thread-Topic: What differences and relations between SVM, HSA, HMM and Unified Memory? Thread-Index: AdLg9z7NmA9JkspsQ66KW6tRAbxebgCcZZgAAC5VgiA= Date: Tue, 13 Jun 2017 12:36:16 +0000 Message-ID: <9BD73EA91F8E404F851CF3F519B14AA8CE7BDD@SZXEMI503-MBS.china.huawei.com> References: <9BD73EA91F8E404F851CF3F519B14AA8CE753F@SZXEMI503-MBS.china.huawei.com> <20170612184413.GA5924@gmail.com> In-Reply-To: <20170612184413.GA5924@gmail.com> Accept-Language: en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.142.72.75] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090203.593FDC48.0256,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=169.254.8.85, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: a0f385844110fbab0e7bcd2e5cacec90 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id v5DCatuf030587 Content-Length: 3405 Lines: 76 That's the thing I wanna know! Thanks for your explanation. Thanks, Zongyong Wu -----邮件原件----- 发件人: Jerome Glisse [mailto:j.glisse@gmail.com] 发送时间: 2017年6月13日 2:44 收件人: Wuzongyong (Cordius Wu, Euler Dept) 抄送: iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org; oded.gabbay@amd.com; Wanzongshun (Vincent) 主题: Re: What differences and relations between SVM, HSA, HMM and Unified Memory? On Sat, Jun 10, 2017 at 04:06:28AM +0000, Wuzongyong (Cordius Wu, Euler Dept) wrote: > Hi, > > Could someone explain differences and relations between the SVM > (Shared Virtual Memory, by Intel), HSA(Heterogeneous System > Architecture, by AMD), HMM(Heterogeneous Memory Management, by Glisse) > and UM(Unified Memory, by NVIDIA) ? Are these in the substitutional > relation? > > As I understand it, these aim to solve the same thing, sharing > pointers between CPU and GPU(implement with ATS/PASID/PRI/IOMMU > support). So far, SVM and HSA can only be used by integrated gpu. > And, Intel declare that the root ports doesn't not have the required > TLP prefix support, resulting that SVM can't be used by discrete > devices. So could someone tell me the required TLP prefix means what > specifically? > > With HMM, we can use allocator like malloc to manage host and device > memory. Does this mean that there is no need to use SVM and HSA with > HMM, or HMM is the basis of SVM and HAS to implement Fine-Grained > system SVM defined in the opencl spec? So aim of all technology is to share address space between a device and CPU. Now they are 3 way to do it: A) all in hardware like CAPI or CCIX where device memory is cache coherent from CPU access point of view and system memory is also accessible by device in cache coherent way with CPU. So it is cache coherency going both way from CPU to device memory and from device to system memory B) partially in hardware ATS/PASID (which are the same technology behind both HSA and SVM). Here it is only single way solution where you have cache coherent access from device to system memory but not the other way around. Moreover you share the CPU page table with the device so you do not need to program the IOMMU. Here you can not use the device memory transparently. At least not without software help like HMM. C) all in software. Here device can access system memory with cache coherency but it does not share the same CPU page table. Each device have their own page table and thus you need to synchronize them. HMM provides helper that address all of the 3 solutions. A) for all hardware solution HMM provides new helpers to help with migration of process memory to device memory B) for partial hardware solution you can mix with HMM to again provide helpers for migration to device memory. This assume you device can mix and match local device page table with ATS/PASID region C) full software solution using all the feature of HMM where it is all done in software and HMM is just doing the heavy lifting on behalf of device driver In all of the above we are talking fine-grained system SVM as in the OpenCL specificiation. So you can malloc() memory and use it directly from the GPU. Hope this clarify thing. Cheers, Jérôme