Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752699AbdI3C6o (ORCPT ); Fri, 29 Sep 2017 22:58:44 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:7485 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752578AbdI3C6m (ORCPT ); Fri, 29 Sep 2017 22:58:42 -0400 Subject: Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5 To: Jerome Glisse , Bob Liu References: <20170719022537.GA6911@redhat.com> <20170720150305.GA2767@redhat.com> <20170721014106.GB25991@redhat.com> <20170905193644.GD19397@redhat.com> <20170911233649.GA4892@redhat.com> <20170926161635.GA3216@redhat.com> CC: Dan Williams , "linux-kernel@vger.kernel.org" , Linux MM , John Hubbard , David Nellans , Balbir Singh , Michal Hocko , Andrew Morton From: Bob Liu Message-ID: <0d7273c3-181c-6d68-3c5f-fa518e782374@huawei.com> Date: Sat, 30 Sep 2017 10:57:38 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170926161635.GA3216@redhat.com> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.142.83.150] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020203.59CF085F.003C,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: d90c064be42052f9d76254321560d092 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2693 Lines: 59 On 2017/9/27 0:16, Jerome Glisse wrote: > On Tue, Sep 26, 2017 at 05:56:26PM +0800, Bob Liu wrote: >> On Tue, Sep 12, 2017 at 7:36 AM, Jerome Glisse wrote: >>> On Sun, Sep 10, 2017 at 07:22:58AM +0800, Bob Liu wrote: >>>> On Wed, Sep 6, 2017 at 3:36 AM, Jerome Glisse wrote: >>>>> On Thu, Jul 20, 2017 at 08:48:20PM -0700, Dan Williams wrote: >>>>>> On Thu, Jul 20, 2017 at 6:41 PM, Jerome Glisse wrote: [...] >>>>> So i pushed a branch with WIP for nouveau to use HMM: >>>>> >>>>> https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-nouveau >>>>> >>>> >>>> Nice to see that. >>>> Btw, do you have any plan for a CDM-HMM driver? CPU can write to >>>> Device memory directly without extra copy. >>> >>> Yes nouveau CDM support on PPC (which is the only CDM platform commercialy >>> available today) is on the TODO list. Note that the driver changes for CDM >>> are minimal (probably less than 100 lines of code). From the driver point >>> of view this is memory and it doesn't matter if it is CDM or not. >>> >> >> It seems have to migrate/copy memory between system-memory and >> device-memory even in HMM-CDM solution. >> Because device-memory is not added into buddy system, the page fault >> for normal malloc() always allocate memory from system-memory!! >> If the device then access the same virtual address, the data is copied >> to device-memory. >> >> Correct me if I misunderstand something. >> @Balbir, how do you plan to make zero-copy work if using HMM-CDM? > > Device can access system memory so copy to device is _not_ mandatory. Copying > data to device is for performance only ie the device driver take hint from > userspace and monitor device activity to decide which memory should be migrated > to device memory to maximize performance. > > Moreover in some previous version of the HMM patchset we had an helper that Could you point in which version? I'd like to have a look. > allowed to directly allocate device memory on device page fault. I intend to > post this helper again. With that helper you can have zero copy when device > is the first to access the memory. > > Plan is to get what we have today work properly with the open source driver > and make it perform well. Once we get some experience with real workload we > might look into allowing CPU page fault to be directed to device memory but > at this time i don't think we need this. > For us, we need this feature that CPU page fault can be direct to device memory. So that don't need to copy data from system memory to device memory. Do you have any suggestion on the implementation? I'll try to make a prototype patch. -- Thanks, Bob