Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S967923AbdIZJ4a (ORCPT ); Tue, 26 Sep 2017 05:56:30 -0400 Received: from mail-wr0-f193.google.com ([209.85.128.193]:33126 "EHLO mail-wr0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S967898AbdIZJ42 (ORCPT ); Tue, 26 Sep 2017 05:56:28 -0400 X-Google-Smtp-Source: AOwi7QD43Eje5vlFjqk0Lte9eQDDknnz/vOoIBIUs1MyeZiTu7ejyiOjqEBO2Uv8aOy1I/NlQpisTeWDZ+U5L+cxo8c= MIME-Version: 1.0 In-Reply-To: <20170911233649.GA4892@redhat.com> References: <20170718153816.GA3135@redhat.com> <20170719022537.GA6911@redhat.com> <20170720150305.GA2767@redhat.com> <20170721014106.GB25991@redhat.com> <20170905193644.GD19397@redhat.com> <20170911233649.GA4892@redhat.com> From: Bob Liu Date: Tue, 26 Sep 2017 17:56:26 +0800 Message-ID: Subject: Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5 To: Jerome Glisse Cc: Dan Williams , Bob Liu , "linux-kernel@vger.kernel.org" , Linux MM , John Hubbard , David Nellans , Balbir Singh , Michal Hocko , Andrew Morton Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id v8Q9ujHp000658 Content-Length: 3571 Lines: 73 On Tue, Sep 12, 2017 at 7:36 AM, Jerome Glisse wrote: > On Sun, Sep 10, 2017 at 07:22:58AM +0800, Bob Liu wrote: >> On Wed, Sep 6, 2017 at 3:36 AM, Jerome Glisse wrote: >> > On Thu, Jul 20, 2017 at 08:48:20PM -0700, Dan Williams wrote: >> >> On Thu, Jul 20, 2017 at 6:41 PM, Jerome Glisse wrote: >> >> > On Fri, Jul 21, 2017 at 09:15:29AM +0800, Bob Liu wrote: >> >> >> On 2017/7/20 23:03, Jerome Glisse wrote: >> >> >> > On Wed, Jul 19, 2017 at 05:09:04PM +0800, Bob Liu wrote: >> >> >> >> On 2017/7/19 10:25, Jerome Glisse wrote: >> >> >> >>> On Wed, Jul 19, 2017 at 09:46:10AM +0800, Bob Liu wrote: >> >> >> >>>> On 2017/7/18 23:38, Jerome Glisse wrote: >> >> >> >>>>> On Tue, Jul 18, 2017 at 11:26:51AM +0800, Bob Liu wrote: >> >> >> >>>>>> On 2017/7/14 5:15, Jérôme Glisse wrote: >> > >> > [...] >> > >> >> >> > Second device driver are not integrated that closely within mm and the >> >> >> > scheduler kernel code to allow to efficiently plug in device access >> >> >> > notification to page (ie to update struct page so that numa worker >> >> >> > thread can migrate memory base on accurate informations). >> >> >> > >> >> >> > Third it can be hard to decide who win between CPU and device access >> >> >> > when it comes to updating thing like last CPU id. >> >> >> > >> >> >> > Fourth there is no such thing like device id ie equivalent of CPU id. >> >> >> > If we were to add something the CPU id field in flags of struct page >> >> >> > would not be big enough so this can have repercusion on struct page >> >> >> > size. This is not an easy sell. >> >> >> > >> >> >> > They are other issues i can't think of right now. I think for now it >> >> >> >> >> >> My opinion is most of the issues are the same no matter use CDM or HMM-CDM. >> >> >> I just care about a more complete solution no matter CDM,HMM-CDM or other ways. >> >> >> HMM or HMM-CDM depends on device driver, but haven't see a public/full driver to >> >> >> demonstrate the whole solution works fine. >> >> > >> >> > I am working with NVidia close source driver team to make sure that it works >> >> > well for them. I am also working on nouveau open source driver for same NVidia >> >> > hardware thought it will be of less use as what is missing there is a solid >> >> > open source userspace to leverage this. Nonetheless open source driver are in >> >> > the work. >> >> >> >> Can you point to the nouveau patches? I still find these HMM patches >> >> un-reviewable without an upstream consumer. >> > >> > So i pushed a branch with WIP for nouveau to use HMM: >> > >> > https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-nouveau >> > >> >> Nice to see that. >> Btw, do you have any plan for a CDM-HMM driver? CPU can write to >> Device memory directly without extra copy. > > Yes nouveau CDM support on PPC (which is the only CDM platform commercialy > available today) is on the TODO list. Note that the driver changes for CDM > are minimal (probably less than 100 lines of code). From the driver point > of view this is memory and it doesn't matter if it is CDM or not. > It seems have to migrate/copy memory between system-memory and device-memory even in HMM-CDM solution. Because device-memory is not added into buddy system, the page fault for normal malloc() always allocate memory from system-memory!! If the device then access the same virtual address, the data is copied to device-memory. Correct me if I misunderstand something. @Balbir, how do you plan to make zero-copy work if using HMM-CDM? -- Thanks, Bob