Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755518AbdIGR1d convert rfc822-to-8bit (ORCPT ); Thu, 7 Sep 2017 13:27:33 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36690 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752375AbdIGR1c (ORCPT ); Thu, 7 Sep 2017 13:27:32 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 468BFC058EC0 Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=jglisse@redhat.com Date: Thu, 7 Sep 2017 13:27:31 -0400 (EDT) From: Jerome Glisse To: Bob Liu Cc: Ross Zwisler , akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, John Hubbard , Dan Williams , David Nellans , Balbir Singh , majiuyue , "xieyisheng (A)" Message-ID: <1726639990.10465990.1504805251676.JavaMail.zimbra@redhat.com> In-Reply-To: <4f4a2196-228d-5d54-5386-72c3ffb1481b@huawei.com> References: <20170817000548.32038-1-jglisse@redhat.com> <20170904155123.GA3161@redhat.com> <7026dfda-9fd0-2661-5efc-66063dfdf6bc@huawei.com> <20170905023826.GA4836@redhat.com> <20170905185414.GB24073@linux.intel.com> <0bc5047d-d27c-65b6-acab-921263e715c8@huawei.com> <20170906021216.GA23436@redhat.com> <4f4a2196-228d-5d54-5386-72c3ffb1481b@huawei.com> Subject: Re: [HMM-v25 19/19] mm/hmm: add new helper to hotplug CDM memory region v3 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-Originating-IP: [10.20.6.212, 10.4.195.12] Thread-Topic: mm/hmm: add new helper to hotplug CDM memory region v3 Thread-Index: Zo6awl0ory3cEXUcN3Y6VnB0w+/5bw== X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Thu, 07 Sep 2017 17:27:32 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3469 Lines: 82 > On 2017/9/6 10:12, Jerome Glisse wrote: > > On Wed, Sep 06, 2017 at 09:25:36AM +0800, Bob Liu wrote: > >> On 2017/9/6 2:54, Ross Zwisler wrote: > >>> On Mon, Sep 04, 2017 at 10:38:27PM -0400, Jerome Glisse wrote: > >>>> On Tue, Sep 05, 2017 at 09:13:24AM +0800, Bob Liu wrote: > >>>>> On 2017/9/4 23:51, Jerome Glisse wrote: > >>>>>> On Mon, Sep 04, 2017 at 11:09:14AM +0800, Bob Liu wrote: > >>>>>>> On 2017/8/17 8:05, Jérôme Glisse wrote: [...] > > For HMM each process give hint (somewhat similar to mbind) for range of > > virtual address to the device kernel driver (through some API like OpenCL > > or CUDA for GPU for instance). All this being device driver specific ioctl. > > > > The kernel device driver have an overall view of all the process that use > > the device and each of the memory advise they gave. From that informations > > the kernel device driver decide what part of each process address space to > > migrate to device memory. > > Oh, I mean CDM-HMM. I'm fine with HMM. They are one and the same really. In both cases HMM is just a set of helpers for device driver. > > This obviously dynamic and likely to change over the process lifetime. > > > > My understanding is that HMAT want similar API to allow process to give > > direction on > > where each range of virtual address should be allocated. It is expected > > that most > > Right, but not clear who should manage the physical memory allocation and > setup the pagetable mapping. An new driver or the kernel? Physical device memory is manage by the kernel device driver as it is today and has it will be tomorrow. HMM does not change that, nor does it requires any change to that. Migrating process memory to or from device is done by the kernel through the regular page migration. HMM provides new helper for device driver to initiate such migration. There is no mechanisms like auto numa migration for the reasons i explain previously. Kernel device driver use all knowledge it has to decide what to migrate to device memory. Nothing new here either, it is what happens today for special allocated device object and it will just happen all the same for regular mmap memory (private anonymous or mmap of a regular file of a filesystem). So every low level thing happen in the kernel. Userspace only provides directive to the kernel device driver through device specific API. But the kernel device driver can ignore or override those directive. > > software can easily infer what part of its address will need more > > bandwidth, smaller > > latency versus what part is sparsely accessed ... > > > > For HMAT i think first target is HBM and persistent memory and device > > memory might > > be added latter if that make sense. > > > > Okay, so there are two potential ways for CPU-addressable cache-coherent > device memory > (or cpu-less numa memory or "target domain" memory in ACPI spec )? > 1. CDM-HMM > 2. HMAT No this are 2 orthogonal thing, they do not conflict with each others quite the contrary. HMM (the CDM part is no different) is a set of helpers, see it as a toolbox, for device driver. HMAT is a way for firmware to report memory resources with more informations that just range of physical address. HMAT is specific to platform that rely on ACPI. HMAT does not provide any helpers to manage these memory. So a device driver can get informations about device memory from HMAT and then use HMM to help in managing and using this memory. Jérôme