Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757168AbdIHUnI (ORCPT ); Fri, 8 Sep 2017 16:43:08 -0400 Received: from mail-oi0-f42.google.com ([209.85.218.42]:34184 "EHLO mail-oi0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757069AbdIHUnG (ORCPT ); Fri, 8 Sep 2017 16:43:06 -0400 X-Google-Smtp-Source: AOwi7QAEEuKNe2533wmuRY1GHaR5g67jCcaM9/Trlbi+YM0EvqsODDzjYpvejVcJiLLhzcQBTW23vEsCh2yihRJ4RMg= MIME-Version: 1.0 In-Reply-To: <863afc77-ed84-fed5-ebb8-d88e636816a3@huawei.com> References: <20170817000548.32038-1-jglisse@redhat.com> <20170904155123.GA3161@redhat.com> <7026dfda-9fd0-2661-5efc-66063dfdf6bc@huawei.com> <20170905023826.GA4836@redhat.com> <20170905185414.GB24073@linux.intel.com> <0bc5047d-d27c-65b6-acab-921263e715c8@huawei.com> <20170906021216.GA23436@redhat.com> <4f4a2196-228d-5d54-5386-72c3ffb1481b@huawei.com> <1726639990.10465990.1504805251676.JavaMail.zimbra@redhat.com> <863afc77-ed84-fed5-ebb8-d88e636816a3@huawei.com> From: Dan Williams Date: Fri, 8 Sep 2017 13:43:05 -0700 Message-ID: Subject: Re: [HMM-v25 19/19] mm/hmm: add new helper to hotplug CDM memory region v3 To: Bob Liu Cc: Jerome Glisse , Ross Zwisler , Andrew Morton , "linux-kernel@vger.kernel.org" , Linux MM , John Hubbard , David Nellans , Balbir Singh , majiuyue , "xieyisheng (A)" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id v88KhQeV008784 Content-Length: 1862 Lines: 37 On Thu, Sep 7, 2017 at 6:59 PM, Bob Liu wrote: > On 2017/9/8 1:27, Jerome Glisse wrote: [..] >> No this are 2 orthogonal thing, they do not conflict with each others quite >> the contrary. HMM (the CDM part is no different) is a set of helpers, see >> it as a toolbox, for device driver. >> >> HMAT is a way for firmware to report memory resources with more informations >> that just range of physical address. HMAT is specific to platform that rely >> on ACPI. HMAT does not provide any helpers to manage these memory. >> >> So a device driver can get informations about device memory from HMAT and then >> use HMM to help in managing and using this memory. >> > > Yes, but as Balbir mentioned requires : > 1. Don't online the memory as a NUMA node > 2. Use the HMM-CDM API's to map the memory to ZONE DEVICE via the driver > > And I'm not sure whether Intel going to use this HMM-CDM based method for their "target domain" memory ? > Or they prefer to NUMA approach? Ross? Dan? The starting / strawman proposal for performance differentiated memory ranges is to get platform firmware to mark them reserved by default. Then, after we parse the HMAT, make them available via the device-dax mechanism so that applications that need 100% guaranteed access to these potentially high-value / limited-capacity ranges can be sure to get them by default, i.e. before any random kernel objects are placed in them. Otherwise, if there are no dedicated users for the memory ranges via device-dax, or they don't need the total capacity, we want to hotplug that memory into the general purpose memory allocator with a numa node number so typical numactl and memory-management flows are enabled. Ideally this would not be specific to HMAT and any agent that knows differentiated performance characteristics of a memory range could use this scheme.