Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751411AbdGRD1Y (ORCPT ); Mon, 17 Jul 2017 23:27:24 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:9368 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751348AbdGRD1V (ORCPT ); Mon, 17 Jul 2017 23:27:21 -0400 Subject: Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5 To: =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , , References: <20170713211532.970-1-jglisse@redhat.com> CC: John Hubbard , David Nellans , Dan Williams , Balbir Singh , Michal Hocko From: Bob Liu Message-ID: <2d534afc-28c5-4c81-c452-7e4c013ab4d0@huawei.com> Date: Tue, 18 Jul 2017 11:26:51 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170713211532.970-1-jglisse@redhat.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.142.83.150] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090206.596D8014.0069,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 23c4a0a9227bec3849d7b96e7b05592b Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4146 Lines: 99 On 2017/7/14 5:15, Jérôme Glisse wrote: > Sorry i made horrible mistake on names in v4, i completly miss- > understood the suggestion. So here i repost with proper naming. > This is the only change since v3. Again sorry about the noise > with v4. > > Changes since v4: > - s/DEVICE_HOST/DEVICE_PUBLIC > > Git tree: > https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-cdm-v5 > > > Cache coherent device memory apply to architecture with system bus > like CAPI or CCIX. Device connected to such system bus can expose > their memory to the system and allow cache coherent access to it > from the CPU. > > Even if for all intent and purposes device memory behave like regular > memory, we still want to manage it in isolation from regular memory. > Several reasons for that, first and foremost this memory is less > reliable than regular memory if the device hangs because of invalid > commands we can loose access to device memory. Second CPU access to > this memory is expected to be slower than to regular memory. Third > having random memory into device means that some of the bus bandwith > wouldn't be available to the device but would be use by CPU access. > > This is why we want to manage such memory in isolation from regular > memory. Kernel should not try to use this memory even as last resort > when running out of memory, at least for now. > I think set a very large node distance for "Cache Coherent Device Memory" may be a easier way to address these concerns. -- Regards, Bob Liu > This patchset add a new type of ZONE_DEVICE memory (DEVICE_HOST) > that is use to represent CDM memory. This patchset build on top of > the HMM patchset that already introduce a new type of ZONE_DEVICE > memory for private device memory (see HMM patchset). > > The end result is that with this patchset if a device is in use in > a process you might have private anonymous memory or file back > page memory using ZONE_DEVICE (DEVICE_HOST). Thus care must be > taken to not overwritte lru fields of such pages. > > Hence all core mm changes are done to address assumption that any > process memory is back by a regular struct page that is part of > the lru. ZONE_DEVICE page are not on the lru and the lru pointer > of struct page are use to store device specific informations. > > Thus this patchset update all code path that would make assumptions > about lruness of a process page. > > patch 01 - rename DEVICE_PUBLIC to DEVICE_HOST to free DEVICE_PUBLIC name > patch 02 - add DEVICE_PUBLIC type to ZONE_DEVICE (all core mm changes) > patch 03 - add an helper to HMM for hotplug of CDM memory > patch 04 - preparatory patch for memory controller changes (memch) > patch 05 - update memory controller to properly handle > ZONE_DEVICE pages when uncharging > patch 06 - documentation patch > > Previous posting: > v1 https://lkml.org/lkml/2017/4/7/638 > v2 https://lwn.net/Articles/725412/ > v3 https://lwn.net/Articles/727114/ > v4 https://lwn.net/Articles/727692/ > > Jérôme Glisse (6): > mm/zone-device: rename DEVICE_PUBLIC to DEVICE_HOST > mm/device-public-memory: device memory cache coherent with CPU v4 > mm/hmm: add new helper to hotplug CDM memory region v3 > mm/memcontrol: allow to uncharge page without using page->lru field > mm/memcontrol: support MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_PUBLIC > v3 > mm/hmm: documents how device memory is accounted in rss and memcg > > Documentation/vm/hmm.txt | 40 ++++++++ > fs/proc/task_mmu.c | 2 +- > include/linux/hmm.h | 7 +- > include/linux/ioport.h | 1 + > include/linux/memremap.h | 25 ++++- > include/linux/mm.h | 20 ++-- > kernel/memremap.c | 19 ++-- > mm/Kconfig | 11 +++ > mm/gup.c | 7 ++ > mm/hmm.c | 89 ++++++++++++++++-- > mm/madvise.c | 2 +- > mm/memcontrol.c | 231 ++++++++++++++++++++++++++++++----------------- > mm/memory.c | 46 +++++++++- > mm/migrate.c | 57 +++++++----- > mm/swap.c | 11 +++ > 15 files changed, 434 insertions(+), 134 deletions(-) >