Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753409AbdCPPWB (ORCPT ); Thu, 16 Mar 2017 11:22:01 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54154 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752838AbdCPPVz (ORCPT ); Thu, 16 Mar 2017 11:21:55 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 1C03461E64 Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jglisse@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 1C03461E64 From: =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= To: akpm@linux-foundation.org, , linux-mm@kvack.org Cc: John Hubbard , Naoya Horiguchi , David Nellans , =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" Subject: [HMM 05/16] mm/ZONE_DEVICE/x86: add support for un-addressable device memory Date: Thu, 16 Mar 2017 12:05:24 -0400 Message-Id: <1489680335-6594-6-git-send-email-jglisse@redhat.com> In-Reply-To: <1489680335-6594-1-git-send-email-jglisse@redhat.com> References: <1489680335-6594-1-git-send-email-jglisse@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 16 Mar 2017 15:03:56 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2684 Lines: 72 It does not need much, just skip populating kernel linear mapping for range of un-addressable device memory (it is pick so that there is no physical memory resource overlapping it). All the logic is in share mm code. Only support x86-64 as this feature doesn't make much sense with constrained virtual address space of 32bits architecture. Signed-off-by: Jérôme Glisse Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" --- arch/x86/mm/init_64.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 0098dc9..7c8c91c 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -644,7 +644,8 @@ static void update_end_of_memory_vars(u64 start, u64 size) int arch_add_memory(int nid, u64 start, u64 size, int flags) { const int supported_flags = MEMORY_DEVICE | - MEMORY_DEVICE_ALLOW_MIGRATE; + MEMORY_DEVICE_ALLOW_MIGRATE | + MEMORY_DEVICE_UNADDRESSABLE; struct pglist_data *pgdat = NODE_DATA(nid); struct zone *zone = pgdat->node_zones + zone_for_memory(nid, start, size, ZONE_NORMAL, @@ -659,7 +660,17 @@ int arch_add_memory(int nid, u64 start, u64 size, int flags) return -EINVAL; } - init_memory_mapping(start, start + size); + /* + * We get un-addressable memory when some one is adding a ZONE_DEVICE + * to have struct page for a device memory which is not accessible by + * the CPU so it is pointless to have a linear kernel mapping of such + * memory. + * + * Core mm should make sure it never set a pte pointing to such fake + * physical range. + */ + if (!(flags & MEMORY_DEVICE_UNADDRESSABLE)) + init_memory_mapping(start, start + size); ret = __add_pages(nid, zone, start_pfn, nr_pages); WARN_ON_ONCE(ret); @@ -958,7 +969,8 @@ kernel_physical_mapping_remove(unsigned long start, unsigned long end) int __ref arch_remove_memory(u64 start, u64 size, int flags) { const int supported_flags = MEMORY_DEVICE | - MEMORY_DEVICE_ALLOW_MIGRATE; + MEMORY_DEVICE_ALLOW_MIGRATE | + MEMORY_DEVICE_UNADDRESSABLE; unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; struct page *page = pfn_to_page(start_pfn); @@ -979,7 +991,9 @@ int __ref arch_remove_memory(u64 start, u64 size, int flags) zone = page_zone(page); ret = __remove_pages(zone, start_pfn, nr_pages); WARN_ON_ONCE(ret); - kernel_physical_mapping_remove(start, start + size); + + if (!(flags & MEMORY_DEVICE_UNADDRESSABLE)) + kernel_physical_mapping_remove(start, start + size); return ret; } -- 2.4.11