Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp428813pxm; Wed, 23 Feb 2022 03:37:53 -0800 (PST) X-Google-Smtp-Source: ABdhPJz+9+KQZwlUeWKcWgR32GV5qk3GTT1q9Q/G7PY0ztVG7nnngH9mslRacA/DfQSb9+9GldOG X-Received: by 2002:a05:6a00:244b:b0:4c9:319e:ecb7 with SMTP id d11-20020a056a00244b00b004c9319eecb7mr29193524pfj.58.1645616273389; Wed, 23 Feb 2022 03:37:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645616273; cv=none; d=google.com; s=arc-20160816; b=mRIgj8eXpwfEQC/hX5YE8770bhwZdfd/BlIvSmjrtMb3LuvFXxwDIHykbNJwBOtx8Z dJaibCuRj3GsY+UIqPzR6JHCGhu29OcIly9ozkcFsuWpVMuxTdsiquAPxZ6JZIgM0jnf Imn5kyz8DuA8tVRrvKc9X2e/0TcbVRx/NUOs6zHi2d1N1S9ah8DzNATRxCKdhTdmwbBh ApRWTHItehrz8hy4IltJr52c5m73+77wnXeyREN99Hda+1Aa3oSv9Z3CexvKWVtxjyjd WOWrgb4JzWAJ7iWSPpKej1W+DVTCFC2hmjcNDqa1CAMO7UyDwgCajRMIuoXDVaMZ/RXq /Qwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:dkim-signature; bh=gMgbv5PxdERvilfG13W1/59w3Y4MPufgzH9ipT5mZcE=; b=hsJ7VPSn7uoQmAUHeNJcskdiet9MbJgSEZEbquPfOkh0CDGkM7qFuLGEnRWDZXhg/9 DwHlqgPm6/VTA/4OlTG6di07fPHsijSe8/MxsfLaiHPu0Uj2QCmuVPTr43Q/Yp/HvxNR Q4wpLGFFZ/3aki/bYgCfzpS6A4Gb9Xqc3Pm+LJDaohn4h7ImzcSRTYLd46w7bd33+/FK /qgwT9xDt20vNAJs0FALzla0b6iuCN3A7OsFso9pB5fQIYS9t/R7xjeaqUWzZozgJwVo HIAjpcqY6bierx4VnVyFBjfhVUyhMvIKokSvBQWrzU3xi1tifW1YW7ejaZkNom1orCHu ufwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=b6urhxkW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k9-20020a17090a62c900b001bc74c1c386si1933639pjs.35.2022.02.23.03.37.38; Wed, 23 Feb 2022 03:37:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=b6urhxkW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236834AbiBWFZD (ORCPT + 99 others); Wed, 23 Feb 2022 00:25:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57758 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236018AbiBWFYz (ORCPT ); Wed, 23 Feb 2022 00:24:55 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E8A46C1F3 for ; Tue, 22 Feb 2022 21:24:08 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id l3-20020a25ad43000000b0062462e2af34so11456599ybe.17 for ; Tue, 22 Feb 2022 21:24:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=gMgbv5PxdERvilfG13W1/59w3Y4MPufgzH9ipT5mZcE=; b=b6urhxkWGfnvarQIxVi9atsKIrKLwBXqKa9HpQGtESTNRqGIm9vkGf3iqdrTrF3WZR 5m1o2lHCH4V6NfBZsxlk8GhGN3fJPzlR3rYLD19suBuPDuixp1NPgPg6U6hpUzZvTY5L IgBE58eLo2ZZzbqp5xEwBHiVFVtDJ87nrKLza6quhghsvVIu54Q0IHEcJ3cNhC/sxTwh jZftmgJSZ3zhdFt219bc3VfY34rKbPUl3RGNBdUPaeK8YOXIcRYVBQ85kA7yXhGQqR5+ DMF/TwW/QA+HqXhMMcxPWsADaTGbuLYcULrIcRh5s/jS84XHXJ0eFx/xBfm7XL6hFrwa +MKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=gMgbv5PxdERvilfG13W1/59w3Y4MPufgzH9ipT5mZcE=; b=q17eb0ZHA3+ghCF1kyzSt2krPVdFPyEijngIgeUGezGR33OPCU1lGM8xMjSiSGGQrL 6h3vtdFx1IGa5oguZnzp7LubH9zTsNdM4p/ottPetnM8EpHsDeTXvpJaA7WjWCuWP7FP J62mqiG3E5yCO4/1f0+dF+BHHFbtJjJtOcKAKZegqTEN6fL062ruCu4Vbw7lP6SDfN8Y 5iMHxmQqttx1WqFjvNl46yV8b4KV44eme1OvLdqNGkR49GUuNHZzKTgJCECk+DI7ppsI Vh1Gyd1lNn62dkGLmhrs0FOCSlPTTODafZMQeDawtddi1Fm2IB50+xscelH1pJWWGpDw kaRg== X-Gm-Message-State: AOAM5323M4X00rMS81u1oqUeoYPhJv3NqBkb9hGZKQVtaM33Br0ICOuP fXZzerD9qnZkUzwl7qbpvoPjfpaf9OlR+zZS47E0f4Ad9kkkkOKskzX3FbMc6lXuRSKOBWnD0t6 lMtEdsaIsz0V31KO3B5+P/JlO/HMZCFo85CKqI9GVEZ0ijCxYj4lk/NXnQZA0GxZHsWtJOlvq X-Received: from js-desktop.svl.corp.google.com ([2620:15c:2cd:202:ccbe:5d15:e2e6:322]) (user=junaids job=sendgmr) by 2002:a81:5d0:0:b0:2d0:d056:c703 with SMTP id 199-20020a8105d0000000b002d0d056c703mr28080196ywf.288.1645593847896; Tue, 22 Feb 2022 21:24:07 -0800 (PST) Date: Tue, 22 Feb 2022 21:21:46 -0800 In-Reply-To: <20220223052223.1202152-1-junaids@google.com> Message-Id: <20220223052223.1202152-11-junaids@google.com> Mime-Version: 1.0 References: <20220223052223.1202152-1-junaids@google.com> X-Mailer: git-send-email 2.35.1.473.g83b2b277ed-goog Subject: [RFC PATCH 10/47] mm: asi: Support for global non-sensitive direct map allocations From: Junaid Shahid To: linux-kernel@vger.kernel.org Cc: kvm@vger.kernel.org, pbonzini@redhat.com, jmattson@google.com, pjt@google.com, oweisse@google.com, alexandre.chartre@oracle.com, rppt@linux.ibm.com, dave.hansen@linux.intel.com, peterz@infradead.org, tglx@linutronix.de, luto@kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org A new GFP flag is added to specify that an allocation should be considered globally non-sensitive. The pages will be mapped into the ASI global non-sensitive pseudo-PGD, which is shared between all standard ASI instances. A new page flag is also added so that when these pages are freed, they can also be unmapped from the ASI page tables. Signed-off-by: Junaid Shahid --- include/linux/gfp.h | 10 ++- include/linux/mm_types.h | 5 ++ include/linux/page-flags.h | 9 ++ include/trace/events/mmflags.h | 12 ++- mm/page_alloc.c | 145 ++++++++++++++++++++++++++++++++- tools/perf/builtin-kmem.c | 1 + 6 files changed, 178 insertions(+), 4 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 8fcc38467af6..07a99a463a34 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -60,6 +60,11 @@ struct vm_area_struct; #else #define ___GFP_NOLOCKDEP 0 #endif +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION +#define ___GFP_GLOBAL_NONSENSITIVE 0x4000000u +#else +#define ___GFP_GLOBAL_NONSENSITIVE 0 +#endif /* If the above are modified, __GFP_BITS_SHIFT may need updating */ /* @@ -248,8 +253,11 @@ struct vm_area_struct; /* Disable lockdep for GFP context tracking */ #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP) +/* Allocate non-sensitive memory */ +#define __GFP_GLOBAL_NONSENSITIVE ((__force gfp_t)___GFP_GLOBAL_NONSENSITIVE) + /* Room for N __GFP_FOO bits */ -#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP)) +#define __GFP_BITS_SHIFT 27 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1)) /** diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 3de1afa57289..5b8028fcfe67 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -191,6 +191,11 @@ struct page { /** @rcu_head: You can use this to free a page by RCU. */ struct rcu_head rcu_head; + +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + /* Links the pages_to_free_async list */ + struct llist_node async_free_node; +#endif }; union { /* This union is 4 bytes in size. */ diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index b90a17e9796d..a07434cc679c 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -140,6 +140,9 @@ enum pageflags { #endif #ifdef CONFIG_KASAN_HW_TAGS PG_skip_kasan_poison, +#endif +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + PG_global_nonsensitive, #endif __NR_PAGEFLAGS, @@ -542,6 +545,12 @@ TESTCLEARFLAG(Young, young, PF_ANY) PAGEFLAG(Idle, idle, PF_ANY) #endif +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION +__PAGEFLAG(GlobalNonSensitive, global_nonsensitive, PF_ANY); +#else +__PAGEFLAG_FALSE(GlobalNonSensitive, global_nonsensitive); +#endif + #ifdef CONFIG_KASAN_HW_TAGS PAGEFLAG(SkipKASanPoison, skip_kasan_poison, PF_HEAD) #else diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index 116ed4d5d0f8..73a49197ef54 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -50,7 +50,8 @@ {(unsigned long)__GFP_DIRECT_RECLAIM, "__GFP_DIRECT_RECLAIM"},\ {(unsigned long)__GFP_KSWAPD_RECLAIM, "__GFP_KSWAPD_RECLAIM"},\ {(unsigned long)__GFP_ZEROTAGS, "__GFP_ZEROTAGS"}, \ - {(unsigned long)__GFP_SKIP_KASAN_POISON,"__GFP_SKIP_KASAN_POISON"}\ + {(unsigned long)__GFP_SKIP_KASAN_POISON,"__GFP_SKIP_KASAN_POISON"},\ + {(unsigned long)__GFP_GLOBAL_NONSENSITIVE, "__GFP_GLOBAL_NONSENSITIVE"}\ #define show_gfp_flags(flags) \ (flags) ? __print_flags(flags, "|", \ @@ -93,6 +94,12 @@ #define IF_HAVE_PG_SKIP_KASAN_POISON(flag,string) #endif +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION +#define IF_HAVE_ASI(flag, string) ,{1UL << flag, string} +#else +#define IF_HAVE_ASI(flag, string) +#endif + #define __def_pageflag_names \ {1UL << PG_locked, "locked" }, \ {1UL << PG_waiters, "waiters" }, \ @@ -121,7 +128,8 @@ IF_HAVE_PG_HWPOISON(PG_hwpoison, "hwpoison" ) \ IF_HAVE_PG_IDLE(PG_young, "young" ) \ IF_HAVE_PG_IDLE(PG_idle, "idle" ) \ IF_HAVE_PG_ARCH_2(PG_arch_2, "arch_2" ) \ -IF_HAVE_PG_SKIP_KASAN_POISON(PG_skip_kasan_poison, "skip_kasan_poison") +IF_HAVE_PG_SKIP_KASAN_POISON(PG_skip_kasan_poison, "skip_kasan_poison") \ +IF_HAVE_ASI(PG_global_nonsensitive, "global_nonsensitive") #define show_page_flags(flags) \ (flags) ? __print_flags(flags, "|", \ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c5952749ad40..a4048fa1868a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -697,7 +697,7 @@ static inline bool pcp_allowed_order(unsigned int order) return false; } -static inline void free_the_page(struct page *page, unsigned int order) +static inline void __free_the_page(struct page *page, unsigned int order) { if (pcp_allowed_order(order)) /* Via pcp? */ free_unref_page(page, order); @@ -705,6 +705,14 @@ static inline void free_the_page(struct page *page, unsigned int order) __free_pages_ok(page, order, FPI_NONE); } +static bool asi_unmap_freed_pages(struct page *page, unsigned int order); + +static inline void free_the_page(struct page *page, unsigned int order) +{ + if (asi_unmap_freed_pages(page, order)) + __free_the_page(page, order); +} + /* * Higher-order pages are called "compound pages". They are structured thusly: * @@ -5162,6 +5170,129 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, return true; } +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + +static DEFINE_PER_CPU(struct work_struct, async_free_work); +static DEFINE_PER_CPU(struct llist_head, pages_to_free_async); +static bool async_free_work_initialized; + +static void __free_the_page(struct page *page, unsigned int order); + +static void async_free_work_fn(struct work_struct *work) +{ + struct page *page, *tmp; + struct llist_node *pages_to_free; + void *va; + size_t len; + uint order; + + pages_to_free = llist_del_all(this_cpu_ptr(&pages_to_free_async)); + + /* A later patch will do a more optimized TLB flush. */ + + llist_for_each_entry_safe(page, tmp, pages_to_free, async_free_node) { + va = page_to_virt(page); + order = page->private; + len = PAGE_SIZE * (1 << order); + + asi_flush_tlb_range(ASI_GLOBAL_NONSENSITIVE, va, len); + __free_the_page(page, order); + } +} + +static int __init asi_page_alloc_init(void) +{ + int cpu; + + if (!static_asi_enabled()) + return 0; + + for_each_possible_cpu(cpu) + INIT_WORK(per_cpu_ptr(&async_free_work, cpu), + async_free_work_fn); + + /* + * This function is called before SMP is initialized, so we can assume + * that this is the only running CPU at this point. + */ + + barrier(); + async_free_work_initialized = true; + barrier(); + + if (!llist_empty(this_cpu_ptr(&pages_to_free_async))) + queue_work_on(smp_processor_id(), mm_percpu_wq, + this_cpu_ptr(&async_free_work)); + + return 0; +} +early_initcall(asi_page_alloc_init); + +static int asi_map_alloced_pages(struct page *page, uint order, gfp_t gfp_mask) +{ + uint i; + + if (!static_asi_enabled()) + return 0; + + if (gfp_mask & __GFP_GLOBAL_NONSENSITIVE) { + for (i = 0; i < (1 << order); i++) + __SetPageGlobalNonSensitive(page + i); + + return asi_map_gfp(ASI_GLOBAL_NONSENSITIVE, page_to_virt(page), + PAGE_SIZE * (1 << order), gfp_mask); + } + + return 0; +} + +static bool asi_unmap_freed_pages(struct page *page, unsigned int order) +{ + void *va; + size_t len; + bool async_flush_needed; + + if (!static_asi_enabled()) + return true; + + if (!PageGlobalNonSensitive(page)) + return true; + + va = page_to_virt(page); + len = PAGE_SIZE * (1 << order); + async_flush_needed = irqs_disabled() || in_interrupt(); + + asi_unmap(ASI_GLOBAL_NONSENSITIVE, va, len, !async_flush_needed); + + if (!async_flush_needed) + return true; + + page->private = order; + llist_add(&page->async_free_node, this_cpu_ptr(&pages_to_free_async)); + + if (async_free_work_initialized) + queue_work_on(smp_processor_id(), mm_percpu_wq, + this_cpu_ptr(&async_free_work)); + + return false; +} + +#else /* CONFIG_ADDRESS_SPACE_ISOLATION */ + +static inline +int asi_map_alloced_pages(struct page *pages, uint order, gfp_t gfp_mask) +{ + return 0; +} + +static inline +bool asi_unmap_freed_pages(struct page *page, unsigned int order) +{ + return true; +} + +#endif + /* * __alloc_pages_bulk - Allocate a number of order-0 pages to a list or array * @gfp: GFP flags for the allocation @@ -5345,6 +5476,9 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid, return NULL; } + if (static_asi_enabled() && (gfp & __GFP_GLOBAL_NONSENSITIVE)) + gfp |= __GFP_ZERO; + gfp &= gfp_allowed_mask; /* * Apply scoped allocation constraints. This is mainly about GFP_NOFS @@ -5388,6 +5522,15 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid, page = NULL; } + if (page) { + int err = asi_map_alloced_pages(page, order, gfp); + + if (unlikely(err)) { + __free_pages(page, order); + page = NULL; + } + } + trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype); return page; diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c index da03a341c63c..5857953cd5c1 100644 --- a/tools/perf/builtin-kmem.c +++ b/tools/perf/builtin-kmem.c @@ -660,6 +660,7 @@ static const struct { { "__GFP_RECLAIM", "R" }, { "__GFP_DIRECT_RECLAIM", "DR" }, { "__GFP_KSWAPD_RECLAIM", "KR" }, + { "__GFP_GLOBAL_NONSENSITIVE", "GNS" }, }; static size_t max_gfp_len; -- 2.35.1.473.g83b2b277ed-goog