Received: by 10.213.65.68 with SMTP id h4csp1067609imn; Fri, 6 Apr 2018 14:01:46 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/4nVYQUutiEHprd6QKXJDNj6ShLiCC3s0+6s360XWeomuHP3hFsX8PyIfn5NIzn7Q1zRzf X-Received: by 10.98.245.7 with SMTP id n7mr21411324pfh.164.1523048506863; Fri, 06 Apr 2018 14:01:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523048506; cv=none; d=google.com; s=arc-20160816; b=BpA7R6itQTT4oFGGZQEaMa71K/l/tuP2N5ZzjBFAKcKQ19x/QTS+eC8WFuscAOcRg6 /E12vBbRHPLdEaDRAbcC968waWY7gSceH9JAaaUdUAjv/vYkRakt4K5SqrJbvet9dDAe +enhE+Himk/ChtpHrcyDL/YXo9sa6As/00IU/ZN8wU1QPfJhAxeRWB2QeA44QX+D7u3R YfSvEr2mBx7GN5qv5r5i68Qx1HtTdyBqNwqRjnkxHOCynOWCqjey2Cmy1EyZG4CI5/6M Bmvh+eQSJ2yY0RZoAmBw5gDqRQ2nCnUA7LZv9ECsisnhOiWgyyLdW2BD3yaPvRMB8OBs N2Cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:in-reply-to:references:date :from:cc:to:subject:arc-authentication-results; bh=9WYbxF10rPGDVcvR7NzWYPR/SO8iUuc0BFN472+XI6w=; b=oApCUMWTGmeYqCCofkbKiVGoseE2g9iRNQi321IVAP0xmXQAk9uOXPaKZXkTt4ALTS rYqBdEreNRTqlcHbq1s7ZzcDSyar7BO+xm0oWwFW3zk9x9+0Hf30LK9o1k0azaMYZ7au Ul0FjW/3FIz9BwR/kp703hlPeRRq5ZLnVW/G5oO870G+QG3FiddlRBezeaWdHP4GZIva HQH47eT7Sz0fnFMklxwq/4qcz/0P4m68jhdZvclXNVGu3MfbxypxkE83gVZm95+jkci+ YWn9cB2FCL1Bw6+pR6SnhBluGFrt/ekrsqRdWMEcTYjGrDmEJe+SuLvsjivgEaeCNviK QFYg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a63si8576550pfc.131.2018.04.06.14.01.06; Fri, 06 Apr 2018 14:01:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752348AbeDFU6T (ORCPT + 99 others); Fri, 6 Apr 2018 16:58:19 -0400 Received: from mga17.intel.com ([192.55.52.151]:19589 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752292AbeDFU6P (ORCPT ); Fri, 6 Apr 2018 16:58:15 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Apr 2018 13:58:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,416,1517904000"; d="scan'208";a="31340215" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.39.119]) by orsmga007.jf.intel.com with ESMTP; 06 Apr 2018 13:58:14 -0700 Subject: [PATCH 11/11] x86/pti: leave kernel text global for !PCID To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, Dave Hansen , aarcange@redhat.com, luto@kernel.org, torvalds@linux-foundation.org, keescook@google.com, hughd@google.com, jgross@suse.com, x86@kernel.org, namit@vmware.com From: Dave Hansen Date: Fri, 06 Apr 2018 13:55:18 -0700 References: <20180406205501.24A1A4E7@viggo.jf.intel.com> In-Reply-To: <20180406205501.24A1A4E7@viggo.jf.intel.com> Message-Id: <20180406205518.E3D989EB@viggo.jf.intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Note: This has changed since the last version. It now clones the kernel text PMDs at a much later point and also disables this functionality on AMD K8 processors. Details in the patch. -- I'm sticking this at the end of the series because it's a bit weird. It can be dropped and the rest of the series is still useful without it. Global pages are bad for hardening because they potentially let an exploit read the kernel image via a Meltdown-style attack which makes it easier to find gadgets. But, global pages are good for performance because they reduce TLB misses when making user/kernel transitions, especially when PCIDs are not available, such as on older hardware, or where a hypervisor has disabled them for some reason. This patch implements a basic, sane policy: If you have PCIDs, you only map a minimal amount of kernel text global. If you do not have PCIDs, you map all kernel text global. This policy effectively makes PCIDs something that not only adds performance but a little bit of hardening as well. I ran a simple "lseek" microbenchmark[1] to test the benefit on a modern Atom microserver. Most of the benefit comes from applying the series before this patch ("entry only"), but there is still a signifiant benefit from this patch. No Global Lines (baseline ): 6077741 lseeks/sec 88 Global Lines (entry only): 7528609 lseeks/sec (+23.9%) 94 Global Lines (this patch): 8433111 lseeks/sec (+38.8%) 1. https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c Signed-off-by: Dave Hansen Cc: Andrea Arcangeli Cc: Andy Lutomirski Cc: Linus Torvalds Cc: Kees Cook Cc: Hugh Dickins Cc: Juergen Gross Cc: x86@kernel.org Cc: Nadav Amit --- b/arch/x86/include/asm/pti.h | 2 + b/arch/x86/mm/init_64.c | 6 +++ b/arch/x86/mm/pti.c | 78 ++++++++++++++++++++++++++++++++++++++++--- 3 files changed, 82 insertions(+), 4 deletions(-) diff -puN arch/x86/include/asm/pti.h~kpti-global-text-option arch/x86/include/asm/pti.h --- a/arch/x86/include/asm/pti.h~kpti-global-text-option 2018-04-06 10:47:59.393796116 -0700 +++ b/arch/x86/include/asm/pti.h 2018-04-06 10:47:59.400796116 -0700 @@ -6,8 +6,10 @@ #ifdef CONFIG_PAGE_TABLE_ISOLATION extern void pti_init(void); extern void pti_check_boottime_disable(void); +extern void pti_clone_kernel_text(void); #else static inline void pti_check_boottime_disable(void) { } +static inline void pti_clone_kernel_text(void) { } #endif #endif /* __ASSEMBLY__ */ diff -puN arch/x86/mm/init_64.c~kpti-global-text-option arch/x86/mm/init_64.c --- a/arch/x86/mm/init_64.c~kpti-global-text-option 2018-04-06 10:47:59.395796116 -0700 +++ b/arch/x86/mm/init_64.c 2018-04-06 10:47:59.400796116 -0700 @@ -1294,6 +1294,12 @@ void mark_rodata_ro(void) (unsigned long) __va(__pa_symbol(_sdata))); debug_checkwx(); + + /* + * Do this after all of the manipulation of the + * kernel text page tables are complete. + */ + pti_clone_kernel_text(); } int kern_addr_valid(unsigned long addr) diff -puN arch/x86/mm/pti.c~kpti-global-text-option arch/x86/mm/pti.c --- a/arch/x86/mm/pti.c~kpti-global-text-option 2018-04-06 10:47:59.397796116 -0700 +++ b/arch/x86/mm/pti.c 2018-04-06 10:47:59.401796116 -0700 @@ -66,12 +66,22 @@ static void __init pti_print_if_secure(c pr_info("%s\n", reason); } +enum pti_mode { + PTI_AUTO = 0, + PTI_FORCE_OFF, + PTI_FORCE_ON +} pti_mode; + void __init pti_check_boottime_disable(void) { char arg[5]; int ret; + /* Assume mode is auto unless overridden. */ + pti_mode = PTI_AUTO; + if (hypervisor_is_type(X86_HYPER_XEN_PV)) { + pti_mode = PTI_FORCE_OFF; pti_print_if_insecure("disabled on XEN PV."); return; } @@ -79,18 +89,23 @@ void __init pti_check_boottime_disable(v ret = cmdline_find_option(boot_command_line, "pti", arg, sizeof(arg)); if (ret > 0) { if (ret == 3 && !strncmp(arg, "off", 3)) { + pti_mode = PTI_FORCE_OFF; pti_print_if_insecure("disabled on command line."); return; } if (ret == 2 && !strncmp(arg, "on", 2)) { + pti_mode = PTI_FORCE_ON; pti_print_if_secure("force enabled on command line."); goto enable; } - if (ret == 4 && !strncmp(arg, "auto", 4)) + if (ret == 4 && !strncmp(arg, "auto", 4)) { + pti_mode = PTI_AUTO; goto autosel; + } } if (cmdline_find_option_bool(boot_command_line, "nopti")) { + pti_mode = PTI_FORCE_OFF; pti_print_if_insecure("disabled on command line."); return; } @@ -149,7 +164,7 @@ pgd_t __pti_set_user_pgd(pgd_t *pgdp, pg * * Returns a pointer to a P4D on success, or NULL on failure. */ -static __init p4d_t *pti_user_pagetable_walk_p4d(unsigned long address) +static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address) { pgd_t *pgd = kernel_to_user_pgdp(pgd_offset_k(address)); gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO); @@ -177,7 +192,7 @@ static __init p4d_t *pti_user_pagetable_ * * Returns a pointer to a PMD on success, or NULL on failure. */ -static __init pmd_t *pti_user_pagetable_walk_pmd(unsigned long address) +static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address) { gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO); p4d_t *p4d = pti_user_pagetable_walk_p4d(address); @@ -267,7 +282,7 @@ static void __init pti_setup_vsyscall(vo static void __init pti_setup_vsyscall(void) { } #endif -static void __init +static void pti_clone_pmds(unsigned long start, unsigned long end, pmdval_t clear) { unsigned long addr; @@ -373,6 +388,58 @@ static void __init pti_clone_entry_text( } /* + * Global pages and PCIDs are both ways to make kernel TLB entries + * live longer, reduce TLB misses and improve kernel performance. + * But, leaving all kernel text Global makes it potentially accessible + * to Meltdown-style attacks which make it trivial to find gadgets or + * defeat KASLR. + * + * Only use global pages when it is really worth it. + */ +static inline bool pti_kernel_image_global_ok(void) +{ + /* + * Systems with PCIDs get litlle benefit from global + * kernel text and are not worth the downsides. + */ + if (cpu_feature_enabled(X86_FEATURE_PCID)) + return false; + + /* + * Only do global kernel image for pti=auto. Do the most + * secure thing (not global) if pti=on specified. + */ + if (pti_mode != PTI_AUTO) + return false; + + /* + * K8 may not tolerate the cleared _PAGE_RW on the userspace + * global kernel image pages. Do the safe thing (disable + * global kernel image). This is unlikely to ever be + * noticed because PTI is disabled by default on AMD CPUs. + */ + if (boot_cpu_has(X86_FEATURE_K8)) + return false; + + return true; +} + +/* + * For some configurations, map all of kernel text into the user page + * tables. This reduces TLB misses, especially on non-PCID systems. + */ +void pti_clone_kernel_text(void) +{ + unsigned long start = PFN_ALIGN(_text); + unsigned long end = ALIGN((unsigned long)_end, PMD_PAGE_SIZE); + + if (!pti_kernel_image_global_ok()) + return; + + pti_clone_pmds(start, end, _PAGE_RW); +} + +/* * This is the only user for it and it is not arch-generic like * the other set_memory.h functions. Just extern it. */ @@ -388,6 +455,9 @@ void pti_set_kernel_image_nonglobal(void unsigned long start = PFN_ALIGN(_text); unsigned long end = ALIGN((unsigned long)_end, PMD_PAGE_SIZE); + if (pti_kernel_image_global_ok()) + return; + pr_debug("set kernel image non-global\n"); set_memory_nonglobal(start, (end - start) >> PAGE_SHIFT); _