Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755574AbdCKKO3 (ORCPT ); Sat, 11 Mar 2017 05:14:29 -0500 Received: from terminus.zytor.com ([65.50.211.136]:33026 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751788AbdCKKOU (ORCPT ); Sat, 11 Mar 2017 05:14:20 -0500 Date: Sat, 11 Mar 2017 02:13:38 -0800 From: tip-bot for Daniel Borkmann Message-ID: Cc: torvalds@linux-foundation.org, ast@kernel.org, davem@davemloft.net, daniel@iogearbox.net, keescook@chromium.org, rusty@rustcorp.com.au, fengguang.wu@intel.com, linux-kernel@vger.kernel.org, labbott@redhat.com, hpa@zytor.com, mingo@kernel.org, tglx@linutronix.de Reply-To: davem@davemloft.net, daniel@iogearbox.net, keescook@chromium.org, torvalds@linux-foundation.org, ast@kernel.org, mingo@kernel.org, tglx@linutronix.de, rusty@rustcorp.com.au, fengguang.wu@intel.com, linux-kernel@vger.kernel.org, labbott@redhat.com, hpa@zytor.com In-Reply-To: <25c41ad9eca164be4db9ad84f768965b7eb19d9e.1489191673.git.daniel@iogearbox.net> References: <25c41ad9eca164be4db9ad84f768965b7eb19d9e.1489191673.git.daniel@iogearbox.net> To: linux-tip-commits@vger.kernel.org Subject: [tip:x86/urgent] x86/tlb: Fix tlb flushing when lguest clears PGE Git-Commit-ID: 9b3e557f1238135f6ff405e760001a8a40139214 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4643 Lines: 93 Commit-ID: 9b3e557f1238135f6ff405e760001a8a40139214 Gitweb: http://git.kernel.org/tip/9b3e557f1238135f6ff405e760001a8a40139214 Author: Daniel Borkmann AuthorDate: Sat, 11 Mar 2017 01:31:19 +0100 Committer: Thomas Gleixner CommitDate: Sat, 11 Mar 2017 11:10:02 +0100 x86/tlb: Fix tlb flushing when lguest clears PGE Fengguang reported random corruptions from various locations on x86-32 after commits d2852a224050 ("arch: add ARCH_HAS_SET_MEMORY config") and 9d876e79df6a ("bpf: fix unlocking of jited image when module ronx not set") that uses the former. While x86-32 doesn't have a JIT like x86_64, the bpf_prog_lock_ro() and bpf_prog_unlock_ro() got enabled due to ARCH_HAS_SET_MEMORY, whereas Fengguang's test kernel doesn't have module support built in and therefore never had the DEBUG_SET_MODULE_RONX setting enabled. After investigating the crashes further, it turned out that using set_memory_ro() and set_memory_rw() didn't have the desired effect, for example, setting the pages as read-only on x86-32 would still let probe_kernel_write() succeed without error. This behavior would manifest itself in situations where the vmalloc'ed buffer was accessed prior to set_memory_*() such as in case of bpf_prog_alloc(). In cases where it wasn't, the page attribute changes seemed to have taken effect, leading to the conclusion that a TLB invalidate didn't happen. Moreover, it turned out that this issue reproduced with qemu in "-cpu kvm64" mode, but not for "-cpu host". When the issue occurs, change_page_attr_set_clr() did trigger a TLB flush as expected via __flush_tlb_all() through cpa_flush_range(), though. There are 3 variants for issuing a TLB flush: invpcid_flush_all() (depends on CPU feature bits X86_FEATURE_INVPCID, X86_FEATURE_PGE), cr4 based flush (depends on X86_FEATURE_PGE), and cr3 based flush. For "-cpu host" case in my setup, the flush used invpcid_flush_all() variant, whereas for "-cpu kvm64", the flush was cr4 based. Switching the kvm64 case to cr3 manually worked fine, and further investigating the cr4 one turned out that X86_CR4_PGE bit was not set in cr4 register, meaning the __native_flush_tlb_global_irq_disabled() wrote cr4 twice with the same value instead of clearing X86_CR4_PGE in the first write to trigger the flush. It turned out that X86_CR4_PGE was cleared from cr4 during init from lguest_arch_host_init() via adjust_pge(). The X86_FEATURE_PGE bit is also cleared from there due to concerns of using PGE in guest kernel that can lead to hard to trace bugs (see bff672e630a0 ("lguest: documentation V: Host") in init()). The CPU feature bits are cleared in dynamic boot_cpu_data, but they never propagated to __flush_tlb_all() as it uses static_cpu_has() instead of boot_cpu_has() for testing which variant of TLB flushing to use, meaning they still used the old setting of the host kernel. Clearing via setup_clear_cpu_cap(X86_FEATURE_PGE) so this would propagate to static_cpu_has() checks is too late at this point as sections have been patched already, so for now, it seems reasonable to switch back to boot_cpu_has(X86_FEATURE_PGE) as it was prior to commit c109bf95992b ("x86/cpufeature: Remove cpu_has_pge"). This lets the TLB flush trigger via cr3 as originally intended, properly makes the new page attributes visible and thus fixes the crashes seen by Fengguang. Fixes: c109bf95992b ("x86/cpufeature: Remove cpu_has_pge") Reported-by: Fengguang Wu Signed-off-by: Daniel Borkmann Cc: bp@suse.de Cc: Kees Cook Cc: "David S. Miller" Cc: netdev@vger.kernel.org Cc: Rusty Russell Cc: Alexei Starovoitov Cc: Linus Torvalds Cc: lkp@01.org Cc: Laura Abbott Link: http://lkml.kernrl.org/r/20170301125426.l4nf65rx4wahohyl@wfg-t540p.sh.intel.com Link: http://lkml.kernel.org/r/25c41ad9eca164be4db9ad84f768965b7eb19d9e.1489191673.git.daniel@iogearbox.net Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/tlbflush.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 6fa8594..fc5abff 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -188,7 +188,7 @@ static inline void __native_flush_tlb_single(unsigned long addr) static inline void __flush_tlb_all(void) { - if (static_cpu_has(X86_FEATURE_PGE)) + if (boot_cpu_has(X86_FEATURE_PGE)) __flush_tlb_global(); else __flush_tlb();