Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp554424pxk; Wed, 2 Sep 2020 08:38:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxGgIAmaWmpSRPOQqbtTgtiOKONkSQnx/PNud9QNlwF7g8iXA97gUT+YXRfssgjBbdO0hNs X-Received: by 2002:aa7:dd11:: with SMTP id i17mr652089edv.170.1599061084051; Wed, 02 Sep 2020 08:38:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599061084; cv=none; d=google.com; s=arc-20160816; b=Zx3dloBvu5X01U7yq4Nu71mmrHaEB4Mnq45rEucrL/JPWFUlcELfCc2rl2Ss5a9Aia ygR2EpxcIkW8D+6N/RugoBMUyNJe5duEhq6JS5D3jPthos6Z5sXghw2p+DG1LBiSQ7F5 MqSm4Z1g5oJw/ZXWfWWGdBbT/3KFQu2C+TdyUT6xUvWfXzsbvdrlhg3c3P0jXNHf5Kem pa9W1JUJLVq9t2xENf4XWFo1iOjBe78qdYpRIM2o5xKjxwHsfVDpLlh2A8wYbukLXQ+5 RaNVbt+w+6YBx1qEkTTLfYilpugDkILWNSw2Q4oNNcBQUQGavUU1tbQyRGvP55B8oihB 70tw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=bZV9KGWZ5viva61/b9hVQgmVhPVV/tpGu2oJRA3hrXM=; b=nrLiETUYzAO4XTv9DF3S75m72Z72AAG8ZFRiGTIUyNF+C45uLD0d2ZT4PGnEeyvZzr ApuFxTobkZEGB+V1HJlD1FXCF3BaIqGw1sB4N5XLndu2wGaU1OvRb6gKFRVKVCFyu088 YLiQ/jPRQFy5T04t7KBf0+U4f/+j3hjrBerSQ5d8n7IiK1unaTOPgHdaFVWn4pOdsnKA 0t1xQeti2kc8Zca8njyLuEMPTJ0q1p6aBWV66ebCI6d75dh56DMQJexKUtwXap6yIJfA wcWslQb28nM2ZjovqC5RB9MWQQz6RpgU95/x4wBl52hloHusYCatIugDa8oD7P0OZ1+f jIfQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d22si2765699edz.234.2020.09.02.08.37.32; Wed, 02 Sep 2020 08:38:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728241AbgIBPeE (ORCPT + 99 others); Wed, 2 Sep 2020 11:34:04 -0400 Received: from mail-qt1-f194.google.com ([209.85.160.194]:44912 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728303AbgIBPdt (ORCPT ); Wed, 2 Sep 2020 11:33:49 -0400 Received: by mail-qt1-f194.google.com with SMTP id e7so3860105qtj.11 for ; Wed, 02 Sep 2020 08:33:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bZV9KGWZ5viva61/b9hVQgmVhPVV/tpGu2oJRA3hrXM=; b=YpT7Bq+Pq8YW448aD4i1t19E4j38m6S+gV4+Uulcl+r8+fIdM9x3s+iP0mpOtU0Vq9 E9rGgoSDC2qLDGnSXvitIXvcCKB3u2oUZAy3qnczm2/kEfHO+0Ep5ZiRoALb5ZX5yo1E V8XgzQiyyXZ+whZv1RCK6R9GDdIDUvEJiF5clsRjSnlxZvbfGpVpZ9IvWAN/VI6e64Ng 691fO488YiPjdMF3W3xmRDV275MFEo00JUstN0MEuNKqQqRaZ169WsqHfBEEUruSKeOQ ymfrVxEx/rbvj5mtqhVgQ5nvYg+sBUwvkzHZ2fxHsWWLwtBWDYsdlZhdhx1k+WMECudL JyXQ== X-Gm-Message-State: AOAM530NWjW75y9BOKpls9hHJiFRDc7MB2iFAGDq+b/BduReiQKoztvJ /J6bOCPRTVdAxdoPkp5A8zk= X-Received: by 2002:ac8:7a6b:: with SMTP id w11mr6806456qtt.316.1599060827762; Wed, 02 Sep 2020 08:33:47 -0700 (PDT) Received: from rani.riverdale.lan ([2001:470:1f07:5f3::b55f]) by smtp.gmail.com with ESMTPSA id v42sm5195260qth.35.2020.09.02.08.33.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Sep 2020 08:33:47 -0700 (PDT) From: Arvind Sankar To: Linus Torvalds Cc: Miguel Ojeda , Sedat Dilek , Segher Boessenkool , Thomas Gleixner , Nick Desaulniers , "Paul E. McKenney" , Ingo Molnar , Arnd Bergmann , Borislav Petkov , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , "H. Peter Anvin" , "Kirill A. Shutemov" , Kees Cook , Peter Zijlstra , Juergen Gross , Andy Lutomirski , Andrew Cooper , LKML , clang-built-linux , Will Deacon , nadav.amit@gmail.com, Nathan Chancellor Subject: [PATCH v2] x86/asm: Replace __force_order with memory clobber Date: Wed, 2 Sep 2020 11:33:46 -0400 Message-Id: <20200902153346.3296117-1-nivedita@alum.mit.edu> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200823212550.3377591-1-nivedita@alum.mit.edu> References: <20200823212550.3377591-1-nivedita@alum.mit.edu> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The CRn accessor functions use __force_order as a dummy operand to prevent the compiler from reordering the inline asm. The fact that the asm is volatile should be enough to prevent this already, however older versions of GCC had a bug that could sometimes result in reordering. This was fixed in 8.1, 7.3 and 6.5. Versions prior to these, including 5.x and 4.9.x, may reorder volatile asm. There are some issues with __force_order as implemented: - It is used only as an input operand for the write functions, and hence doesn't do anything additional to prevent reordering writes. - It allows memory accesses to be cached/reordered across write functions, but CRn writes affect the semantics of memory accesses, so this could be dangerous. - __force_order is not actually defined in the kernel proper, but the LLVM toolchain can in some cases require a definition: LLVM (as well as GCC 4.9) requires it for PIE code, which is why the compressed kernel has a definition, but also the clang integrated assembler may consider the address of __force_order to be significant, resulting in a reference that requires a definition. Fix this by: - Using a memory clobber for the write functions to additionally prevent caching/reordering memory accesses across CRn writes. - Using a dummy input operand with an arbitrary constant address for the read functions, instead of a global variable. This will prevent reads from being reordered across writes, while allowing memory loads to be cached/reordered across CRn reads, which should be safe. Tested-by: Nathan Chancellor Tested-by: Sedat Dilek Signed-off-by: Arvind Sankar Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82602 Link: https://lore.kernel.org/lkml/20200527135329.1172644-1-arnd@arndb.de/ --- Changes from v1: - Add lore link to email thread and mention state of 5.x/4.9.x in commit log arch/x86/boot/compressed/pgtable_64.c | 9 --------- arch/x86/include/asm/special_insns.h | 27 ++++++++++++++------------- arch/x86/kernel/cpu/common.c | 4 ++-- 3 files changed, 16 insertions(+), 24 deletions(-) diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c index c8862696a47b..7d0394f4ebf9 100644 --- a/arch/x86/boot/compressed/pgtable_64.c +++ b/arch/x86/boot/compressed/pgtable_64.c @@ -5,15 +5,6 @@ #include "pgtable.h" #include "../string.h" -/* - * __force_order is used by special_insns.h asm code to force instruction - * serialization. - * - * It is not referenced from the code, but GCC < 5 with -fPIE would fail - * due to an undefined symbol. Define it to make these ancient GCCs work. - */ -unsigned long __force_order; - #define BIOS_START_MIN 0x20000U /* 128K, less than this is insane */ #define BIOS_START_MAX 0x9f000U /* 640K, absolute maximum */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 59a3e13204c3..8f7791217ef4 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -11,45 +11,46 @@ #include /* - * Volatile isn't enough to prevent the compiler from reordering the - * read/write functions for the control registers and messing everything up. - * A memory clobber would solve the problem, but would prevent reordering of - * all loads stores around it, which can hurt performance. Solution is to - * use a variable and mimic reads and writes to it to enforce serialization + * The compiler should not reorder volatile asm, however older versions of GCC + * had a bug (which was fixed in 8.1, 7.3 and 6.5) where they could sometimes + * reorder volatile asm. The write functions are not a problem since they have + * memory clobbers preventing reordering. To prevent reads from being reordered + * with respect to writes, use a dummy memory operand. */ -extern unsigned long __force_order; + +#define __FORCE_ORDER "m"(*(unsigned int *)0x1000UL) void native_write_cr0(unsigned long val); static inline unsigned long native_read_cr0(void) { unsigned long val; - asm volatile("mov %%cr0,%0\n\t" : "=r" (val), "=m" (__force_order)); + asm volatile("mov %%cr0,%0\n\t" : "=r" (val) : __FORCE_ORDER); return val; } static __always_inline unsigned long native_read_cr2(void) { unsigned long val; - asm volatile("mov %%cr2,%0\n\t" : "=r" (val), "=m" (__force_order)); + asm volatile("mov %%cr2,%0\n\t" : "=r" (val) : __FORCE_ORDER); return val; } static __always_inline void native_write_cr2(unsigned long val) { - asm volatile("mov %0,%%cr2": : "r" (val), "m" (__force_order)); + asm volatile("mov %0,%%cr2": : "r" (val) : "memory"); } static inline unsigned long __native_read_cr3(void) { unsigned long val; - asm volatile("mov %%cr3,%0\n\t" : "=r" (val), "=m" (__force_order)); + asm volatile("mov %%cr3,%0\n\t" : "=r" (val) : __FORCE_ORDER); return val; } static inline void native_write_cr3(unsigned long val) { - asm volatile("mov %0,%%cr3": : "r" (val), "m" (__force_order)); + asm volatile("mov %0,%%cr3": : "r" (val) : "memory"); } static inline unsigned long native_read_cr4(void) @@ -64,10 +65,10 @@ static inline unsigned long native_read_cr4(void) asm volatile("1: mov %%cr4, %0\n" "2:\n" _ASM_EXTABLE(1b, 2b) - : "=r" (val), "=m" (__force_order) : "0" (0)); + : "=r" (val) : "0" (0), __FORCE_ORDER); #else /* CR4 always exists on x86_64. */ - asm volatile("mov %%cr4,%0\n\t" : "=r" (val), "=m" (__force_order)); + asm volatile("mov %%cr4,%0\n\t" : "=r" (val) : __FORCE_ORDER); #endif return val; } diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index c5d6f17d9b9d..178499f90366 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -359,7 +359,7 @@ void native_write_cr0(unsigned long val) unsigned long bits_missing = 0; set_register: - asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order)); + asm volatile("mov %0,%%cr0": "+r" (val) : : "memory"); if (static_branch_likely(&cr_pinning)) { if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) { @@ -378,7 +378,7 @@ void native_write_cr4(unsigned long val) unsigned long bits_changed = 0; set_register: - asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits)); + asm volatile("mov %0,%%cr4": "+r" (val) : : "memory"); if (static_branch_likely(&cr_pinning)) { if (unlikely((val & cr4_pinned_mask) != cr4_pinned_bits)) { -- 2.26.2