Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp9600441ybi; Wed, 10 Jul 2019 13:15:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqzlfM70P/nCCUIqR8Krplb/IjSav5fLYW/OTnIF+nqg0pGbRifTnKeDO49GA3GG2EVgd5WX X-Received: by 2002:a63:6b0a:: with SMTP id g10mr46013pgc.295.1562789741356; Wed, 10 Jul 2019 13:15:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562789741; cv=none; d=google.com; s=arc-20160816; b=k3DaRBNg/v0F+feYWHcrFA1e7VasnOQuJb9In5+0iuLfeHxbtIGtIqZqZheDZtarXd ggjqnd2fcIvGT0A2nAEwAkVIZyESUAQwz3WE9UASBv/hir6LmZmfob9wnpaNrSn6Gvlb 091A4JTIyInXxKLGzuYP2dxtOQthSr8N1ZplmRXLrqwwVbAOgEd3/fgGanyyzlWMWpUa KCbNxu7XiObOC6qmEAD6+e06MmRjyVJEWT/rXAvTses3JHrB945X4dBw4tp+2g+Ue/XZ CPD9JEeZ5p+tpa7znXWgk/eZWpFBRJOd6AsdknBfgTrylwSIEGIE23fZ/w9a6eAeYk25 t6DA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=h0BX3mdDUgHm9iC9zApJq64h1iRY9VPLnwEaSxZ1oQ8=; b=ru++fI9Hxvk3zxysIhe3IHfkm3MOujKp6K1umLbiIOzs8mqktxQYTpza/UoKEZzm4J /ZoBUytULgiEC3eVCeDm+5UE51njTZ/0P2Nb3kvbG235OZpp799+91wCFe6TZ0tN7zIq QAuvXEtLHcfcyg4hh2RLutT7Z0RgetoH4ikJfKDqe9j5o6jIT1V59IhgjwAj0OotEc1Q 3wHEkp2nkKqmQLjfyftP4oMxoxCA3G288/IQeXoorHPnzJHxnWwJ3J54h9tmC7pA2lc9 pD30Bo+lhjmThaR7J7+ulExDEYkCuMJY02WAKvr0Hlb+A80I8VLpwjBkfmqdGSBrlZfS Wi3Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b="R/qIHl8H"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r4si2854763pgv.195.2019.07.10.13.15.24; Wed, 10 Jul 2019 13:15:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b="R/qIHl8H"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727923AbfGJT7p (ORCPT + 99 others); Wed, 10 Jul 2019 15:59:45 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:38865 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725832AbfGJT7o (ORCPT ); Wed, 10 Jul 2019 15:59:44 -0400 Received: by mail-pg1-f196.google.com with SMTP id z75so1737148pgz.5 for ; Wed, 10 Jul 2019 12:59:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=h0BX3mdDUgHm9iC9zApJq64h1iRY9VPLnwEaSxZ1oQ8=; b=R/qIHl8HdqvrmPbHzQg1HfwACUqyDWMzNPh7VzXZarBJD7FL6QHsDFUONqGPupB6Iv MZuOztgGv+4V6P7aeQ1eRgYeGba2YQsDmGQBmxSexwJibrv9f/gMrTOA4P+R+TdLcCP+ Jkql/CVYFIrMb0Q9ngLiiRPq+Pa6ui2xTXP9Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=h0BX3mdDUgHm9iC9zApJq64h1iRY9VPLnwEaSxZ1oQ8=; b=n296hzkCJ/FLZq5dq89395eW0+5w08jZdhu0hAwCP6PjTDQmH0UOV7DdqRcJzdYTUB uTp2zWCKF1Hr+NxcFX52DvBWjmwpajAOwEM6avHblNiet7farMhs300Q7j7bqmnZil8s MKO/c14bTWYkoDByxVtKC3gHzmejn/OE9RTpeJq3YZnfQ1gwqjM69zV4wOlvrakwh3y9 M8CTM8igKCMCMs89VuQYjfCd3/vSae1V0OAYDF3HBK6nLcXrbIyXwn0sG0ufinLP80cW YwOfGPfsXL/h7sUCpxZOvu+3h4y1vIHmQTE3kvt3IZoKYuCfvm/HOSMYB/BgJ+1X6HSr khuw== X-Gm-Message-State: APjAAAUhA3BLnAfndKjtAZrceY1Mx3K2n/nWwMw0GaBGyMUrRXHZfW+z QA3o61GRZfw+yfiERwlqQuMLtQ== X-Received: by 2002:a63:6d6:: with SMTP id 205mr25934pgg.262.1562788783843; Wed, 10 Jul 2019 12:59:43 -0700 (PDT) Received: from www.outflux.net (smtp.outflux.net. [198.145.64.163]) by smtp.gmail.com with ESMTPSA id t9sm2937225pji.18.2019.07.10.12.59.42 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 10 Jul 2019 12:59:42 -0700 (PDT) Date: Wed, 10 Jul 2019 12:59:41 -0700 From: Kees Cook To: Thomas Gleixner Cc: Xi Ruoyao , Peter Zijlstra , Jiri Kosina , Linus Torvalds , Ingo Molnar , Linux List Kernel Mailing , Borislav Petkov , Len Brown , Andrew Morton , "Rafael J. Wysocki" , Tony Luck , Bob Moore , Erik Schmauss , Josh Poimboeuf , Daniel Bristot de Oliveira , Juergen Gross Subject: Re: [PATCH] x86/asm: Move native_write_cr0/3() out of line Message-ID: <201907101258.FE97AEC86@keescook> References: <768463eb26a2feb0fcc374fd7f9cc28b96976917.camel@mengyan1223.wang> <20190710134433.GN3402@hirez.programming.kicks-ass.net> <20190710142653.GJ3419@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 10, 2019 at 09:42:46PM +0200, Thomas Gleixner wrote: > The pinning of sensitive CR0 and CR4 bits caused a boot crash when loading > the kvm_intel module on a kernel compiled with CONFIG_PARAVIRT=n. > > The reason is that the static key which controls the pinning is marked RO > after init. The kvm_intel module contains a CR4 write which requires to > update the static key entry list. That obviously does not work when the key > is in a RO section. > > With CONFIG_PARAVIRT enabled this does not happen because the CR4 write > uses the paravirt indirection and the actual write function is built in. > > As the key is intended to be immutable after init, move > native_write_cr0/3() out of line. > > While at it consolidate the update of the cr4 shadow variable and store the > value right away when the pinning is initialized on a booting CPU. No point > in reading it back 20 instructions later. This allows to confine the static > key and the pinning variable to cpu/common and allows to mark them static. > > Fixes: 8dbec27a242c ("x86/asm: Pin sensitive CR0 bits") > Fixes: 873d50d58f67 ("x86/asm: Pin sensitive CR4 bits") > Reported-by: Linus Torvalds > Reported-by: Xi Ruoyao > Signed-off-by: Thomas Gleixner > Tested-by: Xi Ruoyao Thank you for tracking this down and solving it! Nit: should be "cr0/4()" in Subject and in paragraph 4. Acked-by: Kees Cook -Kees > --- > arch/x86/include/asm/processor.h | 1 > arch/x86/include/asm/special_insns.h | 41 ------------------- > arch/x86/kernel/cpu/common.c | 72 +++++++++++++++++++++++++++-------- > arch/x86/kernel/smpboot.c | 14 ------ > arch/x86/xen/smp_pv.c | 1 > 5 files changed, 61 insertions(+), 68 deletions(-) > > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -741,6 +741,7 @@ extern void load_direct_gdt(int); > extern void load_fixmap_gdt(int); > extern void load_percpu_segment(int); > extern void cpu_init(void); > +extern void cr4_init(void); > > static inline unsigned long get_debugctlmsr(void) > { > --- a/arch/x86/include/asm/special_insns.h > +++ b/arch/x86/include/asm/special_insns.h > @@ -18,9 +18,7 @@ > */ > extern unsigned long __force_order; > > -/* Starts false and gets enabled once CPU feature detection is done. */ > -DECLARE_STATIC_KEY_FALSE(cr_pinning); > -extern unsigned long cr4_pinned_bits; > +void native_write_cr0(unsigned long val); > > static inline unsigned long native_read_cr0(void) > { > @@ -29,24 +27,6 @@ static inline unsigned long native_read_ > return val; > } > > -static inline void native_write_cr0(unsigned long val) > -{ > - unsigned long bits_missing = 0; > - > -set_register: > - asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order)); > - > - if (static_branch_likely(&cr_pinning)) { > - if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) { > - bits_missing = X86_CR0_WP; > - val |= bits_missing; > - goto set_register; > - } > - /* Warn after we've set the missing bits. */ > - WARN_ONCE(bits_missing, "CR0 WP bit went missing!?\n"); > - } > -} > - > static inline unsigned long native_read_cr2(void) > { > unsigned long val; > @@ -91,24 +71,7 @@ static inline unsigned long native_read_ > return val; > } > > -static inline void native_write_cr4(unsigned long val) > -{ > - unsigned long bits_missing = 0; > - > -set_register: > - asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits)); > - > - if (static_branch_likely(&cr_pinning)) { > - if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) { > - bits_missing = ~val & cr4_pinned_bits; > - val |= bits_missing; > - goto set_register; > - } > - /* Warn after we've set the missing bits. */ > - WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n", > - bits_missing); > - } > -} > +void native_write_cr4(unsigned long val); > > #ifdef CONFIG_X86_64 > static inline unsigned long native_read_cr8(void) > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -366,10 +366,62 @@ static __always_inline void setup_umip(s > cr4_clear_bits(X86_CR4_UMIP); > } > > -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > -EXPORT_SYMBOL(cr_pinning); > -unsigned long cr4_pinned_bits __ro_after_init; > -EXPORT_SYMBOL(cr4_pinned_bits); > +static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > +static unsigned long cr4_pinned_bits __ro_after_init; > + > +void native_write_cr0(unsigned long val) > +{ > + unsigned long bits_missing = 0; > + > +set_register: > + asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order)); > + > + if (static_branch_likely(&cr_pinning)) { > + if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) { > + bits_missing = X86_CR0_WP; > + val |= bits_missing; > + goto set_register; > + } > + /* Warn after we've set the missing bits. */ > + WARN_ONCE(bits_missing, "CR0 WP bit went missing!?\n"); > + } > +} > +EXPORT_SYMBOL(native_write_cr0); > + > +void native_write_cr4(unsigned long val) > +{ > + unsigned long bits_missing = 0; > + > +set_register: > + asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits)); > + > + if (static_branch_likely(&cr_pinning)) { > + if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) { > + bits_missing = ~val & cr4_pinned_bits; > + val |= bits_missing; > + goto set_register; > + } > + /* Warn after we've set the missing bits. */ > + WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n", > + bits_missing); > + } > +} > +EXPORT_SYMBOL(native_write_cr4); > + > +void cr4_init(void) > +{ > + unsigned long cr4 = __read_cr4(); > + > + if (boot_cpu_has(X86_FEATURE_PCID)) > + cr4 |= X86_CR4_PCIDE; > + if (static_branch_likely(&cr_pinning)) > + cr4 |= cr4_pinned_bits; > + > + __write_cr4(cr4); > + > + /* Initialize cr4 shadow for this CPU. */ > + this_cpu_write(cpu_tlbstate.cr4, cr4); > +} > > /* > * Once CPU feature detection is finished (and boot params have been > @@ -1723,12 +1775,6 @@ void cpu_init(void) > > wait_for_master_cpu(cpu); > > - /* > - * Initialize the CR4 shadow before doing anything that could > - * try to read it. > - */ > - cr4_init_shadow(); > - > if (cpu) > load_ucode_ap(); > > @@ -1823,12 +1869,6 @@ void cpu_init(void) > > wait_for_master_cpu(cpu); > > - /* > - * Initialize the CR4 shadow before doing anything that could > - * try to read it. > - */ > - cr4_init_shadow(); > - > show_ucode_info_early(); > > pr_info("Initializing CPU#%d\n", cpu); > --- a/arch/x86/kernel/smpboot.c > +++ b/arch/x86/kernel/smpboot.c > @@ -210,28 +210,16 @@ static int enable_start_cpu0; > */ > static void notrace start_secondary(void *unused) > { > - unsigned long cr4 = __read_cr4(); > - > /* > * Don't put *anything* except direct CPU state initialization > * before cpu_init(), SMP booting is too fragile that we want to > * limit the things done here to the most necessary things. > */ > - if (boot_cpu_has(X86_FEATURE_PCID)) > - cr4 |= X86_CR4_PCIDE; > - if (static_branch_likely(&cr_pinning)) > - cr4 |= cr4_pinned_bits; > - > - __write_cr4(cr4); > + cr4_init(); > > #ifdef CONFIG_X86_32 > /* switch away from the initial page table */ > load_cr3(swapper_pg_dir); > - /* > - * Initialize the CR4 shadow before doing anything that could > - * try to read it. > - */ > - cr4_init_shadow(); > __flush_tlb_all(); > #endif > load_current_idt(); > --- a/arch/x86/xen/smp_pv.c > +++ b/arch/x86/xen/smp_pv.c > @@ -58,6 +58,7 @@ static void cpu_bringup(void) > { > int cpu; > > + cr4_init(); > cpu_init(); > touch_softlockup_watchdog(); > preempt_disable(); -- Kees Cook