Received: by 10.223.164.202 with SMTP id h10csp1295697wrb; Wed, 15 Nov 2017 17:11:27 -0800 (PST) X-Google-Smtp-Source: AGs4zMbLKq3SUjSwGEN2p+0Go87eqnaY3kRfir+W5PgCdPCkBnm7okVNvV9lFbpf/DZUIpVYJl2N X-Received: by 10.84.231.193 with SMTP id g1mr17587750pln.407.1510794687584; Wed, 15 Nov 2017 17:11:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510794687; cv=none; d=google.com; s=arc-20160816; b=07pPG7JCg0pvOXAkGrqIdwmbv6w+MTgmVb9SLQWnr+ph4STbps+ypIfJ4laYpbQ745 Ka1EwjbkvvamGRkXPRgkqg+3vQHYu5XtM9uLPYhlbE5QaFvA2MSROcwyp8ki17dOwwNw WyQDMTnw0NgGvjEEjhYysJqtqlyKUf1/p3UDgTaTBA5vtVEr1wF3ma6lkVfhzT2TASks sl1xeHTl6HB8iZv1gDBV08oB68mg/KhWDI2wJVNm64Ok+yVFlbri88a85EV4IrkKq9fH doQi5raS7BNxNej6Y5eky34SxaaxGpXg0XxcQUaBga2O1rfoDHJ/XMza7+b1Hn0OLP8h TQfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature:arc-authentication-results; bh=t4NXPhxLzYcANKGHBUm2fkVVHT0GVVwHKJ8WkqCawcE=; b=ifjEgCvGGrIb7YOQRYbkVxsdDiQl5vCSitJh9eiQM1ECfOIV9wL9bSIckKrB7QZ9V6 SQWCDtHGpH4xr5wVeMiLMR5Qr9yy+myhUxebAikUgeKSCfyeHXWL5McMgDnaHfy2aA6S Pv3NUSQO31aUtB+3OO6eWG4298vyce9uyVacHtK2R7KS1MdreM26glpRKu7Q+ZPn9jeS CdXpSaomZJcbFsrQr7i3uZn05VBCCT88JwYqSAbEBQAxfQ39oyd2MDhVKY5rWMG9WLCQ HhxWkWtCoIymYomUvoew880olia7qm6VJoWpOhDcw+bXVWVsDXlWhp0j6hFwyqT+f2nd c2Hw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=l0JSm6q8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d18si7268307pls.617.2017.11.15.17.11.14; Wed, 15 Nov 2017 17:11:27 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=l0JSm6q8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933880AbdKPAew (ORCPT + 89 others); Wed, 15 Nov 2017 19:34:52 -0500 Received: from mail-pf0-f195.google.com ([209.85.192.195]:44088 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933866AbdKPAeo (ORCPT ); Wed, 15 Nov 2017 19:34:44 -0500 Received: by mail-pf0-f195.google.com with SMTP id x7so18283362pfa.1 for ; Wed, 15 Nov 2017 16:34:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=t4NXPhxLzYcANKGHBUm2fkVVHT0GVVwHKJ8WkqCawcE=; b=l0JSm6q8hvcwWkpnbqgg8grh+UY3wgG2sO/HSyOixRKCP1FFeI/KzpozA9efGkOiSD XuJUMEvUO5I58EXPFCLIRpWGG8DdH+IYZN/uXrHz0Y9lYwBlLB321ajjPrE1bpjwaG+6 x5+6qlE2j4x6Cbx4tC4z/5ulZwJRpp2A6jmkKaN4h8SdVGzMBUvscU95Mr92QxkQzpMm yWvTAmyBr1f0j/TgOsr0Ohw2IqixXriEYGDDO8FZ84g8aHQhSD2n/tMdPdO3GLRJKsS7 +SjmD/ouUwD8zgmONPnG1raE+O12Hd4+XkDiNWQ4mDSCCy9sXHxss5wPK50C3VUq6ifr pqyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=t4NXPhxLzYcANKGHBUm2fkVVHT0GVVwHKJ8WkqCawcE=; b=E+HWP9+fxoFphNWI9SIySO+7309cdg/w309mLln/qY8OmARHlyf6WsxzbCnrVSxIZK WEj0gZSgKQKKsugbu+Gwjd0LkFkWkVSscanVgIhjiGLFMCidtkMYfOIzORRWOfq+rBHI oWjwzOs54Wl6GeFQdlbJUAUwHevTY+3JuwKjfGlkKPRR8VsYM5lcBHLEa5X/0ZIYBvOk f0ljWNEvmoL+/W85NcuTYmSyAgDJeFC1ef1Th+yOAks0HUfTsGIj/RbucBGXrjt2LkWt iPPaVCnQZEq0qo8coTT4atzuMbMra6En0FiTdb5ZNguY10E5X7vSH1femA1A+4V74x8z nNAQ== X-Gm-Message-State: AJaThX6dSfEO1COJF4wt/PzKkNQOIckoub8ryuUwMKCGbDgorI74epN0 Mu+mrblPu5MMGY74AWqDgQ6c7tZN X-Received: by 10.101.92.129 with SMTP id a1mr17084538pgt.6.1510792483733; Wed, 15 Nov 2017 16:34:43 -0800 (PST) Received: from [10.2.44.15] ([208.91.2.2]) by smtp.gmail.com with ESMTPSA id c9sm48711735pfj.76.2017.11.15.16.34.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 Nov 2017 16:34:41 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: x86: CR4 update when IRQs are enabled From: Nadav Amit In-Reply-To: Date: Wed, 15 Nov 2017 16:34:39 -0800 Cc: Dave Hansen , Thomas Gleixner , Borislav Petkov , the arch/x86 maintainers , Andy Lutomirski , LKML Content-Transfer-Encoding: quoted-printable Message-Id: <87A1AFF9-A955-4A40-B378-384EB4F6442E@gmail.com> References: <2DE32618-CB44-43C7-BAFB-F6967248F97C@gmail.com> To: "H. Peter Anvin" X-Mailer: Apple Mail (2.3273) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org hpa@zytor.com wrote: > On November 15, 2017 3:31:50 PM PST, Nadav Amit = wrote: >> Ping? >>=20 >> Nadav Amit wrote: >>=20 >>> CC=E2=80=99ing more people, and adding a patch to clarify... >>>=20 >>> Nadav Amit wrote: >>>=20 >>>> I am puzzled by the comment in tlb_state.cr4 , which says: >>>>=20 >>>> /* >>>> * Access to this CR4 shadow and to H/W CR4 is protected by >>>> * disabling interrupts when modifying either one. >>>> */ >>>>=20 >>>> This does not seem to be true and adding a warning to CR4 writes >> when >>>> !irqs_disabled() immediately fires in identify_cpu() and >>>> __mcheck_cpu_init_generic(). While these two are called during = boot, >> I think >>>> there other CR4 changes with enabled IRQs, for example, PR_SET_TSC. >>>>=20 >>>> So my question(s): Is the comment correct? Is the current behavior >> correct? >>> So here is what I have in mind. I am not sure whether >> CONFIG_DEBUG_PREEMPT is >>> the right #ifdef. Let me know what you think. >>>=20 >>> -- >8 -- >>>=20 >>> Subject: [PATCH] x86: disable IRQs before changing CR4 >>>=20 >>> CR4 changes need to be performed while IRQs are disabled in order to >>> update the CR4 shadow and the actual register atomically. >>>=20 >>> Signed-off-by: Nadav Amit >>> --- >>> arch/x86/include/asm/tlbflush.h | 18 ++++++++++++------ >>> arch/x86/kernel/cpu/common.c | 13 ++++++++++++- >>> arch/x86/kernel/cpu/mcheck/mce.c | 3 +++ >>> arch/x86/kernel/cpu/mcheck/p5.c | 4 ++++ >>> arch/x86/kernel/cpu/mcheck/winchip.c | 3 +++ >>> arch/x86/kernel/process.c | 14 ++++++++++++-- >>> 6 files changed, 46 insertions(+), 9 deletions(-) >>>=20 >>> diff --git a/arch/x86/include/asm/tlbflush.h >> b/arch/x86/include/asm/tlbflush.h >>> index 50ea3482e1d1..bc70dd1cc7c6 100644 >>> --- a/arch/x86/include/asm/tlbflush.h >>> +++ b/arch/x86/include/asm/tlbflush.h >>> @@ -89,6 +89,15 @@ static inline void cr4_init_shadow(void) >>> this_cpu_write(cpu_tlbstate.cr4, __read_cr4()); >>> } >>>=20 >>> +static inline void update_cr4(unsigned long cr4) >>> +{ >>> +#ifdef CONFIG_DEBUG_PREEMPT >>> + WARN_ON_ONCE(!irqs_disabled()); >>> +#endif >>> + this_cpu_write(cpu_tlbstate.cr4, cr4); >>> + __write_cr4(cr4); >>> +} >>> + >>> /* Set in this cpu's CR4. */ >>> static inline void cr4_set_bits(unsigned long mask) >>> { >>> @@ -97,8 +106,7 @@ static inline void cr4_set_bits(unsigned long >> mask) >>> cr4 =3D this_cpu_read(cpu_tlbstate.cr4); >>> if ((cr4 | mask) !=3D cr4) { >>> cr4 |=3D mask; >>> - this_cpu_write(cpu_tlbstate.cr4, cr4); >>> - __write_cr4(cr4); >>> + update_cr4(cr4); >>> } >>> } >>>=20 >>> @@ -110,8 +118,7 @@ static inline void cr4_clear_bits(unsigned long >> mask) >>> cr4 =3D this_cpu_read(cpu_tlbstate.cr4); >>> if ((cr4 & ~mask) !=3D cr4) { >>> cr4 &=3D ~mask; >>> - this_cpu_write(cpu_tlbstate.cr4, cr4); >>> - __write_cr4(cr4); >>> + update_cr4(cr4); >>> } >>> } >>>=20 >>> @@ -121,8 +128,7 @@ static inline void cr4_toggle_bits(unsigned long >> mask) >>> cr4 =3D this_cpu_read(cpu_tlbstate.cr4); >>> cr4 ^=3D mask; >>> - this_cpu_write(cpu_tlbstate.cr4, cr4); >>> - __write_cr4(cr4); >>> + update_cr4(cr4); >>> } >>>=20 >>> /* Read the CR4 shadow. */ >>> diff --git a/arch/x86/kernel/cpu/common.c >> b/arch/x86/kernel/cpu/common.c >>> index c8b39870f33e..82e6b41fd5e9 100644 >>> --- a/arch/x86/kernel/cpu/common.c >>> +++ b/arch/x86/kernel/cpu/common.c >>> @@ -318,6 +318,8 @@ static bool pku_disabled; >>>=20 >>> static __always_inline void setup_pku(struct cpuinfo_x86 *c) >>> { >>> + unsigned long flags; >>> + >>> /* check the boot processor, plus compile options for PKU: */ >>> if (!cpu_feature_enabled(X86_FEATURE_PKU)) >>> return; >>> @@ -327,7 +329,10 @@ static __always_inline void setup_pku(struct >> cpuinfo_x86 *c) >>> if (pku_disabled) >>> return; >>>=20 >>> + local_irq_save(flags); >>> cr4_set_bits(X86_CR4_PKE); >>> + local_irq_restore(flags); >>> + >>> /* >>> * Seting X86_CR4_PKE will cause the X86_FEATURE_OSPKE >>> * cpuid bit to be set. We need to ensure that we >>> @@ -1069,6 +1074,7 @@ static void = validate_apic_and_package_id(struct >> cpuinfo_x86 *c) >>> */ >>> static void identify_cpu(struct cpuinfo_x86 *c) >>> { >>> + unsigned long flags; >>> int i; >>>=20 >>> c->loops_per_jiffy =3D loops_per_jiffy; >>> @@ -1121,9 +1127,14 @@ static void identify_cpu(struct cpuinfo_x86 >> *c) >>> /* Disable the PN if appropriate */ >>> squash_the_stupid_serial_number(c); >>>=20 >>> - /* Set up SMEP/SMAP */ >>> + /* >>> + * Set up SMEP/SMAP. Disable interrupts to prevent triggering a >> warning >>> + * as CR4 changes must be done with disabled interrupts. >>> + */ >>> + local_irq_save(flags); >>> setup_smep(c); >>> setup_smap(c); >>> + local_irq_restore(flags); >>>=20 >>> /* >>> * The vendor-specific functions might have changed features. >>> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c >> b/arch/x86/kernel/cpu/mcheck/mce.c >>> index 3b413065c613..497c07e33c25 100644 >>> --- a/arch/x86/kernel/cpu/mcheck/mce.c >>> +++ b/arch/x86/kernel/cpu/mcheck/mce.c >>> @@ -1508,6 +1508,7 @@ static void __mcheck_cpu_init_generic(void) >>> { >>> enum mcp_flags m_fl =3D 0; >>> mce_banks_t all_banks; >>> + unsigned long flags; >>> u64 cap; >>>=20 >>> if (!mca_cfg.bootlog) >>> @@ -1519,7 +1520,9 @@ static void __mcheck_cpu_init_generic(void) >>> bitmap_fill(all_banks, MAX_NR_BANKS); >>> machine_check_poll(MCP_UC | m_fl, &all_banks); >>>=20 >>> + local_irq_save(flags); >>> cr4_set_bits(X86_CR4_MCE); >>> + local_irq_restore(flags); >>>=20 >>> rdmsrl(MSR_IA32_MCG_CAP, cap); >>> if (cap & MCG_CTL_P) >>> diff --git a/arch/x86/kernel/cpu/mcheck/p5.c >> b/arch/x86/kernel/cpu/mcheck/p5.c >>> index 2a0717bf8033..d5d4963415e9 100644 >>> --- a/arch/x86/kernel/cpu/mcheck/p5.c >>> +++ b/arch/x86/kernel/cpu/mcheck/p5.c >>> @@ -42,6 +42,7 @@ static void pentium_machine_check(struct pt_regs >> *regs, long error_code) >>> /* Set up machine check reporting for processors with Intel style >> MCE: */ >>> void intel_p5_mcheck_init(struct cpuinfo_x86 *c) >>> { >>> + unsigned long flags; >>> u32 l, h; >>>=20 >>> /* Default P5 to off as its often misconnected: */ >>> @@ -62,7 +63,10 @@ void intel_p5_mcheck_init(struct cpuinfo_x86 *c) >>> pr_info("Intel old style machine check architecture = supported.\n"); >>>=20 >>> /* Enable MCE: */ >>> + local_irq_save(flags); >>> cr4_set_bits(X86_CR4_MCE); >>> + local_irq_restore(flags); >>> + >>> pr_info("Intel old style machine check reporting enabled on >> CPU#%d.\n", >>> smp_processor_id()); >>> } >>> diff --git a/arch/x86/kernel/cpu/mcheck/winchip.c >> b/arch/x86/kernel/cpu/mcheck/winchip.c >>> index c6a722e1d011..6dd985e3849d 100644 >>> --- a/arch/x86/kernel/cpu/mcheck/winchip.c >>> +++ b/arch/x86/kernel/cpu/mcheck/winchip.c >>> @@ -26,6 +26,7 @@ static void winchip_machine_check(struct pt_regs >> *regs, long error_code) >>> /* Set up machine check reporting on the Winchip C6 series */ >>> void winchip_mcheck_init(struct cpuinfo_x86 *c) >>> { >>> + unsigned long flags; >>> u32 lo, hi; >>>=20 >>> machine_check_vector =3D winchip_machine_check; >>> @@ -37,7 +38,9 @@ void winchip_mcheck_init(struct cpuinfo_x86 *c) >>> lo &=3D ~(1<<4); /* Enable MCE */ >>> wrmsr(MSR_IDT_FCR1, lo, hi); >>>=20 >>> + local_irq_save(flags); >>> cr4_set_bits(X86_CR4_MCE); >>> + local_irq_restore(flags); >>>=20 >>> pr_info("Winchip machine check reporting enabled on CPU#0.\n"); >>> } >>> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c >>> index 3ca198080ea9..09e44e784745 100644 >>> --- a/arch/x86/kernel/process.c >>> +++ b/arch/x86/kernel/process.c >>> @@ -127,25 +127,35 @@ void flush_thread(void) >>>=20 >>> void disable_TSC(void) >>> { >>> + unsigned long flags; >>> + >>> preempt_disable(); >>> - if (!test_and_set_thread_flag(TIF_NOTSC)) >>> + if (!test_and_set_thread_flag(TIF_NOTSC)) { >>> /* >>> * Must flip the CPU state synchronously with >>> * TIF_NOTSC in the current running context. >>> */ >>> + local_irq_save(flags); >>> cr4_set_bits(X86_CR4_TSD); >>> + local_irq_restore(flags); >>> + } >>> preempt_enable(); >>> } >>>=20 >>> static void enable_TSC(void) >>> { >>> + unsigned long flags; >>> + >>> preempt_disable(); >>> - if (test_and_clear_thread_flag(TIF_NOTSC)) >>> + if (test_and_clear_thread_flag(TIF_NOTSC)) { >>> /* >>> * Must flip the CPU state synchronously with >>> * TIF_NOTSC in the current running context. >>> */ >>> + local_irq_save(flags); >>> cr4_clear_bits(X86_CR4_TSD); >>> + local_irq_restore(flags); >>> + } >>> preempt_enable(); >>> } >=20 > This is wrong on at least two levels colon first of all, this should = not be wrapped around the abstracted operations but put inside them if = relevant. Second, I suspect that this is not at all a requirement but = rather that as long as the hardware register is written second, I think = we should always be safe. Thanks for your reply. Can you please explain your suspicion? Here is a simple execution that = may fail (assume cr4 is initially zero): CPU0 =3D=3D=3D=3D cr4_set_bits(1) =3D> cpu_tlbstate.cr4 =3D 1 =3D> interrupt cr4_set_bits(2) cpu_tlbstate.cr4 =3D 3 __write_cr4(3) =3D> __write_cr4(1) [ and should have been 3 ] Am I missing anything? Now, it may be a theoretical issue right now, since I did not find any change of CR4 bits that happens in an interrupt handler. Anyhow, if you want me to submit a fix, please let me know what other = levels of wrongness you had in mind. Regards, Nadav From 1584182499407506098@xxx Thu Nov 16 01:02:42 +0000 2017 X-GM-THRID: 1583536401885830151 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread