Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp8098886imu; Tue, 4 Dec 2018 03:00:10 -0800 (PST) X-Google-Smtp-Source: AFSGD/UoxiEZkzfCsiTwYd6rBJ+A+cBCO/uvjLmHdN6zix3PSsPplEPcI8EI1CQKXzH2lV5cAXI0 X-Received: by 2002:a17:902:a40f:: with SMTP id p15mr20023536plq.286.1543921210634; Tue, 04 Dec 2018 03:00:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543921210; cv=none; d=google.com; s=arc-20160816; b=whtyWo01yjbjw90VNBn8dtkRiCYTgpyl28fq1dtAUrUAM10vdy+QnKuwZi49oxXg44 rLyzI9xMoc8j6XCwsQjoxp93boSBxRISBA5T9P1jH1YoxdgMR/MRHrlBvSZJ8Grd9Nqf Z76NyhxIhVRnO0j4PR8xJ1EPVIAcpuPnBnJ9pMYsEoPonCyyz2GIGS69MNUAMFA2wLZU TkCpt5yFexd35EJe2iKTx6rInLjONsEYYLOoWtvn71ZfUoH9NPKbawQ9HxoDXwhVOXIk 77bM1/OLdJTJsq9SAu0ypSmcWIezXJnebpq/EEoiW9A/y55q0y9xP1C0odu0SyfsZO4Z 1RFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=aQVKWyMfrD4pAbxaG6yJ4EXk0Cxvxg42FWPWVfp+3gk=; b=ysOAYSwSu4EkMo4df9QcTwFSFpB15vVWZcyc4rRo4luUE1icSdziAY+1upWUWVwJye Lf8t5Q9wapSrXBvB6mrU7Riod++q3kV/zRE2tHIlhoF8pvkX5iAL6NMO584qj+QYi5Zp i+W8aVymXs+KBJxgF2JVPlfrslJ8+66ZtSPM8m+2it3YPuPaAtAXn7yXS0FgMeNWS2tD UWyyJTuLiceWSwRJ6u7WSwfnEsDAJn0e46y4bOLWlff338XisiSLtWx4DvxRUC/A9PJB sFT69zEqqFn9OvAOK27SVJhnVsntBby7yhJ9VWMme9lHiYt+C2lDCheBkFYz12kNlHHs HIwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=2Cc5RY+3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k13si15659853pgb.343.2018.12.04.02.59.55; Tue, 04 Dec 2018 03:00:10 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=2Cc5RY+3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725979AbeLDK5X (ORCPT + 99 others); Tue, 4 Dec 2018 05:57:23 -0500 Received: from mail.kernel.org ([198.145.29.99]:41158 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726710AbeLDK5V (ORCPT ); Tue, 4 Dec 2018 05:57:21 -0500 Received: from localhost (5356596B.cm-6-7b.dynamic.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E3F48214D9; Tue, 4 Dec 2018 10:57:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1543921039; bh=btSCoALJaOnApqgclvxuhqqt8TNJO793VlxGpS4LyeY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=2Cc5RY+3gVWYR3R5KmUw5wUvbCTGfaZJhvo/GdeNZRDRFMv44x4xyMad8JmDxqLGy qvKWfOyrWydTP+VNVelqgiYyoqKlqifPw/2v2a5N3selXyRKRzAr0Og6jBobnIXslk kBFwv51T88PJ2aqHicIp/HLapbG8VF+v6k8nt4gs= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Tim Chen , Thomas Gleixner , Peter Zijlstra , Andy Lutomirski , Linus Torvalds , Jiri Kosina , Tom Lendacky , Josh Poimboeuf , Andrea Arcangeli , David Woodhouse , Andi Kleen , Dave Hansen , Casey Schaufler , Asit Mallick , Arjan van de Ven , Jon Masters , Waiman Long , Dave Stewart , Kees Cook Subject: [PATCH 4.19 056/139] x86/speculation: Prevent stale SPEC_CTRL msr content Date: Tue, 4 Dec 2018 11:48:57 +0100 Message-Id: <20181204103652.277494765@linuxfoundation.org> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181204103649.950154335@linuxfoundation.org> References: <20181204103649.950154335@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.19-stable review patch. If anyone has any objections, please let me know. ------------------ From: Thomas Gleixner tglx@linutronix.de commit 6d991ba509ebcfcc908e009d1db51972a4f7a064 upstream The seccomp speculation control operates on all tasks of a process, but only the current task of a process can update the MSR immediately. For the other threads the update is deferred to the next context switch. This creates the following situation with Process A and B: Process A task 2 and Process B task 1 are pinned on CPU1. Process A task 2 does not have the speculation control TIF bit set. Process B task 1 has the speculation control TIF bit set. CPU0 CPU1 MSR bit is set ProcB.T1 schedules out ProcA.T2 schedules in MSR bit is cleared ProcA.T1 seccomp_update() set TIF bit on ProcA.T2 ProcB.T1 schedules in MSR is not updated <-- FAIL This happens because the context switch code tries to avoid the MSR update if the speculation control TIF bits of the incoming and the outgoing task are the same. In the worst case ProcB.T1 and ProcA.T2 are the only tasks scheduling back and forth on CPU1, which keeps the MSR stale forever. In theory this could be remedied by IPIs, but chasing the remote task which could be migrated is complex and full of races. The straight forward solution is to avoid the asychronous update of the TIF bit and defer it to the next context switch. The speculation control state is stored in task_struct::atomic_flags by the prctl and seccomp updates already. Add a new TIF_SPEC_FORCE_UPDATE bit and set this after updating the atomic_flags. Check the bit on context switch and force a synchronous update of the speculation control if set. Use the same mechanism for updating the current task. Reported-by: Tim Chen Signed-off-by: Thomas Gleixner Cc: Peter Zijlstra Cc: Andy Lutomirski Cc: Linus Torvalds Cc: Jiri Kosina Cc: Tom Lendacky Cc: Josh Poimboeuf Cc: Andrea Arcangeli Cc: David Woodhouse Cc: Tim Chen Cc: Andi Kleen Cc: Dave Hansen Cc: Casey Schaufler Cc: Asit Mallick Cc: Arjan van de Ven Cc: Jon Masters Cc: Waiman Long Cc: Greg KH Cc: Dave Stewart Cc: Kees Cook Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1811272247140.1875@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman --- arch/x86/include/asm/spec-ctrl.h | 6 +----- arch/x86/include/asm/thread_info.h | 4 +++- arch/x86/kernel/cpu/bugs.c | 18 +++++++----------- arch/x86/kernel/process.c | 30 +++++++++++++++++++++++++++++- 4 files changed, 40 insertions(+), 18 deletions(-) --- a/arch/x86/include/asm/spec-ctrl.h +++ b/arch/x86/include/asm/spec-ctrl.h @@ -83,10 +83,6 @@ static inline void speculative_store_byp #endif extern void speculation_ctrl_update(unsigned long tif); - -static inline void speculation_ctrl_update_current(void) -{ - speculation_ctrl_update(current_thread_info()->flags); -} +extern void speculation_ctrl_update_current(void); #endif --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -84,6 +84,7 @@ struct thread_info { #define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */ #define TIF_SECCOMP 8 /* secure computing */ #define TIF_SPEC_IB 9 /* Indirect branch speculation mitigation */ +#define TIF_SPEC_FORCE_UPDATE 10 /* Force speculation MSR update in context switch */ #define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */ #define TIF_UPROBE 12 /* breakpointed or singlestepping */ #define TIF_PATCH_PENDING 13 /* pending live patching update */ @@ -112,6 +113,7 @@ struct thread_info { #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) #define _TIF_SECCOMP (1 << TIF_SECCOMP) #define _TIF_SPEC_IB (1 << TIF_SPEC_IB) +#define _TIF_SPEC_FORCE_UPDATE (1 << TIF_SPEC_FORCE_UPDATE) #define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY) #define _TIF_UPROBE (1 << TIF_UPROBE) #define _TIF_PATCH_PENDING (1 << TIF_PATCH_PENDING) @@ -149,7 +151,7 @@ struct thread_info { /* flags to check in __switch_to() */ #define _TIF_WORK_CTXSW_BASE \ (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP| \ - _TIF_SSBD) + _TIF_SSBD | _TIF_SPEC_FORCE_UPDATE) /* * Avoid calls to __switch_to_xtra() on UP as STIBP is not evaluated. --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -700,14 +700,10 @@ static void ssb_select_mitigation(void) #undef pr_fmt #define pr_fmt(fmt) "Speculation prctl: " fmt -static void task_update_spec_tif(struct task_struct *tsk, int tifbit, bool on) +static void task_update_spec_tif(struct task_struct *tsk) { - bool update; - - if (on) - update = !test_and_set_tsk_thread_flag(tsk, tifbit); - else - update = test_and_clear_tsk_thread_flag(tsk, tifbit); + /* Force the update of the real TIF bits */ + set_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE); /* * Immediately update the speculation control MSRs for the current @@ -717,7 +713,7 @@ static void task_update_spec_tif(struct * This can only happen for SECCOMP mitigation. For PRCTL it's * always the current task. */ - if (tsk == current && update) + if (tsk == current) speculation_ctrl_update_current(); } @@ -733,16 +729,16 @@ static int ssb_prctl_set(struct task_str if (task_spec_ssb_force_disable(task)) return -EPERM; task_clear_spec_ssb_disable(task); - task_update_spec_tif(task, TIF_SSBD, false); + task_update_spec_tif(task); break; case PR_SPEC_DISABLE: task_set_spec_ssb_disable(task); - task_update_spec_tif(task, TIF_SSBD, true); + task_update_spec_tif(task); break; case PR_SPEC_FORCE_DISABLE: task_set_spec_ssb_disable(task); task_set_spec_ssb_force_disable(task); - task_update_spec_tif(task, TIF_SSBD, true); + task_update_spec_tif(task); break; default: return -ERANGE; --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -443,6 +443,18 @@ static __always_inline void __speculatio wrmsrl(MSR_IA32_SPEC_CTRL, msr); } +static unsigned long speculation_ctrl_update_tif(struct task_struct *tsk) +{ + if (test_and_clear_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE)) { + if (task_spec_ssb_disable(tsk)) + set_tsk_thread_flag(tsk, TIF_SSBD); + else + clear_tsk_thread_flag(tsk, TIF_SSBD); + } + /* Return the updated threadinfo flags*/ + return task_thread_info(tsk)->flags; +} + void speculation_ctrl_update(unsigned long tif) { /* Forced update. Make sure all relevant TIF flags are different */ @@ -451,6 +463,14 @@ void speculation_ctrl_update(unsigned lo preempt_enable(); } +/* Called from seccomp/prctl update */ +void speculation_ctrl_update_current(void) +{ + preempt_disable(); + speculation_ctrl_update(speculation_ctrl_update_tif(current)); + preempt_enable(); +} + void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p) { struct thread_struct *prev, *next; @@ -482,7 +502,15 @@ void __switch_to_xtra(struct task_struct if ((tifp ^ tifn) & _TIF_NOCPUID) set_cpuid_faulting(!!(tifn & _TIF_NOCPUID)); - __speculation_ctrl_update(tifp, tifn); + if (likely(!((tifp | tifn) & _TIF_SPEC_FORCE_UPDATE))) { + __speculation_ctrl_update(tifp, tifn); + } else { + speculation_ctrl_update_tif(prev_p); + tifn = speculation_ctrl_update_tif(next_p); + + /* Enforce MSR update to ensure consistent state */ + __speculation_ctrl_update(~tifn, tifn); + } } /*