Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp774923ybb; Sat, 28 Mar 2020 09:46:03 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtIWo/QmqEtwqW1kI12AFm/5BQO/5mBl79aDZJRd2BqlRlIPdF5DpE4vSriMVVHw8y9mtol X-Received: by 2002:a05:6830:1da6:: with SMTP id z6mr2906350oti.124.1585413963034; Sat, 28 Mar 2020 09:46:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585413963; cv=none; d=google.com; s=arc-20160816; b=HHeR62zqx1FFN7uLoNSOlhbWxgCPdyXg/jr99P/CqF+K5nAC59RwMazWecDSdPbgKc wS+BNt5465v1OEYi9OIbqvjCXiuROnhwehfogIg09Ytf1F1LUox8afTZd609TvpYb8CN D55GxwkMpYVbLBdcWStkCMkXfu4WMqOkX+ERCOac98keVlzOt0gTe+YOGtUOs2oEQLQ/ aVqbEa9ICxhhbT+OwB2MdALvi4sPfRplq2nItO5v0JiqXo5jQaMTrgD7v+ILPYNiKx0e wV119DlHl2I9xTC/fOQE8WAqQgLbr4qLRfHZeQvNL2/N2Cv83Z3OP62+K/u5qIeteeGe xqnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :ironport-sdr:ironport-sdr; bh=iAUHTey9uTEobM9nOf+t5h9XqJ5GLHCuED7kOlTEG54=; b=pZsveY6BJ0ZVkJz2y5IxT+FifM54I5i5Sr3qHcp7f0QiXLeDti04wjTmPKYnTJceuR Scdeh7VoZVGi9Gn0Bbe5DtV8+/p0yxT/KvRmBxTNeqSG4UEI9rrr7Uugk96ks4h7ZEnV K378L383atcibQw0qDq3z1zDw7Op/ikX75XpUjFgjsHvoKm9ntEKpAraIhkTkN2PI5bB 1euLa+gCV46okBMQqX8SpvUZtTHdXQ7OpA/4Q7/7bMQQg+N5iFoM2EoH0slrxZjRuAwi oDU4oWIwRGRsqeJYnIj6LyAoo7gVjwT1qnH/i8ceTv+jh+lo+Q8a7tWTa3lx7TH9dHaw OQ7A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x63si3764024oia.101.2020.03.28.09.45.50; Sat, 28 Mar 2020 09:46:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727934AbgC1Qo0 (ORCPT + 99 others); Sat, 28 Mar 2020 12:44:26 -0400 Received: from mga14.intel.com ([192.55.52.115]:39985 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727892AbgC1QoB (ORCPT ); Sat, 28 Mar 2020 12:44:01 -0400 IronPort-SDR: WTRNRXFcjJGz1aOQtSMSaoNuoTrtaIWS6dMpRVkNtS2EuunMaOkQMUv15YcqsEy3mDVm/h3ipB A+5nX57ZmMxg== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2020 09:43:59 -0700 IronPort-SDR: SdGokYdutrkh7muRZKTD8uLS4BGv9nABmxBsyKvY78eKn5Rw5R9w//vc1XZd7FZMrUJtUW0LgZ UGwtIlpexqOw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,317,1580803200"; d="scan'208";a="447771185" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by fmsmga005.fm.intel.com with ESMTP; 28 Mar 2020 09:43:58 -0700 From: Yu-cheng Yu To: linux-kernel@vger.kernel.org, x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , Dave Hansen , Tony Luck , Andy Lutomirski , Borislav Petkov , Rik van Riel , "Ravi V. Shankar" , Sebastian Andrzej Siewior , Fenghua Yu , Peter Zijlstra Cc: Yu-cheng Yu Subject: [PATCH v3 09/10] x86/fpu/xstate: Preserve supervisor states for slow path of __fpu__restore_sig() Date: Sat, 28 Mar 2020 09:43:06 -0700 Message-Id: <20200328164307.17497-10-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200328164307.17497-1-yu-cheng.yu@intel.com> References: <20200328164307.17497-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The signal return code is responsible for taking an XSAVE buffer present in user memory and loading it into the hardware registers. This operation only affects user XSAVE state and never affects supervisor state. The fast path through this code simply points XRSTOR directly at the user buffer. However, due to page faults, this XRSTOR can fail. If it fails, the signal return code falls back to a slow path which can tolerate page faults. That slow path copies the xfeatures one by one out of the user buffer into the task's fpu state area. However, by being in a context where it can handle page faults, the code can also schedule. That exposes us to the whims of the lazy-fpu-load code. That code currently can not comprehend valid fpregs (the supervisor state) mixed with the *not* valid user fpregs. To do that, we would need to track whether individual XSAVE components had valid fpregs or fpstate. If we left the current code in place, the context-switch code would think it has an up-to-date fpstate and would fail to save the supervisor state when scheduling the task out. When scheduling back in, it would likely restore stale supervisor state. To fix that, we need to save the supervisor state before the slow path. That way, the supervisor state is always up-to-date and the task can survive being scheduled. Modify copy_user_to_fpregs_zeroing() so that if it fails, fpregs are not zeroed, and there is no need for fpregs_deactivate() and supervisor states are preserved. Move set_thread_flag(TIF_NEED_FPU_LOAD) to the slow path. Without doing this, the fast path also needs supervisor states to be saved first. Signed-off-by: Yu-cheng Yu --- arch/x86/kernel/fpu/signal.c | 53 +++++++++++++++++++----------------- 1 file changed, 28 insertions(+), 25 deletions(-) diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c index d09d72334a12..545ca4314096 100644 --- a/arch/x86/kernel/fpu/signal.c +++ b/arch/x86/kernel/fpu/signal.c @@ -262,19 +262,23 @@ sanitize_restored_user_xstate(union fpregs_state *state, static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only) { u64 init_bv; + int r; if (use_xsave()) { if (fx_only) { init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE; - copy_kernel_to_xregs(&init_fpstate.xsave, init_bv); - return copy_user_to_fxregs(buf); + r = copy_user_to_fxregs(buf); + if (!r) + copy_kernel_to_xregs(&init_fpstate.xsave, init_bv); + return r; } else { init_bv = xfeatures_mask_user() & ~xbv; - if (unlikely(init_bv)) + r = copy_user_to_xregs(buf, xbv); + if (!r && unlikely(init_bv)) copy_kernel_to_xregs(&init_fpstate.xsave, init_bv); - return copy_user_to_xregs(buf, xbv); + return r; } } else if (use_fxsr()) { return copy_user_to_fxregs(buf); @@ -327,28 +331,10 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size) } } - /* - * The current state of the FPU registers does not matter. By setting - * TIF_NEED_FPU_LOAD unconditionally it is ensured that the our xstate - * is not modified on context switch and that the xstate is considered - * to be loaded again on return to userland (overriding last_cpu avoids - * the optimisation). - */ - set_thread_flag(TIF_NEED_FPU_LOAD); - __fpu_invalidate_fpregs_state(fpu); - if ((unsigned long)buf_fx % 64) fx_only = 1; - /* - * For 32-bit frames with fxstate, copy the fxstate so it can be - * reconstructed later. - */ - if (ia32_fxstate) { - ret = __copy_from_user(&env, buf, sizeof(env)); - if (ret) - goto err_out; - envp = &env; - } else { + + if (!ia32_fxstate) { /* * Attempt to restore the FPU registers directly from user * memory. For that to succeed, the user access cannot cause @@ -365,10 +351,27 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size) fpregs_unlock(); return 0; } - fpregs_deactivate(fpu); fpregs_unlock(); + } else { + /* + * For 32-bit frames with fxstate, copy the fxstate so it can + * be reconstructed later. + */ + ret = __copy_from_user(&env, buf, sizeof(env)); + if (ret) + goto err_out; + envp = &env; } + /* + * The current state of the FPU registers does not matter. By setting + * TIF_NEED_FPU_LOAD unconditionally it is ensured that the our xstate + * is not modified on context switch and that the xstate is considered + * to be loaded again on return to userland (overriding last_cpu avoids + * the optimisation). + */ + set_thread_flag(TIF_NEED_FPU_LOAD); + __fpu_invalidate_fpregs_state(fpu); if (use_xsave() && !fx_only) { u64 init_bv = xfeatures_mask_user() & ~user_xfeatures; -- 2.21.0