Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp9272454rwr; Thu, 11 May 2023 12:24:07 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ41IytNylxnU31Rwj9KmaLvr/nnRBjiFhHo1OVlaWoJBGyDeTqhkZEBkXq7JqonkMowuCh+ X-Received: by 2002:a05:6a00:2401:b0:63d:2c2f:e3a2 with SMTP id z1-20020a056a00240100b0063d2c2fe3a2mr28897843pfh.18.1683833047067; Thu, 11 May 2023 12:24:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683833047; cv=none; d=google.com; s=arc-20160816; b=yu2bYWlgtuXtEVx1eiZ+q8tMYNg2x3oTnctAsJxAG1bz8Uh+UBHHAo2wVjWV8IFG2s oVIfj6caLKdt4Ut8/BhqxGNAGBAo7OmVeFD9LmOu3YIQcZ1W03zzEuyV8w/XUeUwtOV0 CzqpJ90V1924UGWkTq+2sZozdoutuS2xTJ3Yg9wolfpFlzcdXagSUTbUfOoODS0z/T1f uR5NTj5kHJCqLAKlsSvHK6PSTdWQVG7NSrpdc/k/tt6CNcDkprFlee0l3Rz1LNXnyZ/1 y2Fsajj19qHxckvr/F82ypnyYbAcFvhRAUH1RBmEpGZ4rlurTJfLJulPEEZTAiraSNYQ 28IA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=Z/c57fGI76Etvz35JP6oYi56+W2cE5Opvi4hQ9J18Gk=; b=ecQq3IV/ZJk0hsV2M0vkR/zBUXrgoAOiwchAZYmkgsfND6QilHz3+1fNvY/UYGD03K W9HAotECdHTirEZFkhsJtXWCDVWI+pG72MjTvg1+SJlcVaaZv0ZYfU9rQRez/1Sn/DeK GNauLaeQtTlfzvWFZ9Od+tXduHc7OXZt21z0bm4jWWjyRn3+ua8xAsFJ/y2YIIKf5HOG 6y5heHyesH2J68s3nXuhvj0fQDnRfq5z3L23cE8v/MzNtiph6Xj2g3l34wF2W2ZGat/n bYNn9dvMp9KDy4PdX3F2rhkvsN7FszSGSCR/iy6nD5bp5m5eZQ47hnRZ8de9UzNo9jrM +NnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=eieIKIMV; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=mx435wk2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j20-20020a633c14000000b0051f8817a307si7923599pga.869.2023.05.11.12.23.54; Thu, 11 May 2023 12:24:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=eieIKIMV; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=mx435wk2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238875AbjEKTMt (ORCPT + 99 others); Thu, 11 May 2023 15:12:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233472AbjEKTMr (ORCPT ); Thu, 11 May 2023 15:12:47 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D456E8A57 for ; Thu, 11 May 2023 12:12:15 -0700 (PDT) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1683832263; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Z/c57fGI76Etvz35JP6oYi56+W2cE5Opvi4hQ9J18Gk=; b=eieIKIMVU6HyUXgjUhiQnvhyCt0FltP7sWWls5ql4poV9hK21esWHSd0mHfiQi9aY6OEs8 7/EkYpiasPxMwNyJ77uIjD1sldif2LJ1zW0dV4EnYZWZxYMMtVH+Nt9vlpvIe3Yo3J7KB+ OSiIEIi3+vgdAI36p5GCloBxJv/uYJ98+D1KgV+Gh/wgPioj8zlHGeIOwJoqrteTVN2R7y uZjNMleW8+zNWU9WHxuS+jPjlE5N+X9ElCXVaVgMUFMefEj4eWariHnyEOqIow2SbSl+q2 Nk3Kt8bev2QHCyRSbudoB56jRaOWbG18AUIrdzwv4bKhnJ+Km0bf7XesbWzGdg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1683832263; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Z/c57fGI76Etvz35JP6oYi56+W2cE5Opvi4hQ9J18Gk=; b=mx435wk27fjwmLF8qGCLTrq98EJJ76mOT5Snc5JargynMtdsU1X8/5wfzkRWj68wXA86ZV kiIh1FNJo3YuIkCg== To: Adamos Ttofari Cc: abusse@amazon.de, dwmw@amazon.co.uk, hborghor@amazon.de, sironi@amazon.de, attofari@amazon.de, Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Kyle Huey , "Chang S. Bae" , Andrew Cooper , linux-kernel@vger.kernel.org Subject: Re: [PATCH] fpu: xstate: Keep xfd_state always in-sync with IA32_XFD MSR In-Reply-To: <20230511152818.13839-1-attofari@amazon.de> References: <20230511152818.13839-1-attofari@amazon.de> Date: Thu, 11 May 2023 21:11:02 +0200 Message-ID: <877cted6pl.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 11 2023 at 15:28, Adamos Ttofari wrote: > Commit 672365477ae8 ("x86/fpu: Update XFD state where required") and > commit 8bf26758ca96 ("x86/fpu: Add XFD state to fpstate") introduced a > per_cpu variable xfd_state to keep the IA32_XFD MSR value cached. In > order to avoid unnecessary writes to the MSR. > > xfd_state might not be always synced with the MSR. Eventually affecting > MSR writes. xfd_state is initialized with 0, meanwhile the MSR is > initialized with the XFEATURE_MASK_USER_DYNAMIC to make XFD fire. Then > later on reschedule to a different CPU, when a process that uses extended > xfeatures and handled the #NM (by allocating the additional space in task's > fpstate for extended xfeatures) it will skip the MSR update in > restore_fpregs_from_fpstate because the value might match to already cached > xfd_state (meanwhile it is not the same with the MSR). Eventually calling a > XRSTOR to set the new state (that caries extended xfeatures) and fire a #NM > from kernel context. The XFD is expected to fire from user-space context, > but not in this case and the kernel crashes. I'm completely confused. So after reading the patch I think I know what you are trying to explain: On CPU hotplug MSR_IA32_XFD is reset to the init_fpstate.xfd, which wipes out any stale state. But the per CPU cached xfd value is not reset, which brings them out of sync. As a consequence a subsequent xfd_update_state() might fail to update the MSR which in turn can result in XRSTOR raising a #NM in kernel space, which crashes the kernel. Right? > To address the issue mentioned initialize xfd_state with the current MSR > value and update the XFD MSR always with xfd_update_state to avoid > un-sync cases. > > Fixes: 672365477ae8 ("x86/fpu: Update XFD state where required") > > Signed-off-by: Adamos Ttofari > --- > arch/x86/kernel/fpu/xstate.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c > index 0bab497c9436..36ed27ac0ecd 100644 > --- a/arch/x86/kernel/fpu/xstate.c > +++ b/arch/x86/kernel/fpu/xstate.c > @@ -179,8 +179,14 @@ void fpu__init_cpu_xstate(void) > * key as that does not work on the boot CPU. This also ensures > * that any stale state is wiped out from XFD. > */ > - if (cpu_feature_enabled(X86_FEATURE_XFD)) > - wrmsrl(MSR_IA32_XFD, init_fpstate.xfd); > + if (cpu_feature_enabled(X86_FEATURE_XFD)) { > + u64 xfd; > + > + rdmsrl(MSR_IA32_XFD, xfd); > + __this_cpu_write(xfd_state, xfd); > + > + xfd_update_state(&init_fpstate); > + } This does not compile on 32bit. You want something like the uncompiled below. > /* > * XCR_XFEATURE_ENABLED_MASK (aka. XCR0) sets user features > @@ -915,7 +921,7 @@ void fpu__resume_cpu(void) > } > > if (fpu_state_size_dynamic()) > - wrmsrl(MSR_IA32_XFD, current->thread.fpu.fpstate->xfd); > + xfd_update_state(&init_fpstate); On suspend per CPU xfd_state == current->thread.fpu.fpstate->xfd so it's correct to restore the exact state which was active _before_ suspend. xfd_state can't be out of sync in that case, no? Thanks, tglx --- diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 0bab497c9436..70785a722759 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -177,10 +177,11 @@ void fpu__init_cpu_xstate(void) * Must happen after CR4 setup and before xsetbv() to allow KVM * lazy passthrough. Write independent of the dynamic state static * key as that does not work on the boot CPU. This also ensures - * that any stale state is wiped out from XFD. + * that any stale state is wiped out from XFD. Reset the per CPU + * xfd cache too. */ if (cpu_feature_enabled(X86_FEATURE_XFD)) - wrmsrl(MSR_IA32_XFD, init_fpstate.xfd); + xfd_reset_state(); /* * XCR_XFEATURE_ENABLED_MASK (aka. XCR0) sets user features diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index a4ecb04d8d64..6cfaf72228f4 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -159,9 +159,16 @@ static inline void xfd_update_state(struct fpstate *fpstate) } } +static inline void xfd_reset_state(void) +{ + wrmsrl(MSR_IA32_XFD, init_fpstate.xfd); + __this_cpu_write(xfd_state, init_fpstate.xfd); +} + extern int __xfd_enable_feature(u64 which, struct fpu_guest *guest_fpu); #else static inline void xfd_update_state(struct fpstate *fpstate) { } +static inline void xfd_reset_state(void) { } static inline int __xfd_enable_feature(u64 which, struct fpu_guest *guest_fpu) { return -EPERM;