Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp1425987pxb; Fri, 1 Oct 2021 10:17:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyBEJB9Yj5cfcPg5sdUkEj+x5VLdDhLeI6I5BHoLTCR8RIbiuduwGscYf1xAbmQUmeLnt20 X-Received: by 2002:aa7:870b:0:b0:44b:bcef:32b4 with SMTP id b11-20020aa7870b000000b0044bbcef32b4mr12249546pfo.41.1633108637374; Fri, 01 Oct 2021 10:17:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633108637; cv=none; d=google.com; s=arc-20160816; b=jTu+BWIry7Xu78dosVQlFnQhghW7sEJLFN0RTUG+b29OFL8gV3Fv8XPEzRXdfwUQKY 0gMuKaO2d9lfkupUhnA0oetya+YMNpnL2rhTsJNkaD+pl4dl23TBvQG99nKfIC9Xhzq+ vzPt6uNd8x231HtYLBSxhTgvjWAF98Ino0pezozEKCw9m2eXUzRjgKF38kPraupUpwCz 345FPnjvUE9PwDqWA+QNNFCzBgv9Ft3VcpyG1ZH3jiYyY51D6n0+F+N1bytyGlg//OO9 sePVo0LLQmWtwI/ito6fs74l/bE4LsvCjed5z4yjV1HoCpAlkVCeekGbBErdrqaH/M2+ I5ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=tkahMb00/2ZOB3OTBy1INbxnjaoRkIMrPGAeMzpZqq4=; b=Pz/89ScFwu4css1Q5t0BaQl6gKft45pmEbSdl8xXrQa8zVroQkQCsPc099FArMQ8xD fQ1iDfOABpIp9RrtW9tCVpNLv8xUFx+9vNZKTM8v1u5/FYWaCMO2qDNNzX/upYYpJAiT t0X60WKH+ozZrXltaQAEr9ijCrhCxWku0aVB6dgQYgq9Mn40CQNRDlJPXE7ZG4fdYHyu B31fiSzznVylLX98hDdvu+q+Fja3qoRqVoiOicnpmKOfPsZOvJKJlmhJjNDQf4w2PB+3 PPAvdnT9HrZfiSfiG9+F36qPChgABWMxRGA6HXMZgd9oMSmJeqzl88k/EwX9rdB7DFTr iaCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=VvAbATfH; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q3si8086516pgq.444.2021.10.01.10.17.04; Fri, 01 Oct 2021 10:17:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=VvAbATfH; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231903AbhJAPEh (ORCPT + 99 others); Fri, 1 Oct 2021 11:04:37 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:57930 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1354765AbhJAPEf (ORCPT ); Fri, 1 Oct 2021 11:04:35 -0400 From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1633100570; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=tkahMb00/2ZOB3OTBy1INbxnjaoRkIMrPGAeMzpZqq4=; b=VvAbATfHtKYOGYETTF4NyAZJH0nLm539yYltnLygMzbsW/tEjTosnqw19MbnXdR/rnK7YR 2gfMFdiGhpFaZ5tQC82CVyMj+qAI10owVUzsEUEEZyAL10FpSozE/JFriZBhwGiDmB8R8h hKkBEsJ8RK+UzTT2tMRGhAUMJsnqut+uN3bt7Ttb3X2YAZWvrH9/xuPYlTVo2KmQlMLJcz n53R48hp2BPxnaQLKBszUkVeTtGDdKofjtZixfvRV2GnLJLHmSZ6atwcZpXjXUcsGANWI5 Tqiwj49RQxsZBqLoWCHz2VOJ1ty98w6Pnl5AyyJVGmOH/IbdoZPNE0PToIAVjQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1633100570; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=tkahMb00/2ZOB3OTBy1INbxnjaoRkIMrPGAeMzpZqq4=; b=pcOb3Ej2c2nCvd9kpicGl6rnch6vwgCq0kClnm6XbkHUugUqvyUgR4DW0P1KvBr85CVRS7 71RB9rhBWBi3sYDw== To: "Chang S. Bae" , bp@suse.de, luto@kernel.org, mingo@kernel.org, x86@kernel.org Cc: len.brown@intel.com, lenb@kernel.org, dave.hansen@intel.com, thiago.macieira@intel.com, jing2.liu@intel.com, ravi.v.shankar@intel.com, linux-kernel@vger.kernel.org, chang.seok.bae@intel.com Subject: Re: [PATCH v10 13/28] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state In-Reply-To: <20210825155413.19673-14-chang.seok.bae@intel.com> References: <20210825155413.19673-1-chang.seok.bae@intel.com> <20210825155413.19673-14-chang.seok.bae@intel.com> Date: Fri, 01 Oct 2021 17:02:49 +0200 Message-ID: <871r546b52.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 25 2021 at 08:53, Chang S. Bae wrote: > +/** > + * xfd_switch - Switches the MSR IA32_XFD context if needed. > + * @prev: The previous task's struct fpu pointer > + * @next: The next task's struct fpu pointer > + */ > +static inline void xfd_switch(struct fpu *prev, struct fpu *next) > +{ > + u64 prev_xfd_mask, next_xfd_mask; > + > + if (!cpu_feature_enabled(X86_FEATURE_XFD) || !xfeatures_mask_user_dynamic) > + return; This is context switch, so this wants to be a static key which is turned on during init when the CPU supports XFD and user dynamic features are available. > + > + prev_xfd_mask = prev->state_mask & xfeatures_mask_user_dynamic; > + next_xfd_mask = next->state_mask & xfeatures_mask_user_dynamic; > + > + if (unlikely(prev_xfd_mask != next_xfd_mask)) > + wrmsrl_safe(MSR_IA32_XFD, xfeatures_mask_user_dynamic ^ next_xfd_mask); > +} > + > /* > * Delay loading of the complete FPU state until the return to userland. > * PKRU is handled separately. > */ > -static inline void switch_fpu_finish(struct fpu *new_fpu) > +static inline void switch_fpu_finish(struct fpu *old_fpu, struct fpu *new_fpu) > { > - if (cpu_feature_enabled(X86_FEATURE_FPU)) > + if (cpu_feature_enabled(X86_FEATURE_FPU)) { > set_thread_flag(TIF_NEED_FPU_LOAD); > + xfd_switch(old_fpu, new_fpu); Why has this to be done on context switch? Zero explanation provided. Why can't this be done in exit_to_user() where the FPU state restore is handled? > } > + > + if (boot_cpu_has(X86_FEATURE_XFD)) s/boot_cpu_has/cpu_feature_enabled/g > + wrmsrl(MSR_IA32_XFD, xfeatures_mask_user_dynamic); > } > + > + if (cpu_feature_enabled(X86_FEATURE_XFD)) > + wrmsrl_safe(MSR_IA32_XFD, (current->thread.fpu.state_mask & > + xfeatures_mask_user_dynamic) ^ > + xfeatures_mask_user_dynamic); Lacks curly braces as it's not a single line of code. > } > > /** > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c > index 33f5d8d07367..6cd4fb098f8f 100644 > --- a/arch/x86/kernel/process.c > +++ b/arch/x86/kernel/process.c > @@ -97,6 +97,16 @@ void arch_thread_struct_whitelist(unsigned long *offset, unsigned long *size) > *size = fpu_buf_cfg.min_size; > } > > +void arch_release_task_struct(struct task_struct *task) > +{ > + if (!cpu_feature_enabled(X86_FEATURE_FPU)) > + return; > + > + /* Free up only the dynamically-allocated memory. */ > + if (task->thread.fpu.state != &task->thread.fpu.__default_state) Sigh. > + free_xstate_buffer(&task->thread.fpu); > > +static __always_inline bool handle_xfd_event(struct fpu *fpu, struct pt_regs *regs) > +{ > + bool handled = false; > + u64 xfd_err; > + > + if (!cpu_feature_enabled(X86_FEATURE_XFD)) > + return handled; > + > + rdmsrl_safe(MSR_IA32_XFD_ERR, &xfd_err); > + wrmsrl_safe(MSR_IA32_XFD_ERR, 0); > + > + if (xfd_err) { What's wrong with if (!xfd_err) return false; an spare the full indentation levels below > + u64 xfd_event = xfd_err & xfeatures_mask_user_dynamic; > + u64 value; > + > + if (WARN_ON(!xfd_event)) { > + /* > + * Unexpected event is raised. But update XFD state to > + * unblock the task. > + */ > + rdmsrl_safe(MSR_IA32_XFD, &value); > + wrmsrl_safe(MSR_IA32_XFD, value & ~xfd_err); Ditto. But returning false here will not unblock the task as exc_device_not_available() will simply reach "die()". > + } else { > + struct fpu *fpu = ¤t->thread.fpu; You need this because the fpu argument above is invalid? > + int err = -1; > + > + /* > + * Make sure not in interrupt context as handling a > + * trap from userspace. > + */ > + if (!WARN_ON(in_interrupt())) { Why would in_interrupt() be necessarily true when the trap comes from kernel space? The proper check is user_mode(regs) as done anywhere else. > + err = realloc_xstate_buffer(fpu, xfd_event); > + if (!err) > + wrmsrl_safe(MSR_IA32_XFD, (fpu->state_mask & > + xfeatures_mask_user_dynamic) ^ > + xfeatures_mask_user_dynamic); > + } > + > + /* Raise a signal when it failed to handle. */ > + if (err) > + force_sig_fault(SIGILL, ILL_ILLOPC, error_get_trap_addr(regs)); > + } > + handled = true; > + } > + return handled; > +} > + > DEFINE_IDTENTRY(exc_device_not_available) > { > unsigned long cr0 = read_cr0(); > + if (handle_xfd_event(¤t->thread.fpu, regs)) > + return; As I said before, this is wrong because at that point interrupts are disabled. Thanks, tglx