Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp230984iob; Mon, 2 May 2022 17:56:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzb3DPozdyEU7uyIj0FgHGS9Avc5NoxobJN4CzoQeFU4AbEkIQvMZIBCQv/GbHjSU9NLnLz X-Received: by 2002:a63:2309:0:b0:398:d3fe:1c41 with SMTP id j9-20020a632309000000b00398d3fe1c41mr11845547pgj.131.1651539409684; Mon, 02 May 2022 17:56:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651539409; cv=none; d=google.com; s=arc-20160816; b=h11vlmkhmxWvcm/bRjcf/hiYBwfqyIgjHiNcYESQ8IzCIWv0ioiapLsbkgiagma9Vb 4242yhPb0ldq+VXq5JQmdW8Ef3aIPW5NnFAKAoEDnVGrA2L8+CLVXvOTQq1wBhiTrn+W OGN/pliRv11py8LGHgNpUJefNRV/w111H87CTfeTypj4wQUzcVKiIpo21D+GVSzrEn1e c5dd1wkOGdpStwTfYVGKLVDBs59U46LOKBGpssV6qtaqGpu5zW9gSap6/oaxTZK7zGiv OXSO7iOIpybF3ZIrfVHq16pTutoGuQg+BWIsxP7mYlzFbksG/Z3LWjb+++uyM6AEeNqh fROA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=Xh00mBTIPhiL8G3sTnLFewMyTOf/QnQuL4kbITowOlY=; b=gnmD6Fzo68GLweUIzfcElXCB56Sfmok6do12EcXnj6jgxRNQ3oYroC75a3hQJzSzOE K/hNfw5H/1wV8zFe0+/89Ox3nVg2njNpuxbBOwV4/1PRKJ6+0RFGhaTNRwCEg5qif3ua gq6M5FbiD4uUixFFhrAR0kRAN+EQh8YF+ZYujJ10wV9deH9E4dTDd8p/KEZls6GHKMYD CiiQNELzm1u0Shh6ooqcku2oVnGe693eo96Owh43xhTvPn6c+nHXHCIoFNlOrgm1Vwn+ uc/4lHtcTd6dDPwRAyzqvh9MOo/QNnei3QvQchznFVL0rjhXpyzG29abks9RhkHB65r6 PKDA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=mRcYKlxG; dkim=neutral (no key) header.i=@linutronix.de header.b=ozCVrDoC; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id k8-20020a170902c40800b0015d338b709bsi17172220plk.457.2022.05.02.17.56.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 May 2022 17:56:49 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=mRcYKlxG; dkim=neutral (no key) header.i=@linutronix.de header.b=ozCVrDoC; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8B04149CB2; Mon, 2 May 2022 17:42:58 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244307AbiEATfO (ORCPT + 99 others); Sun, 1 May 2022 15:35:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231638AbiEATfN (ORCPT ); Sun, 1 May 2022 15:35:13 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58FA64E3A5; Sun, 1 May 2022 12:31:46 -0700 (PDT) Message-ID: <20220501193102.588689270@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1651433504; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Xh00mBTIPhiL8G3sTnLFewMyTOf/QnQuL4kbITowOlY=; b=mRcYKlxGiBvr0VWf1j+KfTraywDaOEabXj+tQJB0pXGtMuyfP7nekJI7hrH9snBEGrYi9t JW4Dw5xY7Lj0i/eYPLz7Ral3pHPrKLWtUdEYaEpzdWAoVeH45m7W8T/z8o2THgonrpbqL3 qBLumciwRqgistyIEbuGYhWKZoCGvNP8E1YNPrcV0dxDQzEeC3CmSZ5yLCkXZZS3LqYV9Q VvMGH0X3lqNFvCtxcEUETgQjxoQQz96YY6gKCSfUvZ4AbLsCDSBH3rv9W/hiPWAQHzPIMS xjb8VZfAR5V8Y0Bfoke/2e5L8FVjReM7gPPLpmBORu4sCM77IjzoPlaUc7fqew== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1651433504; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Xh00mBTIPhiL8G3sTnLFewMyTOf/QnQuL4kbITowOlY=; b=ozCVrDoCzB05zO60lAMcMwkSCjU7hs4Zoly+Ar87eKVl14g1QXVIV6QIGQHT3a0mw9GUOC g9v8HsLGNrCX9bCg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Filipe Manana , stable@vger.kernel.org Subject: [patch 1/3] x86/fpu: Prevent FPU state corruption References: <20220501192740.203963477@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Date: Sun, 1 May 2022 21:31:43 +0200 (CEST) X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The FPU usage related to task FPU management is either protected by disabling interrupts (switch_to, return to user) or via fpregs_lock() which is a wrapper around local_bh_disable(). When kernel code wants to use the FPU then it has to check whether it is possible by calling irq_fpu_usable(). But the condition in irq_fpu_usable() is wrong. It allows FPU to be used when: !in_interrupt() || interrupted_user_mode() || interrupted_kernel_fpu_idle() The latter is checking whether some other context already uses FPU in the kernel, but if that's not the case then it allows FPU to be used unconditionally even if the calling context interupted a fpregs_lock() critical region. If that happens then the FPU state of the interrupted context becomes corrupted. Allow in kernel FPU usage only when no other context has in kernel FPU usage and either the calling context is not hard interrupt context or the hard interrupt did not interrupt a local bottomhalf disabled region. It's hard to find a proper Fixes tag as the condition was broken in one way or the other for a very long time and the eager/lazy FPU changes caused a lot of churn. Picked something remotely connected from the history. This survived undetected for quite some time as FPU usage in interrupt context is rare, but the recent changes to the random code unearthed it at least on a kernel which had FPU debugging enabled. There is probably a higher rate of silent corruption as not all issues can be detected by the FPU debugging code. This will be addressed in a subsequent change. Fixes: 5d2bd7009f30 ("x86, fpu: decouple non-lazy/eager fpu restore from xsave") Reported-by: Filipe Manana Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org --- arch/x86/kernel/fpu/core.c | 67 +++++++++++++++++---------------------------- 1 file changed, 26 insertions(+), 41 deletions(-) --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -41,17 +41,7 @@ struct fpu_state_config fpu_user_cfg __r */ struct fpstate init_fpstate __ro_after_init; -/* - * Track whether the kernel is using the FPU state - * currently. - * - * This flag is used: - * - * - by IRQ context code to potentially use the FPU - * if it's unused. - * - * - to debug kernel_fpu_begin()/end() correctness - */ +/* Track in-kernel FPU usage */ static DEFINE_PER_CPU(bool, in_kernel_fpu); /* @@ -59,42 +49,37 @@ static DEFINE_PER_CPU(bool, in_kernel_fp */ DEFINE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx); -static bool kernel_fpu_disabled(void) -{ - return this_cpu_read(in_kernel_fpu); -} - -static bool interrupted_kernel_fpu_idle(void) -{ - return !kernel_fpu_disabled(); -} - -/* - * Were we in user mode (or vm86 mode) when we were - * interrupted? - * - * Doing kernel_fpu_begin/end() is ok if we are running - * in an interrupt context from user mode - we'll just - * save the FPU state as required. - */ -static bool interrupted_user_mode(void) -{ - struct pt_regs *regs = get_irq_regs(); - return regs && user_mode(regs); -} - /* * Can we use the FPU in kernel mode with the * whole "kernel_fpu_begin/end()" sequence? - * - * It's always ok in process context (ie "not interrupt") - * but it is sometimes ok even from an irq. */ bool irq_fpu_usable(void) { - return !in_interrupt() || - interrupted_user_mode() || - interrupted_kernel_fpu_idle(); + if (WARN_ON_ONCE(in_nmi())) + return false; + + /* In kernel FPU usage already active? */ + if (this_cpu_read(in_kernel_fpu)) + return false; + + /* + * When not in NMI or hard interrupt context, FPU can be used: + * + * - Task context is safe except from within fpregs_lock()'ed + * critical regions. + * + * - Soft interrupt processing context which cannot happen + * while in a fpregs_lock()'ed critical region. + */ + if (!in_hardirq()) + return true; + + /* + * In hard interrupt context it's safe when soft interrupts + * are enabled, which means the interrupt did not hit in + * a fpregs_lock()'ed critical region. + */ + return !softirq_count(); } EXPORT_SYMBOL(irq_fpu_usable);