Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp1622192pxb; Fri, 1 Oct 2021 15:10:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwid51Rg1BG8iVvSXvnjWYW4T7XQkqaJ9OwEG+1Itpi+6XsTGaXZueettBPyex0TIWhuSLq X-Received: by 2002:aa7:d64b:: with SMTP id v11mr628830edr.57.1633126222616; Fri, 01 Oct 2021 15:10:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633126222; cv=none; d=google.com; s=arc-20160816; b=YAlE0fHrBuOih/cAW77PaSaPeXpAiw4L4eZaVkzNTHWrgP0w0zw/Q6F8ySueQpsAPp 6CR5fwkQDpOjbDDPRtP71GkWUYB9vjF5Rt5CTooZxHX3GsxVlT/xfNXiWtNBS7F4epfm ezaAhQkuFypfN2IAiozk1kMKhq6ooPcZriIZnYbMR9M5tfufAl1tkS+Jj3NOPjCkX10p tT5T9zP33RfYPWL2hXILlFbUWbKTNhAfRTXBupFOgQ+CS/EEg2j8KBXsf19bB9xA6Jw9 IHCSuz6wNG8SUrrHHU1155usinSpJIMhILIu63EaNoGZA2+CJsAUwMwMvZKJG9cUyupJ 7t8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=xVh65H2Fp5ijtkJEhRPg4UVo19RUyKFUjPzFX2ov0uk=; b=iSy5zhMVHQQqgLQnxy1hEVMNLD4kt9x7gBok+vDGHPiTWOxtdCdptPUPWCKXtKmHav MYrUzOljtPcS3LvDBKdD9k/uIDvS5qpL+Ad1oY7yJwdyiytaBJ9bVBicHEEWFw2BJ8A1 eF6ek7BTNZ9jH5D9FYL15hirlzTAO9PW+NLON7B3ljICLlpJ+V+VMLz2PGmcVNLq7OS4 nGJvLmim5/vL3bgMqp0r09O8irNHmdvcok4PSjFhxWbuYhaAW+Pv7qxRPg/LZf54CFvG W0Zs8oy8vstVU2ex3G5eyoZ/EPgHvwrHtAYJmIXxW6tU7jyr+JA2bZmy9Jsi4YYVw1Ks NnEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=y8YhK2p3; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=XO7TyMsB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c14si56067edr.286.2021.10.01.15.09.44; Fri, 01 Oct 2021 15:10:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=y8YhK2p3; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=XO7TyMsB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354509AbhJAVa5 (ORCPT + 99 others); Fri, 1 Oct 2021 17:30:57 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:59858 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229727AbhJAVay (ORCPT ); Fri, 1 Oct 2021 17:30:54 -0400 From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1633123748; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=xVh65H2Fp5ijtkJEhRPg4UVo19RUyKFUjPzFX2ov0uk=; b=y8YhK2p3xqDnbz3PXfLBx/isyD7z2c/RlgGuOoLzyg077RwbT+p6IINOBUH4bI+nh9tnfx vD1dWiFGoxqhYpBO12KbpPRVUJIvEwj9hfy7lUksnqd3HNIGYkTUbbxr1D+40jpdBL8Wdt ek6MpA1lv5vfxCGJHqjMVjreYEDQO55GDlO5q0cf62L1/uR03gILXZkFj/Rb/0mhUePTVC alts5PtCmS4ZGs1GMqXwynAQc0ZeQ8uNCPvh0Kx9cwU44ySemsmre0aPij/A4ZJrcRHUD/ a13iBnIpLXaVWRXOZB6Ob4pVlB5iCBRUDTq8QzHbE7HNb/IzXNwhbuoNyOYCfw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1633123748; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=xVh65H2Fp5ijtkJEhRPg4UVo19RUyKFUjPzFX2ov0uk=; b=XO7TyMsBSh50qLM3BMaabrrwikLWLxoZtkPtuva7JEVZ4/1LK5CdQaUFP/YT1bQ7ZpN6zN GWbMV8jfecxCoeBQ== To: Andy Lutomirski , Sohil Mehta , the arch/x86 maintainers Cc: Tony Luck , Dave Hansen , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Jens Axboe , Christian Brauner , "Peter Zijlstra (Intel)" , Shuah Khan , Arnd Bergmann , Jonathan Corbet , Raj Ashok , Jacob Pan , Gayatri Kammela , Zeng Guang , "Williams, Dan J" , Randy E Witt , "Shankar, Ravi V" , Ramesh Thomas , Linux API , linux-arch@vger.kernel.org, Linux Kernel Mailing List , linux-kselftest@vger.kernel.org Subject: Re: [RFC PATCH 11/13] x86/uintr: Introduce uintr_wait() syscall In-Reply-To: <0364c572-4bc2-4538-8d65-485dbfa81f0d@www.fastmail.com> References: <20210913200132.3396598-1-sohil.mehta@intel.com> <20210913200132.3396598-12-sohil.mehta@intel.com> <877dex7tgj.ffs@tglx> <87tui162am.ffs@tglx> <25ba1e1f-c05b-4b67-b547-6b5dbc958a2f@www.fastmail.com> <87pmsp5aqx.ffs@tglx> <0364c572-4bc2-4538-8d65-485dbfa81f0d@www.fastmail.com> Date: Fri, 01 Oct 2021 23:29:07 +0200 Message-ID: <875yug4eos.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 01 2021 at 08:13, Andy Lutomirski wrote: > On Fri, Oct 1, 2021, at 2:56 AM, Thomas Gleixner wrote: >> On Thu, Sep 30 2021 at 21:41, Andy Lutomirski wrote: >>> On Thu, Sep 30, 2021, at 5:01 PM, Thomas Gleixner wrote: >> >>> Now that I read the docs some more, I'm seriously concerned about this >>> XSAVE design. XSAVES with UINTR is destructive -- it clears UINV. If >>> we actually use this, then the whole last_cpu "preserve the state in >>> registers" optimization goes out the window. So does anything that >>> happens to assume that merely saving the state doesn't destroy it on >>> respectable modern CPUs XRSTORS will #GP if you XRSTORS twice, which >>> makes me nervous and would need a serious audit of our XRSTORS paths. >> >> I have no idea what you are fantasizing about. You can XRSTORS five >> times in a row as long as your XSTATE memory image is correct. > > I'm just reading TFM, which is some kind of dystopian fantasy. > > 11.8.2.4 XRSTORS > > Before restoring the user-interrupt state component, XRSTORS verifies > that UINV is 0. If it is not, XRSTORS causes a general-protection > fault (#GP) before loading any part of the user-interrupt state > component. (UINV is IA32_UINTR_MISC[39:32]; XRSTORS does not check the > contents of the remainder of that MSR.) Duh. I was staring at the SDM and searching for a hint. Stupid me! > So if UINV is set in the memory image and you XRSTORS five times in a > row, the first one will work assuming UINV was zero. The second one > will #GP. Yes. I can see what you mean now :) > 11.8.2.3 XSAVES > After saving the user-interrupt state component, XSAVES clears UINV. (UINV is IA32_UINTR_MISC[39:32]; > XSAVES does not modify the remainder of that MSR.) > > So if we're running a UPID-enabled user task and we switch to a kernel > thread, we do XSAVES and UINV is cleared. Then we switch back to the > same task and don't do XRSTORS (or otherwise write IA32_UINTR_MISC) > and UINV is still clear. Yes, that has to be mopped up on the way to user space. > And we had better clear UINV when running a kernel thread because the > UPID might get freed or the kernel thread might do some CPL3 > shenanigans (via EFI, perhaps? I don't know if any firmwares actually > do this). Right. That's what happens already with the current pile. > So all this seems to put UINV into the "independent" category of > feature along with LBR. And the 512-byte wastes from extra copies of > the legacy area and the loss of the XMODIFIED optimization will just > be collateral damage. So we'd end up with two XSAVES on context switch. We can simply do: XSAVES(); fpu.state.xtsate.uintr.uinv = 0; which allows to do as many XRSTORS in a row as we want. Only the final one on the way to user space will have to restore the real vector if the register state is not valid: if (fpu_state_valid()) { if (needs_uinv(current) wrmsrl(UINV, vector); } else { if (needs_uinv(current) fpu.state.xtsate.uintr.uinv = vector; XRSTORS(); } Hmm? Thanks, tglx