Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp1653827pxb; Fri, 1 Oct 2021 16:09:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzmdZYm40DwAc7e+LvlE0UmjykhC+sWm2Y6SNVgtDYCjXD55Tfzrin/9cvsZdy5yxnnPHep X-Received: by 2002:a17:907:110b:: with SMTP id qu11mr728589ejb.380.1633129781543; Fri, 01 Oct 2021 16:09:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633129781; cv=none; d=google.com; s=arc-20160816; b=nZ7nhMFK6eXxjEU+geJssQWmW/iMTmZ0B+t27tq/g/YVhuoM97gI1j0uFm+K42C00e ANb1+kqoAQGtnmYEdlVlR8z8pJOlrXl4VJ2327TQmtaSDPz3JJOIK5IXxaYJ0oGGm+WC Y0255hgxXjF3nzWRsqZSey7S+EyGsv6zJdtScvdNPs5WNwEOj3fq2SqrUepfiq7XGBgQ BFrNNnIZhk2FdlJgdwCkrnRrDY0nUCnL5jWYqyxZPPrY+OpkIsIP3WWv2ytumouQUOxo 9aBhQ+SURHaJjkoQ3pfaESCQ7LRyc3MuJex1eOVkaEb3EG3HsiMKbmnp8BFgnYafv08/ WGIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:in-reply-to:cc:references:message-id:date :subject:mime-version:from:content-transfer-encoding:dkim-signature; bh=aN3ZAgmq2n3pSx/30jOE16PQorFF/7eTuuFBp43eiXY=; b=YjJQIIzZbph8sGArTGl6If32wMoRK42rp7QJvz/HKMdhzy2UhHqtvCPunhqXutLuLZ cTS5CQKO+Ab5/5miTPNtPsqdwEcLLQmGr0+7hlMj6gD0i1FKrog7Tgs2m3tLUXtKaxCb cqeWto1jYPzqvBh1xHuMGZDWecLikDRLQoX9qLcvLNSkDUkR3rbnD03/spCttRT9C6P4 wCbwOQRgVwyKvWs3MODwLVqdO9cFeCbJ5rJR+NUwlYSQ5Pz222hipHUoxZqj+wtRH2Dy I/R1q5+dq3ANQJFoMRGi4K4fTFK73JbfM6iHMo6sYqrgHIOJJG3UxXQQeMoM4i/jq++h 3iUQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20210112.gappssmtp.com header.s=20210112 header.b=IYOscUxV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o13si8581076eje.67.2021.10.01.16.09.15; Fri, 01 Oct 2021 16:09:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20210112.gappssmtp.com header.s=20210112 header.b=IYOscUxV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230433AbhJAXGY (ORCPT + 99 others); Fri, 1 Oct 2021 19:06:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230442AbhJAXGP (ORCPT ); Fri, 1 Oct 2021 19:06:15 -0400 Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4977CC06177E for ; Fri, 1 Oct 2021 16:04:30 -0700 (PDT) Received: by mail-pf1-x42f.google.com with SMTP id s55so8148387pfw.4 for ; Fri, 01 Oct 2021 16:04:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=aN3ZAgmq2n3pSx/30jOE16PQorFF/7eTuuFBp43eiXY=; b=IYOscUxVH70MZ++sZ08gw+G5qC2G9A4KaPrK30xx7iUzUYOkrtaWLex6rPowT2oJkM Zs3AdyB7IydSyAi7JHeMI8UIMlGaybdg3gme2syp0Pw0iS8EWOvZwU4UIHN1RhCVp/Pb uDKBfTpbl/2BBoNnH5nRwt5DZPLdF0hSHju+XZbCbvh/ZPIPGFgr5ru+0C4tkB++XXwC wxv6P1o08gPYcbU6IkL9La9335csYmGKevZQeOXjXQSAZQ5FOLMiYB15Nkl0VPCq5ikY U1NiMfvwVphEDGR2OFr7cbEf7c+QmF+wKwFDrBjU7hWZVvskT0wdAGG7nAOnF8qWRlOC 558w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=aN3ZAgmq2n3pSx/30jOE16PQorFF/7eTuuFBp43eiXY=; b=v2K0HQImPuf6WxqtPyoSKR/09YTyvL4rkUg26c1qmy4+UB/jNsFBNhuHVAfthrGdj+ PKv7bwbIPirY94p9WorXTRcR+AxRkgmvTNaZ27WL4DVB7vtvaHG4B38T5vHKUedxe1aU bzw0QGEb6VfDOb55ChXK4+5mvQ3Y2fWo7bmI8KoJvRUS3sZLNkMA7/hW2yQVH8yLXKkd C2HPDLViP+a5NWv+5GPIkqjUGS6ifUpibipgKyfA3CRWy75TbFGE8iQ8IvmTypo791qg /Jfr45eYaJluEZBnzkIVLj5sLuoh7dlW1uGFfaMDLdVr3MMc68HKS2Lm6KQSOaBfAXFy tz9g== X-Gm-Message-State: AOAM531yLMlkK2TjAs1Yt+2l8pjA5LHKco3ktkjyQ1xwh2+xEEUBd0FJ d/6zptoKmeSmC0TO96gI5hTy6A== X-Received: by 2002:a63:2cce:: with SMTP id s197mr492444pgs.45.1633129469671; Fri, 01 Oct 2021 16:04:29 -0700 (PDT) Received: from smtpclient.apple ([2600:1010:b06f:1961:6102:acf1:515f:f327]) by smtp.gmail.com with ESMTPSA id w20sm504149pfc.80.2021.10.01.16.04.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 01 Oct 2021 16:04:29 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Andy Lutomirski Mime-Version: 1.0 (1.0) Subject: Re: [RFC PATCH 11/13] x86/uintr: Introduce uintr_wait() syscall Date: Fri, 1 Oct 2021 16:04:27 -0700 Message-Id: <266EFCFB-D0B6-4922-8538-AA3D1146C588@amacapital.net> References: <875yug4eos.ffs@tglx> Cc: Andy Lutomirski , Sohil Mehta , the arch/x86 maintainers , Tony Luck , Dave Hansen , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Jens Axboe , Christian Brauner , "Peter Zijlstra (Intel)" , Shuah Khan , Arnd Bergmann , Jonathan Corbet , Raj Ashok , Jacob Pan , Gayatri Kammela , Zeng Guang , "Williams, Dan J" , Randy E Witt , "Shankar, Ravi V" , Ramesh Thomas , Linux API , linux-arch@vger.kernel.org, Linux Kernel Mailing List , linux-kselftest@vger.kernel.org In-Reply-To: <875yug4eos.ffs@tglx> To: Thomas Gleixner X-Mailer: iPhone Mail (19A346) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Oct 1, 2021, at 2:29 PM, Thomas Gleixner wrote: >=20 > =EF=BB=BFOn Fri, Oct 01 2021 at 08:13, Andy Lutomirski wrote: >=20 >>> On Fri, Oct 1, 2021, at 2:56 AM, Thomas Gleixner wrote: >>> On Thu, Sep 30 2021 at 21:41, Andy Lutomirski wrote: >>>>> On Thu, Sep 30, 2021, at 5:01 PM, Thomas Gleixner wrote: >>>=20 >>>> Now that I read the docs some more, I'm seriously concerned about this >>>> XSAVE design. XSAVES with UINTR is destructive -- it clears UINV. If >>>> we actually use this, then the whole last_cpu "preserve the state in >>>> registers" optimization goes out the window. So does anything that >>>> happens to assume that merely saving the state doesn't destroy it on >>>> respectable modern CPUs XRSTORS will #GP if you XRSTORS twice, which >>>> makes me nervous and would need a serious audit of our XRSTORS paths. >>>=20 >>> I have no idea what you are fantasizing about. You can XRSTORS five >>> times in a row as long as your XSTATE memory image is correct. >>=20 >> I'm just reading TFM, which is some kind of dystopian fantasy. >>=20 >> 11.8.2.4 XRSTORS >>=20 >> Before restoring the user-interrupt state component, XRSTORS verifies >> that UINV is 0. If it is not, XRSTORS causes a general-protection >> fault (#GP) before loading any part of the user-interrupt state >> component. (UINV is IA32_UINTR_MISC[39:32]; XRSTORS does not check the >> contents of the remainder of that MSR.) >=20 > Duh. I was staring at the SDM and searching for a hint. Stupid me! >=20 >> So if UINV is set in the memory image and you XRSTORS five times in a >> row, the first one will work assuming UINV was zero. The second one >> will #GP. >=20 > Yes. I can see what you mean now :) >=20 >> 11.8.2.3 XSAVES >> After saving the user-interrupt state component, XSAVES clears UINV. (UIN= V is IA32_UINTR_MISC[39:32]; >> XSAVES does not modify the remainder of that MSR.) >>=20 >> So if we're running a UPID-enabled user task and we switch to a kernel >> thread, we do XSAVES and UINV is cleared. Then we switch back to the >> same task and don't do XRSTORS (or otherwise write IA32_UINTR_MISC) >> and UINV is still clear. >=20 > Yes, that has to be mopped up on the way to user space. >=20 >> And we had better clear UINV when running a kernel thread because the >> UPID might get freed or the kernel thread might do some CPL3 >> shenanigans (via EFI, perhaps? I don't know if any firmwares actually >> do this). >=20 > Right. That's what happens already with the current pile. >=20 >> So all this seems to put UINV into the "independent" category of >> feature along with LBR. And the 512-byte wastes from extra copies of >> the legacy area and the loss of the XMODIFIED optimization will just >> be collateral damage. >=20 > So we'd end up with two XSAVES on context switch. We can simply do: >=20 > XSAVES(); > fpu.state.xtsate.uintr.uinv =3D 0; Could work. As long as UINV is armed, RR can change at any time (maybe just w= hen IF=3D1? The manual is unclear). But the first XSAVES disarms UINV, so m= aybe this won=E2=80=99t confuse any callers. >=20 > which allows to do as many XRSTORS in a row as we want. Only the final > one on the way to user space will have to restore the real vector if the > register state is not valid: >=20 > if (fpu_state_valid()) { > if (needs_uinv(current) > wrmsrl(UINV, vector); > } else { > if (needs_uinv(current) > fpu.state.xtsate.uintr.uinv =3D vector; > XRSTORS(); > } >=20 > Hmm? I like it better than anything else I=E2=80=99ve seen. >=20 > Thanks, >=20 > tglx=20