Received: by 2002:a25:1104:0:0:0:0:0 with SMTP id 4csp469948ybr; Fri, 22 May 2020 10:52:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxoh1NkwEUbfKL8PUvIA9zkGGkE0ipatVmq7Bkmy6oBDE8Tnul2FuDSfne4BAg67kQvalmS X-Received: by 2002:a17:906:44f:: with SMTP id e15mr8860791eja.161.1590169949265; Fri, 22 May 2020 10:52:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590169949; cv=none; d=google.com; s=arc-20160816; b=Ow2h1i7xnT/MmXsDtZYVW/tlqB6sUrHgkgwNeTyKL/t4Skz3BiYywSUoc+6GGTMXkr Sf0yFO2/wzDy4+CQ/n1oYykxk+odkN5uIspcaWkxC+0DpePQnXWD0QdJiHGlBYXpK7zr 2gmkU6kAZmNTkOKthKv0H8FfZ65fhfdJTSanZehLEhJhrXredWSVYdPjoeEayKusIBO7 OosF5E2AB8Gis/nbJUgIhv0snbYKBXyLwhXwlTk5idP9YHrJ0SFZUJWpj6XN55pvGVIC BC1MTSNX1tmvg49QMS6CTdq0cVpVmQ61/wKoyhdxS1zeaHh1kDM8bO6hLj8DGjGbKEaC nZyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=tY6CmQnUt5RJU4VSllVKftAd/gz23THapMOHNGWWw9Q=; b=FsW12jJYd5c9FgbwSYVqEUYunA6S5qFZK7NnHek7NyNNYg8Jqv5G/BeQxshQGl/J5o q4gTNv2GlM4Yef/p3OduKzQnyOH/HIiC9d/Y74s5ygMp/zKLjOwzcek/jv8lrz6GBhD4 JceZVAGVg2plekiK28QJEPwfvq68ozcxhnAEq6F9BBdV2ruGUt5lr8dhfke7QtTgXrdX mOFaxNw0vU7KT4A9qYGZfRFqOa55/e6Bx0JysQK7Csk5ulRMw09OAI0iDiCwYg0O8lrx mfyI7ihqTV3FHbZG7Sfq7PXnU+VFUyz1pzSsT/9PssgKDz7+jor2F+c/ZcA7/S/j2QF5 sqFQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=citrix.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id qx20si5092204ejb.281.2020.05.22.10.52.06; Fri, 22 May 2020 10:52:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=citrix.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730787AbgEVRs1 (ORCPT + 99 others); Fri, 22 May 2020 13:48:27 -0400 Received: from ppsw-31.csi.cam.ac.uk ([131.111.8.131]:59256 "EHLO ppsw-31.csi.cam.ac.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726373AbgEVRs0 (ORCPT ); Fri, 22 May 2020 13:48:26 -0400 X-Cam-AntiVirus: no malware found X-Cam-ScannerInfo: http://help.uis.cam.ac.uk/email-scanner-virus Received: from 88-109-182-220.dynamic.dsl.as9105.com ([88.109.182.220]:51732 helo=[192.168.1.219]) by ppsw-31.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.157]:465) with esmtpsa (PLAIN:amc96) (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) id 1jcBmD-000an5-M3 (Exim 4.92.3) (return-path ); Fri, 22 May 2020 18:48:09 +0100 Subject: Re: [PATCH v10 01/26] Documentation/x86: Add CET description To: Peter Zijlstra Cc: "H.J. Lu" , Dave Hansen , Yu-cheng Yu , the arch/x86 maintainers , "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , LKML , "open list:DOCUMENTATION" , Linux-MM , linux-arch , Linux API , Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang References: <20200429220732.31602-2-yu-cheng.yu@intel.com> <5cc163ff9058d1b27778e5f0a016c88a3b1a1598.camel@intel.com> <44c055342bda4fb4730703f987ae35195d1d0c38.camel@intel.com> <32235ffc-6e6c-fb3d-80c4-a0478e2d0e0f@intel.com> <6272c481-af90-05c5-7231-3ba44ff9bd02@citrix.com> <20200522164953.GA411971@hirez.programming.kicks-ass.net> From: Andrew Cooper Message-ID: Date: Fri, 22 May 2020 18:48:08 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <20200522164953.GA411971@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-GB Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 22/05/2020 17:49, Peter Zijlstra wrote: > On Sat, May 16, 2020 at 03:09:22PM +0100, Andrew Cooper wrote: > >> Sadly, the same is not true for kernel shadow stacks. >> >> SSP is 0 after SYSCALL, SYSENTER and CLRSSBSY, and you've got to be >> careful to re-establish the shadow stack before a CALL, interrupt or >> exception tries pushing a word onto the shadow stack at 0xfffffffffffffff8. > Oh man, I can only imagine the joy that brings to #NM and friends :-( Establishing a supervisor shadow stack for the first time involves a large leap of faith, even by usual x86 standards. You need to have prepared MSR_PL0_SSP with correct mappings and supervisor tokens, such that when you enable CR4.CET and MSR_S_CET.SHSTK_EN, your SETSSBSY instruction succeeds at its atomic "check the token and set the busy bit" shadow stack access.  Any failure here tends to be a triple fault, and I didn't get around to figuring out why #DF wasn't taken cleanly. You also need to have prepared MSR_IST_SSP beforehand with the IST shadow stack pointers matching any IST configuration in the IDT, lest a NMI ruins your day on the instruction boundary before SETSSBSY. A less obvious side effect of these "windows with an SSP of 0" is that you're now forced to use IST for all non-maskable interrupts/exceptions, even if you choose not to use SYSCALL, and you no longer need IST to remove the risks of a userspace privilege escalation, and would prefer not to use IST because of its problematic reentrancy characteristics. For anyone counting the number of IST-necessary vectors across all potential configurations in modern hardware, its #DB, NMI, #DF, #MC, #VE, #HV, #VC and #SX, and an architectural limit of 7. There are several other amusing aspects, such as iret-to-self needing to use call-oriented-programming to keep itself shadow-stack-safe, or the fact that IRET to user mode doesn't fault if it fails to clear the supervisor busy bit, instead leaving you to double fault at some point in the future at the next syscall/interrupt/exception because the stack is still busy. ~Andrew P.S. For anyone interested, https://lore.kernel.org/xen-devel/20200501225838.9866-1-andrew.cooper3@citrix.com/T/#u for getting supervisor shadow stacks working on Xen, which is far simpler to manage than Linux.  I do not envy whomever has the fun of trying to make this work for Linux.