Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp557769imu; Thu, 8 Nov 2018 12:07:05 -0800 (PST) X-Google-Smtp-Source: AJdET5fqyh2AH6nkUdOxjFfTpXpWMOoBNNbkzmC2Klz3UtcWPos5D1aeTVD+iEtpQ6Gau/39Gmsd X-Received: by 2002:a17:902:a9cb:: with SMTP id b11-v6mr5829697plr.219.1541707625922; Thu, 08 Nov 2018 12:07:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541707625; cv=none; d=google.com; s=arc-20160816; b=gRxg+dUdpsAR4qrEQ78Yx5qqE0CJoRYVTUuHaQbyxg4IN0NZAHvBkx5SlZmTGVTeiD SH7zdZ2k5MBoze4BfsCZoDoPa5xZAMttqNZ/jOHuVYDgV9k67czonEh5RS75f8RTtaqh X6qSs2nkroEJSGIAHXuBB7v34OuxmEp3A4Tya+yhYt/0wWt62NkpN08AppMCpYAnpnXF X48v8s19lzCBBTt2Afhve6iW5HdwGv0/gvDIOm2Osbb5F0Jn6vx8PHj6uYmTCMW7D8sf s9CkeZV9glTSx85/g5xVTUpy6gII/i5EM4fDMvNxqorBvuuwEzBNt2y2ktMV4MjCYvOi /nwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=UL5+Om8tKNJ8qo7eHj5jspqPtmEW4GhB5eb+rBLE75E=; b=cIWJe76ZBSPiYrml2VI1i2c/eGb1hJZgmqL2c2EKgs4oQcjSvZmcHreTeYAsDMXUui Vx8unQshP/YzvwVC4K44ZcJaL4BEvzVVtW2KB5evX0/mC7uUe/sBXTmZdAf9FyH2ySvr /XfYfuB1DoJNA2F0hXf5RA5olKBJf/BQdui/F1fVUBGhCxph01PD7wGhICOndDlw7yya qpGydLcGWVy8N9crzaWr2TIqrw35lkvsMpuUjzPCtGge76XNzU4z9aXgplmOM2/Iel50 e4yR7Rpbq1teoq5wGbd52wyDb1y6Aodyyh1pzHVeWUyCMupw/CVWErOkeXM8avRRilpp XKQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=CJlBMWv8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b9si4250691pgt.293.2018.11.08.12.06.50; Thu, 08 Nov 2018 12:07:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=CJlBMWv8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727244AbeKIFm6 (ORCPT + 99 others); Fri, 9 Nov 2018 00:42:58 -0500 Received: from mail-wr1-f49.google.com ([209.85.221.49]:42502 "EHLO mail-wr1-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725199AbeKIFm6 (ORCPT ); Fri, 9 Nov 2018 00:42:58 -0500 Received: by mail-wr1-f49.google.com with SMTP id y15-v6so22571820wru.9 for ; Thu, 08 Nov 2018 12:05:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=UL5+Om8tKNJ8qo7eHj5jspqPtmEW4GhB5eb+rBLE75E=; b=CJlBMWv8wLviKdsfsx4K4VEc4/KIHyCa3u/9O+dSsS1bRwx/ZRZIdviuTTeZhl55Hg JjJTN0DcIxYadpK/Obp5Z9KU7f7vazfUcQJQTJuUHLPHdXFeRelrAFzACYkGpOSGNliW 2a3qGudWTfbAG8tv2lCuZpN5QDZplS6aekTOXaG+jA82tDiN8k8vVHaxKnYprpezr5FJ mepCX1lN+dG4kFoX06rwrT5CAyzVdJmg09aqEzQn1F3fzInW0FyT7H3uMquQYZOgaEDh n1JhvvXkIGRa60A4QVRrcWloWh6irawhf11K7SmUJNYB6wlxFQb9WQLHmTYcGNdplpTL AKEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=UL5+Om8tKNJ8qo7eHj5jspqPtmEW4GhB5eb+rBLE75E=; b=bwlWv/5V+kX1U1ICRRAO+fHKCs3QO2qUxdei9o8q3Dr+6ngPpronMtAqSjxkVU8wQJ CzegbFXxytPvFPPR5zOBTnEcXtkPwkIzRClw34XMvNWAlnnDe+dCBVQv2jSlTSzBROfe GujXsxeDGvdnpYNDjFDDDSNv6CN1RtY0SZ3cU6mJzqalRpJVugrSH5Xm7PWQrwvaNqSL MpO/xbv3+r5nqqD9YbDVF8QjuS7HXMQeSjm61JqgoM0M2v4kP0qeaG3k7l+ZOg2SpN1G TpLY08KZQ5eJmdxLz3qMI9zP0M1eQ8yeF1VCxhRO/dH2a0WRowKfVjduNp0dGuaxVGVk Ibjg== X-Gm-Message-State: AGRZ1gIa520U5atbytpKydvLcLkqwZCN/KXoP8ZnHlBoebd5mv6W8Qki MVGxp7DFfg1LxNr7xdxV2isuc8BJxH9zcdC+hRhCwg== X-Received: by 2002:a5d:4450:: with SMTP id x16-v6mr5412203wrr.308.1541707554882; Thu, 08 Nov 2018 12:05:54 -0800 (PST) MIME-Version: 1.0 References: <1541518670.7839.31.camel@intel.com> <1541524750.7839.51.camel@intel.com> <22596E35-F5D1-4935-86AB-B510DCA0FABE@amacapital.net> <1C426267-492F-4AE7-8BE8-C7FE278531F9@amacapital.net> <209cf4a5-eda9-2495-539f-fed22252cf02@intel.com> <9B76E95B-5745-412E-8007-7FAA7F83D6FB@amacapital.net> <20181108195420.GA14715@linux.intel.com> In-Reply-To: <20181108195420.GA14715@linux.intel.com> From: Andy Lutomirski Date: Thu, 8 Nov 2018 12:05:42 -0800 Message-ID: Subject: Re: RFC: userspace exception fixups To: "Christopherson, Sean J" Cc: Dave Hansen , Andrew Lutomirski , Jann Horn , Linus Torvalds , Rich Felker , Dave Hansen , Jethro Beekman , Jarkko Sakkinen , Florian Weimer , Linux API , X86 ML , linux-arch , LKML , Peter Zijlstra , nhorman@redhat.com, npmccallum@redhat.com, "Ayoun, Serge" , shay.katz-zamir@intel.com, linux-sgx@vger.kernel.org, Andy Shevchenko , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "Carlos O'Donell" , adhemerval.zanella@linaro.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 8, 2018 at 11:54 AM Sean Christopherson wrote: > > On Tue, Nov 06, 2018 at 01:07:54PM -0800, Andy Lutomirski wrote: > > > > > > > On Nov 6, 2018, at 1:00 PM, Dave Hansen wrote= : > > > > > >> On 11/6/18 12:12 PM, Andy Lutomirski wrote: > > >> True, but what if we have a nasty enclave that writes to memory just > > >> below SP *before* decrementing SP? > > > > > > Yeah, that would be unfortunate. If an enclave did this (roughly): > > > > > > 1. EENTER > > > 2. Hardware sets eenter_hwframe->sp =3D %sp > > > 3. Enclave runs... wants to do out-call > > > 4. Enclave sets up parameters: > > > memcpy(&eenter_hwframe->sp[-offset], arg1, size); > > > ... > > > 5. Enclave sets eenter_hwframe->sp -=3D offset > > > > > > If we got a signal between 4 and 5, we'd clobber the copy of 'arg1' t= hat > > > was on the stack. The enclave could easily fix this by moving ->sp f= irst. > > > > > > But, this is one of those "fun" parts of the ABI that I think we need= to > > > talk about. If we do this, we also basically require that the code > > > which handles asynchronous exits must *not* write to the stack. That= 's > > > not hard because it's typically just a single ERESUME instruction, bu= t > > > it *is* a requirement. > > > > > > > I was assuming that the async exit stuff was completely hidden by the > > API. The AEP code would decide whether the exit got fixed up by the > > kernel (which may or may not be easy to tell =E2=80=94 can the code eve= n tell > > without kernel help whether it was, say, an IRQ vs #UD?) and then eithe= r > > do ERESUME or cause sgx_enter_enclave() to return with an appropriate > > return value. > > Ok, SDK folks came up with an idea that would allow them to use vDSO, > albeit with a bit of ugliness and potentially a ROP-attack issue. > Definitely some weirdness, but the weirdness is well contained, unlike > the magic prefix approach. > > Provide two enter_enclave() vDSO "functions". The first is a normal > function with a normal C interface. The second is a blob of code that > is "called" and "returns" via indirect jmp, and can be used by SGX > runtimes that want to use the untrusted stack for out-calls from the > enclave. > > For the indirect jmp "function", use %rbp to stash the return address > of the caller (either in %rbp itself or in memory pointed to by %rbp). > It works because hardware also saves/restores %rbp along with %rsp when > doing enclave transitions, and the SDK can live with %rbp being > off-limits. Fault info is passed via registers. Hmm. The idea being that the SDK preserves RBP but not RSP. That's not the most terrible thing in the world. But could the SDK live with something more like my suggestion where the vDSO supplies a normal function that takes a struct containing registers that are visible to the enclave? This would make it extremely awkward for the enclave to use the untrusted stack per se, but it would make it quite easy (I think) for the untrusted part of the SDK to allocate some extra memory and just tell the enclave that *that* memory is the stack. AFAFICS we do have two registers that genuinely are preserved: FSBASE and GSBASE. Which is a good thing, because otherwise SGX enablement would currently be a privilege escalation issue due to making GSBASE writable when it should not be. This whole thing is a mess. I'm starting to think that the cleanest solution would be to provide a way to just tell the kernel that certain RIP values have exception fixups.