Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2182285imu; Tue, 6 Nov 2018 10:15:43 -0800 (PST) X-Google-Smtp-Source: AJdET5eh2wbKVNTj31ATERWbPBT999T/mbiUQOTuAUzhN2N8HwqF7PFPdNh/Hl+o0mZiDE885m0q X-Received: by 2002:a17:902:650f:: with SMTP id b15-v6mr27163876plk.2.1541528143756; Tue, 06 Nov 2018 10:15:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541528143; cv=none; d=google.com; s=arc-20160816; b=jKamTBPUzJOlhxurG0+OFeKDTSmsjeyPo/hmoXhml3/OdLvfsOhYajXQsBCW1Sqp34 +VjnQZjUDD4Sb119pVxP+DMYpdYUhSB4eGg2dHeosX+tq95Ydpr53cLFmsnOFJqTSBt1 xkF1LclpLXkqS31Lo8DQtyrrSN4rx2EQ2G4K3Da4C07XRGpJajFwgyKTCc/9gisKHjYG GKo0kJLEvHLlDXKAhbVk86lOgwW1QonUapTmqL+XcBOqsqJ25RiCx9vUI9j57f2O/Kbg /f1O7fzZjYM7fkMj9/E1Ka5pKeI+9KEa66msTyzmQYCIQHWd32L5PrNKIQHJ3Egp4Nc2 PChQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=fuTYZ0EUURoM2npeeIQL7Ud5k/6P95lXaguV2MbS8Ks=; b=zIPE6NI3wbVd7CzAcuN6CBd3r4uUmv/aZjyFtJCArkwUpd1tbC8+uu4QVlYpkM+Wf/ uSfEkZgF6V7A+sHeJoJn2qJglgP1pSnkftha2Sn70xKwvaA+02UWVMbJVNyc9HYNGXPd eY3Bs1paazx+6UGSxEFL3WvnB5fAguoj8ivjdm/LNUVRlNd+hO/0Qzi7OmWhKLfJk18P 6Xf5me3XRPC/aw65MTx8rI7JK1hkEa7Kh5T4QsfQdLwvr3JhDag0dcFl9CHUG/VxqAuD +MXidb3/64nlcpHtpuLb8dJ5BQEp5oQ+44e0WnB1k3azNtfCSAaP41Kw0SiZXaAR+C3n mWvA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=a1liUfpu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 13-v6si36961955pgv.104.2018.11.06.10.15.28; Tue, 06 Nov 2018 10:15:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=a1liUfpu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389469AbeKGCXp (ORCPT + 99 others); Tue, 6 Nov 2018 21:23:45 -0500 Received: from mail-pg1-f193.google.com ([209.85.215.193]:34979 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389386AbeKGCXp (ORCPT ); Tue, 6 Nov 2018 21:23:45 -0500 Received: by mail-pg1-f193.google.com with SMTP id 32-v6so6081622pgu.2 for ; Tue, 06 Nov 2018 08:57:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=fuTYZ0EUURoM2npeeIQL7Ud5k/6P95lXaguV2MbS8Ks=; b=a1liUfpuFJJmS06nU/EjBdoWCE3hJsy7OELbBpdzO9qBINWaExkerGXPukA2B6Vckm uQjTGSavGI9BxMFjKgA0zSLUCe+QgFIvr7btS831iBlLMi12pYZhxdS+1hKLflPNqshm ZB9GOHMVrE6QLsWgkdEb6scSl1nmFe9WDpsL/Hp3As5A2KVrC7PVoWcT3QPMVcs8ZkhS b2Ni2DR9+lg3U07J6DlElfqLGyFhNFn3qvMJnOhnsg1OZE9FwnzzE70kY5s4LohszlDs dnI6jaaGFyBSicX8AoIvApjRObCMLmht+ZCXjVNxJZh7p4HkwvXCAdsxmMQamdR4ypgn aXtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=fuTYZ0EUURoM2npeeIQL7Ud5k/6P95lXaguV2MbS8Ks=; b=XJSHaco74bhuyMEbweoclNJ9ht/XD/iKFbaTFGLY9ip1YR+hWA5iNslLUQkZxqzeWI OYWqvF4EO+auyTvj2WYQbx/ARIzd38YsLbs6Orsg5HEfJVStq2d6IXTBLYc22AdAGrAP ZlWviy13LI6F71xBKwMYOCZHk6dejGBZEitNACkZU123m0FE3Lz8JSQvHbKxgEYjgRye yTE4D5gZZh7hTnjfnl48CwOqg2/ApGJ7er694Xeh1tAQ9Bo70mLybeg+fRUR3zJzBIqG XBa5fxiTBpEDLgKqdJTxlnUX3KmcAOrRNqW5WOyvM6W8joKgrhsyFfl9Qrqr6QRKig49 DI5Q== X-Gm-Message-State: AGRZ1gIwte1LYFh7vof6PNd0VmNXToVfFTEBTTITMDYyyUm8WG4YsJs9 AXmx8+QzhFFRsUXexruVKSkZRg== X-Received: by 2002:a62:995c:: with SMTP id d89-v6mr26698555pfe.11.1541523457476; Tue, 06 Nov 2018 08:57:37 -0800 (PST) Received: from ?IPv6:2601:646:c200:7429:41cb:cc75:c7b0:b9a? ([2601:646:c200:7429:41cb:cc75:c7b0:b9a]) by smtp.gmail.com with ESMTPSA id 7-v6sm48620961pgk.31.2018.11.06.08.57.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 06 Nov 2018 08:57:36 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: RFC: userspace exception fixups From: Andy Lutomirski X-Mailer: iPhone Mail (16A404) In-Reply-To: <1541518670.7839.31.camel@intel.com> Date: Tue, 6 Nov 2018 08:57:35 -0800 Cc: Andy Lutomirski , Jann Horn , Dave Hansen , Linus Torvalds , Rich Felker , Dave Hansen , Jethro Beekman , Jarkko Sakkinen , Florian Weimer , Linux API , X86 ML , linux-arch , LKML , Peter Zijlstra , nhorman@redhat.com, npmccallum@redhat.com, "Ayoun, Serge" , shay.katz-zamir@intel.com, linux-sgx@vger.kernel.org, Andy Shevchenko , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Carlos O'Donell , adhemerval.zanella@linaro.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <20181102163034.GB7393@linux.intel.com> <7050972d-a874-dc08-3214-93e81181da60@intel.com> <20181102170627.GD7393@linux.intel.com> <20181102173350.GF7393@linux.intel.com> <20181102182712.GG7393@linux.intel.com> <20181102220437.GI7393@linux.intel.com> <1541518670.7839.31.camel@intel.com> To: Sean Christopherson Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Nov 6, 2018, at 7:37 AM, Sean Christopherson wrote: >=20 >> On Fri, 2018-11-02 at 16:32 -0700, Andy Lutomirski wrote: >>> On Fri, Nov 2, 2018 at 4:28 PM Jann Horn wrote: >>>=20 >>>=20 >>> On Fri, Nov 2, 2018 at 11:04 PM Sean Christopherson >>> wrote: >>>>=20 >>>>> On Fri, Nov 02, 2018 at 08:02:23PM +0100, Jann Horn wrote: >>>>>=20 >>>>> On Fri, Nov 2, 2018 at 7:27 PM Sean Christopherson >>>>> wrote: >>>>>>=20 >>>>>>> On Fri, Nov 02, 2018 at 10:48:38AM -0700, Andy Lutomirski wrote: >>>>>>>=20 >>>>>>> This whole mechanism seems very complicated, and it's not clear >>>>>>> exactly what behavior user code wants. >>>>>> No argument there. That's why I like the approach of dumping the >>>>>> exception to userspace without trying to do anything intelligent in >>>>>> the kernel. Userspace can then do whatever it wants AND we don't >>>>>> have to worry about mucking with stacks. >>>>>>=20 >>>>>> One of the hiccups with the VDSO approach is that the enclave may >>>>>> want to use the untrusted stack, i.e. the stack that has the VDSO's >>>>>> stack frame. For example, Intel's SDK uses the untrusted stack to >>>>>> pass parameters for EEXIT, which means an AEX might occur with what >>>>>> is effectively a bad stack from the VDSO's perspective. >>>>> What exactly does "uses the untrusted stack to pass parameters for >>>>> EEXIT" mean? I guess you're saying that the enclave is writing to >>>>> RSP+[0...some_positive_offset], and the written data needs to be >>>>> visible to the code outside the enclave afterwards? >>>> As is, they actually do it the other way around, i.e. negative offsets >>>> relative to the untrusted %RSP. Going into the enclave there is no >>>> reserved space on the stack. The SDK uses EEXIT like a function call, >>>> i.e. pushing parameters on the stack and making an call outside of the >>>> enclave, hence the name out-call. This allows the SDK to handle any >>>> reasonable out-call without a priori knowledge of the application's >>>> maximum out-call "size". >>> But presumably this is bounded to be at most 128 bytes (the red zone >>> size), right? Otherwise this would be incompatible with >>> non-sigaltstack signal delivery. >>=20 >> I think Sean is saying that the enclave also updates RSP. >=20 > Yeah, the enclave saves/restores RSP from/to the current save state area. >=20 >> One might reasonably wonder how the SDX knows the offset from RSP to >> the function ID. Presumably using RBP? >=20 > Here's pseudocode for how the SDK uses the untrusted stack, minus a > bunch of error checking and gory details. >=20 > The function ID and a pointer to a marshalling struct are passed to > the untrusted runtime via normal register params, e.g. RDI and RSI. > The marshalling struct is what's actually allocated on the untrusted > stack, like alloca() but more complex and explicit. The marshalling > struct size is not artificially restricted by the SDK, e.g. AFAIK it > could span multiple 4k pages. >=20 >=20 > int sgx_out_call(const unsigned int func_index, void *marshalling_struct) > { > struct sgx_encl_tls *tls =3D get_encl_tls(); >=20 > %RBP =3D tls->save_state_area[SSA_RBP]; > %RSP =3D tls->save_state_area[SSA_RSP]; > %RDI =3D func_index; > %RSI =3D marshalling_struct; >=20 > EEXIT >=20 > /* magic elsewhere to get back here on an EENTER(OUT_CALL_RETURN) */ > return %RAX > } >=20 > void *sgx_alloc_untrusted_stack(size_t size) > { > struct sgx_encl_tls *tls =3D get_encl_tls(); > struct sgx_out_call_context *context; > void *tmp; >=20 > /* create a frame on the trusted stack to hold the out-call context */ > tls->trusted_stack -=3D sizeof(struct sgx_out_call_context); >=20 > /* save the untrusted %RSP into the out-call context */ > context =3D (struct sgx_out_call_context *)tls->trusted_stack; > context->untrusted_stack =3D tls->save_state_area[SSA_RSP]; >=20 > /* allocate space on the untrusted stack */ > tmp =3D (void *)(tls->save_state_area[SSA_RSP] - size); > tls->save_state_area[SSA_RSP] =3D tmp; >=20 > return tmp; > } >=20 > void sgx_pop_untrusted_stack(void) > { > struct sgx_encl_tls *tls =3D get_encl_tls(); > struct sgx_out_call_context *context; >=20 > /* retrieve the current out-call context from the trusted stack */ > context =3D (struct sgx_out_call_context *)tls->trusted_stack; >=20 > /* restore untrusted %RSP */ > tls->save_state_area[SSA_RSP] =3D context->untrusted_stack; >=20 > /* pop the out-call context frame */ > tls->trusted_stack +=3D sizeof(struct sgx_out_call_context); > } >=20 > int sgx_main(void) > { > struct my_out_call_struct *params; >=20 > params =3D sgx_alloc_untrusted_stack(sizeof(*params)); >=20 > params->0..N =3D XYZ; >=20 > ret =3D sgx_out_call(DO_WORK, params); >=20 > sgx_pop_untrusted_stack(); >=20 > return ret; > } So I guess the non-enclave code basically can=E2=80=99t trust its stack poin= ter because of these shenanigans. And the AEP code has to live with the fact= that its RSP is basically arbitrary and probably can=E2=80=99t even be unwo= und by a debugger? And the EENTER code has to deal with the fact that its r= ed zone can be blatantly violated by the enclave? I=E2=80=99m assuming it=E2=80=99s way too late for the SGX SDK to be changed= to use a normal RPC mechanism? I=E2=80=99m a bit disappointed that enclaves= can even manipulate outside state like this. I assume Intel had some reason= for making it possible, but still.=