Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp543323imu; Thu, 8 Nov 2018 11:55:04 -0800 (PST) X-Google-Smtp-Source: AJdET5c34Gm0fp2KZaWRs1AWp3OuONwvpvDWvT9VNjItd8QtB/ehKc7AanjR0nYy0az6/i4r/z5s X-Received: by 2002:a62:93d5:: with SMTP id r82-v6mr5909458pfk.55.1541706904092; Thu, 08 Nov 2018 11:55:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541706904; cv=none; d=google.com; s=arc-20160816; b=zLggeHdG8/b7qfAfsGuCeMqXYcnrqiQCi+SQcOy4N1qEuZ8yCO7QSo1CsvT2pzyKKk Ng4u3Ft3kMOSgYZSJu3eYDnV5wlwm49239T4PE9I9/YtUjTczFSmsR7PmlO31xSzvdHO u73jZRbWjg+H7hqClNlqUpuQ2wZwAPRx1H59M2KIf9oMddA806jkjhp5G9QO7nrIcAjx rbuQ9ctoHs6gDpUCjWgX0hS7e6/p+fTXX46rztay/W3vZQ76OBNAC9Wrl6iXO/habD3i 8EHjgLyLB8uhyhP19gZaSyacepf5kDXmTdoMD2A/vB4oLLZfxalZWjLfAE0xdsk1dAv5 rVgw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=qznIAtZQlnUUMcrvA9bnsN5JjvO9VznN3/YvMNt2LXs=; b=jMZGSV7nuhEr8MgtjJXy930u/JJMpz1PGYJTgSusGOo9azayQKncXRS7Q+Y5IcEVqP 19umTcWgxcOXh3zisFURQ2BLBQgKB/NwL7taVXRxJwu+COuAj8thrMViEk4OMzpXYZUt juULSWnE3uAsG3t9ZnLehfnXnDJ1oy06INofYLvbHqZG9584VHk7CHHBl7Nb+S2X8SUC 4U1xMn0TBCJJAkyHOLJUnNzOKcSTvg5wPKjcDgGGlMRTlLXQu54IEtS/yH7HMcxuz1I4 Gt65ojInkGDhlPXfILFSw732LUsK2wUHh4G+SiS9nvaGiMWaCgiOETXlX5bNU5o0HpJi zrgA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p1-v6si1006048pfa.120.2018.11.08.11.54.48; Thu, 08 Nov 2018 11:55:04 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727244AbeKIFbU (ORCPT + 99 others); Fri, 9 Nov 2018 00:31:20 -0500 Received: from mga04.intel.com ([192.55.52.120]:38603 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726140AbeKIFbU (ORCPT ); Fri, 9 Nov 2018 00:31:20 -0500 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Nov 2018 11:54:20 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,480,1534834800"; d="scan'208";a="90566834" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.154]) by orsmga008.jf.intel.com with ESMTP; 08 Nov 2018 11:54:20 -0800 Date: Thu, 8 Nov 2018 11:54:20 -0800 From: Sean Christopherson To: Andy Lutomirski Cc: Dave Hansen , Andy Lutomirski , Jann Horn , Linus Torvalds , Rich Felker , Dave Hansen , Jethro Beekman , Jarkko Sakkinen , Florian Weimer , Linux API , X86 ML , linux-arch , LKML , Peter Zijlstra , nhorman@redhat.com, npmccallum@redhat.com, "Ayoun, Serge" , shay.katz-zamir@intel.com, linux-sgx@vger.kernel.org, Andy Shevchenko , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Carlos O'Donell , adhemerval.zanella@linaro.org Subject: Re: RFC: userspace exception fixups Message-ID: <20181108195420.GA14715@linux.intel.com> References: <1541518670.7839.31.camel@intel.com> <1541524750.7839.51.camel@intel.com> <22596E35-F5D1-4935-86AB-B510DCA0FABE@amacapital.net> <1C426267-492F-4AE7-8BE8-C7FE278531F9@amacapital.net> <209cf4a5-eda9-2495-539f-fed22252cf02@intel.com> <9B76E95B-5745-412E-8007-7FAA7F83D6FB@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <9B76E95B-5745-412E-8007-7FAA7F83D6FB@amacapital.net> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 06, 2018 at 01:07:54PM -0800, Andy Lutomirski wrote: > > > > On Nov 6, 2018, at 1:00 PM, Dave Hansen wrote: > > > >> On 11/6/18 12:12 PM, Andy Lutomirski wrote: > >> True, but what if we have a nasty enclave that writes to memory just > >> below SP *before* decrementing SP? > > > > Yeah, that would be unfortunate. If an enclave did this (roughly): > > > > 1. EENTER > > 2. Hardware sets eenter_hwframe->sp = %sp > > 3. Enclave runs... wants to do out-call > > 4. Enclave sets up parameters: > > memcpy(&eenter_hwframe->sp[-offset], arg1, size); > > ... > > 5. Enclave sets eenter_hwframe->sp -= offset > > > > If we got a signal between 4 and 5, we'd clobber the copy of 'arg1' that > > was on the stack. The enclave could easily fix this by moving ->sp first. > > > > But, this is one of those "fun" parts of the ABI that I think we need to > > talk about. If we do this, we also basically require that the code > > which handles asynchronous exits must *not* write to the stack. That's > > not hard because it's typically just a single ERESUME instruction, but > > it *is* a requirement. > > > > I was assuming that the async exit stuff was completely hidden by the > API. The AEP code would decide whether the exit got fixed up by the > kernel (which may or may not be easy to tell — can the code even tell > without kernel help whether it was, say, an IRQ vs #UD?) and then either > do ERESUME or cause sgx_enter_enclave() to return with an appropriate > return value. Ok, SDK folks came up with an idea that would allow them to use vDSO, albeit with a bit of ugliness and potentially a ROP-attack issue. Definitely some weirdness, but the weirdness is well contained, unlike the magic prefix approach. Provide two enter_enclave() vDSO "functions". The first is a normal function with a normal C interface. The second is a blob of code that is "called" and "returns" via indirect jmp, and can be used by SGX runtimes that want to use the untrusted stack for out-calls from the enclave. For the indirect jmp "function", use %rbp to stash the return address of the caller (either in %rbp itself or in memory pointed to by %rbp). It works because hardware also saves/restores %rbp along with %rsp when doing enclave transitions, and the SDK can live with %rbp being off-limits. Fault info is passed via registers. Basic idea for the "functions" below. The fixup stuff is obviously not wired up correctly, just trying to convey the concept. struct enclu_fault_info { unsigned int leaf; unsigned int trapnr; unsigned int error_code; unsigned long address; }; int __vdso_enter_enclave(void *tcs, struct enclu_fault_info *fault_info) { unsigned int leaf, trapnr; asm volatile ( "lea 2f(%%rip), %%rcx\n\t" "1: enclu\n\t" "jmp 3f\n\t" /* ERESUME trampoline */ "2: enclu\n\t" "ud2\n\t" /* out: */ "3:\n" /* EENTER fixup */ ".pushsection .fixup,\"ax\"\n\t" "4:\n\t" "mov %%eax, %%edi\n\t" "movl $"__stringify(SGX_EENTER)", %%eax\n\t" "jmp 3b\n\t" ".popsection\n\t" _ASM_EXTABLE_FAULT(1b, 4b) /* ERESUME FIXUP */ ".pushsection .fixup,\"ax\"\n\t" "5:\n\t" "mov %%eax, %%edi\n\t" "movl $"__stringify(SGX_ERESUME)", %%eax\n\t" "jmp 3b\n\t" ".popsection\n\t" _ASM_EXTABLE_FAULT(2b, 5b) : "=a"(leaf), "=D" (trapnr) : "a" (SGX_EENTER), "b" (tcs) : "cc", "memory", "rcx", "rdx", "rsi", "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15" ); if (leaf == SGX_EEXIT) return 0; if (fault_info) { fault_info->leaf = leaf; fault_info->trapnr = trapnr; fault_info->error_code = 0; fault_info->address = 0; } return -EFAULT; } GLOBAL(__vdso_enter_enclave_no_stack) endbr64 /* %rbp = return target, %rbx = tcs */ leaq 3f(%rip), %rcx movl $2, %eax 1: enclu /* "return" to "caller" */ 2: jmp *%rbp /* ERESUME trampoline */ 3: enclu ud2 /* EENTER fixup handler */ 4: movq %rax, %rdi movl $2, %eax /* %rsi = error code, %rdx = address */ jmp 2b /* ERESUME fixup handler */ 5: movq %rax, %rdi movl $3, %eax /* %rsi = error code, %rdx = address */ jmp 2b