Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp5326311ybi; Wed, 12 Jun 2019 00:15:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqzdJ+mpVLsmDbxwpSBVakmljmjr7txOvOIX37XmD+sJQuJVBGy1mkPRvlLtyMFXCJUd7My2 X-Received: by 2002:a63:6883:: with SMTP id d125mr16770333pgc.281.1560323756803; Wed, 12 Jun 2019 00:15:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560323756; cv=none; d=google.com; s=arc-20160816; b=kYc5FyUPAncIQyDVWz7/TSbUS+0rXe/41qB4eWkzSwFpLZ/KBdzOYUTVfQ28hNdzDM dXDPiAngBiB6O4tcx3K1b/5rVinaGKJBrN2cyIg3HHWAzDi8KC+rlawyQ7TI9tnsU1qP JAL9TSYLQZP+E3nnMnWiWlonNEO42XaLvZj0NV+NWK+3WI4FFxzJx6s518Op/YGG7KJQ u7TRflwo8fQGEe/73G72fXh3bzofixVaIsMaSERREOoFbUAADOcUFr3G6bNgFo7nRxtl 9K2s/6OZHrXETBH0NedtUFYbvjtIDObNfHZssYwx0KS5m0vXBpFtxLUt940qPyP+ICcV DBgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=i3PsJwwDPsMEThcXBS4rchzEL1RqZkKdH+CwL/ZYuMY=; b=PlYE5J7KZXfTsUSVz3wn+njzaE7nNEQug4kMFszs2b6aB2cwaks5DJhgHnzV7j/ZSJ KsrOHjx6yBXTp+t/Jv24d5xtAFFSFw62uG6h6uMNwCpX8ry7SRFH72oDRISj+IWV1Xdx m8iAzaT8lQAqVquZ5mLnZ2SZyGlQfJ01svfRvFS5p1e+m2w74sZqurOPptiyh6n7smxO UsRb+w1/4/iaBZQuO3fL04jzQfkj7undSe6H0zi4jrt2/Kgils7PRFf4TWLZG/2+QBAT bwkCLmx7Kc+aYPQ/RV8KOm1I9RJ84eTWJFl2klK073xDnbw1X8bH4w8ek84f4p1be/HG tteg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=Bl3JANCS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g191si15969522pgc.197.2019.06.12.00.15.40; Wed, 12 Jun 2019 00:15:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=Bl3JANCS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406262AbfFLAJd (ORCPT + 99 others); Tue, 11 Jun 2019 20:09:33 -0400 Received: from mail-pf1-f195.google.com ([209.85.210.195]:45508 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2405287AbfFLAJc (ORCPT ); Tue, 11 Jun 2019 20:09:32 -0400 Received: by mail-pf1-f195.google.com with SMTP id s11so8439178pfm.12 for ; Tue, 11 Jun 2019 17:09:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=i3PsJwwDPsMEThcXBS4rchzEL1RqZkKdH+CwL/ZYuMY=; b=Bl3JANCSsnwSitJTWTfdgv5r0urYyObtgFnzXllgN0+LpM9C0WkN0RoCQpiNiEKN/F UYqwtjiNTA+9tYZcvTjiZVrkdDEsgEyLglUTcJYGBmLyjgDbVpZBZLNrX7qFJ3nDWHcb W2HY2Uz+eWXZ9D6r5SgxN7XCOCq7ijmDNrOF1gdegdauLyOhfHRm+SXgztoBPvABTuc2 lmMWdb9NLlLOyydS6e07W/28K9+971Q/KuoFv3iYPkxbz3crZCFWnjfPwhxezXd6gU6s 7x3YaiANYeYeSZ0EBRf00a43z1GzDvBfUmdvNrUukroL2ySnVpAZRL/YghT+GzeN70FS PC5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=i3PsJwwDPsMEThcXBS4rchzEL1RqZkKdH+CwL/ZYuMY=; b=TG/PC2VpfvXNLMaTXCofMvCCB5pcM6vARn2amSZwSxmKTNE7gcJ+lrE57RQiaoMIHA Fo9lB0ZeTsjK2RxiscmHf6vUMLREX/tSAZ0UFeytNMKxLUnE/QiKqjrtoGWy/RQtKjpw /fBvJQ60G+tvAPjx/pbxaWcf8FZYMNGQVwcQEom9FvuTwbmY6SVtUwWzcc41sXeBUJfv 2O9tia0LDrVBe0eXAyIYJikdVUmrSgQsyNm8PyRHFypaUaPsJj+hOPZedj7Bd9CTCmFM wEXSJrdboILK7PX/v+w6n7NaSgVeST4UwRgHyk89NuJcF1CcoJ0rQHzcVOhfz2V2z/Hl T2Mg== X-Gm-Message-State: APjAAAUqLvojbclHD3u/6kM+UOmMduPZoIg0aZ+qomMV3IVAINJhYGd3 UKFjufnACnSjcGX6z57wSFPrFg== X-Received: by 2002:a63:de43:: with SMTP id y3mr23364910pgi.271.1560298171849; Tue, 11 Jun 2019 17:09:31 -0700 (PDT) Received: from ?IPv6:2600:1010:b062:7159:60af:2fa5:3435:5195? ([2600:1010:b062:7159:60af:2fa5:3435:5195]) by smtp.gmail.com with ESMTPSA id 18sm14248201pfy.0.2019.06.11.17.09.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 11 Jun 2019 17:09:30 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits From: Andy Lutomirski X-Mailer: iPhone Mail (16F203) In-Reply-To: <960B34DE67B9E140824F1DCDEC400C0F655010EF@ORSMSX116.amr.corp.intel.com> Date: Tue, 11 Jun 2019 17:09:28 -0700 Cc: Andy Lutomirski , "Christopherson, Sean J" , Jarkko Sakkinen , Stephen Smalley , James Morris , "Serge E . Hallyn" , LSM List , Paul Moore , Eric Paris , "selinux@vger.kernel.org" , Jethro Beekman , "Hansen, Dave" , Thomas Gleixner , Linus Torvalds , LKML , X86 ML , "linux-sgx@vger.kernel.org" , Andrew Morton , "nhorman@redhat.com" , "npmccallum@redhat.com" , "Ayoun, Serge" , "Katz-zamir, Shay" , "Huang, Haitao" , Andy Shevchenko , "Svahn, Kai" , Borislav Petkov , Josh Triplett , "Huang, Kai" , David Rientjes , "Roberts, William C" , "Tricca, Philip B" Content-Transfer-Encoding: quoted-printable Message-Id: <331B31BF-9892-4FB3-9265-3E37412F80F4@amacapital.net> References: <20190606021145.12604-1-sean.j.christopherson@intel.com> <20190606021145.12604-3-sean.j.christopherson@intel.com> <960B34DE67B9E140824F1DCDEC400C0F65500E13@ORSMSX116.amr.corp.intel.com> <960B34DE67B9E140824F1DCDEC400C0F655010EF@ORSMSX116.amr.corp.intel.com> To: "Xing, Cedric" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Jun 10, 2019, at 3:28 PM, Xing, Cedric wrote: >> From: Andy Lutomirski [mailto:luto@kernel.org] >> Sent: Monday, June 10, 2019 12:15 PM >>=20 >> On Mon, Jun 10, 2019 at 11:29 AM Xing, Cedric >> wrote: >>>=20 >>>> From: Christopherson, Sean J >>>> Sent: Wednesday, June 05, 2019 7:12 PM >>>>=20 >>>> +/** >>>> + * sgx_map_allowed - check vma protections against the associated >>>> enclave page >>>> + * @encl: an enclave >>>> + * @start: start address of the mapping (inclusive) >>>> + * @end: end address of the mapping (exclusive) >>>> + * @prot: protection bits of the mapping >>>> + * >>>> + * Verify a userspace mapping to an enclave page would not violate >>>> +the security >>>> + * requirements of the *kernel*. Note, this is in no way related >>>> +to the >>>> + * page protections enforced by hardware via the EPCM. The EPCM >>>> +protections >>>> + * can be directly extended by the enclave, i.e. cannot be relied >>>> +upon by the >>>> + * kernel for security guarantees of any kind. >>>> + * >>>> + * Return: >>>> + * 0 on success, >>>> + * -EACCES if the mapping is disallowed >>>> + */ >>>> +int sgx_map_allowed(struct sgx_encl *encl, unsigned long start, >>>> + unsigned long end, unsigned long prot) { >>>> + struct sgx_encl_page *page; >>>> + unsigned long addr; >>>> + >>>> + prot &=3D (VM_READ | VM_WRITE | VM_EXEC); >>>> + if (!prot || !encl) >>>> + return 0; >>>> + >>>> + mutex_lock(&encl->lock); >>>> + >>>> + for (addr =3D start; addr < end; addr +=3D PAGE_SIZE) { >>>> + page =3D radix_tree_lookup(&encl->page_tree, addr >> >>>> PAGE_SHIFT); >>>> + >>>> + /* >>>> + * Do not allow R|W|X to a non-existent page, or >> protections >>>> + * beyond those of the existing enclave page. >>>> + */ >>>> + if (!page || (prot & ~page->prot)) >>>> + return -EACCES; >>>=20 >>> In SGX2, pages will be "mapped" before being populated. >>>=20 >>> Here's a brief summary for those who don't have enough background on >> how new EPC pages could be added to a running enclave in SGX2: >>> - There are 2 new instructions - EACCEPT and EAUG. >>> - EAUG is used by SGX module to add (augment) a new page to an >> existing enclave. The newly added page is *inaccessible* until the >> enclave *accepts* it. >>> - EACCEPT is the instruction for an enclave to accept a new page. >>>=20 >>> And the s/w flow for an enclave to request new EPC pages is expected >> to be something like the following: >>> - The enclave issues EACCEPT at the linear address that it would >> like a new page. >>> - EACCEPT results in #PF, as there's no page at the linear address >> above. >>> - SGX module is notified about the #PF, in form of its vma->vm_ops- >>> fault() being called by kernel. >>> - SGX module EAUGs a new EPC page at the fault address, and resumes >> the enclave. >>> - EACCEPT is reattempted, and succeeds at this time. >>=20 >> This seems like an odd workflow. Shouldn't the #PF return back to >> untrusted userspace so that the untrusted user code can make its own >> decision as to whether it wants to EAUG a page there as opposed to, say, >> killing the enclave or waiting to keep resource usage under control? >=20 > This may seem odd to some at the first glance. But if you can think of how= static heap (pre-allocated by EADD before EINIT) works, the load parses the= "metadata" coming with the enclave to decide the address/size of the heap, E= ADDs it, and calls it done. In the case of "dynamic" heap (allocated dynamic= ally by EAUG after EINIT), the same thing applies - the loader determines th= e range of the heap, tells the SGX module about it, and calls it done. Every= thing else is the between the enclave and the SGX module. >=20 > In practice, untrusted code usually doesn't know much about enclaves, just= like it doesn't know much about the shared objects loaded into its address s= pace either. Without the necessary knowledge, untrusted code usually just do= es what it is told (via o-calls, or return value from e-calls), without judg= ing that's right or wrong.=20 >=20 > When it comes to #PF like what I described, of course a signal could be se= nt to the untrusted code but what would it do then? Usually it'd just come b= ack asking for a page at the fault address. So we figured it'd be more effic= ient to just have the kernel EAUG at #PF.=20 >=20 > Please don't get me wrong though, as I'm not dictating what the s/w flow s= hall be. It's just going to be a choice offered to user mode. And that choic= e was planned to be offered via mprotect() - i.e. a writable vma causes kern= el to EAUG while a non-writable vma will result in a signal (then the user m= ode could decide whether to EAUG). The key point is flexibility - as we want= to allow all reasonable s/w flows instead of dictating one over others. We h= ad similar discussions on vDSO API before. And I think you accepted my appro= ach because of its flexibility. Am I right? As long as user code can turn this off, I have no real objection. But it mig= ht make sense to have it be more explicit =E2=80=94 have an ioctl set up a r= ange as =E2=80=9CEAUG-on-demand=E2=80=9D. But this is all currently irrelevant. We can argue about it when the patches= show up. :)=