Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp877980yba; Wed, 15 May 2019 11:29:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqz1pHBci+OjlI9xsq5UB4h1gFRJ7b51TOtcWVTKLHexVO66OZcrwOEVDmWILvSnJboVKP6G X-Received: by 2002:a63:171c:: with SMTP id x28mr45122789pgl.12.1557944997004; Wed, 15 May 2019 11:29:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557944996; cv=none; d=google.com; s=arc-20160816; b=d66nZUlRn7c6pWaQEwcX0nyx6g+iEAgiVVjjuACQp2xv+SI0MplJH7f2fY74f9d6yx tqlI2Tz1hXnW+vvdHsmY3heAFSvB5d7GZxxnHSmkwpbGUITvIJ89uxZkPdX2Vyc+AA8V qspy4dS7PgbvgQ5aJU4dDssZCINRCiNGmxlcFX2ZfZqOrEQtST1emmzphlqnHnLQxclh mdNK0IrS5TGFqK1CHzCe/TkdPpsl+SdMmsJROqKU5Hn3cvUqQY7I7rOGaWEQGq4CJQaR NeEI/7bR0aftK/rBAxXL3neLAcn8jI1vdwDv53aeX9doCCTxEifWEEd3jj2lzRF4TkB6 dyDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=H6U+5opU3mwyLuWultRCa9x2ItSNJtZ8hovS5ad0w88=; b=oG88xbNoiN2jEuv/r9eGD6uO82ZY4dU7bgfuUgElLqcYC7NxMUNZp0CE3dPxcVxeEj Pb1YR1JycaXYAX7xxgSlOhurZuPtRPxRFo0WhIIeKhkC4MfSjbAL7TYZQg0Sd0LEGss2 M/PcJ8pSQrzVUZZooattkh87/oGWm0HlYK+yIhOaM4dUAR6ssNiRfZBzXpcGk0W7FGsR F4A+nGG17xuws+dRUT39FByggAZNVq1EqCXVIU+aycNusYZ99mWqkdaJaqHcnEgkJ6rP Ilw8ZkPmIsdPEdgrOm8DcXkOX3V3qvGZ906BjNCXoLArG0cwFkPMbxJvOzZHiQBb4sXL rzkA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=l+ze3bM9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g12si2535266pla.313.2019.05.15.11.29.42; Wed, 15 May 2019 11:29:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=l+ze3bM9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727357AbfEOS1T (ORCPT + 99 others); Wed, 15 May 2019 14:27:19 -0400 Received: from mail.kernel.org ([198.145.29.99]:34170 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726467AbfEOS1T (ORCPT ); Wed, 15 May 2019 14:27:19 -0400 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 29F1621473 for ; Wed, 15 May 2019 18:27:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1557944837; bh=eyCci4NNHW/Zb8xOIWX1uiuQImd0sVgqVJNMBUtSyzo=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=l+ze3bM9Xdq86BlYng6FD78FmBq7OVBDoDLIvX6TPDXdbjyb53Eyyr3Gmo9FBNwB0 VrEqBdvsYZznlPOWMqDeTQUbWDUQTB005XXgyHcaKsKlEvQjgebrHzPDQQRDk3LPvN ZIXgbS7QZBNVKlbDrXxccBz6lOFRTKcRFQInOmbw= Received: by mail-wr1-f47.google.com with SMTP id b18so521694wrq.12 for ; Wed, 15 May 2019 11:27:17 -0700 (PDT) X-Gm-Message-State: APjAAAV0Dxii8DoLofDRvWjV4b0acfNTBgz7ASmGumcKdUmzO04Zde9o 1MXaPec0EeccFWIYSgvOLCGcanQgQWzSDfnGOPXBhw== X-Received: by 2002:a5d:4907:: with SMTP id x7mr14732988wrq.199.1557944835683; Wed, 15 May 2019 11:27:15 -0700 (PDT) MIME-Version: 1.0 References: <8fe520bb-30bd-f246-a3d8-c5443e47a014@intel.com> <358e9b36-230f-eb18-efdb-b472be8438b4@fortanix.com> <960B34DE67B9E140824F1DCDEC400C0F4E886094@ORSMSX116.amr.corp.intel.com> <6da269d8-7ebb-4177-b6a7-50cc5b435cf4@fortanix.com> <20190513102926.GD8743@linux.intel.com> <20190514104323.GA7591@linux.intel.com> <20190514204527.GC1977@linux.intel.com> <20190515013031.GF1977@linux.intel.com> In-Reply-To: <20190515013031.GF1977@linux.intel.com> From: Andy Lutomirski Date: Wed, 15 May 2019 11:27:04 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: SGX vs LSM (Re: [PATCH v20 00/28] Intel SGX1 support) To: Sean Christopherson , James Morris , "Serge E. Hallyn" , LSM List , Paul Moore , Stephen Smalley , Eric Paris , selinux@vger.kernel.org Cc: Andy Lutomirski , Jarkko Sakkinen , Jethro Beekman , "Xing, Cedric" , "Hansen, Dave" , Thomas Gleixner , "Dr. Greg" , Linus Torvalds , LKML , X86 ML , "linux-sgx@vger.kernel.org" , Andrew Morton , "nhorman@redhat.com" , "npmccallum@redhat.com" , "Ayoun, Serge" , "Katz-zamir, Shay" , "Huang, Haitao" , Andy Shevchenko , "Svahn, Kai" , Borislav Petkov , Josh Triplett , "Huang, Kai" , David Rientjes Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, LSM and SELinux people- We're trying to figure out how SGX fits in with LSMs. For background, an SGX library is functionally a bit like a DSO, except that it's nominally resistant to attack from outside and the process of loading it is complicated. To load an enclave, a program can open /dev/sgx/enclave, do some ioctls to load the code and data segments into the enclave, call a special ioctl to "initialize" the enclave, and then call into the enclave (using special CPU instructions). One nastiness is that there is not actually a universally agreed upon, documented file format for enclaves. Windows has an undocumented format, and there are probably a few others out there. No one really wants to teach the kernel to parse enclave files. There are two issues with how this interacts with LSMs: 1) LSMs might want to be able to whitelist, blacklist, or otherwise restrict what enclaves can run at all. The current proposal that everyone seems to dislike the least is to have a .sigstruct file on disk that contains a hash and signature of the enclave in a CPU-defined format. To initialize an enclave, a program will pass an fd to this file, and a new LSM hook can be called to allow or disallow the operation. In a SELinux context, the idea is that policy could require the .sigstruct file to be labeled with a type like sgx_sigstruct_t, and only enclaves that have a matching .sigstruct with such a label could run. 2) Just like any other DSO, there are potential issues with how enclaves deal with writable vs executable memory. This takes two forms. First, a task should probably require EXECMEM, EXECMOD, or similar permission to run an enclave that can modify its own text. Second, it would be nice if a task that did *not* have EXECMEM, EXECMOD, or similar could still run the enclave if it had EXECUTE permission on the file containing the enclave. Currently, this all works because DSOs are run by mmapping the file to create multiple VMAs, some of which are executable, non-writable, and non-CoWed, and some of which are writable but not executable. With SGX, there's only really one inode per enclave (the anon_inode that comes form /dev/sgx/enclave), and it can only be sensibly mapped MAP_SHARED. With the current version of the SGX driver, to run an enclave, I think you'll need either EXECUTE rights to /dev/sgx/enclave or EXECMOD or similar, all of which more or less mean that you can run any modified code you want, and none of which is useful to prevent enclaves from contain RWX segments. So my question is: what, if anything, should change to make this work bette= r? Here's a very vague proposal that's kind of like what I've been thinking over the past few days. The SGX inode could track, for each page, a "safe-to-execute" bit. When you first open /dev/sgx/enclave, you get a blank enclave and all pages are safe-to-execute. When you do the ioctl to load context (which could be code, data, or anything else), the kernel will check whether the *source* VMA is executable and, if not, mark the page of the enclave being loaded as unsafe. Once the enclave is initialized, the driver will clear the safe-to-execute bit for any page that is successfully mapped writably. The intent is that a page of the enclave is safe-to-execute if that page was populated from executable memory and not modified since then. LSMs could then enforce a policy that you can map an enclave page RX if the page is safe-to-execute, you can map any page you want for write if there are no executable mappings, and you can only map a page for write and execute simultaneously if you can EXECMOD permission. This should allow an enclave to be loaded by userspace from a file with EXECUTE rights. So here are my questions: Are the goals I mentioned reasonable? Is the design I just outlined reasonable? Would SELinux support this? Is there a better solution that works well enough? Thanks, all! > On May 14, 2019, at 6:30 PM, Sean Christopherson wrote: > > >> But thinking this all through, it's a bit more complicated than any of >> this. Looking at the SELinux code for inspiration, there are quite a >> few paths, but they boil down to two cases: EXECUTE is the right to >> map an unmodified file executably, and EXECMOD/EXECMEM (the >> distinction seems mostly irrelevant) is the right to create (via mmap >> or mprotect) a modified anonymous file mapping or a non-file-backed >> mapping that is executable. So, if we do nothing, then mapping an >> enclave with execute permission will require either EXECUTE on the >> enclave inode or EXECMOD/EXECMEM, depending on exactly how this gets >> set up. > > If we do literally nothing, then I'm pretty sure mapping an enclave will > require PROCESS__EXECMEM. The mmap() for the actual enclave is done > using an anon inode, e.g. from /dev/sgx/enclave. Anon inodes are marked > private, which means inode_has_perm() will always return "success". The > only effective check is in file_map_prot_check() when default_noexec is > true, in which case requesting PROT_EXEC on private inodes requires > PROCESS__EXECMEM. > >> So all is well, sort of. The problem is that I expect there will be >> people who want enclaves to work in a process that does not have these >> rights. To make this work, we probably need do some surgery on >> SELinux. ISTM the act of copying (via the EADD ioctl) data from a >> PROT_EXEC mapping to an enclave should not be construed as "modifying" >> the enclave for SELinux purposes. Actually doing this could be >> awkward, since the same inode will have executable parts and >> non-executable parts, and SELinux can't really tell the difference. > > Rather the do surgery on SELinux, why not go with Cedric's original > proposal and propagate the permissions from the source VMA to the EPC > VMA? Which EPC VMA? Users can map the enclave fd again after EADD, resulting in a new VMA. And any realistic enclave will presumably have RO, RW, and RX pages. > The enclave mmap() from userspace could then be done with RO > permissions so as to not run afoul of LSMs. Adding PROT_EXEC after > EADD would require PROCESS__EXECMEM, but that's in line with mprotect() > on regular memory. How does this help anything? The driver currently only works with EXECMEM and, with this change, it still needs EXECMEM. I think that, if we=E2=80=99re going to make changes along these lines, the goal should be that you can have an enclave serialized in a file on disk such that you have EXECUTE on the file, and you should be able to load and run the enclave without needing EXECMEM. (Unless the enclave is self-modifying, of course.)