2016-04-25 17:37:08

by Jarkko Sakkinen

[permalink] [raw]
Subject: [PATCH 0/6] Intel Secure Guard Extensions

Intel(R) SGX is a set of CPU instructions that can be used by
applications to set aside private regions of code and data. The code
outside the enclave is disallowed to access the memory inside the
enclave by the CPU access control.

The firmware uses PRMRR registers to reserve an area of physical memory
called Enclave Page Cache (EPC). There is a hardware unit in the
processor called Memory Encryption Engine. The MEE encrypts and decrypts
the EPC pages as they enter and leave the processor package.

Jarkko Sakkinen (5):
x86, sgx: common macros and definitions
intel_sgx: driver for Intel Secure Guard eXtensions
intel_sgx: ptrace() support for the driver
intel_sgx: driver documentation
intel_sgx: TODO file for the staging area

Kai Huang (1):
x86: add SGX definition to cpufeature

Documentation/x86/intel_sgx.txt | 86 +++
arch/x86/include/asm/cpufeature.h | 1 +
arch/x86/include/asm/sgx.h | 253 +++++++
drivers/staging/Kconfig | 2 +
drivers/staging/Makefile | 1 +
drivers/staging/intel_sgx/Kconfig | 13 +
drivers/staging/intel_sgx/Makefile | 12 +
drivers/staging/intel_sgx/TODO | 25 +
drivers/staging/intel_sgx/isgx.h | 238 +++++++
drivers/staging/intel_sgx/isgx_compat_ioctl.c | 179 +++++
drivers/staging/intel_sgx/isgx_ioctl.c | 926 ++++++++++++++++++++++++++
drivers/staging/intel_sgx/isgx_main.c | 369 ++++++++++
drivers/staging/intel_sgx/isgx_page_cache.c | 485 ++++++++++++++
drivers/staging/intel_sgx/isgx_user.h | 113 ++++
drivers/staging/intel_sgx/isgx_util.c | 334 ++++++++++
drivers/staging/intel_sgx/isgx_vma.c | 400 +++++++++++
16 files changed, 3437 insertions(+)
create mode 100644 Documentation/x86/intel_sgx.txt
create mode 100644 arch/x86/include/asm/sgx.h
create mode 100644 drivers/staging/intel_sgx/Kconfig
create mode 100644 drivers/staging/intel_sgx/Makefile
create mode 100644 drivers/staging/intel_sgx/TODO
create mode 100644 drivers/staging/intel_sgx/isgx.h
create mode 100644 drivers/staging/intel_sgx/isgx_compat_ioctl.c
create mode 100644 drivers/staging/intel_sgx/isgx_ioctl.c
create mode 100644 drivers/staging/intel_sgx/isgx_main.c
create mode 100644 drivers/staging/intel_sgx/isgx_page_cache.c
create mode 100644 drivers/staging/intel_sgx/isgx_user.h
create mode 100644 drivers/staging/intel_sgx/isgx_util.c
create mode 100644 drivers/staging/intel_sgx/isgx_vma.c

--
2.7.4


2016-04-25 17:37:38

by Jarkko Sakkinen

[permalink] [raw]
Subject: [PATCH 1/6] x86: add SGX definition to cpufeature

From: Kai Huang <[email protected]>

Signed-off-by: Kai Huang <[email protected]>
---
arch/x86/include/asm/cpufeature.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 7ad8c94..f6be49f 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -208,6 +208,7 @@
/* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */
#define X86_FEATURE_FSGSBASE ( 9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/
#define X86_FEATURE_TSC_ADJUST ( 9*32+ 1) /* TSC adjustment MSR 0x3b */
+#define X86_FEATURE_SGX ( 9*32+ 2) /* Software Guard Extensions */
#define X86_FEATURE_BMI1 ( 9*32+ 3) /* 1st group bit manipulation extensions */
#define X86_FEATURE_HLE ( 9*32+ 4) /* Hardware Lock Elision */
#define X86_FEATURE_AVX2 ( 9*32+ 5) /* AVX2 instructions */
--
2.7.4

2016-04-25 17:37:50

by Jarkko Sakkinen

[permalink] [raw]
Subject: [PATCH 2/6] x86, sgx: common macros and definitions

Added arch/x86/include/asm/sgx.h that contains common architectural
macros and definitions that are shared with the SGX driver and kernel
virtualization code.

Signed-off-by: Jarkko Sakkinen <[email protected]>
---
arch/x86/include/asm/sgx.h | 253 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 253 insertions(+)
create mode 100644 arch/x86/include/asm/sgx.h

diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
new file mode 100644
index 0000000..ef9f20f
--- /dev/null
+++ b/arch/x86/include/asm/sgx.h
@@ -0,0 +1,253 @@
+/*
+ * (C) Copyright 2016 Intel Corporation
+ *
+ * Authors:
+ *
+ * Jarkko Sakkinen <[email protected]>
+ * Suresh Siddha <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#ifndef _ASM_X86_SGX_H
+#define _ASM_X86_SGX_H
+
+#include <asm/asm.h>
+#include <linux/bitops.h>
+#include <linux/types.h>
+
+#define SGX_CPUID 0x12
+
+enum sgx_page_type {
+ SGX_PAGE_TYPE_SECS = 0x00,
+ SGX_PAGE_TYPE_TCS = 0x01,
+ SGX_PAGE_TYPE_REG = 0x02,
+ SGX_PAGE_TYPE_VA = 0x03,
+};
+
+enum sgx_secs_attributes {
+ SGX_SECS_A_DEBUG = BIT_ULL(1),
+ SGX_SECS_A_MODE64BIT = BIT_ULL(2),
+ SGX_SECS_A_PROVISION_KEY = BIT_ULL(4),
+ SGX_SECS_A_LICENSE_KEY = BIT_ULL(5),
+ SGX_SECS_A_RESERVED_MASK = (BIT_ULL(0) |
+ BIT_ULL(3) |
+ GENMASK_ULL(63, 6)),
+};
+
+#define SGX_SECS_RESERVED1_SIZE 28
+#define SGX_SECS_RESERVED2_SIZE 32
+#define SGX_SECS_RESERVED3_SIZE 96
+#define SGX_SECS_RESERVED4_SIZE 3836
+
+struct sgx_secs {
+ u64 size;
+ u64 base;
+ u32 ssaframesize;
+ uint8_t reserved1[SGX_SECS_RESERVED1_SIZE];
+ u64 flags;
+ u64 xfrm;
+ u32 mrenclave[8];
+ uint8_t reserved2[SGX_SECS_RESERVED2_SIZE];
+ u32 mrsigner[8];
+ uint8_t reserved3[SGX_SECS_RESERVED3_SIZE];
+ u16 isvvprodid;
+ u16 isvsvn;
+ uint8_t reserved[SGX_SECS_RESERVED4_SIZE];
+};
+
+struct sgx_tcs {
+ u64 state;
+ u64 flags;
+ u64 ossa;
+ u32 cssa;
+ u32 nssa;
+ u64 oentry;
+ u64 aep;
+ u64 ofsbase;
+ u64 ogsbase;
+ u32 fslimit;
+ u32 gslimit;
+ u64 reserved[503];
+};
+
+enum sgx_secinfo_masks {
+ ISGX_SECINFO_PERMISSION_MASK = GENMASK_ULL(2, 0),
+ ISGX_SECINFO_PAGE_TYPE_MASK = GENMASK_ULL(15, 8),
+ ISGX_SECINFO_RESERVED_MASK = (GENMASK_ULL(7, 3) |
+ GENMASK_ULL(63, 16)),
+};
+
+struct sgx_pcmd {
+ struct isgx_secinfo secinfo;
+ u64 enclave_id;
+ u8 reserved[40];
+ u8 mac[16];
+};
+
+struct sgx_page_info {
+ u64 linaddr;
+ u64 srcpge;
+ union {
+ u64 secinfo;
+ u64 pcmd;
+ };
+ u64 secs;
+} __aligned(32);
+
+#define SIGSTRUCT_SIZE 1808
+#define EINITTOKEN_SIZE 304
+
+enum {
+ ECREATE = 0x0,
+ EADD = 0x1,
+ EINIT = 0x2,
+ EREMOVE = 0x3,
+ EDGBRD = 0x4,
+ EDGBWR = 0x5,
+ EEXTEND = 0x6,
+ ELDU = 0x8,
+ EBLOCK = 0x9,
+ EPA = 0xA,
+ EWB = 0xB,
+ ETRACK = 0xC,
+};
+
+#define __encls_ret(rax, rbx, rcx, rdx) \
+ ({ \
+ int ret; \
+ asm volatile( \
+ "1: .byte 0x0f, 0x01, 0xcf;\n\t" \
+ "2:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "3: jmp 2b\n" \
+ ".previous\n" \
+ _ASM_EXTABLE(1b, 3b) \
+ : "=a"(ret) \
+ : "a"(rax), "b"(rbx), "c"(rcx), "d"(rdx) \
+ : "memory"); \
+ ret; \
+ })
+
+#ifdef CONFIG_X86_64
+#define __encls(rax, rbx, rcx, rdx...) \
+ ({ \
+ int ret; \
+ asm volatile( \
+ "1: .byte 0x0f, 0x01, 0xcf;\n\t" \
+ " xor %%eax,%%eax;\n" \
+ "2:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "3: movq $-1,%%rax\n" \
+ " jmp 2b\n" \
+ ".previous\n" \
+ _ASM_EXTABLE(1b, 3b) \
+ : "=a"(ret), "=b"(rbx), "=c"(rcx) \
+ : "a"(rax), "b"(rbx), "c"(rcx), rdx \
+ : "memory"); \
+ ret; \
+ })
+#else
+#define __encls(rax, rbx, rcx, rdx...) \
+ ({ \
+ int ret; \
+ asm volatile( \
+ "1: .byte 0x0f, 0x01, 0xcf;\n\t" \
+ " xor %%eax,%%eax;\n" \
+ "2:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "3: mov $-1,%%eax\n" \
+ " jmp 2b\n" \
+ ".previous\n" \
+ _ASM_EXTABLE(1b, 3b) \
+ : "=a"(ret), "=b"(rbx), "=c"(rcx) \
+ : "a"(rax), "b"(rbx), "c"(rcx), rdx \
+ : "memory"); \
+ ret; \
+ })
+#endif
+
+static inline unsigned long __ecreate(struct sgx_page_info *pginfo, void *secs)
+{
+ return __encls(ECREATE, pginfo, secs, "d"(0));
+}
+
+static inline int __eextend(void *secs, void *epc)
+{
+ return __encls(EEXTEND, secs, epc, "d"(0));
+}
+
+static inline int __eadd(struct sgx_page_info *pginfo, void *epc)
+{
+ return __encls(EADD, pginfo, epc, "d"(0));
+}
+
+static inline int __einit(void *sigstruct, struct isgx_einittoken *einittoken,
+ void *secs)
+{
+ return __encls_ret(EINIT, sigstruct, secs, einittoken);
+}
+
+static inline int __eremove(void *epc)
+{
+ unsigned long rbx = 0;
+ unsigned long rdx = 0;
+
+ return __encls_ret(EREMOVE, rbx, epc, rdx);
+}
+
+static inline int __edbgwr(void *epc, unsigned long *data)
+{
+ return __encls(EDGBWR, *data, epc, "d"(0));
+}
+
+static inline int __edbgrd(void *epc, unsigned long *data)
+{
+ unsigned long rbx = 0;
+ int ret;
+
+ ret = __encls(EDGBRD, rbx, epc, "d"(0));
+ if (!ret)
+ *(unsigned long *) data = rbx;
+
+ return ret;
+}
+
+static inline int __etrack(void *epc)
+{
+ unsigned long rbx = 0;
+ unsigned long rdx = 0;
+
+ return __encls_ret(ETRACK, rbx, epc, rdx);
+}
+
+static inline int __eldu(unsigned long rbx, unsigned long rcx,
+ unsigned long rdx)
+{
+ return __encls_ret(ELDU, rbx, rcx, rdx);
+}
+
+static inline int __eblock(unsigned long rcx)
+{
+ unsigned long rbx = 0;
+ unsigned long rdx = 0;
+
+ return __encls_ret(EBLOCK, rbx, rcx, rdx);
+}
+
+static inline int __epa(void *epc)
+{
+ unsigned long rbx = SGX_PAGE_TYPE_VA;
+
+ return __encls(EPA, rbx, epc, "d"(0));
+}
+
+static inline int __ewb(struct sgx_page_info *pginfo, void *epc, void *va)
+{
+ return __encls_ret(EWB, pginfo, epc, va);
+}
+
+#endif /* _ASM_X86_SGX_H */
--
2.7.4

2016-04-25 17:38:15

by Jarkko Sakkinen

[permalink] [raw]
Subject: [PATCH 4/6] intel_sgx: ptrace() support for the driver

This commit implements the 'access' callback for the enclave VMA thus
enabling reading and writing the memory of debug enclaves. The page
that is accessed is first faulted and marked as reserved so that the
EPC evictor will know not to swap the page while it is being
manipulated.

Signed-off-by: Jarkko Sakkinen <[email protected]>
---
drivers/staging/intel_sgx/isgx_vma.c | 118 +++++++++++++++++++++++++++++++++++
1 file changed, 118 insertions(+)

diff --git a/drivers/staging/intel_sgx/isgx_vma.c b/drivers/staging/intel_sgx/isgx_vma.c
index f6cfb02..2788ab9 100644
--- a/drivers/staging/intel_sgx/isgx_vma.c
+++ b/drivers/staging/intel_sgx/isgx_vma.c
@@ -275,8 +275,126 @@ static int isgx_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
return VM_FAULT_SIGBUS;
}

+static inline int isgx_vma_access_word(struct isgx_enclave *enclave,
+ unsigned long addr,
+ void *buf,
+ int len,
+ int write,
+ struct isgx_enclave_page *enclave_page,
+ int i)
+{
+ char data[sizeof(unsigned long)];
+ int align, cnt, offset;
+ void *vaddr;
+ int ret;
+
+ offset = ((addr + i) & (PAGE_SIZE - 1)) & ~(sizeof(unsigned long) - 1);
+ align = (addr + i) & (sizeof(unsigned long) - 1);
+ cnt = sizeof(unsigned long) - align;
+ cnt = min(cnt, len - i);
+
+ if (write) {
+ if (enclave_page->flags & ISGX_ENCLAVE_PAGE_TCS &&
+ (offset < 8 || (offset + (len - i)) > 16))
+ return -ECANCELED;
+
+ if (align || (cnt != sizeof(unsigned long))) {
+ vaddr = isgx_get_epc_page(enclave_page->epc_page);
+ ret = __edbgrd((void *)((unsigned long)vaddr + offset),
+ (unsigned long *)data);
+ isgx_put_epc_page(vaddr);
+ if (ret) {
+ isgx_dbg(enclave, "EDBGRD returned %d\n", ret);
+ return -EFAULT;
+ }
+ }
+
+ memcpy(data + align, buf + i, cnt);
+ vaddr = isgx_get_epc_page(enclave_page->epc_page);
+ ret = __edbgwr((void *)((unsigned long)vaddr + offset),
+ (unsigned long *)data);
+ isgx_put_epc_page(vaddr);
+ if (ret) {
+ isgx_dbg(enclave, "EDBGWR returned %d\n", ret);
+ return -EFAULT;
+ }
+ } else {
+ if (enclave_page->flags & ISGX_ENCLAVE_PAGE_TCS &&
+ (offset + (len - i)) > 72)
+ return -ECANCELED;
+
+ vaddr = isgx_get_epc_page(enclave_page->epc_page);
+ ret = __edbgrd((void *)((unsigned long)vaddr + offset),
+ (unsigned long *)data);
+ isgx_put_epc_page(vaddr);
+ if (ret) {
+ isgx_dbg(enclave, "EDBGRD returned %d\n", ret);
+ return -EFAULT;
+ }
+
+ memcpy(buf + i, data + align, cnt);
+ }
+
+ return cnt;
+}
+
+static int isgx_vma_access(struct vm_area_struct *vma, unsigned long addr,
+ void *buf, int len, int write)
+{
+ struct isgx_enclave *enclave = vma->vm_private_data;
+ struct isgx_enclave_page *entry = NULL;
+ const char *op_str = write ? "EDBGWR" : "EDBGRD";
+ int ret = 0;
+ int i;
+
+ /* If process was forked, VMA is still there but vm_private_data is set
+ * to NULL.
+ */
+ if (!enclave)
+ return -EFAULT;
+
+ if (!(enclave->flags & ISGX_ENCLAVE_DEBUG) ||
+ !(enclave->flags & ISGX_ENCLAVE_INITIALIZED) ||
+ (enclave->flags & ISGX_ENCLAVE_SUSPEND))
+ return -EFAULT;
+
+ isgx_dbg(enclave, "%s addr=0x%lx, len=%d\n", op_str, addr, len);
+
+ for (i = 0; i < len; i += ret) {
+ if (!entry || !((addr + i) & (PAGE_SIZE - 1))) {
+ if (entry)
+ entry->flags &= ~ISGX_ENCLAVE_PAGE_RESERVED;
+
+ do {
+ entry = isgx_vma_do_fault(
+ vma, (addr + i) & PAGE_MASK, true);
+ } while (entry == ERR_PTR(-EBUSY));
+
+ if (IS_ERR(entry)) {
+ ret = PTR_ERR(entry);
+ entry = NULL;
+ break;
+ }
+ }
+
+ /* No locks are needed because used fields are immutable after
+ * intialization.
+ */
+ ret = isgx_vma_access_word(enclave, addr, buf, len, write,
+ entry, i);
+ if (ret < 0)
+ break;
+ }
+
+ if (entry)
+ entry->flags &= ~ISGX_ENCLAVE_PAGE_RESERVED;
+
+ return (ret < 0 && ret != -ECANCELED) ? ret : i;
+}
+
struct vm_operations_struct isgx_vm_ops = {
.close = isgx_vma_close,
.open = isgx_vma_open,
.fault = isgx_vma_fault,
+ .access = isgx_vma_access,
};
--
2.7.4

2016-04-25 17:38:18

by Jarkko Sakkinen

[permalink] [raw]
Subject: [PATCH 5/6] intel_sgx: driver documentation

Signed-off-by: Jarkko Sakkinen <[email protected]>
---
Documentation/x86/intel_sgx.txt | 86 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 86 insertions(+)
create mode 100644 Documentation/x86/intel_sgx.txt

diff --git a/Documentation/x86/intel_sgx.txt b/Documentation/x86/intel_sgx.txt
new file mode 100644
index 0000000..f26b50b
--- /dev/null
+++ b/Documentation/x86/intel_sgx.txt
@@ -0,0 +1,86 @@
+1. Intel(R) SGX overview
+========================
+
+Intel(R) SGX is a set of CPU instructions that can be used by applications to
+set aside private regions of code and data. The code outside the enclave is
+disallowed to access the memory inside the enclave by the CPU access control.
+
+There is a new hardware unit in the processor called Memory Encryption Engine
+(MEE) starting from the Skylake microachitecture. BIOS can define one or many
+MEE regions that can hold enclave data by configuring them with PRMRR registers.
+
+The MEE automatically encrypts the data leaving the processor package to the MEE
+regions. The data is encrypted using a random key whose life-time is exactly one
+power cycle.
+
+You can tell if your CPU supports SGX by looking into /proc/cpuinfo:
+
+ cat /proc/cpuinfo | grep ' sgx '
+
+2. Enclaves overview
+====================
+
+SGX defines new data types to maintain information about the enclaves and their
+security properties.
+
+The following data structures exist in MEE regions:
+
+* Enclave Page Cache (EPC): protected code and data
+* Enclave Page Cache Map (EPCM): meta-data for each EPC page
+
+The Enclave Page Cache can hold following types EPC pages:
+
+* SGX Enclave Control Structure (SECS): contains meta-data defining the global
+ properties of an enclave such as range of addresses it can access.
+* Regular EPC pages containing code and data for the enclave.
+* Thread Control Structure (TCS): defines an entry point for a hardware thread
+ to enter into the enclave. The enclave can only be entered through these entry
+ points.
+* Version Array (VA): an EPC page receives a unique version number when it is
+ evicted that is stored into a VA page. A VA page can hold up to 512 version
+ numbers.
+
+There are leaf instructions called EADD and EEXTEND that can be used to add and
+measure an enclave to a virtual address space.
+
+When initializing an enclave a SIGSTRUCT must provided for the EINIT leaf
+instruction that contains signed measurement of the enclave binary. For so
+called architectural enclaves (AEs) this structure is signed with Intel Root of
+Trust.
+
+For normal application specific enclaves a cryptographic token called EINITTOKEN
+must be provided that is signed with Intel RoT. There is an AE called License
+Enclave that provides this token given by a SIGSTRUCT instance. It checks
+whether the public key contained inside SIGSTRUCT is whitelisted and generates
+EINITTOKEN if it is.
+
+There is a special type of enclave called debug enclave that is convenient when
+the enclave code is being developed. These enclaves can be read and write by
+using EDBGWR and EDBGRD leaf instructions. The kernel driver provides ptrace()
+interface for enclaves by using these instructions.
+
+Another benefit with debug enclaves is that LE will ignore the white list
+and always generates EINITTOKEN.
+
+3. IOCTL API
+============
+
+The ioctl API is defined in arch/x86/include/uapi/asm/sgx.h.
+
+SGX_IOCTL_ENCLAVE_CREATE
+
+Creates a VMA and a SECS page for the enclave.
+
+SGX_IOCTL_ENCLAVE_ADD_PAGE
+
+Adds and measures a new EPC page for the enclave. Must be in the range defined
+by SGX_IOCTL_ENCLAVE_CREATE. This will copy the page data and it to a workqueue
+that will eventually execute EADD and EEXTEND leaf instruction that add and
+measure the page.
+
+SGX_IOCTL_ENCLAVE_INIT
+
+Initializes an enclave given by SIGSTRUCT and EINITTOKEN. Executes EINIT leaf
+instruction that will check that the measurement matches the one SIGSTRUCT and
+EINITTOKEN. EINITTOKEN is a data blob given by a special enclave called Launch
+Enclave and it is signed with a CPU's Launch Key.
--
2.7.4

2016-04-25 17:38:30

by Jarkko Sakkinen

[permalink] [raw]
Subject: [PATCH 3/6] intel_sgx: driver for Intel Secure Guard eXtensions

Intel(R) SGX is a set of CPU instructions that can be used by
applications to set aside private regions of code and data. The code
outside the enclave is disallowed to access the memory inside the
enclave by the CPU access control.

Intel SGX driver provides a ioctl interface for loading and initializing
enclaves and a pager in order to support oversubscription.

Signed-off-by: Jarkko Sakkinen <[email protected]>
---
arch/x86/include/asm/sgx.h | 4 +-
drivers/staging/Kconfig | 2 +
drivers/staging/Makefile | 1 +
drivers/staging/intel_sgx/Kconfig | 13 +
drivers/staging/intel_sgx/Makefile | 12 +
drivers/staging/intel_sgx/isgx.h | 238 +++++++
drivers/staging/intel_sgx/isgx_compat_ioctl.c | 179 +++++
drivers/staging/intel_sgx/isgx_ioctl.c | 926 ++++++++++++++++++++++++++
drivers/staging/intel_sgx/isgx_main.c | 369 ++++++++++
drivers/staging/intel_sgx/isgx_page_cache.c | 485 ++++++++++++++
drivers/staging/intel_sgx/isgx_user.h | 113 ++++
drivers/staging/intel_sgx/isgx_util.c | 334 ++++++++++
drivers/staging/intel_sgx/isgx_vma.c | 282 ++++++++
13 files changed, 2956 insertions(+), 2 deletions(-)
create mode 100644 drivers/staging/intel_sgx/Kconfig
create mode 100644 drivers/staging/intel_sgx/Makefile
create mode 100644 drivers/staging/intel_sgx/isgx.h
create mode 100644 drivers/staging/intel_sgx/isgx_compat_ioctl.c
create mode 100644 drivers/staging/intel_sgx/isgx_ioctl.c
create mode 100644 drivers/staging/intel_sgx/isgx_main.c
create mode 100644 drivers/staging/intel_sgx/isgx_page_cache.c
create mode 100644 drivers/staging/intel_sgx/isgx_user.h
create mode 100644 drivers/staging/intel_sgx/isgx_util.c
create mode 100644 drivers/staging/intel_sgx/isgx_vma.c

diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
index ef9f20f..5e2692d 100644
--- a/arch/x86/include/asm/sgx.h
+++ b/arch/x86/include/asm/sgx.h
@@ -82,7 +82,7 @@ enum sgx_secinfo_masks {
};

struct sgx_pcmd {
- struct isgx_secinfo secinfo;
+ struct sgx_secinfo secinfo;
u64 enclave_id;
u8 reserved[40];
u8 mac[16];
@@ -185,7 +185,7 @@ static inline int __eadd(struct sgx_page_info *pginfo, void *epc)
return __encls(EADD, pginfo, epc, "d"(0));
}

-static inline int __einit(void *sigstruct, struct isgx_einittoken *einittoken,
+static inline int __einit(void *sigstruct, struct sgx_einittoken *einittoken,
void *secs)
{
return __encls_ret(EINIT, sigstruct, secs, einittoken);
diff --git a/drivers/staging/Kconfig b/drivers/staging/Kconfig
index 5d3b86a..dc64d4b 100644
--- a/drivers/staging/Kconfig
+++ b/drivers/staging/Kconfig
@@ -110,4 +110,6 @@ source "drivers/staging/wilc1000/Kconfig"

source "drivers/staging/most/Kconfig"

+source "drivers/staging/intel_sgx/Kconfig"
+
endif # STAGING
diff --git a/drivers/staging/Makefile b/drivers/staging/Makefile
index 30918ed..992377b 100644
--- a/drivers/staging/Makefile
+++ b/drivers/staging/Makefile
@@ -47,3 +47,4 @@ obj-$(CONFIG_FB_TFT) += fbtft/
obj-$(CONFIG_FSL_MC_BUS) += fsl-mc/
obj-$(CONFIG_WILC1000) += wilc1000/
obj-$(CONFIG_MOST) += most/
+obj-$(CONFIG_INTEL_SGX) += intel_sgx/
diff --git a/drivers/staging/intel_sgx/Kconfig b/drivers/staging/intel_sgx/Kconfig
new file mode 100644
index 0000000..74e3880
--- /dev/null
+++ b/drivers/staging/intel_sgx/Kconfig
@@ -0,0 +1,13 @@
+config INTEL_SGX
+ tristate "Intel(R) SGX Driver"
+ depends on X86
+ ---help---
+ Intel(R) SGX is a set of CPU instructions that can be used by
+ applications to set aside private regions of code and data. The code
+ outside the enclave is disallowed to access the memory inside the
+ enclave by the CPU access control.
+
+ The firmware uses PRMRR registers to reserve an area of physical memory
+ called Enclave Page Cache (EPC). There is a hardware unit in the
+ processor called Memory Encryption Engine. The MEE encrypts and decrypts
+ the EPC pages as they enter and leave the processor package.
diff --git a/drivers/staging/intel_sgx/Makefile b/drivers/staging/intel_sgx/Makefile
new file mode 100644
index 0000000..cc38853
--- /dev/null
+++ b/drivers/staging/intel_sgx/Makefile
@@ -0,0 +1,12 @@
+obj-$(CONFIG_INTEL_SGX) += intel_sgx.o
+
+intel_sgx-$(CONFIG_INTEL_SGX) += \
+ isgx_ioctl.o \
+ isgx_main.o \
+ isgx_page_cache.o \
+ isgx_util.o \
+ isgx_vma.o
+
+ifdef CONFIG_COMPAT
+intel_sgx-$(CONFIG_INTEL_SGX) += isgx_compat_ioctl.o
+endif
diff --git a/drivers/staging/intel_sgx/isgx.h b/drivers/staging/intel_sgx/isgx.h
new file mode 100644
index 0000000..ec3e649
--- /dev/null
+++ b/drivers/staging/intel_sgx/isgx.h
@@ -0,0 +1,238 @@
+/*
+ * (C) Copyright 2016 Intel Corporation
+ *
+ * Authors:
+ *
+ * Jarkko Sakkinen <[email protected]>
+ * Suresh Siddha <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#ifndef __ARCH_ISGX_H__
+#define __ARCH_ISGX_H__
+
+#include "isgx_user.h"
+#include <asm/sgx.h>
+#include <linux/kref.h>
+#include <linux/rbtree.h>
+#include <linux/rwsem.h>
+#include <linux/sched.h>
+#include <linux/workqueue.h>
+
+/* Number of times to spin before going to sleep because of an interrupt
+ * storm.
+ */
+#define EINIT_SPIN_COUNT 20
+
+/* Number of tries in total before giving up with EINIT. During each try
+ * EINIT is called the number of times specified by EINIT_SPINT_COUNT.
+ */
+#define EINIT_TRY_COUNT 50
+
+/* Time to sleep between each try. */
+#define EINIT_BACKOFF_TIME 20
+
+#define ISGX_ENCLAVE_PAGE_TCS 0x1
+#define ISGX_ENCLAVE_PAGE_RESERVED 0x2
+
+struct isgx_epc_page {
+ resource_size_t pa;
+ struct list_head free_list;
+};
+
+#define ISGX_VA_SLOT_COUNT 512
+
+struct isgx_va_page {
+ struct isgx_epc_page *epc_page;
+ DECLARE_BITMAP(slots, ISGX_VA_SLOT_COUNT);
+ struct list_head list;
+};
+
+/**
+ * isgx_alloc_va_slot() - allocate VA slot from a VA page
+ *
+ * @page: VA page
+ *
+ * Returns offset to a free VA slot. If there are no free slots, an offset of
+ * PAGE_SIZE is returned.
+ */
+static inline unsigned int isgx_alloc_va_slot(struct isgx_va_page *page)
+{
+ int slot = find_first_zero_bit(page->slots, ISGX_VA_SLOT_COUNT);
+
+ if (slot < ISGX_VA_SLOT_COUNT)
+ set_bit(slot, page->slots);
+
+ return slot << 3;
+}
+
+/**
+ * isgx_free_va_slot() - free VA slot from a VA page
+ *
+ * @page: VA page
+ * @offset: the offset of the VA slot
+ *
+ * Releases VA slot.
+ */
+static inline void isgx_free_va_slot(struct isgx_va_page *page,
+ unsigned int offset)
+{
+ clear_bit(offset >> 3, page->slots);
+}
+
+struct isgx_enclave_page {
+ unsigned long addr;
+ unsigned int flags;
+ struct isgx_epc_page *epc_page;
+ struct list_head load_list;
+ struct isgx_enclave *enclave;
+ struct isgx_va_page *va_page;
+ unsigned int va_offset;
+ struct sgx_pcmd pcmd;
+ struct rb_node node;
+};
+
+#define ISGX_ENCLAVE_INITIALIZED 0x01
+#define ISGX_ENCLAVE_DEBUG 0x02
+#define ISGX_ENCLAVE_SECS_EVICTED 0x04
+#define ISGX_ENCLAVE_SUSPEND 0x08
+
+struct isgx_vma {
+ struct vm_area_struct *vma;
+ struct list_head vma_list;
+};
+
+struct isgx_tgid_ctx {
+ struct pid *tgid;
+ atomic_t epc_cnt;
+ struct kref refcount;
+ struct list_head enclave_list;
+ struct list_head list;
+};
+
+struct isgx_enclave {
+ /* the enclave lock */
+ struct mutex lock;
+ unsigned int flags;
+ struct task_struct *owner;
+ struct mm_struct *mm;
+ struct file *backing;
+ struct list_head vma_list;
+ struct list_head load_list;
+ struct kref refcount;
+ unsigned long base;
+ unsigned long size;
+ struct list_head va_pages;
+ struct rb_root enclave_rb;
+ struct list_head add_page_reqs;
+ struct work_struct add_page_work;
+ unsigned int secs_child_cnt;
+ struct isgx_enclave_page secs_page;
+ struct isgx_tgid_ctx *tgid_ctx;
+ struct list_head enclave_list;
+};
+
+extern struct workqueue_struct *isgx_add_page_wq;
+extern unsigned long isgx_epc_base;
+extern unsigned long isgx_epc_size;
+#ifdef CONFIG_X86_64
+extern void *isgx_epc_mem;
+#endif
+extern u64 isgx_enclave_size_max_32;
+extern u64 isgx_enclave_size_max_64;
+extern u64 isgx_xfrm_mask;
+extern u32 isgx_ssaframesize_tbl[64];
+
+extern struct vm_operations_struct isgx_vm_ops;
+extern atomic_t isgx_nr_pids;
+
+/* Message macros */
+#define isgx_dbg(encl, fmt, ...) \
+ pr_debug_ratelimited("isgx: [%d:0x%p] " fmt, \
+ pid_nr((encl)->tgid_ctx->tgid), \
+ (void *)(encl)->base, ##__VA_ARGS__)
+#define isgx_info(encl, fmt, ...) \
+ pr_info_ratelimited("isgx: [%d:0x%p] " fmt, \
+ pid_nr((encl)->tgid_ctx->tgid), \
+ (void *)(encl)->base, ##__VA_ARGS__)
+#define isgx_warn(encl, fmt, ...) \
+ pr_warn_ratelimited("isgx: [%d:0x%p] " fmt, \
+ pid_nr((encl)->tgid_ctx->tgid), \
+ (void *)(encl)->base, ##__VA_ARGS__)
+#define isgx_err(encl, fmt, ...) \
+ pr_err_ratelimited("isgx: [%d:0x%p] " fmt, \
+ pid_nr((encl)->tgid_ctx->tgid), \
+ (void *)(encl)->base, ##__VA_ARGS__)
+
+/*
+ * Ioctl subsystem.
+ */
+
+long isgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg);
+#ifdef CONFIG_COMPAT
+long isgx_compat_ioctl(struct file *filep, unsigned int cmd, unsigned long arg);
+#endif
+void isgx_add_page_worker(struct work_struct *work);
+
+/*
+ * Utility functions
+ */
+
+void *isgx_get_epc_page(struct isgx_epc_page *entry);
+void isgx_put_epc_page(void *epc_page_vaddr);
+struct page *isgx_get_backing(struct isgx_enclave *enclave,
+ struct isgx_enclave_page *entry);
+void isgx_put_backing(struct page *backing, bool write);
+void isgx_insert_pte(struct isgx_enclave *enclave,
+ struct isgx_enclave_page *enclave_page,
+ struct isgx_epc_page *epc_page,
+ struct vm_area_struct *vma);
+int isgx_eremove(struct isgx_epc_page *epc_page);
+int isgx_test_and_clear_young(struct isgx_enclave_page *page);
+struct isgx_vma *isgx_find_vma(struct isgx_enclave *enclave,
+ unsigned long addr);
+void isgx_zap_tcs_ptes(struct isgx_enclave *enclave,
+ struct vm_area_struct *vma);
+bool isgx_pin_mm(struct isgx_enclave *encl);
+void isgx_unpin_mm(struct isgx_enclave *encl);
+void isgx_invalidate(struct isgx_enclave *encl);
+int isgx_find_enclave(struct mm_struct *mm, unsigned long addr,
+ struct vm_area_struct **vma);
+struct isgx_enclave_page *isgx_enclave_find_page(struct isgx_enclave *enclave,
+ unsigned long enclave_la);
+void isgx_enclave_release(struct kref *ref);
+void release_tgid_ctx(struct kref *ref);
+
+/*
+ * Page cache subsystem.
+ */
+
+#define ISGX_NR_LOW_EPC_PAGES_DEFAULT 32
+#define ISGX_NR_SWAP_CLUSTER_MAX 16
+
+extern struct mutex isgx_tgid_ctx_mutex;
+extern struct list_head isgx_tgid_ctx_list;
+extern struct task_struct *kisgxswapd_tsk;
+
+enum isgx_alloc_flags {
+ ISGX_ALLOC_ATOMIC = BIT(0),
+};
+
+enum isgx_free_flags {
+ ISGX_FREE_SKIP_EREMOVE = BIT(0),
+};
+
+int kisgxswapd(void *p);
+int isgx_page_cache_init(resource_size_t start, unsigned long size);
+void isgx_page_cache_teardown(void);
+struct isgx_epc_page *isgx_alloc_epc_page(
+ struct isgx_tgid_ctx *tgid_epc_cnt, unsigned int flags);
+void isgx_free_epc_page(struct isgx_epc_page *entry,
+ struct isgx_enclave *encl,
+ unsigned int flags);
+
+#endif /* __ARCH_X86_ISGX_H__ */
diff --git a/drivers/staging/intel_sgx/isgx_compat_ioctl.c b/drivers/staging/intel_sgx/isgx_compat_ioctl.c
new file mode 100644
index 0000000..e75b0cf
--- /dev/null
+++ b/drivers/staging/intel_sgx/isgx_compat_ioctl.c
@@ -0,0 +1,179 @@
+/*
+ * (C) Copyright 2016 Intel Corporation
+ *
+ * Authors:
+ *
+ * Jarkko Sakkinen <[email protected]>
+ * Suresh Siddha <[email protected]>
+ * Serge Ayoun <[email protected]>
+ * Shay Katz-zamir <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#include "isgx.h"
+#include <linux/acpi.h>
+#include <linux/compat.h>
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+#include <linux/suspend.h>
+
+#define ISGX32_IOC_ENCLAVE_CREATE \
+ _IOWR('p', 0x02, struct sgx_create_param32)
+#define ISGX32_IOC_ENCLAVE_ADD_PAGE \
+ _IOW('p', 0x03, struct sgx_add_param32)
+#define ISGX32_IOC_ENCLAVE_INIT \
+ _IOW('p', 0x04, struct sgx_init_param32)
+#define ISGX32_IOC_ENCLAVE_DESTROY \
+ _IOW('p', 0x06, struct sgx_destroy_param32)
+
+struct sgx_create_param32 {
+ u32 secs;
+ u32 addr;
+};
+
+static long enclave_create_compat(struct file *filep, unsigned int cmd,
+ unsigned long arg)
+{
+ struct sgx_create_param32 create_param32;
+ struct sgx_create_param *create_param;
+ unsigned long addr;
+ int ret;
+
+ if (copy_from_user(&create_param32, (void *)arg,
+ sizeof(create_param32)))
+ return -EFAULT;
+
+ create_param = compat_alloc_user_space(sizeof(*create_param));
+ if (!create_param ||
+ __put_user((void __user *)(unsigned long)create_param32.secs,
+ &create_param->secs))
+ return -EFAULT;
+
+ ret = isgx_ioctl(filep, SGX_IOC_ENCLAVE_CREATE,
+ (unsigned long)create_param);
+ if (ret)
+ return ret;
+
+ if (__get_user(addr, &create_param->addr))
+ return -EFAULT;
+
+ create_param32.addr = addr;
+
+ if (copy_to_user((void *)arg, &create_param32, sizeof(create_param32)))
+ return -EFAULT;
+
+ return 0;
+}
+
+struct sgx_add_param32 {
+ u32 addr;
+ u32 user_addr;
+ u32 secinfo;
+ u32 flags;
+};
+
+static long enclave_add_page_compat(struct file *filep, unsigned int cmd,
+ unsigned long arg)
+{
+ struct sgx_add_param32 add_param32;
+ struct sgx_add_param *add_param;
+
+ if (copy_from_user(&add_param32, (void *)arg,
+ sizeof(add_param32)))
+ return -EFAULT;
+
+ add_param = compat_alloc_user_space(sizeof(*add_param));
+ if (!add_param)
+ return -EFAULT;
+
+ if (__put_user((unsigned long)add_param32.addr,
+ &add_param->addr) ||
+ __put_user((unsigned long)add_param32.user_addr,
+ &add_param->user_addr) ||
+ __put_user((unsigned long)add_param32.secinfo,
+ &add_param->secinfo) ||
+ __put_user((unsigned long)add_param32.flags,
+ &add_param->flags))
+ return -EFAULT;
+
+ return isgx_ioctl(filep, SGX_IOC_ENCLAVE_ADD_PAGE,
+ (unsigned long)add_param);
+}
+
+struct sgx_init_param32 {
+ u32 addr;
+ u32 sigstruct;
+ u32 einittoken;
+};
+
+static long enclave_init_compat(struct file *filep, unsigned int cmd,
+ unsigned long arg)
+{
+ struct sgx_init_param32 init_param32;
+ struct sgx_init_param *init_param;
+
+ if (copy_from_user(&init_param32, (void *)arg,
+ sizeof(init_param32)))
+ return -EFAULT;
+
+ init_param = compat_alloc_user_space(sizeof(*init_param));
+ if (!init_param)
+ return -EFAULT;
+
+ if (__put_user((void __user *)(unsigned long)init_param32.addr,
+ &init_param->addr) ||
+ __put_user((void __user *)(unsigned long)init_param32.sigstruct,
+ &init_param->sigstruct) ||
+ __put_user((void __user *)(unsigned long)init_param32.einittoken,
+ &init_param->einittoken))
+ return -EFAULT;
+
+ return isgx_ioctl(filep, SGX_IOC_ENCLAVE_INIT,
+ (unsigned long)init_param);
+}
+
+struct sgx_destroy_param32 {
+ u32 addr;
+};
+
+static long enclave_destroy_compat(struct file *filep, unsigned int cmd,
+ unsigned long arg)
+{
+ struct sgx_destroy_param32 destroy_param32;
+ struct sgx_destroy_param *destroy_param;
+
+ if (copy_from_user(&destroy_param32, (void *)arg,
+ sizeof(destroy_param32)))
+ return -EFAULT;
+
+ destroy_param = compat_alloc_user_space(sizeof(*destroy_param));
+ if (!destroy_param)
+ return -EFAULT;
+
+ if (__put_user((void __user *)(unsigned long)destroy_param32.addr,
+ &destroy_param->addr))
+ return -EFAULT;
+
+ return isgx_ioctl(filep, SGX_IOC_ENCLAVE_DESTROY,
+ (unsigned long)destroy_param);
+}
+
+long isgx_compat_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
+{
+ switch (cmd) {
+ case ISGX32_IOC_ENCLAVE_CREATE:
+ return enclave_create_compat(filep, cmd, arg);
+ case ISGX32_IOC_ENCLAVE_ADD_PAGE:
+ return enclave_add_page_compat(filep, cmd, arg);
+ case ISGX32_IOC_ENCLAVE_INIT:
+ return enclave_init_compat(filep, cmd, arg);
+ case ISGX32_IOC_ENCLAVE_DESTROY:
+ return enclave_destroy_compat(filep, cmd, arg);
+ default:
+ return -EINVAL;
+ }
+}
diff --git a/drivers/staging/intel_sgx/isgx_ioctl.c b/drivers/staging/intel_sgx/isgx_ioctl.c
new file mode 100644
index 0000000..9d8b36b
--- /dev/null
+++ b/drivers/staging/intel_sgx/isgx_ioctl.c
@@ -0,0 +1,926 @@
+/*
+ * (C) Copyright 2016 Intel Corporation
+ *
+ * Authors:
+ *
+ * Jarkko Sakkinen <[email protected]>
+ * Suresh Siddha <[email protected]>
+ * Serge Ayoun <[email protected]>
+ * Shay Katz-zamir <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#include "isgx.h"
+#include <asm/mman.h>
+#include <linux/delay.h>
+#include <linux/file.h>
+#include <linux/highmem.h>
+#include <linux/ratelimit.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/hashtable.h>
+#include <linux/shmem_fs.h>
+
+struct isgx_add_page_req {
+ struct list_head list;
+ struct isgx_enclave_page *enclave_page;
+ struct sgx_secinfo secinfo;
+ u64 flags;
+};
+
+static u16 isgx_isvsvnle_min;
+atomic_t isgx_nr_pids = ATOMIC_INIT(0);
+
+static struct isgx_tgid_ctx *find_tgid_epc_cnt(struct pid *tgid)
+{
+ struct isgx_tgid_ctx *ctx;
+
+ list_for_each_entry(ctx, &isgx_tgid_ctx_list, list)
+ if (pid_nr(ctx->tgid) == pid_nr(tgid))
+ return ctx;
+
+ return NULL;
+}
+
+static int add_tgid_ctx(struct isgx_enclave *enclave)
+{
+ struct isgx_tgid_ctx *ctx;
+ struct pid *tgid = get_pid(task_tgid(current));
+
+ mutex_lock(&isgx_tgid_ctx_mutex);
+
+ ctx = find_tgid_epc_cnt(tgid);
+ if (ctx) {
+ kref_get(&ctx->refcount);
+ enclave->tgid_ctx = ctx;
+ mutex_unlock(&isgx_tgid_ctx_mutex);
+ put_pid(tgid);
+ return 0;
+ }
+
+ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+ if (!ctx) {
+ mutex_unlock(&isgx_tgid_ctx_mutex);
+ put_pid(tgid);
+ return -ENOMEM;
+ }
+
+ ctx->tgid = tgid;
+ kref_init(&ctx->refcount);
+ INIT_LIST_HEAD(&ctx->enclave_list);
+
+ list_add(&ctx->list, &isgx_tgid_ctx_list);
+ atomic_inc(&isgx_nr_pids);
+
+ enclave->tgid_ctx = ctx;
+
+ mutex_unlock(&isgx_tgid_ctx_mutex);
+ return 0;
+}
+
+void release_tgid_ctx(struct kref *ref)
+{
+ struct isgx_tgid_ctx *pe =
+ container_of(ref, struct isgx_tgid_ctx, refcount);
+ mutex_lock(&isgx_tgid_ctx_mutex);
+ list_del(&pe->list);
+ atomic_dec(&isgx_nr_pids);
+ mutex_unlock(&isgx_tgid_ctx_mutex);
+ put_pid(pe->tgid);
+ kfree(pe);
+}
+
+static int enclave_rb_insert(struct rb_root *root,
+ struct isgx_enclave_page *data)
+{
+ struct rb_node **new = &root->rb_node;
+ struct rb_node *parent = NULL;
+
+ /* Figure out where to put new node */
+ while (*new) {
+ struct isgx_enclave_page *this =
+ container_of(*new, struct isgx_enclave_page, node);
+
+ parent = *new;
+ if (data->addr < this->addr)
+ new = &((*new)->rb_left);
+ else if (data->addr > this->addr)
+ new = &((*new)->rb_right);
+ else
+ return -EFAULT;
+ }
+
+ /* Add new node and rebalance tree. */
+ rb_link_node(&data->node, parent, new);
+ rb_insert_color(&data->node, root);
+
+ return 0;
+}
+
+/**
+ * construct_enclave_page() - populate a new enclave page instance
+ * @enclave an enclave
+ * @entry the enclave page to be populated
+ * @addr the linear address of the enclave page
+ *
+ * Allocates VA slot for the enclave page and fills out its fields. Returns
+ * an error code on failure that can be either a POSIX error code or one of the
+ * error codes defined in isgx_user.h.
+ */
+static int construct_enclave_page(struct isgx_enclave *enclave,
+ struct isgx_enclave_page *entry,
+ unsigned long addr)
+{
+ struct isgx_va_page *va_page;
+ struct isgx_epc_page *epc_page = NULL;
+ unsigned int va_offset = PAGE_SIZE;
+ void *vaddr;
+ int ret = 0;
+
+ list_for_each_entry(va_page, &enclave->va_pages, list) {
+ va_offset = isgx_alloc_va_slot(va_page);
+ if (va_offset < PAGE_SIZE)
+ break;
+ }
+
+ if (va_offset == PAGE_SIZE) {
+ va_page = kzalloc(sizeof(*va_page), GFP_KERNEL);
+ if (!va_page)
+ return -ENOMEM;
+
+ epc_page = isgx_alloc_epc_page(NULL, 0);
+ if (IS_ERR(epc_page)) {
+ kfree(va_page);
+ return PTR_ERR(epc_page);
+ }
+
+ vaddr = isgx_get_epc_page(epc_page);
+ if (!vaddr) {
+ isgx_warn(enclave, "kmap of a new VA page failed %d\n",
+ ret);
+ isgx_free_epc_page(epc_page, NULL,
+ ISGX_FREE_SKIP_EREMOVE);
+ kfree(va_page);
+ return -EFAULT;
+ }
+
+ ret = __epa(vaddr);
+ isgx_put_epc_page(vaddr);
+
+ if (ret) {
+ isgx_warn(enclave, "EPA returned %d\n", ret);
+ isgx_free_epc_page(epc_page, NULL, 0);
+ kfree(va_page);
+ return -EFAULT;
+ }
+
+ va_page->epc_page = epc_page;
+ va_offset = isgx_alloc_va_slot(va_page);
+ list_add(&va_page->list, &enclave->va_pages);
+ }
+
+ entry->enclave = enclave;
+ entry->va_page = va_page;
+ entry->va_offset = va_offset;
+ entry->addr = addr;
+
+ return 0;
+}
+
+static int get_enclave(unsigned long addr, struct isgx_enclave **enclave)
+{
+ struct mm_struct *mm = current->mm;
+ struct vm_area_struct *vma;
+ int ret;
+
+ down_read(&mm->mmap_sem);
+
+ ret = isgx_find_enclave(mm, addr, &vma);
+ if (!ret) {
+ *enclave = vma->vm_private_data;
+ kref_get(&(*enclave)->refcount);
+ }
+
+ up_read(&mm->mmap_sem);
+
+ return ret;
+}
+
+static int set_enclave(unsigned long addr, struct isgx_enclave *enclave)
+{
+ struct mm_struct *mm = current->mm;
+ struct vm_area_struct *vma;
+ struct isgx_vma *evma;
+ int ret;
+
+ down_read(&mm->mmap_sem);
+
+ ret = isgx_find_enclave(mm, addr, &vma);
+ if (ret != -ENOENT)
+ goto out;
+ else
+ ret = 0;
+
+ vma->vm_private_data = enclave;
+
+ evma = kzalloc(sizeof(*evma), GFP_KERNEL);
+ if (!evma) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ evma->vma = vma;
+ list_add_tail(&evma->vma_list, &enclave->vma_list);
+out:
+ up_read(&mm->mmap_sem);
+ return ret;
+}
+
+static int validate_secs(const struct sgx_secs *secs)
+{
+ u32 needed_ssaframesize = 1;
+ u32 tmp;
+ int i;
+
+ if (secs->flags & SGX_SECS_A_RESERVED_MASK)
+ return -EINVAL;
+
+ if (secs->flags & SGX_SECS_A_MODE64BIT) {
+#ifdef CONFIG_X86_64
+ if (secs->size > isgx_enclave_size_max_64)
+ return -EINVAL;
+#else
+ return -EINVAL;
+#endif
+ } else {
+ /* On 64-bit architecture allow 32-bit enclaves only in
+ * the compatibility mode.
+ */
+#ifdef CONFIG_X86_64
+ if (!test_thread_flag(TIF_ADDR32))
+ return -EINVAL;
+#endif
+ if (secs->size > isgx_enclave_size_max_32)
+ return -EINVAL;
+ }
+
+ if ((secs->xfrm & 0x3) != 0x3 || (secs->xfrm & ~isgx_xfrm_mask))
+ return -EINVAL;
+
+ /* SKL quirk */
+ if ((secs->xfrm & BIT(3)) != (secs->xfrm & BIT(4)))
+ return -EINVAL;
+
+ for (i = 2; i < 64; i++) {
+ tmp = isgx_ssaframesize_tbl[i];
+ if (((1 << i) & secs->xfrm) && (tmp > needed_ssaframesize))
+ needed_ssaframesize = tmp;
+ }
+
+ if (!secs->ssaframesize || !needed_ssaframesize ||
+ needed_ssaframesize > secs->ssaframesize)
+ return -EINVAL;
+
+ /* Must be power of two */
+ if (secs->size == 0 || (secs->size & (secs->size - 1)) != 0)
+ return -EINVAL;
+
+ for (i = 0; i < SGX_SECS_RESERVED1_SIZE; i++)
+ if (secs->reserved1[i])
+ return -EINVAL;
+
+ for (i = 0; i < SGX_SECS_RESERVED2_SIZE; i++)
+ if (secs->reserved2[i])
+ return -EINVAL;
+
+ for (i = 0; i < SGX_SECS_RESERVED3_SIZE; i++)
+ if (secs->reserved3[i])
+ return -EINVAL;
+
+ for (i = 0; i < SGX_SECS_RESERVED4_SIZE; i++)
+ if (secs->reserved[i])
+ return -EINVAL;
+
+ return 0;
+}
+
+static long isgx_ioctl_enclave_create(struct file *filep, unsigned int cmd,
+ unsigned long arg)
+{
+ struct sgx_page_info pginfo;
+ struct sgx_secinfo secinfo;
+ struct sgx_create_param *createp = (struct sgx_create_param *)arg;
+ void *secs_la = createp->secs;
+ struct isgx_enclave *enclave = NULL;
+ struct sgx_secs *secs = NULL;
+ struct isgx_epc_page *secs_epc_page;
+ void *secs_vaddr = NULL;
+ struct file *backing;
+ long ret;
+
+ secs = kzalloc(sizeof(*secs), GFP_KERNEL);
+ if (!secs)
+ return -ENOMEM;
+ ret = copy_from_user((void *)secs, secs_la, sizeof(*secs));
+ if (ret) {
+ kfree(secs);
+ return ret;
+ }
+
+ if (validate_secs(secs)) {
+ kfree(secs);
+ return -EINVAL;
+ }
+
+ secs->base = vm_mmap(filep, 0, secs->size,
+ PROT_READ | PROT_WRITE | PROT_EXEC,
+ MAP_SHARED, 0);
+ if (IS_ERR((void *)(unsigned long)secs->base)) {
+ ret = PTR_ERR((void *)(unsigned long)secs->base);
+ kfree(secs);
+ pr_warn("isgx: creating VMA for an enclave failed\n");
+ return ret;
+ }
+
+ backing = shmem_file_setup("dev/isgx", secs->size + PAGE_SIZE,
+ VM_NORESERVE);
+ if (IS_ERR(backing)) {
+ ret = PTR_ERR((void *)backing);
+ vm_munmap(secs->base, secs->size);
+ kfree(secs);
+
+ pr_warn("isgx: creating backing storage for enclave failed\n");
+ return PTR_ERR(backing);
+ }
+
+ enclave = kzalloc(sizeof(*enclave), GFP_KERNEL);
+ if (!enclave)
+ goto out;
+
+ kref_init(&enclave->refcount);
+ INIT_LIST_HEAD(&enclave->add_page_reqs);
+ INIT_LIST_HEAD(&enclave->va_pages);
+ INIT_LIST_HEAD(&enclave->vma_list);
+ INIT_LIST_HEAD(&enclave->load_list);
+ INIT_LIST_HEAD(&enclave->enclave_list);
+ mutex_init(&enclave->lock);
+ INIT_WORK(&enclave->add_page_work, isgx_add_page_worker);
+
+ enclave->owner = current->group_leader;
+ enclave->mm = current->mm;
+ enclave->base = secs->base;
+ enclave->size = secs->size;
+ enclave->backing = backing;
+
+ ret = add_tgid_ctx(enclave);
+ if (ret)
+ goto out;
+
+ secs_epc_page = isgx_alloc_epc_page(NULL, 0);
+ if (IS_ERR(secs_epc_page)) {
+ ret = PTR_ERR(secs_epc_page);
+ secs_epc_page = NULL;
+ goto out;
+ }
+
+ enclave->secs_page.epc_page = secs_epc_page;
+
+ ret = construct_enclave_page(enclave, &enclave->secs_page,
+ enclave->base + enclave->size);
+ if (ret)
+ goto out;
+
+ secs_vaddr = isgx_get_epc_page(enclave->secs_page.epc_page);
+
+ pginfo.srcpge = (unsigned long)secs;
+ pginfo.linaddr = 0;
+ pginfo.secinfo = (unsigned long)&secinfo;
+ pginfo.secs = 0;
+ memset(&secinfo, 0, sizeof(secinfo));
+ ret = __ecreate((void *)&pginfo, secs_vaddr);
+
+ isgx_put_epc_page(secs_vaddr);
+
+ if (ret) {
+ isgx_info(enclave, "ECREATE returned %ld\n", ret);
+ goto out;
+ }
+
+ if (secs->flags & SGX_SECS_A_DEBUG)
+ enclave->flags |= ISGX_ENCLAVE_DEBUG;
+
+ ret = set_enclave(secs->base, enclave);
+
+ mutex_lock(&isgx_tgid_ctx_mutex);
+ list_add_tail(&enclave->enclave_list, &enclave->tgid_ctx->enclave_list);
+ mutex_unlock(&isgx_tgid_ctx_mutex);
+out:
+ if (ret) {
+ vm_munmap(secs->base, secs->size);
+ if (enclave)
+ kref_put(&enclave->refcount, isgx_enclave_release);
+ } else {
+ createp->addr = (unsigned long)enclave->base;
+ }
+ kfree(secs);
+ return ret;
+}
+
+static int validate_secinfo(struct sgx_secinfo *secinfo)
+{
+ u64 perm = secinfo->flags & ISGX_SECINFO_PERMISSION_MASK;
+ u64 page_type = secinfo->flags & ISGX_SECINFO_PAGE_TYPE_MASK;
+ int i;
+
+ if ((secinfo->flags & ISGX_SECINFO_RESERVED_MASK) ||
+ ((perm & SGX_SECINFO_FL_W) && !(perm & SGX_SECINFO_FL_R)) ||
+ (page_type != SGX_SECINFO_PT_TCS &&
+ page_type != SGX_SECINFO_PT_REG))
+ return -EINVAL;
+
+ for (i = 0; i < sizeof(secinfo->reserved) / sizeof(u64); i++)
+ if (secinfo->reserved[i])
+ return -EINVAL;
+
+ return 0;
+}
+
+static int validate_tcs(struct sgx_tcs *tcs)
+{
+ int i;
+
+ /* If FLAGS is not zero, ECALL will fail. */
+ if ((tcs->flags != 0) ||
+ (tcs->ossa & (PAGE_SIZE - 1)) ||
+ (tcs->ofsbase & (PAGE_SIZE - 1)) ||
+ (tcs->ogsbase & (PAGE_SIZE - 1)) ||
+ ((tcs->fslimit & 0xFFF) != 0xFFF) ||
+ ((tcs->gslimit & 0xFFF) != 0xFFF))
+ return -EINVAL;
+
+ for (i = 0; i < sizeof(tcs->reserved) / sizeof(u64); i++)
+ if (tcs->reserved[i])
+ return -EINVAL;
+
+ return 0;
+}
+
+static int __enclave_add_page(struct isgx_enclave *enclave,
+ struct isgx_enclave_page *enclave_page,
+ struct sgx_add_param *addp,
+ struct sgx_secinfo *secinfo)
+{
+ u64 page_type = secinfo->flags & ISGX_SECINFO_PAGE_TYPE_MASK;
+ struct sgx_tcs *tcs;
+ struct page *backing;
+ struct isgx_add_page_req *req = NULL;
+ int ret;
+ int empty;
+ void *user_vaddr;
+ void *tmp_vaddr;
+ struct page *tmp_page;
+
+ tmp_page = alloc_page(GFP_HIGHUSER);
+ if (!tmp_page)
+ return -ENOMEM;
+
+ tmp_vaddr = kmap(tmp_page);
+ ret = copy_from_user((void *)tmp_vaddr, (void *)addp->user_addr,
+ PAGE_SIZE);
+ kunmap(tmp_page);
+ if (ret) {
+ __free_page(tmp_page);
+ return -EFAULT;
+ }
+
+ if (validate_secinfo(secinfo)) {
+ __free_page(tmp_page);
+ return -EINVAL;
+ }
+
+ if (page_type == SGX_SECINFO_PT_TCS) {
+ tcs = (struct sgx_tcs *)kmap(tmp_page);
+ ret = validate_tcs(tcs);
+ kunmap(tmp_page);
+ if (ret) {
+ __free_page(tmp_page);
+ return ret;
+ }
+ }
+
+ ret = construct_enclave_page(enclave, enclave_page, addp->addr);
+ if (ret) {
+ __free_page(tmp_page);
+ return -EINVAL;
+ }
+
+ mutex_lock(&enclave->lock);
+
+ if (enclave->flags & ISGX_ENCLAVE_INITIALIZED) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (isgx_enclave_find_page(enclave, addp->addr)) {
+ ret = -EEXIST;
+ goto out;
+ }
+
+ req = kzalloc(sizeof(*req), GFP_KERNEL);
+ if (!req) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ backing = isgx_get_backing(enclave, enclave_page);
+ if (IS_ERR((void *)backing)) {
+ ret = PTR_ERR((void *)backing);
+ goto out;
+ }
+
+ user_vaddr = kmap(backing);
+ tmp_vaddr = kmap(tmp_page);
+ memcpy(user_vaddr, tmp_vaddr, PAGE_SIZE);
+ kunmap(backing);
+ kunmap(tmp_page);
+
+ if (page_type == SGX_SECINFO_PT_TCS)
+ enclave_page->flags |= ISGX_ENCLAVE_PAGE_TCS;
+
+ memcpy(&req->secinfo, secinfo, sizeof(*secinfo));
+
+ req->enclave_page = enclave_page;
+ req->flags = addp->flags;
+ empty = list_empty(&enclave->add_page_reqs);
+ kref_get(&enclave->refcount);
+ list_add_tail(&req->list, &enclave->add_page_reqs);
+ if (empty)
+ queue_work(isgx_add_page_wq, &enclave->add_page_work);
+
+ isgx_put_backing(backing, true /* write */);
+out:
+
+ if (ret) {
+ kfree(req);
+ isgx_free_va_slot(enclave_page->va_page,
+ enclave_page->va_offset);
+ } else {
+ ret = enclave_rb_insert(&enclave->enclave_rb, enclave_page);
+ WARN_ON(ret);
+ }
+
+ mutex_unlock(&enclave->lock);
+ __free_page(tmp_page);
+ return ret;
+}
+
+static long isgx_ioctl_enclave_add_page(struct file *filep, unsigned int cmd,
+ unsigned long arg)
+{
+ struct sgx_add_param *addp;
+ struct isgx_enclave *enclave;
+ struct isgx_enclave_page *page;
+ struct sgx_secinfo secinfo;
+ int ret;
+
+ addp = (struct sgx_add_param *)arg;
+ if (addp->addr & (PAGE_SIZE - 1))
+ return -EINVAL;
+
+ if (copy_from_user(&secinfo, (void __user *)addp->secinfo,
+ sizeof(secinfo)))
+ return -EFAULT;
+
+ ret = get_enclave(addp->addr, &enclave);
+ if (ret)
+ return ret;
+
+ if (addp->addr < enclave->base ||
+ addp->addr > (enclave->base + enclave->size - PAGE_SIZE)) {
+ kref_put(&enclave->refcount, isgx_enclave_release);
+ return -EINVAL;
+ }
+
+ page = kzalloc(sizeof(*page), GFP_KERNEL);
+ if (!page) {
+ kref_put(&enclave->refcount, isgx_enclave_release);
+ return -ENOMEM;
+ }
+
+ ret = __enclave_add_page(enclave, page, addp, &secinfo);
+ kref_put(&enclave->refcount, isgx_enclave_release);
+
+ if (ret)
+ kfree(page);
+
+ return ret;
+}
+
+static int __isgx_enclave_init(struct isgx_enclave *enclave,
+ char *sigstruct,
+ struct sgx_einittoken *einittoken)
+{
+ int ret = SGX_UNMASKED_EVENT;
+ struct isgx_epc_page *secs_epc_page = enclave->secs_page.epc_page;
+ void *secs_va = NULL;
+ int i;
+ int j;
+
+ if (einittoken->valid && einittoken->isvsvnle < isgx_isvsvnle_min)
+ return SGX_LE_ROLLBACK;
+
+ for (i = 0; i < EINIT_TRY_COUNT; i++) {
+ for (j = 0; j < EINIT_SPIN_COUNT; j++) {
+ mutex_lock(&enclave->lock);
+ secs_va = isgx_get_epc_page(secs_epc_page);
+ ret = __einit(sigstruct, einittoken, secs_va);
+ isgx_put_epc_page(secs_va);
+ mutex_unlock(&enclave->lock);
+ if (ret == SGX_UNMASKED_EVENT)
+ continue;
+ else
+ break;
+ }
+
+ if (ret != SGX_UNMASKED_EVENT)
+ goto out;
+
+ msleep_interruptible(EINIT_BACKOFF_TIME);
+ if (signal_pending(current))
+ return -EINTR;
+ }
+
+out:
+ if (ret) {
+ isgx_info(enclave, "EINIT returned %d\n", ret);
+ } else {
+ enclave->flags |= ISGX_ENCLAVE_INITIALIZED;
+
+ if (einittoken->isvsvnle > isgx_isvsvnle_min)
+ isgx_isvsvnle_min = einittoken->isvsvnle;
+ }
+
+ return ret;
+}
+
+static long isgx_ioctl_enclave_init(struct file *filep, unsigned int cmd,
+ unsigned long arg)
+{
+ int ret = -EINVAL;
+ struct sgx_init_param *initp = (struct sgx_init_param *)arg;
+ unsigned long enclave_id = initp->addr;
+ char *sigstruct;
+ struct sgx_einittoken *einittoken;
+ struct isgx_enclave *enclave;
+ struct page *initp_page;
+
+ initp_page = alloc_page(GFP_HIGHUSER);
+ if (!initp_page)
+ return -ENOMEM;
+
+ sigstruct = kmap(initp_page);
+ einittoken = (struct sgx_einittoken *)
+ ((unsigned long)sigstruct + PAGE_SIZE / 2);
+
+ ret = copy_from_user(sigstruct, initp->sigstruct, SIGSTRUCT_SIZE);
+ if (ret)
+ goto out_free_page;
+
+ ret = copy_from_user(einittoken, initp->einittoken, EINITTOKEN_SIZE);
+ if (ret)
+ goto out_free_page;
+
+ ret = get_enclave(enclave_id, &enclave);
+ if (ret)
+ goto out_free_page;
+
+ mutex_lock(&enclave->lock);
+ if (enclave->flags & ISGX_ENCLAVE_INITIALIZED) {
+ ret = -EINVAL;
+ mutex_unlock(&enclave->lock);
+ goto out;
+ }
+ mutex_unlock(&enclave->lock);
+
+ flush_work(&enclave->add_page_work);
+
+ ret = __isgx_enclave_init(enclave, sigstruct, einittoken);
+out:
+ kref_put(&enclave->refcount, isgx_enclave_release);
+out_free_page:
+ kunmap(initp_page);
+ __free_page(initp_page);
+ return ret;
+}
+
+static long isgx_ioctl_enclave_destroy(struct file *filep, unsigned int cmd,
+ unsigned long arg)
+{
+ struct sgx_destroy_param *destroyp =
+ (struct sgx_destroy_param *)arg;
+ unsigned long enclave_id = destroyp->addr;
+ struct isgx_enclave *enclave;
+ int ret;
+
+ ret = get_enclave(enclave_id, &enclave);
+ if (ret)
+ return ret;
+
+ vm_munmap(enclave->base, enclave->size);
+ kref_put(&enclave->refcount, isgx_enclave_release);
+
+ return 0;
+}
+
+typedef long (*isgx_ioctl_t)(struct file *filep, unsigned int cmd,
+ unsigned long arg);
+
+long isgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
+{
+ char data[256];
+ isgx_ioctl_t handler = NULL;
+ long ret;
+
+ switch (cmd) {
+ case SGX_IOC_ENCLAVE_CREATE:
+ handler = isgx_ioctl_enclave_create;
+ break;
+ case SGX_IOC_ENCLAVE_ADD_PAGE:
+ handler = isgx_ioctl_enclave_add_page;
+ break;
+ case SGX_IOC_ENCLAVE_INIT:
+ handler = isgx_ioctl_enclave_init;
+ break;
+ case SGX_IOC_ENCLAVE_DESTROY:
+ handler = isgx_ioctl_enclave_destroy;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (copy_from_user(data, (void __user *)arg, _IOC_SIZE(cmd)))
+ return -EFAULT;
+
+ ret = handler(filep, cmd, (unsigned long)((void *)data));
+ if (!ret && (cmd & IOC_OUT)) {
+ if (copy_to_user((void __user *)arg, data, _IOC_SIZE(cmd)))
+ return -EFAULT;
+ }
+
+ return ret;
+}
+
+static int do_eadd(struct isgx_epc_page *secs_page,
+ struct isgx_epc_page *epc_page,
+ unsigned long linaddr,
+ struct sgx_secinfo *secinfo,
+ struct page *backing)
+{
+ struct sgx_page_info pginfo;
+ void *epc_page_vaddr;
+ int ret;
+
+ pginfo.srcpge = (unsigned long)kmap_atomic(backing);
+ pginfo.secs = (unsigned long)isgx_get_epc_page(secs_page);
+ epc_page_vaddr = isgx_get_epc_page(epc_page);
+
+ pginfo.linaddr = linaddr;
+ pginfo.secinfo = (unsigned long)secinfo;
+ ret = __eadd(&pginfo, epc_page_vaddr);
+
+ isgx_put_epc_page(epc_page_vaddr);
+ isgx_put_epc_page((void *)(unsigned long)pginfo.secs);
+ kunmap_atomic((void *)(unsigned long)pginfo.srcpge);
+
+ return ret;
+}
+
+static int do_eextend(struct isgx_epc_page *secs_page,
+ struct isgx_epc_page *epc_page)
+{
+ void *secs;
+ void *epc;
+ int ret = 0;
+ int i;
+
+ for (i = 0; i < 0x1000 && !ret; i += 0x100) {
+ secs = isgx_get_epc_page(secs_page);
+ epc = isgx_get_epc_page(epc_page);
+
+ ret = __eextend(secs, (void *)((unsigned long)epc + i));
+
+ isgx_put_epc_page(epc);
+ isgx_put_epc_page(secs);
+ }
+
+ return ret;
+}
+
+static bool process_add_page_req(struct isgx_add_page_req *req)
+{
+ struct page *backing;
+ struct isgx_epc_page *epc_page;
+ struct isgx_enclave_page *enclave_page = req->enclave_page;
+ unsigned int flags = req->flags;
+ struct isgx_enclave *enclave = enclave_page->enclave;
+ unsigned free_flags = ISGX_FREE_SKIP_EREMOVE;
+ struct vm_area_struct *vma;
+ int ret;
+
+ epc_page = isgx_alloc_epc_page(enclave->tgid_ctx, 0);
+ if (IS_ERR(epc_page))
+ return false;
+
+ if (!isgx_pin_mm(enclave)) {
+ isgx_free_epc_page(epc_page, enclave, free_flags);
+ return false;
+ }
+
+ mutex_lock(&enclave->lock);
+
+ if (list_empty(&enclave->vma_list) ||
+ isgx_find_enclave(enclave->mm, enclave_page->addr, &vma))
+ goto out;
+
+ backing = isgx_get_backing(enclave, enclave_page);
+ if (IS_ERR(backing))
+ goto out;
+
+ /* Do not race with do_exit() */
+ if (!atomic_read(&enclave->mm->mm_users)) {
+ isgx_put_backing(backing, 0);
+ goto out;
+ }
+
+ ret = vm_insert_pfn(vma, enclave_page->addr, PFN_DOWN(epc_page->pa));
+ if (ret)
+ goto out;
+
+ ret = do_eadd(enclave->secs_page.epc_page, epc_page,
+ enclave_page->addr, &req->secinfo, backing);
+
+ isgx_put_backing(backing, 0);
+ free_flags = 0;
+ if (ret) {
+ isgx_dbg(enclave, "EADD returned %d\n", ret);
+ zap_vma_ptes(vma, enclave_page->addr, PAGE_SIZE);
+ goto out;
+ }
+
+ enclave->secs_child_cnt++;
+
+ if (!(flags & SGX_ADD_SKIP_EEXTEND)) {
+ ret = do_eextend(enclave->secs_page.epc_page, epc_page);
+ if (ret) {
+ isgx_dbg(enclave, "EEXTEND returned %d\n", ret);
+ zap_vma_ptes(vma, enclave_page->addr, PAGE_SIZE);
+ goto out;
+ }
+ }
+
+ isgx_test_and_clear_young(enclave_page);
+
+ enclave_page->epc_page = epc_page;
+ list_add_tail(&enclave_page->load_list, &enclave->load_list);
+
+ mutex_unlock(&enclave->lock);
+ isgx_unpin_mm(enclave);
+ return true;
+out:
+ isgx_free_epc_page(epc_page, enclave, free_flags);
+ mutex_unlock(&enclave->lock);
+ isgx_unpin_mm(enclave);
+ return false;
+}
+
+void isgx_add_page_worker(struct work_struct *work)
+{
+ struct isgx_enclave *enclave;
+ struct isgx_add_page_req *req;
+ bool skip_rest = false;
+ bool is_empty = false;
+
+ enclave = container_of(work, struct isgx_enclave, add_page_work);
+
+ do {
+ schedule();
+
+ mutex_lock(&enclave->lock);
+ req = list_first_entry(&enclave->add_page_reqs,
+ struct isgx_add_page_req, list);
+ list_del(&req->list);
+ is_empty = list_empty(&enclave->add_page_reqs);
+ mutex_unlock(&enclave->lock);
+
+ if (!skip_rest)
+ if (!process_add_page_req(req))
+ skip_rest = true;
+
+ kfree(req);
+ } while (!kref_put(&enclave->refcount, isgx_enclave_release) &&
+ !is_empty);
+}
diff --git a/drivers/staging/intel_sgx/isgx_main.c b/drivers/staging/intel_sgx/isgx_main.c
new file mode 100644
index 0000000..6554efc5
--- /dev/null
+++ b/drivers/staging/intel_sgx/isgx_main.c
@@ -0,0 +1,369 @@
+/*
+ * (C) Copyright 2016 Intel Corporation
+ *
+ * Authors:
+ *
+ * Jarkko Sakkinen <[email protected]>
+ * Suresh Siddha <[email protected]>
+ * Serge Ayoun <[email protected]>
+ * Shay Katz-zamir <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#include "isgx.h"
+#include <linux/acpi.h>
+#include <linux/compat.h>
+#include <linux/file.h>
+#include <linux/highmem.h>
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+#include <linux/suspend.h>
+#include <linux/hashtable.h>
+#include <linux/kthread.h>
+#include <linux/platform_device.h>
+
+#define DRV_DESCRIPTION "Intel SGX Driver"
+#define DRV_VERSION "0.10"
+
+#define ENCLAVE_SIZE_MAX_64 (64ULL * 1024ULL * 1024ULL * 1024ULL)
+#define ENCLAVE_SIZE_MAX_32 (2ULL * 1024ULL * 1024ULL * 1024ULL)
+
+MODULE_DESCRIPTION(DRV_DESCRIPTION);
+MODULE_AUTHOR("Jarkko Sakkinen <[email protected]>");
+MODULE_VERSION(DRV_VERSION);
+
+/*
+ * Global data.
+ */
+
+struct workqueue_struct *isgx_add_page_wq;
+unsigned long isgx_epc_base;
+unsigned long isgx_epc_size;
+#ifdef CONFIG_X86_64
+void *isgx_epc_mem;
+#endif
+u64 isgx_enclave_size_max_32 = ENCLAVE_SIZE_MAX_32;
+u64 isgx_enclave_size_max_64 = ENCLAVE_SIZE_MAX_64;
+u64 isgx_xfrm_mask = 0x3;
+u32 isgx_ssaframesize_tbl[64];
+
+/*
+ * Local data.
+ */
+
+static int isgx_mmap(struct file *file, struct vm_area_struct *vma);
+
+static unsigned long isgx_get_unmapped_area(struct file *file,
+ unsigned long addr,
+ unsigned long len,
+ unsigned long pgoff,
+ unsigned long flags);
+
+static const struct file_operations isgx_fops = {
+ .owner = THIS_MODULE,
+ .unlocked_ioctl = isgx_ioctl,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = isgx_compat_ioctl,
+#endif
+ .mmap = isgx_mmap,
+ .get_unmapped_area = isgx_get_unmapped_area,
+};
+
+static struct miscdevice isgx_dev = {
+ .name = "sgx",
+ .fops = &isgx_fops,
+ .mode = S_IRUGO | S_IWUGO,
+};
+
+static int isgx_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ vma->vm_ops = &isgx_vm_ops;
+#if !defined(VM_RESERVED)
+ vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO;
+#else
+ vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_RESERVED | VM_IO;
+#endif
+
+ return 0;
+}
+
+static int isgx_init_platform(void)
+{
+ unsigned int eax, ebx, ecx, edx;
+ int i;
+
+ cpuid(0, &eax, &ebx, &ecx, &edx);
+ if (eax < SGX_CPUID) {
+ pr_err("isgx: CPUID is missing the SGX leaf instruction\n");
+ return -ENODEV;
+ }
+
+ if (!boot_cpu_has(X86_FEATURE_SGX)) {
+ pr_err("isgx: CPU is missing the SGX feature\n");
+ return -ENODEV;
+ }
+
+ cpuid_count(SGX_CPUID, 0x0, &eax, &ebx, &ecx, &edx);
+ if (!(eax & 1)) {
+ pr_err("isgx: CPU does not support the SGX 1.0 instruction set\n");
+ return -ENODEV;
+ }
+
+ if (boot_cpu_has(X86_FEATURE_OSXSAVE)) {
+ cpuid_count(SGX_CPUID, 0x1, &eax, &ebx, &ecx, &edx);
+ isgx_xfrm_mask = (((u64)edx) << 32) + (u64)ecx;
+ for (i = 2; i < 64; i++) {
+ cpuid_count(0x0D, i, &eax, &ebx, &ecx, &edx);
+ if ((1 << i) & isgx_xfrm_mask)
+ isgx_ssaframesize_tbl[i] =
+ (168 + eax + ebx + PAGE_SIZE - 1) /
+ PAGE_SIZE;
+ }
+ }
+
+ cpuid_count(SGX_CPUID, 0x0, &eax, &ebx, &ecx, &edx);
+ if (edx & 0xFFFF) {
+#ifdef CONFIG_X86_64
+ isgx_enclave_size_max_64 = 2ULL << (edx & 0xFF);
+#endif
+ isgx_enclave_size_max_32 = 2ULL << ((edx >> 8) & 0xFF);
+ }
+
+ cpuid_count(SGX_CPUID, 0x2, &eax, &ebx, &ecx, &edx);
+
+ /* The should be at least one EPC area or something is wrong. */
+ if ((eax & 0xf) != 0x1)
+ return -ENODEV;
+
+ isgx_epc_base = (((u64)(ebx & 0xfffff)) << 32) +
+ (u64)(eax & 0xfffff000);
+ isgx_epc_size = (((u64)(edx & 0xfffff)) << 32) +
+ (u64)(ecx & 0xfffff000);
+
+ if (!isgx_epc_base)
+ return -ENODEV;
+
+ return 0;
+}
+
+static int isgx_pm_suspend(struct device *dev)
+{
+ struct isgx_tgid_ctx *ctx;
+ struct isgx_enclave *encl;
+
+ kthread_stop(kisgxswapd_tsk);
+ kisgxswapd_tsk = NULL;
+
+ list_for_each_entry(ctx, &isgx_tgid_ctx_list, list) {
+ list_for_each_entry(encl, &ctx->enclave_list, enclave_list) {
+ isgx_invalidate(encl);
+ encl->flags |= ISGX_ENCLAVE_SUSPEND;
+ }
+ }
+
+ return 0;
+}
+
+static int isgx_pm_resume(struct device *dev)
+{
+ kisgxswapd_tsk = kthread_run(kisgxswapd, NULL, "kisgxswapd");
+ return 0;
+}
+
+static SIMPLE_DEV_PM_OPS(isgx_drv_pm, isgx_pm_suspend, isgx_pm_resume);
+
+static int isgx_drv_init(struct device *dev)
+{
+ unsigned int wq_flags;
+ int ret;
+
+ pr_info("isgx: " DRV_DESCRIPTION " v" DRV_VERSION "\n");
+
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return -ENODEV;
+
+ ret = isgx_init_platform();
+ if (ret)
+ return ret;
+
+ pr_info("isgx: EPC memory range 0x%lx-0x%lx\n", isgx_epc_base,
+ isgx_epc_base + isgx_epc_size);
+
+#ifdef CONFIG_X86_64
+ isgx_epc_mem = ioremap_cache(isgx_epc_base, isgx_epc_size);
+ if (!isgx_epc_mem)
+ return -ENOMEM;
+#endif
+
+ ret = isgx_page_cache_init(isgx_epc_base, isgx_epc_size);
+ if (ret)
+ goto out_iounmap;
+
+ wq_flags = WQ_UNBOUND | WQ_FREEZABLE;
+#ifdef WQ_NON_REENETRANT
+ wq_flags |= WQ_NON_REENTRANT;
+#endif
+ isgx_add_page_wq = alloc_workqueue("isgx-add-page-wq", wq_flags, 1);
+ if (!isgx_add_page_wq) {
+ pr_err("isgx: alloc_workqueue() failed\n");
+ ret = -ENOMEM;
+ goto out_iounmap;
+ }
+
+ isgx_dev.parent = dev;
+ ret = misc_register(&isgx_dev);
+ if (ret) {
+ pr_err("isgx: misc_register() failed\n");
+ goto out_workqueue;
+ }
+
+ return 0;
+out_workqueue:
+ destroy_workqueue(isgx_add_page_wq);
+out_iounmap:
+#ifdef CONFIG_X86_64
+ iounmap(isgx_epc_mem);
+#endif
+ return ret;
+}
+
+static int isgx_drv_probe(struct platform_device *pdev)
+{
+ unsigned int eax, ebx, ecx, edx;
+ int i;
+
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return -ENODEV;
+
+ cpuid(0, &eax, &ebx, &ecx, &edx);
+ if (eax < SGX_CPUID) {
+ pr_err("isgx: CPUID is missing the SGX leaf instruction\n");
+ return -ENODEV;
+ }
+
+ if (!boot_cpu_has(X86_FEATURE_SGX)) {
+ pr_err("isgx: CPU is missing the SGX feature\n");
+ return -ENODEV;
+ }
+
+ cpuid_count(SGX_CPUID, 0x0, &eax, &ebx, &ecx, &edx);
+ if (!(eax & 1)) {
+ pr_err("isgx: CPU does not support the SGX 1.0 instruction set\n");
+ return -ENODEV;
+ }
+
+ if (boot_cpu_has(X86_FEATURE_OSXSAVE)) {
+ cpuid_count(SGX_CPUID, 0x1, &eax, &ebx, &ecx, &edx);
+ isgx_xfrm_mask = (((u64)edx) << 32) + (u64)ecx;
+ for (i = 2; i < 64; i++) {
+ cpuid_count(0x0D, i, &eax, &ebx, &ecx, &edx);
+ if ((1 << i) & isgx_xfrm_mask)
+ isgx_ssaframesize_tbl[i] =
+ (168 + eax + ebx + PAGE_SIZE - 1) /
+ PAGE_SIZE;
+ }
+ }
+
+ cpuid_count(SGX_CPUID, 0x0, &eax, &ebx, &ecx, &edx);
+ if (edx & 0xFFFF) {
+#ifdef CONFIG_X86_64
+ isgx_enclave_size_max_64 = 2ULL << (edx & 0xFF);
+#endif
+ isgx_enclave_size_max_32 = 2ULL << ((edx >> 8) & 0xFF);
+ }
+
+ return isgx_drv_init(&pdev->dev);
+}
+
+static int isgx_drv_remove(struct platform_device *pdev)
+{
+ misc_deregister(&isgx_dev);
+ destroy_workqueue(isgx_add_page_wq);
+#ifdef CONFIG_X86_64
+ iounmap(isgx_epc_mem);
+#endif
+ isgx_page_cache_teardown();
+
+ return 0;
+}
+
+static struct platform_driver isgx_drv = {
+ .probe = isgx_drv_probe,
+ .remove = isgx_drv_remove,
+ .driver = {
+ .name = "intel_sgx",
+ .pm = &isgx_drv_pm,
+ },
+};
+
+static struct platform_device *isgx_pdev;
+
+static int __init isgx_init(void)
+{
+ struct platform_device *pdev;
+ int rc;
+
+ rc = platform_driver_register(&isgx_drv);
+ if (rc < 0)
+ return rc;
+
+ pdev = platform_device_register_simple("intel_sgx", -1, NULL, 0);
+ if (IS_ERR(pdev)) {
+ platform_driver_unregister(&isgx_drv);
+ return PTR_ERR(pdev);
+ }
+
+ isgx_pdev = pdev;
+
+ return 0;
+}
+
+static void __exit isgx_exit(void)
+{
+ platform_device_unregister(isgx_pdev);
+ platform_driver_unregister(&isgx_drv);
+}
+
+static unsigned long isgx_get_unmapped_area(struct file *file,
+ unsigned long addr,
+ unsigned long len,
+ unsigned long pgoff,
+ unsigned long flags)
+{
+ if (len < 2 * PAGE_SIZE || (len & (len - 1)))
+ return -EINVAL;
+
+ /* On 64-bit architecture, allow mmap() to exceed 32-bit enclave
+ * limit only if the task is not running in 32-bit compatibility
+ * mode.
+ */
+ if (len > isgx_enclave_size_max_32)
+#ifdef CONFIG_X86_64
+ if (test_thread_flag(TIF_ADDR32))
+ return -EINVAL;
+#else
+ return -EINVAL;
+#endif
+
+#ifdef CONFIG_X86_64
+ if (len > isgx_enclave_size_max_64)
+ return -EINVAL;
+#endif
+
+ addr = current->mm->get_unmapped_area(file, addr, 2 * len, pgoff,
+ flags);
+ if (IS_ERR_VALUE(addr))
+ return addr;
+
+ addr = (addr + (len - 1)) & ~(len - 1);
+
+ return addr;
+}
+
+module_init(isgx_init);
+module_exit(isgx_exit);
+MODULE_LICENSE("GPL");
diff --git a/drivers/staging/intel_sgx/isgx_page_cache.c b/drivers/staging/intel_sgx/isgx_page_cache.c
new file mode 100644
index 0000000..f0224e8
--- /dev/null
+++ b/drivers/staging/intel_sgx/isgx_page_cache.c
@@ -0,0 +1,485 @@
+/*
+ * (C) Copyright 2016 Intel Corporation
+ *
+ * Authors:
+ *
+ * Jarkko Sakkinen <[email protected]>
+ * Suresh Siddha <[email protected]>
+ * Serge Ayoun <[email protected]>
+ * Shay Katz-zamir <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#include "isgx.h"
+#include <linux/freezer.h>
+#include <linux/highmem.h>
+#include <linux/kthread.h>
+#include <linux/ratelimit.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+
+static LIST_HEAD(isgx_free_list);
+static DEFINE_SPINLOCK(isgx_free_list_lock);
+
+LIST_HEAD(isgx_tgid_ctx_list);
+/* mutex for the TGID list */
+DEFINE_MUTEX(isgx_tgid_ctx_mutex);
+static unsigned int isgx_nr_total_epc_pages;
+static unsigned int isgx_nr_free_epc_pages;
+static unsigned int isgx_nr_low_epc_pages = ISGX_NR_LOW_EPC_PAGES_DEFAULT;
+static unsigned int isgx_nr_high_epc_pages;
+struct task_struct *kisgxswapd_tsk;
+static DECLARE_WAIT_QUEUE_HEAD(kisgxswapd_waitq);
+
+static struct isgx_tgid_ctx *isolate_tgid_ctx(unsigned long nr_to_scan)
+{
+ struct isgx_tgid_ctx *ctx = NULL;
+ int i;
+
+ mutex_lock(&isgx_tgid_ctx_mutex);
+
+ if (list_empty(&isgx_tgid_ctx_list)) {
+ mutex_unlock(&isgx_tgid_ctx_mutex);
+ return NULL;
+ }
+
+ for (i = 0; i < nr_to_scan; i++) {
+ /* Peek TGID context from the head. */
+ ctx = list_first_entry(&isgx_tgid_ctx_list,
+ struct isgx_tgid_ctx,
+ list);
+
+ /* Move to the tail so that we do not encounter it in the
+ * next iteration.
+ */
+ list_move_tail(&ctx->list, &isgx_tgid_ctx_list);
+
+ /* Non-empty TGID context? */
+ if (!list_empty(&ctx->enclave_list) &&
+ kref_get_unless_zero(&ctx->refcount))
+ break;
+
+ ctx = NULL;
+ }
+
+ mutex_unlock(&isgx_tgid_ctx_mutex);
+
+ return ctx;
+}
+
+static struct isgx_enclave *isolate_enclave(struct isgx_tgid_ctx *ctx,
+ unsigned long nr_to_scan)
+{
+ struct isgx_enclave *encl = NULL;
+ int i;
+
+ mutex_lock(&isgx_tgid_ctx_mutex);
+
+ if (list_empty(&ctx->enclave_list)) {
+ mutex_unlock(&isgx_tgid_ctx_mutex);
+ return NULL;
+ }
+
+ for (i = 0; i < nr_to_scan; i++) {
+ /* Peek enclave from the head. */
+ encl = list_first_entry(&ctx->enclave_list,
+ struct isgx_enclave,
+ enclave_list);
+
+ /* Move to the tail so that we do not encounter it in the
+ * next iteration.
+ */
+ list_move_tail(&encl->enclave_list, &ctx->enclave_list);
+
+ /* Enclave with faulted pages? */
+ if (!list_empty(&encl->load_list) &&
+ kref_get_unless_zero(&encl->refcount))
+ break;
+
+ encl = NULL;
+ }
+
+ mutex_unlock(&isgx_tgid_ctx_mutex);
+
+ return encl;
+}
+
+static void sgx_isolate_pages(struct isgx_enclave *encl,
+ struct list_head *dst,
+ unsigned long nr_to_scan)
+{
+ struct isgx_enclave_page *entry;
+ int i;
+
+ mutex_lock(&encl->lock);
+
+ for (i = 0; i < nr_to_scan; i++) {
+ if (list_empty(&encl->load_list))
+ break;
+
+ entry = list_first_entry(&encl->load_list,
+ struct isgx_enclave_page,
+ load_list);
+
+ if (!(entry->flags & ISGX_ENCLAVE_PAGE_RESERVED)) {
+ entry->flags |= ISGX_ENCLAVE_PAGE_RESERVED;
+ list_move_tail(&entry->load_list, dst);
+ } else {
+ list_move_tail(&entry->load_list, &encl->load_list);
+ }
+ }
+
+ mutex_unlock(&encl->lock);
+}
+
+static void isgx_ipi_cb(void *info)
+{
+}
+
+static void do_eblock(struct isgx_epc_page *epc_page)
+{
+ void *vaddr;
+
+ vaddr = isgx_get_epc_page(epc_page);
+ BUG_ON(__eblock((unsigned long)vaddr));
+ isgx_put_epc_page(vaddr);
+}
+
+static void do_etrack(struct isgx_epc_page *epc_page)
+{
+ void *epc;
+
+ epc = isgx_get_epc_page(epc_page);
+ BUG_ON(__etrack(epc));
+ isgx_put_epc_page(epc);
+}
+
+static int do_ewb(struct isgx_enclave *enclave,
+ struct isgx_enclave_page *enclave_page,
+ struct page *backing)
+{
+ struct sgx_page_info pginfo;
+ void *epc;
+ void *va;
+ int ret;
+
+ pginfo.srcpge = (unsigned long)kmap_atomic(backing);
+ epc = isgx_get_epc_page(enclave_page->epc_page);
+ va = isgx_get_epc_page(enclave_page->va_page->epc_page);
+
+ pginfo.pcmd = (unsigned long)&enclave_page->pcmd;
+ pginfo.linaddr = 0;
+ pginfo.secs = 0;
+ ret = __ewb(&pginfo, epc,
+ (void *)((unsigned long)va + enclave_page->va_offset));
+
+ isgx_put_epc_page(va);
+ isgx_put_epc_page(epc);
+ kunmap_atomic((void *)(unsigned long)pginfo.srcpge);
+
+ if (ret != 0 && ret != SGX_NOT_TRACKED)
+ isgx_err(enclave, "EWB returned %d\n", ret);
+
+ return ret;
+}
+
+void sgx_evict_page(struct isgx_enclave_page *entry,
+ struct isgx_enclave *encl,
+ unsigned int flags)
+{
+ isgx_free_epc_page(entry->epc_page, encl, flags);
+ entry->epc_page = NULL;
+ entry->flags &= ~ISGX_ENCLAVE_PAGE_RESERVED;
+}
+
+static void sgx_write_pages(struct list_head *src)
+{
+ struct isgx_enclave *enclave;
+ struct isgx_enclave_page *entry;
+ struct isgx_enclave_page *tmp;
+ struct page *pages[ISGX_NR_SWAP_CLUSTER_MAX + 1];
+ struct isgx_vma *evma;
+ int cnt = 0;
+ int i = 0;
+ int ret;
+
+ if (list_empty(src))
+ return;
+
+ entry = list_first_entry(src, struct isgx_enclave_page, load_list);
+ enclave = entry->enclave;
+
+ if (!isgx_pin_mm(enclave)) {
+ while (!list_empty(src)) {
+ entry = list_first_entry(src, struct isgx_enclave_page,
+ load_list);
+ list_del(&entry->load_list);
+ mutex_lock(&enclave->lock);
+ sgx_evict_page(entry, enclave, 0);
+ mutex_unlock(&enclave->lock);
+ }
+
+ return;
+ }
+
+ /* EBLOCK */
+
+ list_for_each_entry_safe(entry, tmp, src, load_list) {
+ mutex_lock(&enclave->lock);
+ evma = isgx_find_vma(enclave, entry->addr);
+ if (!evma) {
+ list_del(&entry->load_list);
+ sgx_evict_page(entry, enclave, 0);
+ mutex_unlock(&enclave->lock);
+ continue;
+ }
+
+ pages[cnt] = isgx_get_backing(enclave, entry);
+ if (IS_ERR(pages[cnt])) {
+ list_del(&entry->load_list);
+ list_add_tail(&entry->load_list, &enclave->load_list);
+ entry->flags &= ~ISGX_ENCLAVE_PAGE_RESERVED;
+ mutex_unlock(&enclave->lock);
+ continue;
+ }
+
+ zap_vma_ptes(evma->vma, entry->addr, PAGE_SIZE);
+ do_eblock(entry->epc_page);
+ cnt++;
+ mutex_unlock(&enclave->lock);
+ }
+
+ /* ETRACK */
+
+ mutex_lock(&enclave->lock);
+ do_etrack(enclave->secs_page.epc_page);
+ mutex_unlock(&enclave->lock);
+
+ /* EWB */
+
+ mutex_lock(&enclave->lock);
+ i = 0;
+
+ while (!list_empty(src)) {
+ entry = list_first_entry(src, struct isgx_enclave_page,
+ load_list);
+ list_del(&entry->load_list);
+
+ evma = isgx_find_vma(enclave, entry->addr);
+ if (evma) {
+ ret = do_ewb(enclave, entry, pages[i]);
+ BUG_ON(ret != 0 && ret != SGX_NOT_TRACKED);
+ /* Only kick out threads with an IPI if needed. */
+ if (ret) {
+ smp_call_function(isgx_ipi_cb, NULL, 1);
+ BUG_ON(do_ewb(enclave, entry, pages[i]));
+ }
+ enclave->secs_child_cnt--;
+ }
+
+ sgx_evict_page(entry, enclave, evma ? ISGX_FREE_SKIP_EREMOVE : 0);
+ isgx_put_backing(pages[i++], evma);
+ }
+
+ /* Allow SECS page eviction only when the enclave is initialized. */
+ if (!enclave->secs_child_cnt &&
+ (enclave->flags & ISGX_ENCLAVE_INITIALIZED)) {
+ pages[cnt] = isgx_get_backing(enclave, &enclave->secs_page);
+ if (!IS_ERR(pages[cnt])) {
+ BUG_ON(do_ewb(enclave, &enclave->secs_page,
+ pages[cnt]));
+ enclave->flags |= ISGX_ENCLAVE_SECS_EVICTED;
+
+ sgx_evict_page(&enclave->secs_page, NULL,
+ ISGX_FREE_SKIP_EREMOVE);
+ isgx_put_backing(pages[cnt], true);
+ }
+ }
+
+ mutex_unlock(&enclave->lock);
+ BUG_ON(i != cnt);
+
+ isgx_unpin_mm(enclave);
+}
+
+static void sgx_swap_pages(unsigned long nr_to_scan)
+{
+ struct isgx_tgid_ctx *ctx;
+ struct isgx_enclave *encl;
+ LIST_HEAD(cluster);
+
+ ctx = isolate_tgid_ctx(nr_to_scan);
+ if (!ctx)
+ return;
+
+ encl = isolate_enclave(ctx, nr_to_scan);
+ if (!encl)
+ goto out;
+
+ sgx_isolate_pages(encl, &cluster, nr_to_scan);
+ sgx_write_pages(&cluster);
+
+ kref_put(&encl->refcount, isgx_enclave_release);
+out:
+ kref_put(&ctx->refcount, release_tgid_ctx);
+}
+
+int kisgxswapd(void *p)
+{
+ DEFINE_WAIT(wait);
+ unsigned int nr_free;
+ unsigned int nr_high;
+
+ for ( ; ; ) {
+ if (kthread_should_stop())
+ break;
+
+ spin_lock(&isgx_free_list_lock);
+ nr_free = isgx_nr_free_epc_pages;
+ nr_high = isgx_nr_high_epc_pages;
+ spin_unlock(&isgx_free_list_lock);
+
+ if (nr_free < nr_high) {
+ sgx_swap_pages(ISGX_NR_SWAP_CLUSTER_MAX);
+ schedule();
+ } else {
+ prepare_to_wait(&kisgxswapd_waitq,
+ &wait, TASK_INTERRUPTIBLE);
+
+ if (!kthread_should_stop())
+ schedule();
+
+ finish_wait(&kisgxswapd_waitq, &wait);
+ }
+ }
+
+ pr_info("%s: done\n", __func__);
+ return 0;
+}
+
+int isgx_page_cache_init(resource_size_t start, unsigned long size)
+{
+ unsigned long i;
+ struct isgx_epc_page *new_epc_page, *entry;
+ struct list_head *parser, *temp;
+
+ for (i = 0; i < size; i += PAGE_SIZE) {
+ new_epc_page = kzalloc(sizeof(*new_epc_page), GFP_KERNEL);
+ if (!new_epc_page)
+ goto err_freelist;
+ new_epc_page->pa = start + i;
+
+ spin_lock(&isgx_free_list_lock);
+ list_add_tail(&new_epc_page->free_list, &isgx_free_list);
+ isgx_nr_total_epc_pages++;
+ isgx_nr_free_epc_pages++;
+ spin_unlock(&isgx_free_list_lock);
+ }
+
+ isgx_nr_high_epc_pages = 2 * isgx_nr_low_epc_pages;
+ kisgxswapd_tsk = kthread_run(kisgxswapd, NULL, "kisgxswapd");
+
+ return 0;
+err_freelist:
+ list_for_each_safe(parser, temp, &isgx_free_list) {
+ spin_lock(&isgx_free_list_lock);
+ entry = list_entry(parser, struct isgx_epc_page, free_list);
+ list_del(&entry->free_list);
+ spin_unlock(&isgx_free_list_lock);
+ kfree(entry);
+ }
+ return -ENOMEM;
+}
+
+void isgx_page_cache_teardown(void)
+{
+ struct isgx_epc_page *entry;
+ struct list_head *parser, *temp;
+
+ if (kisgxswapd_tsk)
+ kthread_stop(kisgxswapd_tsk);
+
+ spin_lock(&isgx_free_list_lock);
+ list_for_each_safe(parser, temp, &isgx_free_list) {
+ entry = list_entry(parser, struct isgx_epc_page, free_list);
+ list_del(&entry->free_list);
+ kfree(entry);
+ }
+ spin_unlock(&isgx_free_list_lock);
+}
+
+static struct isgx_epc_page *isgx_alloc_epc_page_fast(void)
+{
+ struct isgx_epc_page *entry = NULL;
+
+ spin_lock(&isgx_free_list_lock);
+
+ if (!list_empty(&isgx_free_list)) {
+ entry = list_first_entry(&isgx_free_list, struct isgx_epc_page,
+ free_list);
+ list_del(&entry->free_list);
+ isgx_nr_free_epc_pages--;
+ }
+
+ spin_unlock(&isgx_free_list_lock);
+
+ return entry;
+}
+
+struct isgx_epc_page *isgx_alloc_epc_page(
+ struct isgx_tgid_ctx *tgid_epc_cnt,
+ unsigned int flags)
+{
+ struct isgx_epc_page *entry;
+
+ for ( ; ; ) {
+ entry = isgx_alloc_epc_page_fast();
+ if (entry) {
+ if (tgid_epc_cnt)
+ atomic_inc(&tgid_epc_cnt->epc_cnt);
+ break;
+ } else if (flags & ISGX_ALLOC_ATOMIC) {
+ entry = ERR_PTR(-EBUSY);
+ break;
+ }
+
+ if (signal_pending(current)) {
+ entry = ERR_PTR(-ERESTARTSYS);
+ break;
+ }
+
+ sgx_swap_pages(ISGX_NR_SWAP_CLUSTER_MAX);
+ schedule();
+ }
+
+ if (isgx_nr_free_epc_pages < isgx_nr_low_epc_pages)
+ wake_up(&kisgxswapd_waitq);
+
+ return entry;
+}
+
+void isgx_free_epc_page(struct isgx_epc_page *entry,
+ struct isgx_enclave *encl,
+ unsigned int flags)
+{
+ BUG_ON(!entry);
+
+ if (encl) {
+ atomic_dec(&encl->tgid_ctx->epc_cnt);
+
+ if (encl->flags & ISGX_ENCLAVE_SUSPEND)
+ flags |= ISGX_FREE_SKIP_EREMOVE;
+ }
+
+ if (!(flags & ISGX_FREE_SKIP_EREMOVE))
+ BUG_ON(isgx_eremove(entry));
+
+ spin_lock(&isgx_free_list_lock);
+ list_add(&entry->free_list, &isgx_free_list);
+ isgx_nr_free_epc_pages++;
+ spin_unlock(&isgx_free_list_lock);
+}
diff --git a/drivers/staging/intel_sgx/isgx_user.h b/drivers/staging/intel_sgx/isgx_user.h
new file mode 100644
index 0000000..672d19c
--- /dev/null
+++ b/drivers/staging/intel_sgx/isgx_user.h
@@ -0,0 +1,113 @@
+/*
+ * (C) Copyright 2016 Intel Corporation
+ *
+ * Authors:
+ *
+ * Jarkko Sakkinen <[email protected]>
+ * Suresh Siddha <[email protected]>
+ * Serge Ayoun <[email protected]>
+ * Shay Katz-zamir <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#ifndef _UAPI_ASM_X86_SGX_H
+#define _UAPI_ASM_X86_SGX_H
+
+#include <linux/bitops.h>
+#include <linux/ioctl.h>
+#include <linux/stddef.h>
+#include <linux/types.h>
+
+#define SGX_IOC_ENCLAVE_CREATE \
+ _IOWR('p', 0x02, struct sgx_create_param)
+#define SGX_IOC_ENCLAVE_ADD_PAGE \
+ _IOW('p', 0x03, struct sgx_add_param)
+#define SGX_IOC_ENCLAVE_INIT \
+ _IOW('p', 0x04, struct sgx_init_param)
+#define SGX_IOC_ENCLAVE_DESTROY \
+ _IOW('p', 0x06, struct sgx_destroy_param)
+
+/* SGX leaf instruction return values */
+#define SGX_SUCCESS 0
+#define SGX_INVALID_SIG_STRUCT 1
+#define SGX_INVALID_ATTRIBUTE 2
+#define SGX_BLKSTATE 3
+#define SGX_INVALID_MEASUREMENT 4
+#define SGX_NOTBLOCKABLE 5
+#define SGX_PG_INVLD 6
+#define SGX_LOCKFAIL 7
+#define SGX_INVALID_SIGNATURE 8
+#define SGX_MAC_COMPARE_FAIL 9
+#define SGX_PAGE_NOT_BLOCKED 10
+#define SGX_NOT_TRACKED 11
+#define SGX_VA_SLOT_OCCUPIED 12
+#define SGX_CHILD_PRESENT 13
+#define SGX_ENCLAVE_ACT 14
+#define SGX_ENTRYEPOCH_LOCKED 15
+#define SGX_INVALID_LICENSE 16
+#define SGX_PREV_TRK_INCMPL 17
+#define SGX_PG_IS_SECS 18
+#define SGX_INVALID_CPUSVN 32
+#define SGX_INVALID_ISVSVN 64
+#define SGX_UNMASKED_EVENT 128
+#define SGX_INVALID_KEYNAME 256
+
+/* IOCTL return values */
+#define SGX_POWER_LOST_ENCLAVE 0xc0000002
+#define SGX_LE_ROLLBACK 0xc0000003
+
+/* SECINFO flags */
+enum isgx_secinfo_flags {
+ SGX_SECINFO_FL_R = BIT_ULL(0),
+ SGX_SECINFO_FL_W = BIT_ULL(1),
+ SGX_SECINFO_FL_X = BIT_ULL(2),
+};
+
+/* SECINFO page types */
+enum isgx_secinfo_pt {
+ SGX_SECINFO_PT_SECS = 0x000ULL,
+ SGX_SECINFO_PT_TCS = 0x100ULL,
+ SGX_SECINFO_PT_REG = 0x200ULL,
+};
+
+struct sgx_secinfo {
+ __u64 flags;
+ __u64 reserved[7];
+} __aligned(128);
+
+struct sgx_einittoken {
+ __u32 valid;
+ __u8 reserved1[206];
+ __u16 isvsvnle;
+ __u8 reserved2[92];
+} __aligned(512);
+
+struct sgx_create_param {
+ void *secs;
+ unsigned long addr;
+};
+
+#define SGX_ADD_SKIP_EEXTEND 0x1
+
+struct sgx_add_param {
+ unsigned long addr;
+ unsigned long user_addr;
+ struct isgx_secinfo *secinfo;
+ unsigned int flags;
+};
+
+struct sgx_init_param {
+ unsigned long addr;
+ void *sigstruct;
+ struct isgx_einittoken *einittoken;
+};
+
+struct sgx_destroy_param {
+ unsigned long addr;
+};
+
+#endif /* _UAPI_ASM_X86_SGX_H */
diff --git a/drivers/staging/intel_sgx/isgx_util.c b/drivers/staging/intel_sgx/isgx_util.c
new file mode 100644
index 0000000..c635014
--- /dev/null
+++ b/drivers/staging/intel_sgx/isgx_util.c
@@ -0,0 +1,334 @@
+/*
+ * (C) Copyright 2016 Intel Corporation
+ *
+ * Authors:
+ *
+ * Jarkko Sakkinen <[email protected]>
+ * Suresh Siddha <[email protected]>
+ * Serge Ayoun <[email protected]>
+ * Shay Katz-zamir <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#include "isgx.h"
+#include <linux/highmem.h>
+#include <linux/shmem_fs.h>
+
+void *isgx_get_epc_page(struct isgx_epc_page *entry)
+{
+#ifdef CONFIG_X86_32
+ return kmap_atomic_pfn(PFN_DOWN(entry->pa));
+#else
+ return isgx_epc_mem + (entry->pa - isgx_epc_base);
+#endif
+}
+
+void isgx_put_epc_page(void *epc_page_vaddr)
+{
+#ifdef CONFIG_X86_32
+ kunmap_atomic(epc_page_vaddr);
+#else
+#endif
+}
+
+struct page *isgx_get_backing(struct isgx_enclave *enclave,
+ struct isgx_enclave_page *entry)
+{
+ struct page *backing;
+ struct inode *inode;
+ struct address_space *mapping;
+ gfp_t gfpmask;
+ pgoff_t index;
+
+ inode = enclave->backing->f_path.dentry->d_inode;
+ mapping = inode->i_mapping;
+ gfpmask = mapping_gfp_mask(mapping);
+
+ index = (entry->addr - enclave->base) >> PAGE_SHIFT;
+ backing = shmem_read_mapping_page_gfp(mapping, index, gfpmask);
+
+ return backing;
+}
+
+void isgx_put_backing(struct page *backing_page, bool write)
+{
+ if (write)
+ set_page_dirty(backing_page);
+
+ page_cache_release(backing_page);
+}
+
+int isgx_eremove(struct isgx_epc_page *epc_page)
+{
+ void *epc;
+ int ret;
+
+ epc = isgx_get_epc_page(epc_page);
+ ret = __eremove(epc);
+ isgx_put_epc_page(epc);
+
+ if (ret)
+ pr_debug_ratelimited("EREMOVE returned %d\n", ret);
+
+ return ret;
+}
+
+static int isgx_test_and_clear_young_cb(pte_t *ptep, pgtable_t token,
+ unsigned long addr, void *data)
+{
+ pte_t pte;
+ int rc;
+
+ rc = pte_young(*ptep);
+ if (rc) {
+ pte = pte_mkold(*ptep);
+ set_pte_at((struct mm_struct *)data, addr, ptep, pte);
+ }
+
+ return rc;
+}
+
+/**
+ * isgx_test_and_clear_young() - is the enclave page recently accessed?
+ * @page: enclave page to be tested for recent access
+ *
+ * Checks the Access (A) bit from the PTE corresponding to the
+ * enclave page and clears it. Returns 1 if the page has been
+ * recently accessed and 0 if not.
+ */
+int isgx_test_and_clear_young(struct isgx_enclave_page *page)
+{
+ struct mm_struct *mm;
+ struct isgx_vma *evma = isgx_find_vma(page->enclave, page->addr);
+
+ if (!evma)
+ return 0;
+
+ mm = evma->vma->vm_mm;
+
+ return apply_to_page_range(mm, page->addr, PAGE_SIZE,
+ isgx_test_and_clear_young_cb, mm);
+}
+
+/**
+ * isgx_find_vma() - find VMA for the enclave address
+ * @enclave: the enclave to be searched
+ * @addr: the linear address to query
+ *
+ * Finds VMA for the given address of the enclave. Returns the VMA if
+ * there is one containing the given address.
+ */
+struct isgx_vma *isgx_find_vma(struct isgx_enclave *enclave,
+ unsigned long addr)
+{
+ struct isgx_vma *tmp;
+ struct isgx_vma *evma;
+
+ list_for_each_entry_safe(evma, tmp, &enclave->vma_list, vma_list) {
+ if (evma->vma->vm_start <= addr && evma->vma->vm_end > addr)
+ return evma;
+ }
+
+ isgx_dbg(enclave, "cannot find VMA at 0x%lx\n", addr);
+ return NULL;
+}
+
+/**
+ * isgx_zap_tcs_ptes() - clear PTEs that contain TCS pages
+ * @enclave an enclave
+ * @vma: a VMA of the enclave
+ */
+void isgx_zap_tcs_ptes(struct isgx_enclave *enclave, struct vm_area_struct *vma)
+{
+ struct isgx_enclave_page *entry;
+ struct rb_node *rb;
+
+ rb = rb_first(&enclave->enclave_rb);
+ while (rb) {
+ entry = container_of(rb, struct isgx_enclave_page, node);
+ rb = rb_next(rb);
+ if (entry->epc_page && (entry->flags & ISGX_ENCLAVE_PAGE_TCS) &&
+ entry->addr >= vma->vm_start &&
+ entry->addr < vma->vm_end)
+ zap_vma_ptes(vma, entry->addr, PAGE_SIZE);
+ }
+}
+
+/**
+ * isgx_pin_mm - pin the mm_struct of an enclave
+ *
+ * @encl: an enclave
+ *
+ * Locks down mmap_sem of an enclave if it still has VMAs and was not suspended.
+ * Returns true if this the case.
+ */
+bool isgx_pin_mm(struct isgx_enclave *encl)
+{
+ if (encl->flags & ISGX_ENCLAVE_SUSPEND)
+ return false;
+
+ mutex_lock(&encl->lock);
+ if (!list_empty(&encl->vma_list)) {
+ atomic_inc(&encl->mm->mm_count);
+ } else {
+ mutex_unlock(&encl->lock);
+ return false;
+ }
+ mutex_unlock(&encl->lock);
+
+ down_read(&encl->mm->mmap_sem);
+
+ if (list_empty(&encl->vma_list)) {
+ isgx_unpin_mm(encl);
+ return false;
+ }
+
+ return true;
+}
+
+/**
+ * isgx_unpin_mm - unpin the mm_struct of an enclave
+ *
+ * @encl: an enclave
+ *
+ * Unlocks the mmap_sem.
+ */
+void isgx_unpin_mm(struct isgx_enclave *encl)
+{
+ up_read(&encl->mm->mmap_sem);
+ mmdrop(encl->mm);
+}
+
+/**
+ * isgx_unpin_mm - invalidate the enclave
+ *
+ * @encl: an enclave
+ *
+ * Unmap TCS pages and empty the VMA list.
+ */
+void isgx_invalidate(struct isgx_enclave *encl)
+{
+ struct isgx_vma *vma;
+
+ list_for_each_entry(vma, &encl->vma_list, vma_list)
+ isgx_zap_tcs_ptes(encl, vma->vma);
+
+ while (!list_empty(&encl->vma_list)) {
+ vma = list_first_entry(&encl->vma_list, struct isgx_vma,
+ vma_list);
+ list_del(&vma->vma_list);
+ kfree(vma);
+ }
+}
+
+/**
+ * isgx_find_enclave() - find enclave given a virtual address
+ * @mm: the address space where we query the enclave
+ * @addr: the virtual address to query
+ * @vma: VMA if an enclave is found or NULL if not
+ *
+ * Finds an enclave given a virtual address and a address space where to seek it
+ * from. The return value is zero on success. Otherwise, it is either positive
+ * for SGX specific errors or negative for the system errors.
+ */
+int isgx_find_enclave(struct mm_struct *mm, unsigned long addr,
+ struct vm_area_struct **vma)
+{
+ struct isgx_enclave *enclave;
+
+ *vma = find_vma(mm, addr);
+
+ if (!(*vma) || (*vma)->vm_ops != &isgx_vm_ops ||
+ addr < (*vma)->vm_start)
+ return -EINVAL;
+
+ /* Is ECREATE already done? */
+ enclave = (*vma)->vm_private_data;
+ if (!enclave)
+ return -ENOENT;
+
+ if (enclave->flags & ISGX_ENCLAVE_SUSPEND) {
+ isgx_info(enclave, "suspend ID has been changed");
+ return SGX_POWER_LOST_ENCLAVE;
+ }
+
+ return 0;
+}
+
+/**
+ * isgx_enclave_find_page() - find an enclave page
+ * @encl: the enclave to query
+ * @addr: the virtual address to query
+ */
+struct isgx_enclave_page *isgx_enclave_find_page(struct isgx_enclave *enclave,
+ unsigned long enclave_la)
+{
+ struct rb_node *node = enclave->enclave_rb.rb_node;
+
+ while (node) {
+ struct isgx_enclave_page *data =
+ container_of(node, struct isgx_enclave_page, node);
+
+ if (data->addr > enclave_la)
+ node = node->rb_left;
+ else if (data->addr < enclave_la)
+ node = node->rb_right;
+ else
+ return data;
+ }
+
+ return NULL;
+}
+
+void isgx_enclave_release(struct kref *ref)
+{
+ struct rb_node *rb1, *rb2;
+ struct isgx_enclave_page *entry;
+ struct isgx_va_page *va_page;
+ struct isgx_enclave *enclave =
+ container_of(ref, struct isgx_enclave, refcount);
+
+ mutex_lock(&isgx_tgid_ctx_mutex);
+ if (!list_empty(&enclave->enclave_list))
+ list_del(&enclave->enclave_list);
+
+ mutex_unlock(&isgx_tgid_ctx_mutex);
+
+ rb1 = rb_first(&enclave->enclave_rb);
+ while (rb1) {
+ entry = container_of(rb1, struct isgx_enclave_page, node);
+ rb2 = rb_next(rb1);
+ rb_erase(rb1, &enclave->enclave_rb);
+ if (entry->epc_page) {
+ list_del(&entry->load_list);
+ isgx_free_epc_page(entry->epc_page, enclave, 0);
+ }
+ kfree(entry);
+ rb1 = rb2;
+ }
+
+ while (!list_empty(&enclave->va_pages)) {
+ va_page = list_first_entry(&enclave->va_pages,
+ struct isgx_va_page, list);
+ list_del(&va_page->list);
+ isgx_free_epc_page(va_page->epc_page, NULL, 0);
+ kfree(va_page);
+ }
+
+ if (enclave->secs_page.epc_page)
+ isgx_free_epc_page(enclave->secs_page.epc_page, NULL, 0);
+
+ enclave->secs_page.epc_page = NULL;
+
+ if (enclave->tgid_ctx)
+ kref_put(&enclave->tgid_ctx->refcount, release_tgid_ctx);
+
+ if (enclave->backing)
+ fput(enclave->backing);
+
+ kfree(enclave);
+}
diff --git a/drivers/staging/intel_sgx/isgx_vma.c b/drivers/staging/intel_sgx/isgx_vma.c
new file mode 100644
index 0000000..f6cfb02
--- /dev/null
+++ b/drivers/staging/intel_sgx/isgx_vma.c
@@ -0,0 +1,282 @@
+/*
+ * (C) Copyright 2016 Intel Corporation
+ *
+ * Authors:
+ *
+ * Jarkko Sakkinen <[email protected]>
+ * Suresh Siddha <[email protected]>
+ * Serge Ayoun <[email protected]>
+ * Shay Katz-zamir <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#include "isgx.h"
+#include <asm/mman.h>
+#include <linux/delay.h>
+#include <linux/file.h>
+#include <linux/highmem.h>
+#include <linux/ratelimit.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/hashtable.h>
+#include <linux/shmem_fs.h>
+
+static void isgx_vma_open(struct vm_area_struct *vma)
+{
+ struct isgx_enclave *enclave;
+ struct isgx_vma *evma;
+
+ /* Was vm_private_data nullified as a result of the previous fork? */
+ enclave = vma->vm_private_data;
+ if (!enclave)
+ goto out_fork;
+
+ /* Was the process forked? mm_struct changes when the process is
+ * forked.
+ */
+ mutex_lock(&enclave->lock);
+ evma = list_first_entry(&enclave->vma_list,
+ struct isgx_vma, vma_list);
+ if (evma->vma->vm_mm != vma->vm_mm) {
+ mutex_unlock(&enclave->lock);
+ goto out_fork;
+ }
+ mutex_unlock(&enclave->lock);
+
+ mutex_lock(&enclave->lock);
+ if (!list_empty(&enclave->vma_list)) {
+ evma = kzalloc(sizeof(*evma), GFP_KERNEL);
+ if (!evma) {
+ isgx_invalidate(enclave);
+ } else {
+ evma->vma = vma;
+ list_add_tail(&evma->vma_list, &enclave->vma_list);
+ }
+ }
+ mutex_unlock(&enclave->lock);
+
+ kref_get(&enclave->refcount);
+ return;
+out_fork:
+ zap_vma_ptes(vma, vma->vm_start, vma->vm_end - vma->vm_start);
+ vma->vm_private_data = NULL;
+}
+
+static void isgx_vma_close(struct vm_area_struct *vma)
+{
+ struct isgx_enclave *enclave = vma->vm_private_data;
+ struct isgx_vma *evma;
+
+ /* If process was forked, VMA is still there but
+ * vm_private_data is set to NULL.
+ */
+ if (!enclave)
+ return;
+
+ mutex_lock(&enclave->lock);
+
+ /* On vma_close() we remove the vma from vma_list
+ * there is a possibility that evma is not found
+ * in case vma_open() has failed on memory allocation
+ * and vma list has then been emptied
+ */
+ evma = isgx_find_vma(enclave, vma->vm_start);
+ if (evma) {
+ list_del(&evma->vma_list);
+ kfree(evma);
+ }
+
+ vma->vm_private_data = NULL;
+
+ isgx_zap_tcs_ptes(enclave, vma);
+ zap_vma_ptes(vma, vma->vm_start, vma->vm_end - vma->vm_start);
+
+ mutex_unlock(&enclave->lock);
+
+ kref_put(&enclave->refcount, isgx_enclave_release);
+}
+
+static int do_eldu(struct isgx_enclave *enclave,
+ struct isgx_enclave_page *enclave_page,
+ struct isgx_epc_page *epc_page,
+ struct page *backing,
+ bool is_secs)
+{
+ struct sgx_page_info pginfo;
+ void *secs_ptr = NULL;
+ void *epc_ptr;
+ void *va_ptr;
+ int ret;
+
+ pginfo.srcpge = (unsigned long)kmap_atomic(backing);
+ if (!is_secs)
+ secs_ptr = isgx_get_epc_page(enclave->secs_page.epc_page);
+ pginfo.secs = (unsigned long)secs_ptr;
+
+ epc_ptr = isgx_get_epc_page(epc_page);
+ va_ptr = isgx_get_epc_page(enclave_page->va_page->epc_page);
+
+ pginfo.linaddr = is_secs ? 0 : enclave_page->addr;
+ pginfo.pcmd = (unsigned long)&enclave_page->pcmd;
+
+ ret = __eldu((unsigned long)&pginfo,
+ (unsigned long)epc_ptr,
+ (unsigned long)va_ptr +
+ enclave_page->va_offset);
+
+ isgx_put_epc_page(va_ptr);
+ isgx_put_epc_page(epc_ptr);
+
+ if (!is_secs)
+ isgx_put_epc_page(secs_ptr);
+
+ kunmap_atomic((void *)(unsigned long)pginfo.srcpge);
+ WARN_ON(ret);
+ if (ret)
+ return -EFAULT;
+
+ return 0;
+}
+
+static struct isgx_enclave_page *isgx_vma_do_fault(struct vm_area_struct *vma,
+ unsigned long addr,
+ int reserve)
+{
+ struct isgx_enclave *enclave = vma->vm_private_data;
+ struct isgx_enclave_page *entry;
+ struct isgx_epc_page *epc_page;
+ struct isgx_epc_page *secs_epc_page = NULL;
+ struct page *backing;
+ unsigned free_flags = ISGX_FREE_SKIP_EREMOVE;
+ int rc;
+
+ /* If process was forked, VMA is still there but vm_private_data is set
+ * to NULL.
+ */
+ if (!enclave)
+ return ERR_PTR(-EFAULT);
+
+ entry = isgx_enclave_find_page(enclave, addr);
+ if (!entry)
+ return ERR_PTR(-EFAULT);
+
+ /* We use atomic allocation in the #PF handler in order to avoid ABBA
+ * deadlock with mmap_sems.
+ */
+ epc_page = isgx_alloc_epc_page(enclave->tgid_ctx, ISGX_ALLOC_ATOMIC);
+ if (IS_ERR(epc_page))
+ return (struct isgx_enclave_page *)epc_page;
+
+ /* The SECS page is not currently accounted. */
+ secs_epc_page = isgx_alloc_epc_page(NULL, ISGX_ALLOC_ATOMIC);
+ if (IS_ERR(secs_epc_page)) {
+ isgx_free_epc_page(epc_page, enclave, ISGX_FREE_SKIP_EREMOVE);
+ return (struct isgx_enclave_page *)secs_epc_page;
+ }
+
+ mutex_lock(&enclave->lock);
+
+ if (list_empty(&enclave->vma_list)) {
+ entry = ERR_PTR(-EFAULT);
+ goto out;
+ }
+
+ if (!(enclave->flags & ISGX_ENCLAVE_INITIALIZED)) {
+ isgx_dbg(enclave, "cannot fault, unitialized\n");
+ entry = ERR_PTR(-EFAULT);
+ goto out;
+ }
+
+ if (reserve && (entry->flags & ISGX_ENCLAVE_PAGE_RESERVED)) {
+ isgx_dbg(enclave, "cannot fault, 0x%lx is reserved\n",
+ entry->addr);
+ entry = ERR_PTR(-EBUSY);
+ goto out;
+ }
+
+ /* Legal race condition, page is already faulted. */
+ if (entry->epc_page) {
+ if (reserve)
+ entry->flags |= ISGX_ENCLAVE_PAGE_RESERVED;
+ goto out;
+ }
+
+ /* If SECS is evicted then reload it first */
+ if (enclave->flags & ISGX_ENCLAVE_SECS_EVICTED) {
+ backing = isgx_get_backing(enclave, &enclave->secs_page);
+ if (IS_ERR(backing)) {
+ entry = (void *)backing;
+ goto out;
+ }
+
+ rc = do_eldu(enclave, &enclave->secs_page, secs_epc_page,
+ backing, true /* is_secs */);
+ isgx_put_backing(backing, 0);
+ if (rc)
+ goto out;
+
+ enclave->secs_page.epc_page = secs_epc_page;
+ enclave->flags &= ~ISGX_ENCLAVE_SECS_EVICTED;
+
+ /* Do not free */
+ secs_epc_page = NULL;
+ }
+
+ backing = isgx_get_backing(enclave, entry);
+ if (IS_ERR(backing)) {
+ entry = (void *)backing;
+ goto out;
+ }
+
+ do_eldu(enclave, entry, epc_page, backing, false /* is_secs */);
+ rc = vm_insert_pfn(vma, entry->addr, PFN_DOWN(epc_page->pa));
+ isgx_put_backing(backing, 0);
+
+ if (rc) {
+ free_flags = 0;
+ goto out;
+ }
+
+ enclave->secs_child_cnt++;
+
+ entry->epc_page = epc_page;
+
+ if (reserve)
+ entry->flags |= ISGX_ENCLAVE_PAGE_RESERVED;
+
+ /* Do not free */
+ epc_page = NULL;
+
+ list_add_tail(&entry->load_list, &enclave->load_list);
+out:
+ mutex_unlock(&enclave->lock);
+ if (epc_page)
+ isgx_free_epc_page(epc_page, enclave, free_flags);
+ if (secs_epc_page)
+ isgx_free_epc_page(secs_epc_page, NULL,
+ ISGX_FREE_SKIP_EREMOVE);
+ return entry;
+}
+
+static int isgx_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+ unsigned long addr = (unsigned long)vmf->virtual_address;
+ struct isgx_enclave_page *entry;
+
+ entry = isgx_vma_do_fault(vma, addr, 0);
+
+ if (!IS_ERR(entry) || PTR_ERR(entry) == -EBUSY)
+ return VM_FAULT_NOPAGE;
+ else
+ return VM_FAULT_SIGBUS;
+}
+
+struct vm_operations_struct isgx_vm_ops = {
+ .close = isgx_vma_close,
+ .open = isgx_vma_open,
+ .fault = isgx_vma_fault,
+};
--
2.7.4

2016-04-25 17:38:43

by Jarkko Sakkinen

[permalink] [raw]
Subject: [PATCH 6/6] intel_sgx: TODO file for the staging area

Signed-off-by: Jarkko Sakkinen <[email protected]>
---
drivers/staging/intel_sgx/TODO | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
create mode 100644 drivers/staging/intel_sgx/TODO

diff --git a/drivers/staging/intel_sgx/TODO b/drivers/staging/intel_sgx/TODO
new file mode 100644
index 0000000..05f68c2
--- /dev/null
+++ b/drivers/staging/intel_sgx/TODO
@@ -0,0 +1,25 @@
+Documentation
+=============
+
+* Improve Documents/x86/intel-sgx.txt based on the feedback and
+ questions that pop up.
+
+Internals
+=========
+
+* Move structures needed by the allocator to arch/x86/include/asm/sgx.h
+* Move EPC page allocation and eviction code to arch/x86/mm as they
+ will shared with virtualization code.
+* Move enclave management functions to arch/x86/mm as they will be
+ shared with virtualization code.
+* Use reserve_memtype() in order to add EPC to the PAT memtype list
+ with WB caching.
+* Implement proper recovery code for the pager for cases when
+ ETRACK/EBLOCK/EWB fails instead of BUG_ON(). Probably the sanest
+ way to recover is to clear TCS PTEs, kick threads out of enclave
+ and remove EPC pages.
+* Implement ACPI hot-lug for SGX.
+
+===
+
+* Move isgx_user.h to arch/x86/include/uapi/asm/sgx.h
--
2.7.4

2016-04-25 17:54:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Mon, Apr 25, 2016 at 08:34:07PM +0300, Jarkko Sakkinen wrote:
> Intel(R) SGX is a set of CPU instructions that can be used by
> applications to set aside private regions of code and data. The code
> outside the enclave is disallowed to access the memory inside the
> enclave by the CPU access control.
>
> The firmware uses PRMRR registers to reserve an area of physical memory
> called Enclave Page Cache (EPC). There is a hardware unit in the
> processor called Memory Encryption Engine. The MEE encrypts and decrypts
> the EPC pages as they enter and leave the processor package.
>
> Jarkko Sakkinen (5):
> x86, sgx: common macros and definitions
> intel_sgx: driver for Intel Secure Guard eXtensions
> intel_sgx: ptrace() support for the driver
> intel_sgx: driver documentation
> intel_sgx: TODO file for the staging area
>
> Kai Huang (1):
> x86: add SGX definition to cpufeature
>
> Documentation/x86/intel_sgx.txt | 86 +++
> arch/x86/include/asm/cpufeature.h | 1 +
> arch/x86/include/asm/sgx.h | 253 +++++++

Why are you asking for this to go into staging?

What is keeping it out of the "real" part of the kernel tree?

And staging code is self-contained, putting files in arch/* isn't ok for
it, which kind of implies that you should get this merged correctly.

I need a lot more information here before I can take this code...

thanks,

greg k-h

2016-04-25 17:54:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 6/6] intel_sgx: TODO file for the staging area

On Mon, Apr 25, 2016 at 08:34:13PM +0300, Jarkko Sakkinen wrote:
> Signed-off-by: Jarkko Sakkinen <[email protected]>
> ---
> drivers/staging/intel_sgx/TODO | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
> create mode 100644 drivers/staging/intel_sgx/TODO
>
> diff --git a/drivers/staging/intel_sgx/TODO b/drivers/staging/intel_sgx/TODO
> new file mode 100644
> index 0000000..05f68c2
> --- /dev/null
> +++ b/drivers/staging/intel_sgx/TODO
> @@ -0,0 +1,25 @@
> +Documentation
> +=============
> +
> +* Improve Documents/x86/intel-sgx.txt based on the feedback and
> + questions that pop up.
> +
> +Internals
> +=========
> +
> +* Move structures needed by the allocator to arch/x86/include/asm/sgx.h
> +* Move EPC page allocation and eviction code to arch/x86/mm as they
> + will shared with virtualization code.
> +* Move enclave management functions to arch/x86/mm as they will be
> + shared with virtualization code.
> +* Use reserve_memtype() in order to add EPC to the PAT memtype list
> + with WB caching.
> +* Implement proper recovery code for the pager for cases when
> + ETRACK/EBLOCK/EWB fails instead of BUG_ON(). Probably the sanest
> + way to recover is to clear TCS PTEs, kick threads out of enclave
> + and remove EPC pages.
> +* Implement ACPI hot-lug for SGX.

What is keeping you from doing all of this work this week, making this
todo list empty?

thanks,

greg k-h

2016-04-25 17:55:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3/6] intel_sgx: driver for Intel Secure Guard eXtensions

On Mon, Apr 25, 2016 at 08:34:10PM +0300, Jarkko Sakkinen wrote:
> Intel(R) SGX is a set of CPU instructions that can be used by
> applications to set aside private regions of code and data. The code
> outside the enclave is disallowed to access the memory inside the
> enclave by the CPU access control.
>
> Intel SGX driver provides a ioctl interface for loading and initializing
> enclaves and a pager in order to support oversubscription.
>
> Signed-off-by: Jarkko Sakkinen <[email protected]>
> ---
> arch/x86/include/asm/sgx.h | 4 +-
> drivers/staging/Kconfig | 2 +
> drivers/staging/Makefile | 1 +
> drivers/staging/intel_sgx/Kconfig | 13 +
> drivers/staging/intel_sgx/Makefile | 12 +
> drivers/staging/intel_sgx/isgx.h | 238 +++++++
> drivers/staging/intel_sgx/isgx_compat_ioctl.c | 179 +++++

No "new" kernel code should ever need compat_ioctl support, please just
create your structures to not require this at all, it isn't that
difficult.

thanks,

greg k-h

2016-04-25 18:56:39

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH 6/6] intel_sgx: TODO file for the staging area

On Mon, Apr 25, 2016 at 10:54:26AM -0700, Greg KH wrote:
> On Mon, Apr 25, 2016 at 08:34:13PM +0300, Jarkko Sakkinen wrote:
> > Signed-off-by: Jarkko Sakkinen <[email protected]>
> > ---
> > drivers/staging/intel_sgx/TODO | 25 +++++++++++++++++++++++++
> > 1 file changed, 25 insertions(+)
> > create mode 100644 drivers/staging/intel_sgx/TODO
> >
> > diff --git a/drivers/staging/intel_sgx/TODO b/drivers/staging/intel_sgx/TODO
> > new file mode 100644
> > index 0000000..05f68c2
> > --- /dev/null
> > +++ b/drivers/staging/intel_sgx/TODO
> > @@ -0,0 +1,25 @@
> > +Documentation
> > +=============
> > +
> > +* Improve Documents/x86/intel-sgx.txt based on the feedback and
> > + questions that pop up.
> > +
> > +Internals
> > +=========
> > +
> > +* Move structures needed by the allocator to arch/x86/include/asm/sgx.h
> > +* Move EPC page allocation and eviction code to arch/x86/mm as they
> > + will shared with virtualization code.
> > +* Move enclave management functions to arch/x86/mm as they will be
> > + shared with virtualization code.
> > +* Use reserve_memtype() in order to add EPC to the PAT memtype list
> > + with WB caching.
> > +* Implement proper recovery code for the pager for cases when
> > + ETRACK/EBLOCK/EWB fails instead of BUG_ON(). Probably the sanest
> > + way to recover is to clear TCS PTEs, kick threads out of enclave
> > + and remove EPC pages.
> > +* Implement ACPI hot-lug for SGX.
>
> What is keeping you from doing all of this work this week, making this
> todo list empty?

I could. I took the internal driver code and just enumerated the tasks
that I saw that need to be done before it's ready from my point of view.
I just wanted intial feedback before starting to work through these so
that I know my aim is right.

> thanks,
>
> greg k-h

/Jarkko

2016-04-25 19:04:09

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Mon, Apr 25, 2016 at 10:53:52AM -0700, Greg KH wrote:
> On Mon, Apr 25, 2016 at 08:34:07PM +0300, Jarkko Sakkinen wrote:
> > Intel(R) SGX is a set of CPU instructions that can be used by
> > applications to set aside private regions of code and data. The code
> > outside the enclave is disallowed to access the memory inside the
> > enclave by the CPU access control.
> >
> > The firmware uses PRMRR registers to reserve an area of physical memory
> > called Enclave Page Cache (EPC). There is a hardware unit in the
> > processor called Memory Encryption Engine. The MEE encrypts and decrypts
> > the EPC pages as they enter and leave the processor package.
> >
> > Jarkko Sakkinen (5):
> > x86, sgx: common macros and definitions
> > intel_sgx: driver for Intel Secure Guard eXtensions
> > intel_sgx: ptrace() support for the driver
> > intel_sgx: driver documentation
> > intel_sgx: TODO file for the staging area
> >
> > Kai Huang (1):
> > x86: add SGX definition to cpufeature
> >
> > Documentation/x86/intel_sgx.txt | 86 +++
> > arch/x86/include/asm/cpufeature.h | 1 +
> > arch/x86/include/asm/sgx.h | 253 +++++++
>
> Why are you asking for this to go into staging?
>
> What is keeping it out of the "real" part of the kernel tree?

Now that I think of it nothing as long as the API is fixed the way you
suggested and my TODO list is cleared.

I think I prepare a new version of the patches and point it directly
to arch/x86.

> And staging code is self-contained, putting files in arch/* isn't ok for
> it, which kind of implies that you should get this merged correctly.
>
> I need a lot more information here before I can take this code...
>
> thanks,
>
> greg k-h

/Jarkko

2016-04-25 19:04:52

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH 3/6] intel_sgx: driver for Intel Secure Guard eXtensions

On Mon, Apr 25, 2016 at 10:55:36AM -0700, Greg KH wrote:
> On Mon, Apr 25, 2016 at 08:34:10PM +0300, Jarkko Sakkinen wrote:
> > Intel(R) SGX is a set of CPU instructions that can be used by
> > applications to set aside private regions of code and data. The code
> > outside the enclave is disallowed to access the memory inside the
> > enclave by the CPU access control.
> >
> > Intel SGX driver provides a ioctl interface for loading and initializing
> > enclaves and a pager in order to support oversubscription.
> >
> > Signed-off-by: Jarkko Sakkinen <[email protected]>
> > ---
> > arch/x86/include/asm/sgx.h | 4 +-
> > drivers/staging/Kconfig | 2 +
> > drivers/staging/Makefile | 1 +
> > drivers/staging/intel_sgx/Kconfig | 13 +
> > drivers/staging/intel_sgx/Makefile | 12 +
> > drivers/staging/intel_sgx/isgx.h | 238 +++++++
> > drivers/staging/intel_sgx/isgx_compat_ioctl.c | 179 +++++
>
> No "new" kernel code should ever need compat_ioctl support, please just
> create your structures to not require this at all, it isn't that
> difficult.

I'll rework this. Thanks for the feedback.

> thanks,
>
> greg k-h

/Jarkko

2016-04-25 19:06:47

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH 6/6] intel_sgx: TODO file for the staging area

> +* Implement ACPI hot-lug for SGX.

hot-plug

Also with an upstream hat on I would add being able to check the keys on
the enclave against a kernel keychain because not everyone will want to
solely trust whatever keys the hardware thinks it wants to trust.

Alan

2016-04-25 19:21:28

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Mon, Apr 25, 2016 at 12:03 PM, Jarkko Sakkinen
<[email protected]> wrote:
> On Mon, Apr 25, 2016 at 10:53:52AM -0700, Greg KH wrote:
>> On Mon, Apr 25, 2016 at 08:34:07PM +0300, Jarkko Sakkinen wrote:
>> > Intel(R) SGX is a set of CPU instructions that can be used by
>> > applications to set aside private regions of code and data. The code
>> > outside the enclave is disallowed to access the memory inside the
>> > enclave by the CPU access control.
>> >
>> > The firmware uses PRMRR registers to reserve an area of physical memory
>> > called Enclave Page Cache (EPC). There is a hardware unit in the
>> > processor called Memory Encryption Engine. The MEE encrypts and decrypts
>> > the EPC pages as they enter and leave the processor package.
>> >
>> > Jarkko Sakkinen (5):
>> > x86, sgx: common macros and definitions
>> > intel_sgx: driver for Intel Secure Guard eXtensions
>> > intel_sgx: ptrace() support for the driver
>> > intel_sgx: driver documentation
>> > intel_sgx: TODO file for the staging area
>> >
>> > Kai Huang (1):
>> > x86: add SGX definition to cpufeature
>> >
>> > Documentation/x86/intel_sgx.txt | 86 +++
>> > arch/x86/include/asm/cpufeature.h | 1 +
>> > arch/x86/include/asm/sgx.h | 253 +++++++
>>
>> Why are you asking for this to go into staging?
>>
>> What is keeping it out of the "real" part of the kernel tree?
>
> Now that I think of it nothing as long as the API is fixed the way you
> suggested and my TODO list is cleared.
>
> I think I prepare a new version of the patches and point it directly
> to arch/x86.

Thanks. Please cc me as well.

--Andy

>
>> And staging code is self-contained, putting files in arch/* isn't ok for
>> it, which kind of implies that you should get this merged correctly.
>>
>> I need a lot more information here before I can take this code...
>>
>> thanks,
>>
>> greg k-h
>
> /Jarkko



--
Andy Lutomirski
AMA Capital Management, LLC

2016-04-25 19:32:14

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 1/6] x86: add SGX definition to cpufeature

On Mon, Apr 25, 2016 at 10:34 AM, Jarkko Sakkinen
<[email protected]> wrote:
> From: Kai Huang <[email protected]>

Should this come with a nosgx boot option and an IA32_FEATURE_CONTROL
check to disable the feature if BIOS doesn't support it?

--Andy

2016-04-25 19:48:57

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/6] x86: add SGX definition to cpufeature

Andy Lutomirski <[email protected]> writes:

> On Mon, Apr 25, 2016 at 10:34 AM, Jarkko Sakkinen
> <[email protected]> wrote:
>> From: Kai Huang <[email protected]>
>
> Should this come with a nosgx boot option and an IA32_FEATURE_CONTROL
> check to disable the feature if BIOS doesn't support it?

You can already disable every CPUID feature at boot with clearcpuid=...
No point in adding redundant options for everything.

-Andi

--
[email protected] -- Speaking for myself only

2016-04-25 20:01:14

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 5/6] intel_sgx: driver documentation

On 04/25/2016 10:34 AM, Jarkko Sakkinen wrote:
> +SGX_IOCTL_ENCLAVE_INIT
> +
> +Initializes an enclave given by SIGSTRUCT and EINITTOKEN. Executes EINIT leaf
> +instruction that will check that the measurement matches the one SIGSTRUCT and
> +EINITTOKEN. EINITTOKEN is a data blob given by a special enclave called Launch
> +Enclave and it is signed with a CPU's Launch Key.
>

Having thought about this for ten minutes, I have the following thought:

I think that we should seriously consider not allowing user code to
supply EINITTOKEN at all. Here's why:

1. The nominal purpose of this thing is "launch control." I think that
the decision of whether to launch an enclave belongs in the kernel to
the extent that the kernel has the ability to control this.

2. I think that launch control is actively insecure (assuming that the
use case is what I think it is). Since the kernel might have some
interest in controlling whether an enclave can launch (I think this is
entirely reasonable) and since that policy might reasonably be expressed
in the form of a launch enclave, I think that the *kernel* should
generate the actual EINITTOKEN object. (I also reported, off-list, what
I think is a significant security issue under some usage models that is
mitigated if the user isn't allowed to supply their own EINITTOKEN of
unknown provenance.)

3. On a CPU with unlocked IA32_SGXLEPUBKEYHASH, I think that the kernel
should ship, *in the kernel image*, a binary corresponding to an
open-source "launch anything" enclave. The kernel should, when
appropriate, use this thing to generate EINITTOKEN objects. User code
should *not* have to think about where this "launch anything" enclave
comes from or whether it's the same on all kernels. (I think that the
best way to do this would be to try to build it deterministically using
a well-known key pair. This should be very easy to do.) If someone
wants to turn this feature off, let them do so via sysctl.

If someone wants to supply their own launch enclave, then let them
either feed it to the kernel or enable some non-default privileged
option to allow them to supply EINITTOKEN directly.

Actually implementing this is going to be interesting, because the
kernel will have to call out to CPL 3 to do it. It's not *that* hard,
though, as the exiting kernel thread API should be more or less adequate.

--Andy

2016-04-25 20:01:30

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 6/6] intel_sgx: TODO file for the staging area

Jarkko Sakkinen <[email protected]> writes:


> diff --git a/drivers/staging/intel_sgx/TODO b/drivers/staging/intel_sgx/TODO
> new file mode 100644
> index 0000000..05f68c2
> --- /dev/null
> +++ b/drivers/staging/intel_sgx/TODO
> @@ -0,0 +1,25 @@
> +Documentation
> +=============
> +
> +* Improve Documents/x86/intel-sgx.txt based on the feedback and
> + questions that pop up.
> +
> +Internals
> +=========
> +
> +* Move structures needed by the allocator to arch/x86/include/asm/sgx.h
> +* Move EPC page allocation and eviction code to arch/x86/mm as they
> + will shared with virtualization code.
> +* Move enclave management functions to arch/x86/mm as they will be
> + shared with virtualization code.
> +* Use reserve_memtype() in order to add EPC to the PAT memtype list
> + with WB caching.
> +* Implement proper recovery code for the pager for cases when
> + ETRACK/EBLOCK/EWB fails instead of BUG_ON(). Probably the sanest
> + way to recover is to clear TCS PTEs, kick threads out of enclave
> + and remove EPC pages.
> +* Implement ACPI hot-lug for SGX.

- Write proper patch descriptions.

Especially how the "new VM" in 3/6 works needs a lot more explanation ...

- Add some test code

-Andi

2016-04-26 11:23:47

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH 6/6] intel_sgx: TODO file for the staging area

On Mon, Apr 25, 2016 at 01:01:24PM -0700, Andi Kleen wrote:
> Jarkko Sakkinen <[email protected]> writes:
>
>
> > diff --git a/drivers/staging/intel_sgx/TODO b/drivers/staging/intel_sgx/TODO
> > new file mode 100644
> > index 0000000..05f68c2
> > --- /dev/null
> > +++ b/drivers/staging/intel_sgx/TODO
> > @@ -0,0 +1,25 @@
> > +Documentation
> > +=============
> > +
> > +* Improve Documents/x86/intel-sgx.txt based on the feedback and
> > + questions that pop up.
> > +
> > +Internals
> > +=========
> > +
> > +* Move structures needed by the allocator to arch/x86/include/asm/sgx.h
> > +* Move EPC page allocation and eviction code to arch/x86/mm as they
> > + will shared with virtualization code.
> > +* Move enclave management functions to arch/x86/mm as they will be
> > + shared with virtualization code.
> > +* Use reserve_memtype() in order to add EPC to the PAT memtype list
> > + with WB caching.
> > +* Implement proper recovery code for the pager for cases when
> > + ETRACK/EBLOCK/EWB fails instead of BUG_ON(). Probably the sanest
> > + way to recover is to clear TCS PTEs, kick threads out of enclave
> > + and remove EPC pages.
> > +* Implement ACPI hot-lug for SGX.
>
> - Write proper patch descriptions.
>
> Especially how the "new VM" in 3/6 works needs a lot more explanation ...

Agreed. I have now idea how to improve this given the feedback so far
from you Andy and Greg. Thanks. It was hard to figure out the areas,
which require more explanation before putting something out first.

> - Add some test code

Skylake, the only microarchitecture available at the moment supporting
SGX, does not support IA32_SGXLEPUBKEYHASH* MSRs documented in Volume 3C
of the Intel x86 SDM.

There will be an Open Source SDK available in the near future. It comes
with Launch Enclave [1] that generates automatically EINITTOKENs for
debug enclaves. At the moment there is no process for signing producton
enclaves with the Intel root of trust for Linux (there is a process for
Windows).

In order to write test code I would need to use the SDK at minimum to
generate EINITTOKEN for the test enclave.

[1] The source code is available but with Skylake you cannot sign your
own Launch Enclave binary, which is of course possible in future when
the MSRs become available for having you own root of trust.

> -Andi

/Jarkko

2016-04-26 19:00:16

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Mon 2016-04-25 20:34:07, Jarkko Sakkinen wrote:
> Intel(R) SGX is a set of CPU instructions that can be used by
> applications to set aside private regions of code and data. The code
> outside the enclave is disallowed to access the memory inside the
> enclave by the CPU access control.
>
> The firmware uses PRMRR registers to reserve an area of physical memory
> called Enclave Page Cache (EPC). There is a hardware unit in the
> processor called Memory Encryption Engine. The MEE encrypts and decrypts
> the EPC pages as they enter and leave the processor package.

What are non-evil use cases for this?

Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2016-04-26 19:06:12

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Tue, Apr 26, 2016 at 12:00 PM, Pavel Machek <[email protected]> wrote:
> On Mon 2016-04-25 20:34:07, Jarkko Sakkinen wrote:
>> Intel(R) SGX is a set of CPU instructions that can be used by
>> applications to set aside private regions of code and data. The code
>> outside the enclave is disallowed to access the memory inside the
>> enclave by the CPU access control.
>>
>> The firmware uses PRMRR registers to reserve an area of physical memory
>> called Enclave Page Cache (EPC). There is a hardware unit in the
>> processor called Memory Encryption Engine. The MEE encrypts and decrypts
>> the EPC pages as they enter and leave the processor package.
>
> What are non-evil use cases for this?

Storing your ssh private key encrypted such that even someone who
completely compromises your system can't get the actual private key
out. Using this in conjunction with an RPMB device to make it Rather
Difficult (tm) for third parties to decrypt your disk even if you
password has low entropy. There are plenty more.

Think of this as the first time that a secure enclave will be widely
available that anyone can program themselves. Well, almost: Skylake
doesn't actually permit that for, ahem, reasons. But if you read the
very most recent SDM update, it would appear that future Intel CPUs
will allow this as long as the firmware doesn't get in the way. Look
for IA32_SGXLEHASHSIG (possibly misspelled slightly -- the first few
letters are correct).

--Andy

2016-04-26 19:41:24

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Tue 2016-04-26 12:05:48, Andy Lutomirski wrote:
> On Tue, Apr 26, 2016 at 12:00 PM, Pavel Machek <[email protected]> wrote:
> > On Mon 2016-04-25 20:34:07, Jarkko Sakkinen wrote:
> >> Intel(R) SGX is a set of CPU instructions that can be used by
> >> applications to set aside private regions of code and data. The code
> >> outside the enclave is disallowed to access the memory inside the
> >> enclave by the CPU access control.
> >>
> >> The firmware uses PRMRR registers to reserve an area of physical memory
> >> called Enclave Page Cache (EPC). There is a hardware unit in the
> >> processor called Memory Encryption Engine. The MEE encrypts and decrypts
> >> the EPC pages as they enter and leave the processor package.
> >
> > What are non-evil use cases for this?
>
> Storing your ssh private key encrypted such that even someone who
> completely compromises your system can't get the actual private key

Well, if someone gets root on my system, he can get my ssh private
key.... right?

So, you can use this to prevent "cold boot" attacks? (You know,
stealing machine, liquid nitrogen, moving DIMMs to different machine
to read them?) Ok. That's non-evil.

Is there reason not to enable this for whole RAM if the hw can do it?

> out. Using this in conjunction with an RPMB device to make it Rather
> Difficult (tm) for third parties to decrypt your disk even if you
> password has low entropy. There are plenty more.

I'm not sure what RPMB is, but I don't think you can make it too hard
to decrypt my disk if my password has low entropy. ... And I don't see
how encrypting RAM helps there.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2016-04-26 19:56:55

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Tue, Apr 26, 2016 at 12:41 PM, Pavel Machek <[email protected]> wrote:
> On Tue 2016-04-26 12:05:48, Andy Lutomirski wrote:
>> On Tue, Apr 26, 2016 at 12:00 PM, Pavel Machek <[email protected]> wrote:
>> > On Mon 2016-04-25 20:34:07, Jarkko Sakkinen wrote:
>> >> Intel(R) SGX is a set of CPU instructions that can be used by
>> >> applications to set aside private regions of code and data. The code
>> >> outside the enclave is disallowed to access the memory inside the
>> >> enclave by the CPU access control.
>> >>
>> >> The firmware uses PRMRR registers to reserve an area of physical memory
>> >> called Enclave Page Cache (EPC). There is a hardware unit in the
>> >> processor called Memory Encryption Engine. The MEE encrypts and decrypts
>> >> the EPC pages as they enter and leave the processor package.
>> >
>> > What are non-evil use cases for this?
>>
>> Storing your ssh private key encrypted such that even someone who
>> completely compromises your system can't get the actual private key
>
> Well, if someone gets root on my system, he can get my ssh private
> key.... right?
>
> So, you can use this to prevent "cold boot" attacks? (You know,
> stealing machine, liquid nitrogen, moving DIMMs to different machine
> to read them?) Ok. That's non-evil.

Preventing cold boot attacks is really just icing on the cake. The
real point of this is to allow you to run an "enclave". An SGX
enclave has unencrypted code but gets access to a key that only it can
access. It could use that key to unwrap your ssh private key and sign
with it without ever revealing the unwrapped key. No one, not even
root, can read enclave memory once the enclave is initialized and gets
access to its personalized key. The point of the memory encryption
engine to to prevent even cold boot attacks from being used to read
enclave memory.

This could probably be used for evil, but I think the evil uses are
outweighed by the good uses.

>
> Is there reason not to enable this for whole RAM if the hw can do it?

The HW can't, at least not in the current implementation. Also, the
metadata has considerable overhead (no clue whether there's a
performance hit, but there's certainly a memory usage hit).

>
>> out. Using this in conjunction with an RPMB device to make it Rather
>> Difficult (tm) for third parties to decrypt your disk even if you
>> password has low entropy. There are plenty more.
>
> I'm not sure what RPMB is, but I don't think you can make it too hard
> to decrypt my disk if my password has low entropy. ... And I don't see
> how encrypting RAM helps there.

Replay Protected Memory Block. It's a device that allows someone to
write to it and confirm that the write happened and the old contents
is no longer available. You could use it to implement an enclave that
checks a password for your disk but only allows you to try a certain
number of times.

There are some hints in the whitepapers that such a mechanism might be
present on existing Skylake chipsets. I'm not really sure.

>
> Pavel
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html



--
Andy Lutomirski
AMA Capital Management, LLC

2016-04-26 20:12:00

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

Hi!

> >> >> The firmware uses PRMRR registers to reserve an area of physical memory
> >> >> called Enclave Page Cache (EPC). There is a hardware unit in the
> >> >> processor called Memory Encryption Engine. The MEE encrypts and decrypts
> >> >> the EPC pages as they enter and leave the processor package.
> >> >
> >> > What are non-evil use cases for this?
> >>
> >> Storing your ssh private key encrypted such that even someone who
> >> completely compromises your system can't get the actual private key
> >
> > Well, if someone gets root on my system, he can get my ssh private
> > key.... right?
> >
> > So, you can use this to prevent "cold boot" attacks? (You know,
> > stealing machine, liquid nitrogen, moving DIMMs to different machine
> > to read them?) Ok. That's non-evil.
>
> Preventing cold boot attacks is really just icing on the cake. The
> real point of this is to allow you to run an "enclave". An SGX
> enclave has unencrypted code but gets access to a key that only it can
> access. It could use that key to unwrap your ssh private key and sign
> with it without ever revealing the unwrapped key. No one, not even
> root, can read enclave memory once the enclave is initialized and gets
> access to its personalized key. The point of the memory encryption
> engine to to prevent even cold boot attacks from being used to read
> enclave memory.

Ok, so the attacker can still access the "other" machine, but ok, key
is protected.

But... that will mean that my ssh will need to be SGX-aware, and that
I will not be able to switch to AMD machine in future. ... or to other
Intel machine for that matter, right?

What new syscalls would be needed for ssh to get all this support?

> > Is there reason not to enable this for whole RAM if the hw can do it?
>
> The HW can't, at least not in the current implementation. Also, the
> metadata has considerable overhead (no clue whether there's a
> performance hit, but there's certainly a memory usage hit).

:-(.

> >> out. Using this in conjunction with an RPMB device to make it Rather
> >> Difficult (tm) for third parties to decrypt your disk even if you
> >> password has low entropy. There are plenty more.
> >
> > I'm not sure what RPMB is, but I don't think you can make it too hard
> > to decrypt my disk if my password has low entropy. ... And I don't see
> > how encrypting RAM helps there.
>
> Replay Protected Memory Block. It's a device that allows someone to
> write to it and confirm that the write happened and the old contents
> is no longer available. You could use it to implement an enclave that
> checks a password for your disk but only allows you to try a certain
> number of times.

Ookay... I guess I can get a fake Replay Protected Memory block, which
will confirm that write happened and not do anything from China, but
ok, if you put that memory on the CPU, you raise the bar to a "rather
difficult" (tm) level. Nice.

But that also means that when my CPU dies, I'll no longer be able to
access the encrypted data.

And, again, it means that quite complex new kernel-user interface will
be needed, right?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2016-04-26 20:19:32

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

> Replay Protected Memory Block. It's a device that allows someone to
> write to it and confirm that the write happened and the old contents
> is no longer available. You could use it to implement an enclave that
> checks a password for your disk but only allows you to try a certain
> number of times.

rpmb is found in a load of hardware today notably MMC/SD cards. Android
phones often use it to store sensitive system data.

Alan

2016-04-26 20:22:30

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

> > Storing your ssh private key encrypted such that even someone who
> > completely compromises your system can't get the actual private key
>
> Well, if someone gets root on my system, he can get my ssh private
> key.... right?

Potentially not. If you are using a TPM or other TEE (such as SGX) they
can't because the authentication is done from within the TEE. They may be
able to hack your box and use the TEE to login somewhere but not to get
the key out.

Stopping the latter requires a TEE with its own secure input keypad (like
some of the USB dongles)

Other uses might be things like keeping a copy of the rpm database so you
can ask the TEE if the database you have right now happens to match the
one you signed as authentic. I suspect there are lots of interesting
things that can be done with dm_crypt and also IMA in this area too.

Alan

2016-04-26 21:00:32

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

> But... that will mean that my ssh will need to be SGX-aware, and that
> I will not be able to switch to AMD machine in future. ... or to other
> Intel machine for that matter, right?

I'm not privy to AMD's CPU design plans.

However I think for the ssl/ssh case you'd use the same interfaces
currently available for plugging in TPMs and dongles. It's a solved
problem in the crypto libraries.

> What new syscalls would be needed for ssh to get all this support?

I don't see why you'd need new syscalls.

> Ookay... I guess I can get a fake Replay Protected Memory block, which
> will confirm that write happened and not do anything from China, but

It's not quite that simple because there are keys and a counter involved
but I am sure doable.

> And, again, it means that quite complex new kernel-user interface will
> be needed, right?

Why ? For user space we have perfectly good existing system calls, for
kernel space we have existing interfaces to the crypto and key layers for
modules to use.

Alan

2016-04-26 21:52:41

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Tue 2016-04-26 21:59:52, One Thousand Gnomes wrote:
> > But... that will mean that my ssh will need to be SGX-aware, and that
> > I will not be able to switch to AMD machine in future. ... or to other
> > Intel machine for that matter, right?
>
> I'm not privy to AMD's CPU design plans.
>
> However I think for the ssl/ssh case you'd use the same interfaces
> currently available for plugging in TPMs and dongles. It's a solved
> problem in the crypto libraries.
>
> > What new syscalls would be needed for ssh to get all this support?
>
> I don't see why you'd need new syscalls.

So the kernel will implement few selected crypto algorithms, similar
to what TPM would provide, using SGX, and then userspace no longer
needs to know about SGX?

Ok, I guess that's simple.

It also means it is boring, and the multiuser-game-of-the-day will not
be able to protect the (plain text) password from the cold boot
attack.

Nor will be emacs be able to protect in-memory copy of my diary from
cold boot attack.

So I guess yes, some new syscalls would be nice :-).
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2016-04-26 22:34:18

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Apr 26, 2016 1:11 PM, "Pavel Machek" <[email protected]> wrote:
>
> Hi!
>
> > >> >> The firmware uses PRMRR registers to reserve an area of physical memory
> > >> >> called Enclave Page Cache (EPC). There is a hardware unit in the
> > >> >> processor called Memory Encryption Engine. The MEE encrypts and decrypts
> > >> >> the EPC pages as they enter and leave the processor package.
> > >> >
> > >> > What are non-evil use cases for this?
> > >>
> > >> Storing your ssh private key encrypted such that even someone who
> > >> completely compromises your system can't get the actual private key
> > >
> > > Well, if someone gets root on my system, he can get my ssh private
> > > key.... right?
> > >
> > > So, you can use this to prevent "cold boot" attacks? (You know,
> > > stealing machine, liquid nitrogen, moving DIMMs to different machine
> > > to read them?) Ok. That's non-evil.
> >
> > Preventing cold boot attacks is really just icing on the cake. The
> > real point of this is to allow you to run an "enclave". An SGX
> > enclave has unencrypted code but gets access to a key that only it can
> > access. It could use that key to unwrap your ssh private key and sign
> > with it without ever revealing the unwrapped key. No one, not even
> > root, can read enclave memory once the enclave is initialized and gets
> > access to its personalized key. The point of the memory encryption
> > engine to to prevent even cold boot attacks from being used to read
> > enclave memory.
>
> Ok, so the attacker can still access the "other" machine, but ok, key
> is protected.
>
> But... that will mean that my ssh will need to be SGX-aware, and that
> I will not be able to switch to AMD machine in future. ... or to other
> Intel machine for that matter, right?

That's the whole point. You could keep an unwrapped copy of the key
offline so you could provision another machine if needed.

>
> What new syscalls would be needed for ssh to get all this support?

This patchset or similar, plus some user code and an enclave to use.

Sadly, on current CPUs, you also need Intel to bless the enclave. It
looks like new CPUs might relax that requirement.

>
> > > Is there reason not to enable this for whole RAM if the hw can do it?
> >
> > The HW can't, at least not in the current implementation. Also, the
> > metadata has considerable overhead (no clue whether there's a
> > performance hit, but there's certainly a memory usage hit).
>
> :-(.
>
> > >> out. Using this in conjunction with an RPMB device to make it Rather
> > >> Difficult (tm) for third parties to decrypt your disk even if you
> > >> password has low entropy. There are plenty more.
> > >
> > > I'm not sure what RPMB is, but I don't think you can make it too hard
> > > to decrypt my disk if my password has low entropy. ... And I don't see
> > > how encrypting RAM helps there.
> >
> > Replay Protected Memory Block. It's a device that allows someone to
> > write to it and confirm that the write happened and the old contents
> > is no longer available. You could use it to implement an enclave that
> > checks a password for your disk but only allows you to try a certain
> > number of times.
>
> Ookay... I guess I can get a fake Replay Protected Memory block, which
> will confirm that write happened and not do anything from China, but
> ok, if you put that memory on the CPU, you raise the bar to a "rather
> difficult" (tm) level. Nice.

It's not so easy for the RPMB to leak things. It would be much easier
for it to simply not provide replay protection (i.e. more or less what
the FBI asked from Apple: keep allowing guesses even though that
shouldn't work).

>
> But that also means that when my CPU dies, I'll no longer be able to
> access the encrypted data.

You could implement your own escrow policy and keep a copy in the safe.

>
> And, again, it means that quite complex new kernel-user interface will
> be needed, right?

It's actually fairly straightforward, and the kernel part doesn't care
what you use it for (the kernel part is the same for disk encryption
and ssh, for example, except that disk encryption would care about
replay protection, whereas ssh wouldn't).

--Andy

2016-04-26 22:36:19

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Tue, Apr 26, 2016 at 2:52 PM, Pavel Machek <[email protected]> wrote:
> On Tue 2016-04-26 21:59:52, One Thousand Gnomes wrote:
>> > But... that will mean that my ssh will need to be SGX-aware, and that
>> > I will not be able to switch to AMD machine in future. ... or to other
>> > Intel machine for that matter, right?
>>
>> I'm not privy to AMD's CPU design plans.
>>
>> However I think for the ssl/ssh case you'd use the same interfaces
>> currently available for plugging in TPMs and dongles. It's a solved
>> problem in the crypto libraries.
>>
>> > What new syscalls would be needed for ssh to get all this support?
>>
>> I don't see why you'd need new syscalls.
>
> So the kernel will implement few selected crypto algorithms, similar
> to what TPM would provide, using SGX, and then userspace no longer
> needs to know about SGX?

No, other way around. The kernel will provide a basic interface to
SGX and userspace can do whatever it wants. If userspace wants to use
RSA, userspace will provide an actual RSA implementation, in more or
less normal x86 binary form, and will map it into user addresses. It
will tell the kernel "hey, this address range is an enclave", and the
kernel will set it up as such and tell the CPU about it. Userspace
will then use SGX instructions to communicate with the enclave.

It's pretty neat, and it's completely agnostic to the purpose of the enclave.

--Andy

2016-04-27 06:50:01

by Jethro Beekman

[permalink] [raw]
Subject: Re: [PATCH 3/6] intel_sgx: driver for Intel Secure Guard eXtensions

On 25-04-16 10:34, Jarkko Sakkinen wrote:
> diff --git a/drivers/staging/intel_sgx/isgx_ioctl.c
b/drivers/staging/intel_sgx/isgx_ioctl.c
> new file mode 100644
> index 0000000..9d8b36b
> --- /dev/null
> +++ b/drivers/staging/intel_sgx/isgx_ioctl.c
>
> +static long isgx_ioctl_enclave_create(struct file *filep, unsigned int cmd,
> + unsigned long arg)
>
> + secs->base = vm_mmap(filep, 0, secs->size,
> + PROT_READ | PROT_WRITE | PROT_EXEC,
> + MAP_SHARED, 0);

Why does the ioctl interface map userspace memory for an open device? There's
already a perfectly good syscall for that: mmap.

> diff --git a/drivers/staging/intel_sgx/isgx_user.h b/drivers/staging/intel_sgx/isgx_user.h
> new file mode 100644
> index 0000000..672d19c
> --- /dev/null
> +++ b/drivers/staging/intel_sgx/isgx_user.h
>
> +#define SGX_ADD_SKIP_EEXTEND 0x1
> +
> +struct sgx_add_param {
> + unsigned long addr;
> + unsigned long user_addr;
> + struct isgx_secinfo *secinfo;
> + unsigned int flags;
> +};

The hardware supports calling EEXTEND on only a part of a page, I think the
driver should also support that.

Jethro

2016-04-27 06:59:56

by Jethro Beekman

[permalink] [raw]
Subject: Re: [PATCH 6/6] intel_sgx: TODO file for the staging area

On 26-04-16 04:23, Jarkko Sakkinen wrote:
> In order to write test code I would need to use the SDK at minimum to
> generate EINITTOKEN for the test enclave.

You could do this right now with the Rust tools for SGX [1]

[1] https://github.com/jethrogb/sgx-utils/

> /Jarkko

Jethro

2016-04-27 07:32:59

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

Hi!

> > > Preventing cold boot attacks is really just icing on the cake. The
> > > real point of this is to allow you to run an "enclave". An SGX
> > > enclave has unencrypted code but gets access to a key that only it can
> > > access. It could use that key to unwrap your ssh private key and sign
> > > with it without ever revealing the unwrapped key. No one, not even
> > > root, can read enclave memory once the enclave is initialized and gets
> > > access to its personalized key. The point of the memory encryption
> > > engine to to prevent even cold boot attacks from being used to read
> > > enclave memory.
> >
> > Ok, so the attacker can still access the "other" machine, but ok, key
> > is protected.
> >
> > But... that will mean that my ssh will need to be SGX-aware, and that
> > I will not be able to switch to AMD machine in future. ... or to other
> > Intel machine for that matter, right?
>
> That's the whole point. You could keep an unwrapped copy of the key
> offline so you could provision another machine if needed.
>
> >
> > What new syscalls would be needed for ssh to get all this support?
>
> This patchset or similar, plus some user code and an enclave to use.
>
> Sadly, on current CPUs, you also need Intel to bless the enclave. It
> looks like new CPUs might relax that requirement.

Umm. I'm afraid my evil meter just went over "smells evil" and "bit
evil" areas straight to "certainly looks evil".

> > > Replay Protected Memory Block. It's a device that allows someone to
> > > write to it and confirm that the write happened and the old contents
> > > is no longer available. You could use it to implement an enclave that
> > > checks a password for your disk but only allows you to try a certain
> > > number of times.
> >
> > Ookay... I guess I can get a fake Replay Protected Memory block, which
> > will confirm that write happened and not do anything from China, but
> > ok, if you put that memory on the CPU, you raise the bar to a "rather
> > difficult" (tm) level. Nice.
>
> It's not so easy for the RPMB to leak things. It would be much easier
> for it to simply not provide replay protection (i.e. more or less what
> the FBI asked from Apple: keep allowing guesses even though that
> shouldn't work).

Yup.

> > But that also means that when my CPU dies, I'll no longer be able to
> > access the encrypted data.
>
> You could implement your own escrow policy and keep a copy in the
> safe.

And then Intel would have to bless my own escrow policy, which is,
realistically, not going to happen, right?

> > And, again, it means that quite complex new kernel-user interface will
> > be needed, right?
>
> It's actually fairly straightforward, and the kernel part doesn't care
> what you use it for (the kernel part is the same for disk encryption
> and ssh, for example, except that disk encryption would care about
> replay protection, whereas ssh wouldn't).

So we end up with parts of kernel we can not change, and where we may
not even change the compiler. That means assembly. Hey, user, you have
freedom to this code, except it will not work. That was called TiVo
before. We'd have security-relevant parts of kernel where we could not
even fix a securit holes without Intel.

If anything, this is reason to switch to GPLv3.

I'm sorry. This is evil.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2016-04-27 08:18:14

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions


* Andy Lutomirski <[email protected]> wrote:

> > What new syscalls would be needed for ssh to get all this support?
>
> This patchset or similar, plus some user code and an enclave to use.
>
> Sadly, on current CPUs, you also need Intel to bless the enclave. It looks like
> new CPUs might relax that requirement.

That looks like a fundamental technical limitation in my book - to an open source
user this is essentially a very similar capability as tboot: it only allows the
execution of externally blessed static binary blobs...

I don't think we can merge any of this upstream until it's clear that the hardware
owner running open-source user-space can also freely define/start his own secure
enclaves without having to sign the enclave with any external party. I.e.
self-signed enclaves should be fundamentally supported as well.

Thanks,

Ingo

2016-04-27 12:41:06

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH 3/6] intel_sgx: driver for Intel Secure Guard eXtensions

On Tue, Apr 26, 2016 at 11:49:38PM -0700, Jethro Beekman wrote:
> On 25-04-16 10:34, Jarkko Sakkinen wrote:
> > diff --git a/drivers/staging/intel_sgx/isgx_ioctl.c
> b/drivers/staging/intel_sgx/isgx_ioctl.c
> > new file mode 100644
> > index 0000000..9d8b36b
> > --- /dev/null
> > +++ b/drivers/staging/intel_sgx/isgx_ioctl.c
> >
> > +static long isgx_ioctl_enclave_create(struct file *filep, unsigned int cmd,
> > + unsigned long arg)
> >
> > + secs->base = vm_mmap(filep, 0, secs->size,
> > + PROT_READ | PROT_WRITE | PROT_EXEC,
> > + MAP_SHARED, 0);
>
> Why does the ioctl interface map userspace memory for an open device?
> There's already a perfectly good syscall for that: mmap.

You didn't explain what would be the value in doing this but after
thinking for a short while I found out two good reasons:

* The current API is ugly in a way that you can anyway call mmap
directly too and have a useless zombie enclave. This would make
the API less ambiguous.
* SGX_IOC_ENCLAVE_CREATE could be removed. SECS could be passed
through SGX_IOC_ENCLAVE_ADD_PAGE thus simplifying the API a lot.

Given these circumstances I think this does make sense.

> > diff --git a/drivers/staging/intel_sgx/isgx_user.h b/drivers/staging/intel_sgx/isgx_user.h
> > new file mode 100644
> > index 0000000..672d19c
> > --- /dev/null
> > +++ b/drivers/staging/intel_sgx/isgx_user.h
> >
> > +#define SGX_ADD_SKIP_EEXTEND 0x1
> > +
> > +struct sgx_add_param {
> > + unsigned long addr;
> > + unsigned long user_addr;
> > + struct isgx_secinfo *secinfo;
> > + unsigned int flags;
> > +};
>
> The hardware supports calling EEXTEND on only a part of a page, I think the
> driver should also support that.

Why would you want to do that?

> Jethro

/Jarkko

2016-04-27 14:06:19

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Apr 27, 2016 1:18 AM, "Ingo Molnar" <[email protected]> wrote:
>
>
> * Andy Lutomirski <[email protected]> wrote:
>
> > > What new syscalls would be needed for ssh to get all this support?
> >
> > This patchset or similar, plus some user code and an enclave to use.
> >
> > Sadly, on current CPUs, you also need Intel to bless the enclave. It looks like
> > new CPUs might relax that requirement.
>
> That looks like a fundamental technical limitation in my book - to an open source
> user this is essentially a very similar capability as tboot: it only allows the
> execution of externally blessed static binary blobs...
>
> I don't think we can merge any of this upstream until it's clear that the hardware
> owner running open-source user-space can also freely define/start his own secure
> enclaves without having to sign the enclave with any external party. I.e.
> self-signed enclaves should be fundamentally supported as well.

Certainly, if this were a *graphics* driver, airlied would refuse to
merge it without open source userspace available.

We're all used to Intel sending patches that no one outside Intel can
test without because no one has the hardware. Heck, I recently sent a
vdso patch that *I* can't test. But in this case I have the hardware
and there is no way that I can test it, and I don't like this at all.

See my earlier comments about not allowing user code to provide
EINITTOKEN. Implementing that would mostly solve this problem, with
the big caveat that it may be impossible to implement that suggestion
until Intel changes its stance (which is clearly in progress, given
the recent SDM updates).

This could easily end up bring a CNL-only feature in Linux. (Or
whatever generation that change is in.)

--Andy

2016-04-27 23:32:47

by Jethro Beekman

[permalink] [raw]
Subject: Re: [PATCH 3/6] intel_sgx: driver for Intel Secure Guard eXtensions

On 27-04-16 05:40, Jarkko Sakkinen wrote:
>> The hardware supports calling EEXTEND on only a part of a page, I think the
>> driver should also support that.
>
> Why would you want to do that?

You might have segments in a binary that don't start at the beginning of a page
or that end before the end of a page. For example:

Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x000000000001bcac 0x000000000001bcac R E 1000
LOAD 0x000000000001c8e8 0x000000000001c8e8 0x000000000001c8e8
0x0000000000000790 0x0000000000000c68 RW 1000

There's no need to measure the extra padding (0x1bd00--0x1c7ff and
0x1cb00--0x1cfff) in this case.

> /Jarkko

Jethro

2016-04-29 20:05:01

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH 3/6] intel_sgx: driver for Intel Secure Guard eXtensions

On Wed, Apr 27, 2016 at 04:32:23PM -0700, Jethro Beekman wrote:
> On 27-04-16 05:40, Jarkko Sakkinen wrote:
> >> The hardware supports calling EEXTEND on only a part of a page, I think the
> >> driver should also support that.
> >
> > Why would you want to do that?
>
> You might have segments in a binary that don't start at the beginning of a page
> or that end before the end of a page. For example:
>
> Type Offset VirtAddr PhysAddr
> FileSiz MemSiz Flags Align
> LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
> 0x000000000001bcac 0x000000000001bcac R E 1000
> LOAD 0x000000000001c8e8 0x000000000001c8e8 0x000000000001c8e8
> 0x0000000000000790 0x0000000000000c68 RW 1000
>
> There's no need to measure the extra padding (0x1bd00--0x1c7ff and
> 0x1cb00--0x1cfff) in this case.

Do you see this as a performance issue or why do you think that this
would hurt that much?

> Jethro

/Jarkko

2016-04-29 20:17:58

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Tue, Apr 26, 2016 at 09:00:10PM +0200, Pavel Machek wrote:
> On Mon 2016-04-25 20:34:07, Jarkko Sakkinen wrote:
> > Intel(R) SGX is a set of CPU instructions that can be used by
> > applications to set aside private regions of code and data. The code
> > outside the enclave is disallowed to access the memory inside the
> > enclave by the CPU access control.
> >
> > The firmware uses PRMRR registers to reserve an area of physical memory
> > called Enclave Page Cache (EPC). There is a hardware unit in the
> > processor called Memory Encryption Engine. The MEE encrypts and decrypts
> > the EPC pages as they enter and leave the processor package.
>
> What are non-evil use cases for this?

I'm not sure what you mean by non-evil.

>
> Pavel
>
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

/Jarkko

2016-04-29 22:08:17

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel Secure Guard Extensions

On Tue, Apr 26, 2016 at 09:00:10PM +0200, Pavel Machek wrote:
> On Mon 2016-04-25 20:34:07, Jarkko Sakkinen wrote:
> > Intel(R) SGX is a set of CPU instructions that can be used by
> > applications to set aside private regions of code and data. The code
> > outside the enclave is disallowed to access the memory inside the
> > enclave by the CPU access control.
> >
> > The firmware uses PRMRR registers to reserve an area of physical memory
> > called Enclave Page Cache (EPC). There is a hardware unit in the
> > processor called Memory Encryption Engine. The MEE encrypts and decrypts
> > the EPC pages as they enter and leave the processor package.
>
> What are non-evil use cases for this?

Virtual TPMs for containers/guests would be one such use case.

/Jarkko

2016-04-29 22:22:50

by Jethro Beekman

[permalink] [raw]
Subject: Re: [PATCH 3/6] intel_sgx: driver for Intel Secure Guard eXtensions

On 29-04-16 13:04, Jarkko Sakkinen wrote:
>>> Why would you want to do that?
>>
>> ...
>
> Do you see this as a performance issue or why do you think that this
> would hurt that much?

I don't think it's a performance issue at all. I'm just giving an example of why
you'd want to do this. I'm sure people who want to use this instruction set can
come up with other uses, so I think the driver should support it. Other drivers
on different platform might support this, in which case we should be compatible
(to achieve the same enclave measurement). Other Linux drivers support it [1]. I
would ask: why would you not want to do this? It seems trivial to expand the
current flag into 16 separate flags; one for each 256-byte chunk in the page.

[1] https://github.com/jethrogb/sgx-utils/tree/master/linux-driver

> /Jarkko

Jethro