Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp2259946ybi; Sat, 13 Jul 2019 10:15:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqwNjcW/J/rYFLtC2RccJ/KL5UKaLvVySn1Xg2/Iqmwkrk73VQj+G052bHVbbp4Sj7jkGfeO X-Received: by 2002:a63:e953:: with SMTP id q19mr17908707pgj.313.1563038147893; Sat, 13 Jul 2019 10:15:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1563038147; cv=none; d=google.com; s=arc-20160816; b=WaN/hPZZ7RuKXEpimNlZFD8iBtEZIyZd8nL23g+pYxu6yDVIYumr67WE/KvPN1iiJi Y4MjfnwL1AAoZ8+YttGpDpvFkJoIXD8AfQdMhR0Z2KSY69/YwTDg2Ou1M7BadlvP+CiB ZVHTSBO+aOichuslL/Br5hpHhEW22FnPsOcDANE8lVSC2m1XJcLFNUWUCbLzma+a+rv6 nB/0yfuul52kpnBDErulJ33zhLcu3AHBeIe1Nt2DgbN60X8pDI4S8W7si2ewoax7CWRX nlZOJKO05Dh1E2A9Vttg73i8W9pyMj6PtKTmg+ABlOubf8DWNBDUMWSP20W5kbAU7sgg 9fDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=WcXJK3fAyn7kTVlCxeUf7VHw5qwebK3tRpRocCBgYSs=; b=h2y+6fb+/8RZ1nNk89PYSrQ7XYqxXTcEfQhEJUiluQBM3252+Y5ON92zpHztPKdohd EYzbTcd30Nh9FEQYGUEfY55XgSF/D5BpGrENgmHeHZmfcIFqEfTzOzAxH2SWJe5iDmh/ Z32Y/cFB3aaVa9snTbKVb+lz7wID3AJbKFVgZJnzcLXyhyoZcS1gqbzfQ70qrNmu4t8/ LUO19lTH1U6VeBTOr07vI0RIuuwYyZ8QZzKpE3sqM+GTJw/MsDZ1CJQ2yTHjFqZzOfAx /HVB3itCL5ZNzF36ehWbSq9qjSUIpWdgD0L4R16NTUKDNRmW3K3Ve+aVcNW0sZYneWar yEsw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d5si10267467pla.17.2019.07.13.10.15.32; Sat, 13 Jul 2019 10:15:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728298AbfGMRN2 (ORCPT + 99 others); Sat, 13 Jul 2019 13:13:28 -0400 Received: from mga14.intel.com ([192.55.52.115]:18336 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727961AbfGMRN1 (ORCPT ); Sat, 13 Jul 2019 13:13:27 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Jul 2019 10:13:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,487,1557212400"; d="scan'208";a="341981730" Received: from hbriegel-mobl.ger.corp.intel.com (HELO localhost) ([10.252.50.48]) by orsmga005.jf.intel.com with ESMTP; 13 Jul 2019 10:13:17 -0700 From: Jarkko Sakkinen To: linux-kernel@vger.kernel.org, x86@kernel.org, linux-sgx@vger.kernel.org Cc: akpm@linux-foundation.org, dave.hansen@intel.com, sean.j.christopherson@intel.com, nhorman@redhat.com, npmccallum@redhat.com, serge.ayoun@intel.com, shay.katz-zamir@intel.com, haitao.huang@intel.com, andriy.shevchenko@linux.intel.com, tglx@linutronix.de, kai.svahn@intel.com, bp@alien8.de, josh@joshtriplett.org, luto@kernel.org, kai.huang@intel.com, rientjes@google.com, cedric.xing@intel.com, Jarkko Sakkinen Subject: [PATCH v21 27/28] docs: x86/sgx: Document kernel internals Date: Sat, 13 Jul 2019 20:08:03 +0300 Message-Id: <20190713170804.2340-28-jarkko.sakkinen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190713170804.2340-1-jarkko.sakkinen@linux.intel.com> References: <20190713170804.2340-1-jarkko.sakkinen@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sean Christopherson Document some of the more tricky parts of the kernel implementation internals. Signed-off-by: Sean Christopherson Co-developed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen --- Documentation/x86/sgx/2.Kernel-internals.rst | 76 ++++++++++++++++++++ Documentation/x86/sgx/index.rst | 1 + 2 files changed, 77 insertions(+) create mode 100644 Documentation/x86/sgx/2.Kernel-internals.rst diff --git a/Documentation/x86/sgx/2.Kernel-internals.rst b/Documentation/x86/sgx/2.Kernel-internals.rst new file mode 100644 index 000000000000..5c90a65936f2 --- /dev/null +++ b/Documentation/x86/sgx/2.Kernel-internals.rst @@ -0,0 +1,76 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================ +Kernel Internals +================ + +CPU configuration +================= + +Because SGX has an ever evolving and expanding feature set, it's possible for +a BIOS or VMM to configure a system in such a way that not all CPUs are equal, +e.g. where Launch Control is only enabled on a subset of CPUs. Linux does +*not* support such a heterogeneous system configuration, nor does it even +attempt to play nice in the face of a misconfigured system. With the exception +of Launch Control's hash MSRs, which can vary per CPU, Linux assumes that all +CPUs have a configuration that is identical to the boot CPU. + +EPC management +============== + +Because the kernel can't arbitrarily read EPC memory or share RO backing pages +between enclaves, traditional memory models such as CoW and fork() do not work +with enclaves. In other words, the architectural rules of EPC forces it to be +treated as MAP_SHARED at all times. + +The inability to employ traditional memory models also means that EPC memory +must be isolated from normal memory pools, e.g. attempting to use EPC memory +for normal mappings would result in faults and/or perceived data corruption. +Furthermore, EPC is not enumerated by as normal memory, e.g. BIOS enumerates +EPC as reserved memory in the e820 tables, or not at all. As a result, EPC +memory is directly managed by the SGX subsystem, e.g. SGX employs VM_PFNMAP to +manually insert/zap/swap page table entries, and exposes EPC to userspace via +a well known device, /dev/sgx/enclave. + +The net effect is that all enclave VMAs must be MAP_SHARED and are backed by +a single file, /dev/sgx/enclave. + +EPC oversubscription +==================== + +SGX allows to have larger enclaves than amount of available EPC by providing a +subset of leaf instruction for swapping EPC pages to the system memory. The +details of these instructions are discussed in the architecture document. Due +to the unique requirements for swapping EPC pages, and because EPC pages do not +have associated page structures, management of the EPC is not handled by the +standard memory subsystem. + +SGX directly handles swapping of EPC pages, including a thread to initiate the +reclaiming process and a rudimentary LRU mechanism. When the amount of free EPC +pages goes below a low watermark the swapping thread starts reclaiming pages. +The pages that have not been recently accessed (i.e. do not have the A bit set) +are selected as victim pages. Each enclave holds an shmem file as a backing +storage for reclaimed pages. + +Launch Control +============== + +The current kernel implementation supports only writable MSRs. The launch is +performed by setting the MSRs to the hash of the public key modulus of the +enclave signer and a token with the valid bit set to zero. Because kernel makes +ultimately all the launch decisions token are not needed for anything. We +don't need or have a launch enclave for generating them as the MSRs must always +be writable. + +Provisioning +============ + +The use of provisioning must be controlled because it allows to get access to +the provisioning keys to attest to a remote party that the software is running +inside a legit enclave. This could be used by a malware network to ensure that +its nodes are running inside legit enclaves. + +The driver introduces a special device file /dev/sgx/provision and a special +ioctl SGX_IOC_ENCLAVE_SET_ATTRIBUTE to accomplish this. A file descriptor +pointing to /dev/sgx/provision is passed to ioctl from which kernel authorizes +the PROVISION_KEY attribute to the enclave. diff --git a/Documentation/x86/sgx/index.rst b/Documentation/x86/sgx/index.rst index c5dfef62e612..5d660e83d984 100644 --- a/Documentation/x86/sgx/index.rst +++ b/Documentation/x86/sgx/index.rst @@ -14,3 +14,4 @@ potentially malicious. :maxdepth: 1 1.Architecture + 2.Kernel-internals -- 2.20.1