Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp82653imu; Tue, 8 Jan 2019 15:06:48 -0800 (PST) X-Google-Smtp-Source: ALg8bN5aemdeIlPKDW9jwcFfP1NoMgWzbaXUk+ltxultuqACWqK03nd/DrP5AD5bkl1CqenN+bZ8 X-Received: by 2002:a17:902:5066:: with SMTP id f35mr3712985plh.78.1546988807943; Tue, 08 Jan 2019 15:06:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546988807; cv=none; d=google.com; s=arc-20160816; b=EzZ6tWO7ecYku0v5U3G6OsTQAV1G1jZVL8r8OqY7sRCSbK8/YVsuBA+dbxBPsq9cMi OZjWPzloVkxUfSrtZth6ViplkCmTZaqAgVdch6S9ylypEcqy11cJS8RNhrg1/Zo7r43q TVOMB90IYG+dgGm0XC0R5caU2S8RDtmHCSCsaBtDZCsjIe0HwGjNDZX7AzzMEW6kDc5g Df1kL+B9ORy2RdYA4zWztW/lH2wq3dPhZbukaKlSDfw0QYZTQweUM3yMxB5nPJJ0msl3 sJa05AglrEJ29xV6msi3WMffaXMzvtwEImp3TTxJu+3nxSi6vsjbih8TkPDgeuMnqsCO x5vQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=DpaHwZ3qMen/Neod6prQcbgLzKo4GoLDqjx1HVYxi08=; b=OTURi30UxBUECib0Sy+/gnbpR6djxGp+nWgSnqSNZ0xXk2M5QXxifNGOaPoK31LpuT n5TlKl8HJ73JeiiRAIfs1b9o0bhtK9KdQltLQL3UKy/dKkVSbXkQ+Bl8WhlVZna4NEUW 9zKLsnpYg1VR6ELg1nW5lyNAu7tMFj5tPy5CX4nVpqlq7nTsYpbtnW6vMUVZY2n4rphM UC2M1nZGMNYLBjxidvbSucra6ffotD9pqlwDxGhU9bvfqnss4DHGQtS65HeL+O34eDI4 IgmBZ5nnAf1As0QEd9zJV5eDVzcVj7bntrukQxEZfzfqRr0krRivjsGG8XuO1KDy9tSs 7IlA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=V32W80jW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 20si19689226pft.177.2019.01.08.15.06.32; Tue, 08 Jan 2019 15:06:47 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=V32W80jW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728946AbfAHWy1 (ORCPT + 99 others); Tue, 8 Jan 2019 17:54:27 -0500 Received: from mail.kernel.org ([198.145.29.99]:45576 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728348AbfAHWy0 (ORCPT ); Tue, 8 Jan 2019 17:54:26 -0500 Received: from mail-wr1-f42.google.com (mail-wr1-f42.google.com [209.85.221.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 3E1E7218AF for ; Tue, 8 Jan 2019 22:54:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1546988065; bh=NvnqUoU7GQoUtzti/SECPxKN/zJMy7RNxrC1smn7wuo=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=V32W80jWKdBWuOamkTVWXO+AnsFYfhIlaWyrIocmOolyBDT3LZVcLhKRZpSsEXgu6 YkUh5GZQdtEt607W9ykMS0vMxIbXdY+nSVmw1g7/i2VkHSDpkW84hQbh/B/IkFT8GD FC5EQvDO66Xx3vGublmpJpC3Zzt0muCBPlqoTpyk= Received: by mail-wr1-f42.google.com with SMTP id t6so5713272wrr.12 for ; Tue, 08 Jan 2019 14:54:25 -0800 (PST) X-Gm-Message-State: AJcUukdmOlTUZ9v2/JwwkJicG0WctzQD0Cb8FrTmaYEnX4V/nNyo336H Y5U9iZI1Ef9H/BsWSCRRILSeUO4poq6C+/dCUFK5CQ== X-Received: by 2002:adf:8323:: with SMTP id 32mr2737350wrd.176.1546988063578; Tue, 08 Jan 2019 14:54:23 -0800 (PST) MIME-Version: 1.0 References: <20181214215729.4221-1-sean.j.christopherson@intel.com> <7706b2aa71312e1f0009958bcab24e1e9d8d1237.camel@linux.intel.com> <598cd050-f0b5-d18c-96a0-915f02525e3e@fortanix.com> <20181219091148.GA5121@linux.intel.com> <613c6814-4e71-38e5-444a-545f0e286df8@fortanix.com> <20181219144515.GA30909@linux.intel.com> <20181221162825.GB26865@linux.intel.com> <105F7BF4D0229846AF094488D65A0989355A45B6@PGSMSX112.gar.corp.intel.com> <20190108220946.GA30462@linux.intel.com> In-Reply-To: <20190108220946.GA30462@linux.intel.com> From: Andy Lutomirski Date: Tue, 8 Jan 2019 14:54:11 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: x86/sgx: uapi change proposal To: Sean Christopherson Cc: "Huang, Kai" , Andy Lutomirski , Jethro Beekman , Jarkko Sakkinen , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "x86@kernel.org" , Dave Hansen , Peter Zijlstra , "H. Peter Anvin" , "linux-kernel@vger.kernel.org" , "linux-sgx@vger.kernel.org" , Josh Triplett , Haitao Huang , "Dr . Greg Wettstein" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 8, 2019 at 2:09 PM Sean Christopherson wrote: > > On Tue, Jan 08, 2019 at 11:27:11AM -0800, Huang, Kai wrote: > > > > > > > > Can one of you explain why SGX_ENCLAVE_CREATE is better than just > > > > opening a new instance of /dev/sgx for each encalve? > > > > > > Directly associating /dev/sgx with an enclave means /dev/sgx can't be used > > > to provide ioctl()'s for other SGX-related needs, e.g. to mmap() raw EPC and > > > expose it a VM. Proposed layout in the link below. I'll also respond to > > > Jarkko's question about exposing EPC through /dev/sgx instead of having > > > KVM allocate it on behalf of the VM. > > > > > > https://lkml.kernel.org/r/20181218185349.GC30082@linux.intel.com > > > > Hi Sean, > > > > Sorry for replying to old email. But IMHO it is not a must that Qemu > > needs to open some /dev/sgx and allocate/mmap EPC for guest's virtual > > EPC slot, instead, KVM could create private slot, which is not visible > > to Qemu, for virtual EPC, and KVM could call core-SGX EPC allocation > > API directly. > > That's possible, but it has several downsides. > > - Duplicates a lot of code in KVM for managing memory regions. > - Artificially restricts userspace to a single EPC region, unless > even more code is duplicated to handle multiple private regions. > - Requires additional ioctls() or capabilities to probe EPC support > - Does not fit with Qemu/KVM's memory model, e.g. all other types of > memory are exposed to a guest through KVM_SET_USER_MEMORY_REGION. > - Prevents userspace from debugging a guest's enclave. I'm not saying > this is a likely scenario, but I also don't think we should preclude > it without good reason. > - KVM is now responsible for managing the lifecycle of EPC, e.g. what > happens if an EPC cgroup limit is lowered on a running VM and > KVM can't gracefully reclaim EPC? The userspace hypervisor should > ultimately decide how to handle such an event. > - SGX logic is split between SGX and KVM, e.g. VA page management for > oversubscription will likely be common to SGX and KVM. From a long > term maintenance perspective, this means that changes to the EPC > management could potentially need to be Acked by KVM, and vice versa. > > > I am not sure what's the good of allowing userspace to alloc/mmap a > > raw EPC region? Userspace is not allowed to touch EPC anyway, expect > > enclave code. > > > > To me KVM creates private EPC slot is cleaner than exposing /dev/sgx/epc > > and allowing userspace to map some raw EPC region. > > Cleaner in the sense that it's faster to get basic support up and running > since there are fewer touchpoints, but there are long term ramifications > to cramming EPC management in KVM. > > And at this point I'm not stating any absolutes, e.g. how EPC will be > handled by KVM. What I'm pushing for is to not eliminate the possibility > of having the SGX subsystem own all EPC management, e.g. don't tie > /dev/sgx to a single enclave. I haven't gone and re-read all the relevant SDM bits, so I'll just ask: what, if anything, are the actual semantics of mapping "raw EPC" like this? You can't actually do anything with the mapping from user mode unless you actually get an enclave created and initialized in it and have it mapped at the correct linear address, right? I still think you have the right idea, but it is a bit unusual. I do think it makes sense to have QEMU delegate the various ENCLS operations (especially EINIT) to the regular SGX interface, which will mean that VM guests will have exactly the same access controls applied as regular user programs, which is probably what we want. If so, there will need to be a way to get INITTOKEN privilege for the purpose of running non-Linux OSes in the VM, which isn't the end of the world. We might still want the actual ioctl to do EINIT using an actual explicit token to be somehow restricted in a way that strongly discourages its use by anything other than a hypervisor. Or I suppose we could just straight-up ignore the guest-provided init token. --Andy P.S. Is Intel ever going to consider a way to make guests get their own set of keys that are different from the host's keys and other guests' keys?