Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp122893imu; Wed, 2 Jan 2019 15:41:15 -0800 (PST) X-Google-Smtp-Source: ALg8bN4zMVAw9wXmpFnlNRktPRnn57ANstkhMGPKK7NVNUO0+QRcpkMgpyYIPbzRGDZ2hZwvHp0c X-Received: by 2002:a63:5b48:: with SMTP id l8mr15265120pgm.80.1546472475394; Wed, 02 Jan 2019 15:41:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546472475; cv=none; d=google.com; s=arc-20160816; b=z8y0p6uYI2aIMGF4DzU5jrWEMjwurkZir5ru371RxUuRvdTeTCtZYhrYlpwc3TEOo+ s/irTPoO6nG9Fno+f1luDWBW8M2Nv1L/EWxRXG1u7ojPXpL7pKrdldmUuqTYdn1o2qcr UCmZ1okU9TklD8y9yvcmXtL+EwTEgEFwnbByiamdmVlf1XTfPoL88dsqxM0b9dQlouZV qPMo/I2XFCXACUn9oNbSVqO4N++24nCZwQ/Q0KCGSyfx/Db7dxSfJ5ikke5x+jsng41z JiocupScYxb33XVwZismNBWg0HbWprWg2wU6A+flCTRa/WfZIa+YG+tFtSL29Qq/Z0Ez v2zg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=6+vlvxLVq3oamZ7Z7800yIHoCt6gXqCYP9I091+BqHE=; b=x/NQVhXOt8KuWRyBYQ9uhC8Z0DMo07eZ6VTib/CteMpd+f2QHjc3dd/O3Pb2kIMg0e TPANg07ZzqV+In8lgz/PILN0vx3vhu4QmpDmmumrYvJvgVHCYUvg6/T8hj6wB8zrE1WZ GLWSvx8HodLz+cy1hOK3NGxAczVjwmTdiqb0EKQZIyGzDJtYMn/BYfK2JyCzhta5JJNo G3ko+cwlDax6T4dhhqJKSrRj8xjO98C2iRfrjZoe+WOWyaVrAjpAo8nK+hFjC2INqOF+ HNq+7VBiNI37uElNy+U1nTq2u3iLJeTv64IBBCTusGmfQ3vHoLcyTk5wUG2RL08zlNfy 6UjQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j28si14672456pgm.160.2019.01.02.15.40.46; Wed, 02 Jan 2019 15:41:15 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727534AbfABUry (ORCPT + 99 others); Wed, 2 Jan 2019 15:47:54 -0500 Received: from mga18.intel.com ([134.134.136.126]:55885 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726009AbfABUry (ORCPT ); Wed, 2 Jan 2019 15:47:54 -0500 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 02 Jan 2019 12:47:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,432,1539673200"; d="scan'208";a="131018704" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.154]) by fmsmga002.fm.intel.com with ESMTP; 02 Jan 2019 12:47:52 -0800 Date: Wed, 2 Jan 2019 12:47:52 -0800 From: Sean Christopherson To: Jarkko Sakkinen Cc: Andy Lutomirski , Jethro Beekman , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "x86@kernel.org" , Dave Hansen , Peter Zijlstra , "H. Peter Anvin" , "linux-kernel@vger.kernel.org" , "linux-sgx@vger.kernel.org" , Josh Triplett , Haitao Huang , "Dr . Greg Wettstein" Subject: Re: x86/sgx: uapi change proposal Message-ID: <20190102204752.GG7460@linux.intel.com> References: <20181214215729.4221-1-sean.j.christopherson@intel.com> <7706b2aa71312e1f0009958bcab24e1e9d8d1237.camel@linux.intel.com> <598cd050-f0b5-d18c-96a0-915f02525e3e@fortanix.com> <20181219091148.GA5121@linux.intel.com> <613c6814-4e71-38e5-444a-545f0e286df8@fortanix.com> <20181219144515.GA30909@linux.intel.com> <20181220103204.GB26410@linux.intel.com> <20181222081649.GB8895@linux.intel.com> <20181222082502.GA13275@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181222082502.GA13275@linux.intel.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Dec 22, 2018 at 10:25:02AM +0200, Jarkko Sakkinen wrote: > On Sat, Dec 22, 2018 at 10:16:49AM +0200, Jarkko Sakkinen wrote: > > On Thu, Dec 20, 2018 at 12:32:04PM +0200, Jarkko Sakkinen wrote: > > > On Wed, Dec 19, 2018 at 06:58:48PM -0800, Andy Lutomirski wrote: > > > > Can one of you explain why SGX_ENCLAVE_CREATE is better than just > > > > opening a new instance of /dev/sgx for each encalve? > > > > > > I think that fits better to the SCM_RIGHTS scenario i.e. you could send > > > the enclav to a process that does not have necessarily have rights to > > > /dev/sgx. Gives more robust environment to configure SGX. > > > > Sean, is this why you wanted enclave fd and anon inode and not just use > > the address space of /dev/sgx? Just taking notes of all observations. > > I'm not sure what your rationale was (maybe it was somewhere). This was > > something I made up, and this one is wrong deduction. You can easily > > get the same benefit with /dev/sgx associated fd representing the > > enclave. > > > > This all means that for v19 I'm going without enclave fd involved with > > fd to /dev/sgx representing the enclave. No anon inodes will be > > involved. > > Based on these observations I updated the uapi. > > As far as I'm concerned there has to be a solution to do EPC mapping > with a sequence: > > 1. Ping /dev/kvm to do something. > 2. KVM asks SGX core to do something. > 3. SGX core does something. > > I don't care what the something is exactly is, but KVM is the only sane > place for KVM uapi. I would be surprised if KVM maintainers didn't agree > that they don't want to sprinkle KVM uapi to random places in other > subsystems. It's not a KVM uapi. KVM isn't a hypervisor in the traditional sense. The "real" hypervisor lives in userspace, e.g. Qemu, KVM is essentially just a (very fancy) driver for hardware accelerators, e.g. VMX. Qemu for example is fully capable of running an x86 VM without KVM, it's just substantially slower. In terms of guest memory, KVM doesn't care or even know what a particular region of memory represents or what, if anything, is backing a region in the host. There are cases when KVM is made aware of certain aspects of guest memory for performance or functional reasons, e.g. emulated MMIO and encrypted memory, but in all cases the control logic ultimately resides in userspace. SGX is a weird case because ENCLS can't be emulated in software, i.e. exposing SGX to a VM without KVM's help would be difficult. But, it wouldn't be impossible, just slow and ugly. And so, ignoring host oversubscription for the moment, there is no hard requirement that SGX EPC can only be exposed to a VM through KVM. In other words, allocating and exposing EPC to a VM is orthogonal to KVM supporting SGX. Exposing EPC to userspace via /dev/sgx/epc would mean that KVM would handle it like any other guest memory region, and all EPC related code/logic would reside in the SGX subsystem. Oversubscription throws a wrench in the system because ENCLV can only be executed post-VMXON and EPC conflicts generate VMX VM-Exits. But even then, KVM doesn't need to own the EPC uapi, e.g. it can call into the SGX subsystem to handle EPC conflict VM-Exits and the SGX subsystem can wrap ENCLV with exception fixup and forcefully reclaim EPC pages if ENCLV faults. I can't be 100% certain the oversubscription scheme will be sane without actually writing the code, but I'd like to at least keep the option open, i.e. not structure /dev/sgx/ in such a way that adding e.g. /dev/sgx/epc is impossible or ugly.