Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp38025imu; Thu, 3 Jan 2019 13:33:32 -0800 (PST) X-Google-Smtp-Source: ALg8bN62Z6dAxfVIYqwLuMNSs8V7MXum73o5Tpf7q07PbZEfDvBVhLGxByayGaD+2aziwsciI5G4 X-Received: by 2002:a17:902:d83:: with SMTP id 3mr47487802plv.43.1546551212473; Thu, 03 Jan 2019 13:33:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546551212; cv=none; d=google.com; s=arc-20160816; b=tfSH73aFxDKi94NxLPiReYrzIIjZhPrfyUbPs0B+MaqOH6F1MIt+xTpu6qziLO7CqL iKVudinWOHf2TVAmLMOlDMBox/FFTOHhxSH0rm+4KraB1yhz3x8KTzfoouS937QUixQy q0iZPDSgMEwuA+ZBooKbZY0RHap89kJ4r+pc2j/pEaSG4sjX41jCTwJPDH8SfcTdu6lQ 9+XXtQ5X8WL0mKvFUw4OGOReZbK4eNolUlND/q2gmmND1pB3FpJAvAOG7raQsvk6CCKB jzTsMsLrRNY7flzAo5rd8hmQpZJluAVZm86YbDWtePcmi4loHjI3/yDv7FNHrkqT3R2O XP7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:organization:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=e8SENHlrCLCkN25YFH6PhrkDoLdNv7op9BdUPI9lr3I=; b=IQu+VX5Mm8rxRPLMsGBLqROvCEBQBCqr2poZTZb/R2bE/Wpgvo4ZponAS3nL5+Ieb3 W84yHCsuXEa2Zmypx9FIF7xQl+72RGc/oman9M+1Zci9BNlf2rLRSmiTWfW4rw6yJGpe 7gkZYyPVyLoPBltBJ1pJRsEX/dthni0P9cq0byHS7Mz5JegXo4CURiuQ6usW90iJSgRy KEZOK1Sze3u0D9u6EKgFZjC6W01mL6oWdLnQSB+/RZEaj4FVSqUlniAVCIz7l+DvQ8wK 2ap3qdIhbYzjp8eOAnfTK2pg0DOhrFMoVEiU7pOsVRJujPc1AkTDj/0HpTdnUmpwKe3M RA1w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 44si684498plb.57.2019.01.03.13.33.17; Thu, 03 Jan 2019 13:33:32 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730109AbfACPDE (ORCPT + 99 others); Thu, 3 Jan 2019 10:03:04 -0500 Received: from mga12.intel.com ([192.55.52.136]:13872 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728987AbfACPDE (ORCPT ); Thu, 3 Jan 2019 10:03:04 -0500 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Jan 2019 07:03:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,435,1539673200"; d="scan'208";a="308673357" Received: from tmuluk-mobl4.ger.corp.intel.com (HELO localhost) ([10.249.254.238]) by fmsmga005.fm.intel.com with ESMTP; 03 Jan 2019 07:02:57 -0800 Date: Thu, 3 Jan 2019 17:02:56 +0200 From: Jarkko Sakkinen To: Sean Christopherson Cc: Andy Lutomirski , Jethro Beekman , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "x86@kernel.org" , Dave Hansen , Peter Zijlstra , "H. Peter Anvin" , "linux-kernel@vger.kernel.org" , "linux-sgx@vger.kernel.org" , Josh Triplett , Haitao Huang , "Dr . Greg Wettstein" Subject: Re: x86/sgx: uapi change proposal Message-ID: <20190103150256.GA17015@linux.intel.com> References: <7706b2aa71312e1f0009958bcab24e1e9d8d1237.camel@linux.intel.com> <598cd050-f0b5-d18c-96a0-915f02525e3e@fortanix.com> <20181219091148.GA5121@linux.intel.com> <613c6814-4e71-38e5-444a-545f0e286df8@fortanix.com> <20181219144515.GA30909@linux.intel.com> <20181220103204.GB26410@linux.intel.com> <20181222081649.GB8895@linux.intel.com> <20181222082502.GA13275@linux.intel.com> <20190102204752.GG7460@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190102204752.GG7460@linux.intel.com> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 02, 2019 at 12:47:52PM -0800, Sean Christopherson wrote: > On Sat, Dec 22, 2018 at 10:25:02AM +0200, Jarkko Sakkinen wrote: > > On Sat, Dec 22, 2018 at 10:16:49AM +0200, Jarkko Sakkinen wrote: > > > On Thu, Dec 20, 2018 at 12:32:04PM +0200, Jarkko Sakkinen wrote: > > > > On Wed, Dec 19, 2018 at 06:58:48PM -0800, Andy Lutomirski wrote: > > > > > Can one of you explain why SGX_ENCLAVE_CREATE is better than just > > > > > opening a new instance of /dev/sgx for each encalve? > > > > > > > > I think that fits better to the SCM_RIGHTS scenario i.e. you could send > > > > the enclav to a process that does not have necessarily have rights to > > > > /dev/sgx. Gives more robust environment to configure SGX. > > > > > > Sean, is this why you wanted enclave fd and anon inode and not just use > > > the address space of /dev/sgx? Just taking notes of all observations. > > > I'm not sure what your rationale was (maybe it was somewhere). This was > > > something I made up, and this one is wrong deduction. You can easily > > > get the same benefit with /dev/sgx associated fd representing the > > > enclave. > > > > > > This all means that for v19 I'm going without enclave fd involved with > > > fd to /dev/sgx representing the enclave. No anon inodes will be > > > involved. > > > > Based on these observations I updated the uapi. > > > > As far as I'm concerned there has to be a solution to do EPC mapping > > with a sequence: > > > > 1. Ping /dev/kvm to do something. > > 2. KVM asks SGX core to do something. > > 3. SGX core does something. > > > > I don't care what the something is exactly is, but KVM is the only sane > > place for KVM uapi. I would be surprised if KVM maintainers didn't agree > > that they don't want to sprinkle KVM uapi to random places in other > > subsystems. > > It's not a KVM uapi. > > KVM isn't a hypervisor in the traditional sense. The "real" hypervisor > lives in userspace, e.g. Qemu, KVM is essentially just a (very fancy) > driver for hardware accelerators, e.g. VMX. Qemu for example is fully > capable of running an x86 VM without KVM, it's just substantially slower. > > In terms of guest memory, KVM doesn't care or even know what a particular > region of memory represents or what, if anything, is backing a region in > the host. There are cases when KVM is made aware of certain aspects of > guest memory for performance or functional reasons, e.g. emulated MMIO > and encrypted memory, but in all cases the control logic ultimately > resides in userspace. > > SGX is a weird case because ENCLS can't be emulated in software, i.e. > exposing SGX to a VM without KVM's help would be difficult. But, it > wouldn't be impossible, just slow and ugly. > > And so, ignoring host oversubscription for the moment, there is no hard > requirement that SGX EPC can only be exposed to a VM through KVM. In > other words, allocating and exposing EPC to a VM is orthogonal to KVM > supporting SGX. Exposing EPC to userspace via /dev/sgx/epc would mean > that KVM would handle it like any other guest memory region, and all EPC > related code/logic would reside in the SGX subsystem. I'm fine doing that if it makes sense. I just don't understand why you cannot add ioctls to /dev/kvm for allocating the region. Why isn't that possible? As I said to Andy earlier, adding new device files is easy as everything related to device creation is nicely encapsulated. > Oversubscription throws a wrench in the system because ENCLV can only > be executed post-VMXON and EPC conflicts generate VMX VM-Exits. But > even then, KVM doesn't need to own the EPC uapi, e.g. it can call into > the SGX subsystem to handle EPC conflict VM-Exits and the SGX subsystem > can wrap ENCLV with exception fixup and forcefully reclaim EPC pages if > ENCLV faults. If the uapi is *only* for KVM, it should definitely own it. KVM calling SGX subsystem on a conflict is KVM using in-kernel APIs provided by the SGX core. > I can't be 100% certain the oversubscription scheme will be sane without > actually writing the code, but I'd like to at least keep the option open, > i.e. not structure /dev/sgx/ in such a way that adding e.g. /dev/sgx/epc > is impossible or ugly. /Jarkko