Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3145030pxk; Mon, 28 Sep 2020 09:23:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzB2aT9G+oMQIVFJHP7DikITEpMAD/3dMBS31+ZkbfvIlqoeUAjmLCN84a1F5vvu+GqsW/F X-Received: by 2002:aa7:c6cf:: with SMTP id b15mr2720900eds.134.1601310233576; Mon, 28 Sep 2020 09:23:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601310233; cv=none; d=google.com; s=arc-20160816; b=tfBZRN5dumP+H1xNxeOyGvrN0RKZNIGIEWihjOL6tosYx0BtX/IRlL7ZTGnkJGZnZR +XFNrmuvd+6CBUOaKZS9WgNmuZIWeDER+yx3nJk8kQjrNNfs5DWnOGakYxBDBWZA7xHW ioj5jHNfg/ufIdIEWp7GRA9ZIxadBHir2YPR/6FSmppSSFKj+iOCWbLWyfPoe9XBrSaf mNakIOvqLRIu6njM9zU3qOiUj8P33/Ag7GXmcX538FO6sHMx9RSQPzRBEc0aCdbz2u7D qr5xOVFjtm/ST28N0j9PPVm+hTe7WjEoOqn6d5oMeTjQlmebl3SyX/PUQXf76qE6I6lv Dafw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:organization:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :ironport-sdr:ironport-sdr; bh=pUz4eaPUk8t91Tfu+/GrrPT6FxFnV7ZL7Z8giOHqfY4=; b=BdrBhNGlaOPDL8p5tBRL70oyGlZ46mqGTf6PdPMXciWIRQdnolGQWvQF5MbhiufA27 aHJaxBzMAPfjRByJVnRrgWbuBm3cbpMJnLtkpDoMWwok9rFQN8gFifbenxNY7bJUsgZ9 B56kyJmVAoZXNjKPPFmHS0+JPYCOSMB73TzfTKT1yAySB+VkIMBcarAOlP5OSQlhiaua aUG6nUGR+bMns4MR16+n6wpsrlPMnN1xjT1F/ox7g0AslM5jmjtcWO5g0hu/UO/NBtUy 7OCnXt3fgE6ls9W4D+cigAG2C+YImFfA/jvapbiJhJ7WKKg+IRXkdG9C5JSEJyTfypbQ hl8g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e6si875842ejh.487.2020.09.28.09.23.29; Mon, 28 Sep 2020 09:23:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726526AbgI1QUG (ORCPT + 99 others); Mon, 28 Sep 2020 12:20:06 -0400 Received: from mga11.intel.com ([192.55.52.93]:48309 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726344AbgI1QUG (ORCPT ); Mon, 28 Sep 2020 12:20:06 -0400 IronPort-SDR: ebBylnUyEkKRx6ZMHgDu7vBQmQ/RrIHCESBCEUaILo1EK2OPdJ/KdJq8BpRzpuwoIPumBA9MMl MC0kWrvtj+Ng== X-IronPort-AV: E=McAfee;i="6000,8403,9758"; a="159362211" X-IronPort-AV: E=Sophos;i="5.77,313,1596524400"; d="scan'208";a="159362211" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2020 09:20:03 -0700 IronPort-SDR: 6TZJ7Ehu/ELhw6vm2VOLjkupaHMzkSF96p16WXD+Pyxf8TnKh2UvfUtj/SAjsteIHkGVm6TJLD ViathMlqBWzA== X-IronPort-AV: E=Sophos;i="5.77,313,1596524400"; d="scan'208";a="488641778" Received: from gboudouk-mobl2.ger.corp.intel.com (HELO localhost) ([10.249.33.127]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2020 09:19:56 -0700 Date: Mon, 28 Sep 2020 19:19:54 +0300 From: Jarkko Sakkinen To: Dave Hansen Cc: Sean Christopherson , Haitao Huang , Andy Lutomirski , X86 ML , linux-sgx@vger.kernel.org, LKML , Linux-MM , Andrew Morton , Matthew Wilcox , Jethro Beekman , Darren Kenny , Andy Shevchenko , asapek@google.com, Borislav Petkov , "Xing, Cedric" , chenalexchen@google.com, Conrad Parker , cyhanish@google.com, "Huang, Haitao" , Josh Triplett , "Huang, Kai" , "Svahn, Kai" , Keith Moyer , Christian Ludloff , Neil Horman , Nathaniel McCallum , Patrick Uiterwijk , David Rientjes , Thomas Gleixner , yaozhangx@google.com Subject: Re: [PATCH v38 10/24] mm: Add vm_ops->mprotect() Message-ID: <20200928161954.GB92669@linux.intel.com> References: <20200924202549.GB19127@linux.intel.com> <20200924230501.GA20095@linux.intel.com> <20200925000052.GA20333@linux.intel.com> <32fc9df4-d4aa-6768-aa06-0035427b7535@intel.com> <20200925194304.GE31528@linux.intel.com> <230ce6da-7820-976f-f036-a261841d626f@intel.com> <20200928005347.GB6704@linux.intel.com> <6eca8490-d27d-25b8-da7c-df4f9a802e87@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6eca8490-d27d-25b8-da7c-df4f9a802e87@intel.com> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 28, 2020 at 07:04:38AM -0700, Dave Hansen wrote: > On 9/27/20 5:53 PM, Jarkko Sakkinen wrote: > > On Fri, Sep 25, 2020 at 12:53:35PM -0700, Dave Hansen wrote: > >> On 9/25/20 12:43 PM, Sean Christopherson wrote: > >>>> That means that the intent argument (SGX_PROT_*) is currently unused. > >>> No, the intent argument is used (eventually) by SGX's ->mprotect() > >>> implementation, i.e. sgx_mprotect() enforces that the actual protections are a > >>> subset of the declared/intended protections. > >>> > >>> If ->mprotect() is not merged, then it yes, it will be unused. > >> > >> OK, I think I've got it. > >> > >> I think I'm OK with adding ->mprotect(). As long as folks buy into the > >> argument that intent needs to be checked at mmap() time, they obviously > >> need to be checked at mprotect() too. > >> > >> Jarkko, if you want to try and rewrite the changelog, capturing the > >> discussion here and reply, I think I can ack the resulting patch. I > >> don't know if that will satisfy the request from Boris from an ack from > >> a "mm person", but we can at least start there. :) > > > > I think what it needs, based on what I've read, is the step by step > > description of the EMODPE scenarion without this callback and with it. > > EMODPE is virtually irrelevant for this whole thing. The x86 PTE > permissions still specify the most restrictive permissions, which is > what matters the most. > > We care about the _worst_ the enclave can do, not what it imposes on > itself on top of that. AFAIK it is not, or what we are protecting against with this anyway then? Let say an LSM makes decision for the permissions based on origin. If we do not have this you can: 1. EMODPE 2. mprotect I.e. whatever LSM decides, won't matter. The other case, noexec, is now unconditionally denied. > > I think other important thing to underline is that an LSM or any other > > security measure can only do a sane decision when the enclave is loaded. > > At that point we know the source (vm_file). > > Right, you know the source, but it can be anonymous or a file. They are both origin, the point being that you know what you're dealing with when you build the enclave, not when you map it. This is my current rewrite of the commit message in my master branch: " mm: Add 'mprotect' callback to vm_ops Intel Sofware Guard eXtensions (SGX) allows creation of blobs called enclaves, for which page permissions are defined when the enclave is first loaded. Once an enclave is loaded and initialized, it can be mapped to the process address space. There is no standard file format for enclaves. They are dynamically built and the ways how enclaves are deployed differ greatly. For an app you might want to have a simple static binary, but on the other hand for a container you might want to dynamically create the whole thing at run-time. Also, the existing ecosystem for SGX is already large, which would make the task very hard. Finally, even if there was a standard format, one would still want a dynamic way to add pages to the enclave. One big reason for this is that enclaves have load time defined pages that represent entry points to the enclave. Each entry point can service one hardware thread at a time and you might want to run-time parametrize this depending on your environment. The consequence is that enclaves are best created with an ioctl API and the access control can be based only to the origin of the source file for the enclave data, i.e. on VMA file pointer and page permissions. For example, this could be done with LSM hooks that are triggered in the appropriate ioctl's and they could make the access control decision based on this information. Unfortunately, there is ENCLS[EMODPE] that a running enclave can use to upgrade its permissions. If we do not limit mmap() and mprotect(), enclave could upgrade its permissions by using EMODPE followed by an appropriate mprotect() call. This would be completely hidden from the kernel. Add 'mprotect' hook to vm_ops, so that a callback can be implemeted for SGX that will ensure that {mmap, mprotect}() permissions do not surpass any of the original page permissions. This feature allows to maintain and refine sane access control for enclaves. " I'm mostly happy with this but am open for change suggestions. /Jarkko