Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp1408201pxb; Fri, 6 Nov 2020 08:55:18 -0800 (PST) X-Google-Smtp-Source: ABdhPJyUrdyRwHgXXpqdTzck6XUbmMaj38bI4i41XnsEA8rvbM8ZAigVbN7ylMRxDJsoFo5SnbG0 X-Received: by 2002:aa7:c4c2:: with SMTP id p2mr3030862edr.371.1604681718095; Fri, 06 Nov 2020 08:55:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604681718; cv=none; d=google.com; s=arc-20160816; b=hsdR0xYSS3YWtevbDOALukfxVVsHaDY1F10Xp3E+R4mm/QHwXah+K4rP1gkEoPMEut BKgAE0Hx5a6ke9CAUwNXtbPl2o7n3RTlB5ySc/Xx6LIwJH3Y0G+QS8preMjGfIhXDSnu EWpkQxlFCQ7KOLfonmarUG+8c3Lyw182Uh/65RcUWqRoRkwWb/Wd0Zzo33WKZkZ70cBx AtfEASG0b66nKryWWM7Xj95P1WsYs7RdvALVAN4BFmZSXLdPvCvhek+SrMsCfITYFUxw pkc9vbP5ZlmsCYyWxC3eXtcQlNtr1mT4rBVHefxoLs3/mxE7pAH4d4KEUR7FhomuQOIV p37w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=PkJwZd5XGNRRMIIbto21Jn9jh8DpnGi+Lc82FP5xhLs=; b=JYY0HUriGkN2daQGeA068bAeHOUz2qFuIPNoGW9FJKh5juZP1fol/xQ8xIdrC4TuUF 3QN+5JZ4rETjcHZk0sq08q71229jIzxE1qblUsfNJdE2cCj5frKJjkHrrvhl3AYUWccL COnWkZNIHEZAsS6+5/6SpdwqglIeddHHrx/LDUqMCLHwjv6dZfPG6uuM5ZS+AFUAs5Y0 3MYvBw4ZyolUE1Dygbh/EEBL6xLYa4BMDODZatIM7++5WWifSpwxcRLaKHzWqo6tkUEc 5hZA9ImvJOzow7D3W0z4y0NfOiSoMOJ0UMLJBWGNvol9Repegrz7moSLdUw46uzHwSdj 1kDg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=DwqQohcJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b16si1386615edn.23.2020.11.06.08.54.51; Fri, 06 Nov 2020 08:55:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=DwqQohcJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726422AbgKFQvT (ORCPT + 99 others); Fri, 6 Nov 2020 11:51:19 -0500 Received: from mail.kernel.org ([198.145.29.99]:54480 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726075AbgKFQvS (ORCPT ); Fri, 6 Nov 2020 11:51:18 -0500 Received: from kernel.org (83-245-197-237.elisa-laajakaista.fi [83.245.197.237]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5F7402151B; Fri, 6 Nov 2020 16:51:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1604681477; bh=IEal4DM6g9pzay8voUAwBH9HgqP88t6Jh9pbcjBimdo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=DwqQohcJEgkBnM4uO/7BgnvX7ReasVA1yAl5vhjUy3mNbwNHBF8antXZ217mpTy5u wVc4b1yMgJhu92+vURdJa+GNLEfQ2FuU9leamNT//DOpLvKHBIIaRpxw4GN4LcIQ3u C9O+8rJcGVBsxuJ5uk+vJR91YWsWwA455jOVTwNo= Date: Fri, 6 Nov 2020 18:51:07 +0200 From: Jarkko Sakkinen To: Mel Gorman Cc: Jarkko Sakkinen , x86@kernel.org, linux-sgx@vger.kernel.org, linux-kernel@vger.kernel.org, Sean Christopherson , linux-mm@kvack.org, Andrew Morton , Matthew Wilcox , Jethro Beekman , Darren Kenny , andriy.shevchenko@linux.intel.com, asapek@google.com, bp@alien8.de, cedric.xing@intel.com, chenalexchen@google.com, conradparker@google.com, cyhanish@google.com, dave.hansen@intel.com, haitao.huang@intel.com, kai.huang@intel.com, kai.svahn@intel.com, kmoy@google.com, ludloff@google.com, luto@kernel.org, nhorman@redhat.com, npmccallum@redhat.com, puiterwijk@redhat.com, rientjes@google.com, tglx@linutronix.de, yaozhangx@google.com, mikko.ylinen@intel.com, Michal Hocko , Vlastimil Babka Subject: Re: [PATCH v40 10/24] mm: Add 'mprotect' hook to struct vm_operations_struct Message-ID: <20201106165107.GA52595@kernel.org> References: <20201104145430.300542-1-jarkko.sakkinen@linux.intel.com> <20201104145430.300542-11-jarkko.sakkinen@linux.intel.com> <20201106100409.GD3371@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201106100409.GD3371@techsingularity.net> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 06, 2020 at 10:04:09AM +0000, Mel Gorman wrote: > On Wed, Nov 04, 2020 at 04:54:16PM +0200, Jarkko Sakkinen wrote: > > From: Sean Christopherson > > > > Background > > ========== > > > > 1. SGX enclave pages are populated with data by copying from normal memory > > via ioctl() (SGX_IOC_ENCLAVE_ADD_PAGES), which will be added later in > > this series. > > 2. It is desirable to be able to restrict those normal memory data sources. > > For instance, to ensure that the source data is executable before > > copying data to an executable enclave page. > > 3. Enclave page permissions are dynamic (just like normal permissions) and > > can be adjusted at runtime with mprotect(). > > > > This creates a problem because the original data source may have long since > > vanished at the time when enclave page permissions are established (mmap() > > or mprotect()). > > > > The solution (elsewhere in this series) is to force enclaves creators to > > declare their paging permission *intent* up front to the ioctl(). This > > intent can me immediately compared to the source data???s mapping and > > rejected if necessary. > > > > The ???intent??? is also stashed off for later comparison with enclave > > PTEs. This ensures that any future mmap()/mprotect() operations > > performed by the enclave creator or done on behalf of the enclave > > can be compared with the earlier declared permissions. > > > > Problem > > ======= > > > > There is an existing mmap() hook which allows SGX to perform this > > permission comparison at mmap() time. However, there is no corresponding > > ->mprotect() hook. > > > > Solution > > ======== > > > > Add a vm_ops->mprotect() hook so that mprotect() operations which are > > inconsistent with any page's stashed intent can be rejected by the driver. > > > > I have not read the series so this is superficial only. That said... > > > Cc: linux-mm@kvack.org > > Cc: Andrew Morton > > Cc: Matthew Wilcox > > Acked-by: Jethro Beekman > > Reviewed-by: Darren Kenny > > Signed-off-by: Sean Christopherson > > Co-developed-by: Jarkko Sakkinen > > Signed-off-by: Jarkko Sakkinen > > --- > > include/linux/mm.h | 3 +++ > > mm/mprotect.c | 5 ++++- > > 2 files changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index ef360fe70aaf..eb38eabc5039 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -559,6 +559,9 @@ struct vm_operations_struct { > > void (*close)(struct vm_area_struct * area); > > int (*split)(struct vm_area_struct * area, unsigned long addr); > > int (*mremap)(struct vm_area_struct * area); > > + int (*mprotect)(struct vm_area_struct *vma, > > + struct vm_area_struct **pprev, unsigned long start, > > + unsigned long end, unsigned long newflags); > > The first user of this uses the following information > > ret = sgx_encl_may_map(vma->vm_private_data, start, end, newflags); > > It only needs start, end and newflags. The pprev is passed in so the > hook can call mprotect_fixup() which is redundant as the caller knows it > should do that. I don't think an arbitrary driver should be responsible > for poking too much into the mm internals to do the fixup because we do > not know what other users of this hook might require in the future. > > Hence, I would suggest that the hook receive the minimum possible > information to do the permissions check for the first in-tree user. If > it returns without failure then mm/mprotect.c would always do the fixup. > > > vm_fault_t (*fault)(struct vm_fault *vmf); > > vm_fault_t (*huge_fault)(struct vm_fault *vmf, > > enum page_entry_size pe_size); > > diff --git a/mm/mprotect.c b/mm/mprotect.c > > index 56c02beb6041..1fd4fa71ce16 100644 > > --- a/mm/mprotect.c > > +++ b/mm/mprotect.c > > @@ -616,7 +616,10 @@ static int do_mprotect_pkey(unsigned long start, size_t len, > > tmp = vma->vm_end; > > if (tmp > end) > > tmp = end; > > - error = mprotect_fixup(vma, &prev, nstart, tmp, newflags); > > + if (vma->vm_ops && vma->vm_ops->mprotect) > > + error = vma->vm_ops->mprotect(vma, &prev, nstart, tmp, newflags); > > + else > > + error = mprotect_fixup(vma, &prev, nstart, tmp, newflags); > > That would then become > > if (vma->vm_ops && vma->vm_ops->mprotect) > error = vma->vm_ops->mprotect(vma, &prev, nstart, tmp, newflags); > if (!error) > error = mprotect_fixup(vma, &prev, nstart, tmp, newflags); > > and mprotect_fixup would be removed from the driver. > > While vm_operations_struct has borderline zero documentation, a hook for > one in-kernel user should have a comment explaining what the semantics > of the hook is -- what is it responsible for (permission check), what > can it change (nothing), etc. Maybe something like > > /* > * Called by mprotect in the event driver-specific permission > * checks need to be made before the mprotect is finalised. > * No modifications should be done to the VMA, returns 0 > * if the mprotect is permitted. > */ > int (*mprotect)(struct vm_area_struct *vma, > unsigned long start, unsigned long end, > unsigned long newflags); > > If a future driver *does* need to poke deeper into the VM for mprotect > then at least they'll have to explain why that's a good idea. Both comments make sense to me. I'll refine this patch on Monday and also "x86/sgx: Add SGX misc driver interface", which uses this callback. Thanks a lot for valuable feedback! > -- > Mel Gorman > SUSE Labs /Jarkko