Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp2725345rdg; Mon, 16 Oct 2023 12:52:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFOmL7jscXD+Xgd5JYvQslgtnu/h4LVWQJxuojI4+Bv2SONLvvSuGlXdye8BhxlYg95x8L+ X-Received: by 2002:a05:6359:4286:b0:166:aefa:ec44 with SMTP id kp6-20020a056359428600b00166aefaec44mr460638rwb.28.1697485972358; Mon, 16 Oct 2023 12:52:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697485972; cv=none; d=google.com; s=arc-20160816; b=MVvP5sAUyeugn+W61bfFnWVIPorA4Gt0oDKe5tBzv2sn0cFl5GiQRZeV8y6PHwaqxN Pi60yVUbqDRBkSAvJaMKFVeMG7MVotORULJsnvi8s7MV6e9ozBlgftH3CN8pxJoQGq1q zKTmI3K6fIywwI/urygsAyzLAiF7pmXk9HvQS2ku3g9VwOuGXKCn6TrwsrvMCCkw37SI aV0GayEKWjh7q7a5Vs2BQ1tA0byoAtC7XT15+L2MJCcmVIa+3dlxF8gLx8kROnl/XWnl HOyqmTEN114NQO0nHTbkewVmKPkHSvYsXP1eHefMi7ubC02ti18yrsoThsYHfXOQg/ep Ei1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:message-id:organization :from:content-transfer-encoding:mime-version:date:references:subject :cc:to:dkim-signature; bh=289sLKvKQhC6z5EaH7RFw9TTAt1fClFDprCvbR0Nku8=; fh=BrG4V6GIh432FGJbIp5V6sKlECnZ0t8xonlPv3nXmcU=; b=YAEgy69BF4yZJWDd7o+SWB+SbCx+s1LhK1M/T1v1uj3JVzuWHom6SAFWcXL2Lv95TW TMIaGxrHInvO3tF7EE0CKwPOg1pyDR+JTCjnYo7WMK8LkopbCNBEQEKodHfqFjP8Wvfj vOqnB0OdYTg4a779AuKG68sJjD43QYTYxY5I3taRDE6J8/2+y4aWECBXYNDVwLojKpRB ZyhS68l4hICjHCVaqBPcJ1gX6uFBqaRc0DK8gGio5jJOYnQBgWXitB0uMf6mmydhwgGW X/JVOPbzc3SsXkut3SLxmea0JCSKdoMk9dLfa5J2GtkiatLmVnV4WltJQNaY/LfLtbxI 11kA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DPynDWUt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id bw1-20020a056a02048100b005acf1d3b2a2si65075pgb.174.2023.10.16.12.52.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 12:52:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DPynDWUt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 3D4128020D96; Mon, 16 Oct 2023 12:52:50 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233412AbjJPTwf (ORCPT + 99 others); Mon, 16 Oct 2023 15:52:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232798AbjJPTwe (ORCPT ); Mon, 16 Oct 2023 15:52:34 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D8DA78F; Mon, 16 Oct 2023 12:52:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697485952; x=1729021952; h=to:cc:subject:references:date:mime-version: content-transfer-encoding:from:message-id:in-reply-to; bh=MggiAk4RnmF8RUencBb93AiXjdxf5PvxYAIUeWm3Oho=; b=DPynDWUtImTpucq0bYCB8rOe4/1j+CsE47T1J4qmabedR6fW2g+1S2u1 2H5df9N+BOA8Fo2d5ybSJ08GNxNIed8dpuvxB8dJc4f/5FqyBIfRbdebX ixYDgkgZK18L5Eh6cJ5NWH5F5fgI7xD4ozGKfNLKrKTOSGmLm9lclBWMS uV95E3zS1/+E7az8m0Hh6CqFQBVzY4K3ZPEUpGnKnufL4dZx3gtXE7vBi kzF23suJKWfAkoX65YsA8DcJFV4LeVusHVIGyueR6KdGwPfoMUbqJw6ad p9untLpwSq+XFiAhEgyo0A9g+7Ll9Wh6T+LaLqJghlgz0B2ZHT2gx9Hgv A==; X-IronPort-AV: E=McAfee;i="6600,9927,10865"; a="388485872" X-IronPort-AV: E=Sophos;i="6.03,230,1694761200"; d="scan'208";a="388485872" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 12:52:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10865"; a="821693902" X-IronPort-AV: E=Sophos;i="6.03,230,1694761200"; d="scan'208";a="821693902" Received: from hhuan26-mobl.amr.corp.intel.com ([10.92.17.92]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 16 Oct 2023 12:52:29 -0700 Content-Type: text/plain; charset=iso-8859-15; format=flowed; delsp=yes To: "Christopherson,, Sean" , "Huang, Kai" Cc: "Zhang, Bo" , "linux-sgx@vger.kernel.org" , "cgroups@vger.kernel.org" , "yangjie@microsoft.com" , "dave.hansen@linux.intel.com" , "Li, Zhiquan1" , "linux-kernel@vger.kernel.org" , "mingo@redhat.com" , "tglx@linutronix.de" , "tj@kernel.org" , "anakrish@microsoft.com" , "jarkko@kernel.org" , "hpa@zytor.com" , "mikko.ylinen@linux.intel.com" , "Mehta, Sohil" , "bp@alien8.de" , "x86@kernel.org" , "kristen@linux.intel.com" Subject: Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC References: <20230923030657.16148-1-haitao.huang@linux.intel.com> <20230923030657.16148-13-haitao.huang@linux.intel.com> <1b265d0c9dfe17de2782962ed26a99cc9d330138.camel@intel.com> <06142144151da06772a9f0cc195a3c8ffcbc07b7.camel@intel.com> <1f7a740f3acff8a04ec95be39864fb3e32d2d96c.camel@intel.com> <631f34613bcc8b5aa41cf519fa9d76bcd57a7650.camel@intel.com> Date: Mon, 16 Oct 2023 14:52:23 -0500 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Haitao Huang" Organization: Intel Message-ID: In-Reply-To: User-Agent: Opera Mail/1.0 (Win32) X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Mon, 16 Oct 2023 12:52:50 -0700 (PDT) On Mon, 16 Oct 2023 05:57:36 -0500, Huang, Kai wrote: > On Thu, 2023-10-12 at 08:27 -0500, Haitao Huang wrote: >> On Tue, 10 Oct 2023 19:51:17 -0500, Huang, Kai >> wrote: >> [...] >> > (btw, even you track VA/SECS pages in unreclaimable list, given they >> > both have >> > 'enclave' as the owner, do you still need SGX_EPC_OWNER_ENCL and >> > SGX_EPC_OWNER_PAGE ?) >> >> Let me think about it, there might be also a way just track encl objects >> not unreclaimable pages. >> >> I still not get why we need kill the VM not just remove just enough >> pages. >> Is it due to the static allocation not able to reclaim? > > We can choose to "just remove enough EPC pages". The VM may or may not > be > killed when it wants the EPC pages back, depending on whether the > current EPC > cgroup can provide enough EPC pages or not. And this depends on how we > implement the cgroup algorithm to reclaim EPC pages. > > One problem could be: for a EPC cgroup only has SGX VMs, you may end up > with > moving EPC pages from one VM to another and then vice versa endlessly, This could be a valid use case though if people intend to share EPCs between two VMs. Understand no one would be able to use VMs this way currently with the static allocation. > because > you never really actually mark any VM to be dead just like OOM does to > the > normal enclaves. > > From this point, you still need a way to kill a VM, IIUC. > > I think the key point of virtual EPC vs cgroup, as quoted from Sean, > should be > "having sane, well-defined behavior". > > Does "just remove enough EPC pages" meet this? If the problem mentioned > above > can be avoided, I suppose so? So if there's an easy way to achieve, I > guess it > can be an option too. > > But for the initial support, IMO we are not looking for a perfect but yet > complicated solution. I would say, from review's point of view, it's > preferred > to have a simple implementation to achieve a not-prefect, but > consistent, well- > defined behaviour. > > So to me looks killing the VM when cgroup cannot reclaim any more EPC > pages is a > simple option. > > But I might have missed something, especially since middle of last week > I have > been having fever and headache :-) > > So as mentioned above, you can try other alternatives, but please avoid > complicated ones. > > Also, I guess it will be helpful if we can understand the typical SGX > app and/or > SGX VM deployment under EPC cgroup use case. This may help us on > justifying why > the EPC cgroup algorithm to select victim is reasonable. > From this perspective, I think the current implementation is "well-defined": EPC cgroup limits for VMs are only enforced at VM launch time, not runtime. In practice, SGX VM can be launched only with fixed EPC size and all those EPCs are fully committed to the VM once launched. Because of that, I imagine people are using VMs to primarily partition the physical EPCs, i.e, the static size itself is the 'limit' for the workload of a single VM and not expecting EPCs taken away at runtime. So killing does not really add much value for the existing usages IIUC. That said, I don't anticipate adding the enforcement of killing VMs at runtime would break such usages as admin/user can simply choose to set the limit equal to the static size to launch the VM and forget about it. Given that, I'll propose an add-on patch to this series as RFC and have some feedback from community before we decide if that needs be included in first version or we can skip it until we have EPC reclaiming for VMs. Thanks Haitao