Received: by 2002:ab2:7a55:0:b0:1f4:4a7d:290d with SMTP id u21csp414136lqp; Thu, 4 Apr 2024 18:24:19 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVrAYqy5l2lnIPNP2n/QjJvaNnCaqC8Ep7FWjNBzOhRKI4p3fVFwBrCZhMXdLCtY8TQeQShNiXRpOUPicoEupka4xsvjk0yQPHMfjM2mQ== X-Google-Smtp-Source: AGHT+IHDgSE9JMEQa0EVu297M0+e8ARbNihqjzokEHNC8pCprFjtdJip1Dph3MuMAZd8q2kb4t9+ X-Received: by 2002:a17:903:230e:b0:1e0:c3b5:1c5 with SMTP id d14-20020a170903230e00b001e0c3b501c5mr78372plh.24.1712280259536; Thu, 04 Apr 2024 18:24:19 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712280259; cv=pass; d=google.com; s=arc-20160816; b=s+AjYjHztM0TA7CW+u7I/VK05OrB1mqE39+9onzUjGpN1xv91S0VY95E8ev+fiaezW xIzyqcWX+/Tt95Q3p4fw35zuHwV2qmtVowEI+wb7KLdS+2DAIO/wTi+BsRbmz6wgXfw6 G91MYVx6h670qpIrSiqTgNsDq3NXY9wQbYvY+zDmIKKI7qLF1OV6GLRX/FAHrHJ18s3L pYy1nqlIffPFOxWsHE6vN14soE2EKDK9QdxrdtJkJ/Im4TB+iovXrQFUKXgT0d0llOrE Tw9cpTSMamzK9bk2QxjR4y34Uwbw+IjjYBo4ri/SR2StT+rip0nXJZYSUumwSzD/3c3f KxQA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:in-reply-to:message-id:organization:from :content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:references:subject:cc:to :dkim-signature; bh=VJKSgqoh92Fr3lnZ2aqAJY8epgtTEWubgUJj09sKt0s=; fh=c7DqcBeHQipkJPapvB1JyMl2o5pSykTZfPfI9Aaj+Z8=; b=B0xKZ8lZ4uOarykWwhAoRjhvve2Ojtr+w6Nt3i1bbAqIfR5p9XmljXA8AbM8MBVyaR c6OFXu6F8nZaURErf0NybMgneidwYPmY475g8PrSCTcgXitXgJO+YY73JaxkKr9t7X9b h3FLD5QqFPmzDYhE3HBGLYOcO3i7o3hTe7mcdpfQ2ZOEsi99p6n8LgTY/AJ4xAMqRN0X kSpgfztNjK2BW5x4+2MnKY0MRbXNshPXpw0byYzaiYYpMVBeSBrrh77d156qjriWreSn MlIgoQuPmqaSmMh86yU6WNa445qtoSTFkPOziOWgDt5ShSD54rqKkIZ7lJAfPL51v/zQ H0gA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HZoun0fb; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-132305-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-132305-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id f18-20020a170902ce9200b001e2888ddb91si414823plg.316.2024.04.04.18.24.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Apr 2024 18:24:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-132305-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HZoun0fb; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-132305-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-132305-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 33C99283EAF for ; Fri, 5 Apr 2024 01:24:19 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id CF364B67E; Fri, 5 Apr 2024 01:24:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="HZoun0fb" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 688377484; Fri, 5 Apr 2024 01:24:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.12 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712280250; cv=none; b=HzJLi26Wm3FYExf0logFNP9YfAUMkI3J80h9bYa1mB4jEx5b+YFPokSYwt8XpliMvU+zgqSyiSoPjDi9Im+BDdG2axaftX9xj/wVV/sxI9nIAQgBHlu0YtxE6CTS/Ug9UbWT37AP0IWpXmhIE0ltDFnP1bO9Tx2AF04txFAaRGY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712280250; c=relaxed/simple; bh=YPKRHv6WFnAfzv/nKxhMpe/XbVnWjOcVceGiSzSnzE0=; h=Content-Type:To:Cc:Subject:References:Date:MIME-Version:From: Message-ID:In-Reply-To; b=ItBKDK22sKNsBj/zN2f82gXmOQdwe4LWgF30BdXkDoEuBcp7chnFemxeK67HUkGe34PRN9PiOx/Npoi10bSmzIoqZTiXS/Vw94UEy06wj68Pbz5Wld+lsIJQEnOXPFHtRj4uq61kcAB1aNGE1S4BBryZYrjA+LHqt46TSU5eDo8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=HZoun0fb; arc=none smtp.client-ip=192.198.163.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1712280248; x=1743816248; h=to:cc:subject:references:date:mime-version: content-transfer-encoding:from:message-id:in-reply-to; bh=YPKRHv6WFnAfzv/nKxhMpe/XbVnWjOcVceGiSzSnzE0=; b=HZoun0fby8ankw/6XPWx0MybN/Ubg4LFPkhZWVZYrsddCSaJKwZRXeVk QX9BZdrWAmPJ5+5RbZPXlK0JAGtHerbR3ak3a+o+kHOq9Keq0muylKKkf lONEhpGqip0N92l5uXKSIoCTg3JJI/EHHJT+0YPWMtbZxRgaBD1Ads15V 2rSfgjcIsfhiltc+Fw3xaGfHzmKBlniZh7t5y1Jv0I5V42yBRbUKy3pRE nbETb2+yUzwNz1Co+Ckb/88MPM3nSPJuupcEzgoqLVuDNm/juj9YZeQ9s RyKDO0PFH6DvUta7B4x7lk3s8UVfMnDre1p6MoEtHhRQzdLMcme0iZ58D A==; X-CSE-ConnectionGUID: sl88ShJ2Qka4iE77FH9CUg== X-CSE-MsgGUID: wGjXPDU3T5iwTBAx3XJCVQ== X-IronPort-AV: E=McAfee;i="6600,9927,11034"; a="11375075" X-IronPort-AV: E=Sophos;i="6.07,180,1708416000"; d="scan'208";a="11375075" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Apr 2024 18:24:07 -0700 X-CSE-ConnectionGUID: MiigtBY5QreKPX0qKPq2PQ== X-CSE-MsgGUID: /fQN0YYRRzCf1PIV6CE0Jg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,180,1708416000"; d="scan'208";a="42163949" Received: from hhuan26-mobl.amr.corp.intel.com ([10.92.17.168]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/AES256-SHA; 04 Apr 2024 18:24:05 -0700 Content-Type: text/plain; charset=iso-8859-15; format=flowed; delsp=yes To: "hpa@zytor.com" , "tim.c.chen@linux.intel.com" , "linux-sgx@vger.kernel.org" , "x86@kernel.org" , "dave.hansen@linux.intel.com" , "jarkko@kernel.org" , "cgroups@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "mkoutny@suse.com" , "tglx@linutronix.de" , "Mehta, Sohil" , "tj@kernel.org" , "mingo@redhat.com" , "bp@alien8.de" , "Huang, Kai" Cc: "mikko.ylinen@linux.intel.com" , "seanjc@google.com" , "anakrish@microsoft.com" , "Zhang, Bo" , "kristen@linux.intel.com" , "yangjie@microsoft.com" , "Li, Zhiquan1" , "chrisyan@microsoft.com" Subject: Re: [PATCH v10 05/14] x86/sgx: Implement basic EPC misc cgroup functionality References: <20240328002229.30264-1-haitao.huang@linux.intel.com> <20240328002229.30264-6-haitao.huang@linux.intel.com> <89b4e053db21c31859cf2572428fd9d4ab4475ab.camel@intel.com> Date: Thu, 04 Apr 2024 20:24:04 -0500 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Haitao Huang" Organization: Intel Message-ID: In-Reply-To: <89b4e053db21c31859cf2572428fd9d4ab4475ab.camel@intel.com> User-Agent: Opera Mail/1.0 (Win32) On Thu, 28 Mar 2024 07:53:45 -0500, Huang, Kai wrote: > >> --- /dev/null >> +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c >> @@ -0,0 +1,74 @@ >> +// SPDX-License-Identifier: GPL-2.0 >> +// Copyright(c) 2022 Intel Corporation. > > It's 2024 now. > > And looks you need to use C style comment for /* Copyright ... */, after > looking > at some other C files. > Ok, will update years and use C style. >> + >> +#include >> +#include >> +#include "epc_cgroup.h" >> + >> +/* The root SGX EPC cgroup */ >> +static struct sgx_cgroup sgx_cg_root; >> + >> +/** >> + * sgx_cgroup_try_charge() - try to charge cgroup for a single EPC page >> + * >> + * @sgx_cg: The EPC cgroup to be charged for the page. >> + * Return: >> + * * %0 - If successfully charged. >> + * * -errno - for failures. >> + */ >> +int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg) >> +{ >> + return misc_cg_try_charge(MISC_CG_RES_SGX_EPC, sgx_cg->cg, PAGE_SIZE); >> +} >> + >> +/** >> + * sgx_cgroup_uncharge() - uncharge a cgroup for an EPC page >> + * @sgx_cg: The charged sgx cgroup >> + */ >> +void sgx_cgroup_uncharge(struct sgx_cgroup *sgx_cg) >> +{ >> + misc_cg_uncharge(MISC_CG_RES_SGX_EPC, sgx_cg->cg, PAGE_SIZE); >> +} >> + >> +static void sgx_cgroup_free(struct misc_cg *cg) >> +{ >> + struct sgx_cgroup *sgx_cg; >> + >> + sgx_cg = sgx_cgroup_from_misc_cg(cg); >> + if (!sgx_cg) >> + return; >> + >> + kfree(sgx_cg); >> +} >> + >> +static int sgx_cgroup_alloc(struct misc_cg *cg); > > Again, this declaration can be removed if you move the below structure > ... > >> + >> +const struct misc_res_ops sgx_cgroup_ops = { >> + .alloc = sgx_cgroup_alloc, >> + .free = sgx_cgroup_free, >> +}; >> + >> +static void sgx_cgroup_misc_init(struct misc_cg *cg, struct sgx_cgroup >> *sgx_cg) >> +{ >> + cg->res[MISC_CG_RES_SGX_EPC].priv = sgx_cg; >> + sgx_cg->cg = cg; >> +} >> + >> +static int sgx_cgroup_alloc(struct misc_cg *cg) >> +{ >> + struct sgx_cgroup *sgx_cg; >> + >> + sgx_cg = kzalloc(sizeof(*sgx_cg), GFP_KERNEL); >> + if (!sgx_cg) >> + return -ENOMEM; >> + >> + sgx_cgroup_misc_init(cg, sgx_cg); >> + >> + return 0; >> +} > > ... here. > yes, thanks >> + >> +void sgx_cgroup_init(void) >> +{ >> + misc_cg_set_ops(MISC_CG_RES_SGX_EPC, &sgx_cgroup_ops); >> + sgx_cgroup_misc_init(misc_cg_root(), &sgx_cg_root); >> +} >> diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.h >> b/arch/x86/kernel/cpu/sgx/epc_cgroup.h >> new file mode 100644 >> index 000000000000..8f794e23fad6 >> --- /dev/null >> +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.h >> @@ -0,0 +1,70 @@ >> +/* SPDX-License-Identifier: GPL-2.0 */ >> +/* Copyright(c) 2022 Intel Corporation. */ >> +#ifndef _SGX_EPC_CGROUP_H_ >> +#define _SGX_EPC_CGROUP_H_ >> + >> +#include >> +#include >> +#include >> + >> +#include "sgx.h" >> + >> +#ifndef CONFIG_CGROUP_SGX_EPC > > Nit: add an empty line to make text more breathable. > ok >> +#define MISC_CG_RES_SGX_EPC MISC_CG_RES_TYPES >> +struct sgx_cgroup; >> + >> +static inline struct sgx_cgroup *sgx_get_current_cg(void) >> +{ >> + return NULL; >> +} >> + >> +static inline void sgx_put_cg(struct sgx_cgroup *sgx_cg) { } >> + >> +static inline int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg) >> +{ >> + return 0; >> +} >> + >> +static inline void sgx_cgroup_uncharge(struct sgx_cgroup *sgx_cg) { } >> + >> +static inline void sgx_cgroup_init(void) { } >> +#else > > Nit: I prefer two empty lines before and after the 'else'. > ok >> +struct sgx_cgroup { >> + struct misc_cg *cg; >> +}; >> + >> +static inline struct sgx_cgroup *sgx_cgroup_from_misc_cg(struct >> misc_cg *cg) >> +{ >> + return (struct sgx_cgroup *)(cg->res[MISC_CG_RES_SGX_EPC].priv); >> +} >> + >> +/** >> + * sgx_get_current_cg() - get the EPC cgroup of current process. >> + * >> + * Returned cgroup has its ref count increased by 1. Caller must call >> + * sgx_put_cg() to return the reference. >> + * >> + * Return: EPC cgroup to which the current task belongs to. >> + */ >> +static inline struct sgx_cgroup *sgx_get_current_cg(void) >> +{ >> + return sgx_cgroup_from_misc_cg(get_current_misc_cg()); >> +} > > Again, I _think_ you need to check whether get_current_misc_cg() returns > NULL? > > Misc cgroup can be disabled by command line even it is on in the Kconfig. > > I am not expert on cgroup, so could you check on this? > Good catch. Will add NULL check in sgx_cgroup_from_misc_cg() >> + >> +/** >> + * sgx_put_sgx_cg() - Put the EPC cgroup and reduce its ref count. >> + * @sgx_cg - EPC cgroup to put. >> + */ >> +static inline void sgx_put_cg(struct sgx_cgroup *sgx_cg) >> +{ >> + if (sgx_cg) >> + put_misc_cg(sgx_cg->cg); >> +} >> + >> +int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg); >> +void sgx_cgroup_uncharge(struct sgx_cgroup *sgx_cg); >> +void sgx_cgroup_init(void); >> + >> +#endif >> + >> +#endif /* _SGX_EPC_CGROUP_H_ */ >> diff --git a/arch/x86/kernel/cpu/sgx/main.c >> b/arch/x86/kernel/cpu/sgx/main.c >> index d219f14365d4..023af54c1beb 100644 >> --- a/arch/x86/kernel/cpu/sgx/main.c >> +++ b/arch/x86/kernel/cpu/sgx/main.c >> @@ -6,6 +6,7 @@ >> #include >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -17,6 +18,7 @@ >> #include "driver.h" >> #include "encl.h" >> #include "encls.h" >> +#include "epc_cgroup.h" >> >> struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; >> static int sgx_nr_epc_sections; >> @@ -558,7 +560,16 @@ int sgx_unmark_page_reclaimable(struct >> sgx_epc_page *page) >> */ >> struct sgx_epc_page *sgx_alloc_epc_page(void *owner, enum sgx_reclaim >> reclaim) >> { >> + struct sgx_cgroup *sgx_cg; >> struct sgx_epc_page *page; >> + int ret; >> + >> + sgx_cg = sgx_get_current_cg(); >> + ret = sgx_cgroup_try_charge(sgx_cg); >> + if (ret) { >> + sgx_put_cg(sgx_cg); >> + return ERR_PTR(ret); >> + } >> >> for ( ; ; ) { >> page = __sgx_alloc_epc_page(); >> @@ -567,8 +578,10 @@ struct sgx_epc_page *sgx_alloc_epc_page(void >> *owner, enum sgx_reclaim reclaim) >> break; >> } >> >> - if (list_empty(&sgx_active_page_list)) >> - return ERR_PTR(-ENOMEM); >> + if (list_empty(&sgx_active_page_list)) { >> + page = ERR_PTR(-ENOMEM); >> + break; >> + } >> >> if (reclaim == SGX_NO_RECLAIM) { >> page = ERR_PTR(-EBUSY); >> @@ -580,10 +593,24 @@ struct sgx_epc_page *sgx_alloc_epc_page(void >> *owner, enum sgx_reclaim reclaim) >> break; >> } >> >> + /* >> + * Need to do a global reclamation if cgroup was not full but free >> + * physical pages run out, causing __sgx_alloc_epc_page() to fail. >> + */ > > Again, to me this comment shouldn't be here, because it doesn't add any > more > information. > > If you can reach here, you have passed the charge(). In fact, I believe > this > doesn't matter: > When you fail to allocate, you just need to reclaim. > > Now you only have the global reclamation, thus you need to reclaim from > it. > > Perhaps it is useful when you have per-cgroup LRU list. In that case > you can > put this comment there. > Ok >> sgx_reclaim_pages(); >> cond_resched(); >> } >> >> +#ifdef CONFIG_CGROUP_SGX_EPC >> + if (!IS_ERR(page)) { >> + WARN_ON_ONCE(page->sgx_cg); >> + /* sgx_put_cg() in sgx_free_epc_page() */ >> + page->sgx_cg = sgx_cg; >> + } else { >> + sgx_cgroup_uncharge(sgx_cg); >> + sgx_put_cg(sgx_cg); >> + } >> +#endif > > Again, IMHO having CONFIG_CGROUP_SGX_EPC here is ugly, because it > doesn't even > match the try_charge() above, which doesn't have the > CONFIG_CGROUP_SGX_EPC. > > If you add a wrapper in "epc_cgroup.h" > Agree. but in sgx.h so sgx_epc_page struct is not exposed in epc_cgroup.h. > static inline void sgx_epc_page_set_cgroup(struct epc_page *epc_page, > struct sgx_cgroup *sgx_cg) > { > #ifdef CONFIG_CGROUP_SGX_EPC > epc_page->sgx_cg = sgx_cg; > #endif > } > > Then I believe the above can be simplified to: > > if (!IS_ERR(page)) { > sgx_epc_page_set_cgroup(page, sgx_cg); > } else { > sgx_cgroup_uncharge(sgx_cg); > sgx_put_cg(sgx_cg); > } > >> if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) >> wake_up(&ksgxd_waitq); >> >> @@ -604,6 +631,14 @@ void sgx_free_epc_page(struct sgx_epc_page *page) >> struct sgx_epc_section *section = &sgx_epc_sections[page->section]; >> struct sgx_numa_node *node = section->node; >> >> +#ifdef CONFIG_CGROUP_SGX_EPC >> + if (page->sgx_cg) { >> + sgx_cgroup_uncharge(page->sgx_cg); >> + sgx_put_cg(page->sgx_cg); >> + page->sgx_cg = NULL; >> + } >> +#endif >> + > > Similarly, how about adding a wrapper in "epc_cgroup.h" > > struct sgx_cgroup *sgx_epc_page_get_cgroup(struct sgx_epc_page *page) > { > #ifdef CONFIG_CGROUP_SGX_EPC > return page->sgx_cg; > #else > return NULL; > #endif > } > > Then this can be: > > sgx_cg = sgx_epc_page_get_cgroup(page); > sgx_cgroup_uncharge(sgx_cg); > sgx_put_cg(sgx_cg); > sgx_epc_page_set_cgroup(page, NULL); > sure. >> spin_lock(&node->lock); >> >> page->owner = NULL; >> @@ -643,6 +678,11 @@ static bool __init sgx_setup_epc_section(u64 >> phys_addr, u64 size, >> section->pages[i].flags = 0; >> section->pages[i].owner = NULL; >> section->pages[i].poison = 0; >> + >> +#ifdef CONFIG_CGROUP_SGX_EPC >> + section->pages[i].sgx_cg = NULL; >> +#endif > > Can use the wrapper too. > yes. > [...] > > (Btw, I'll be away for couple of days due to public holiday and will > review the > rest starting from late next week). Thanks Haitao