Received: by 2002:a05:7412:da14:b0:e2:908c:2ebd with SMTP id fe20csp2211150rdb; Mon, 9 Oct 2023 17:23:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGeFoS7Db/wUwykl226ypXXCAvRWkeFsUnqZs62JKB5TO+pzCXWcdUYHnsPEtmVWnyxN/Y2 X-Received: by 2002:a05:6358:716:b0:143:897e:6e31 with SMTP id e22-20020a056358071600b00143897e6e31mr16286025rwj.7.1696897400490; Mon, 09 Oct 2023 17:23:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696897400; cv=none; d=google.com; s=arc-20160816; b=GBcKkwquWM5QWrsuMsgdI2VF1qs5yZ4oBIvxDtY67lIb7qiN5GY3vMJiF7RhrWw//A Ypv9rwt7YN3yyisAmJ7m4m+hZ3lb0SyEGNYZjlHY9NdnTQ8e9dkxCtQzssB6XtyBsVJJ ZYbtWI/66bjOk7JRrKf6M+V9ouZA7uKpeIcyOVka6rsEW3WpJbBnkOTh8LT/ZLZIOK4q Q5CoVSly2mck/SezXp7LqpXlGXEWN0DKl0rWDLQuDj2jpyb/neCb0om1kPm6XjZM7Ku+ ahEoOWtDvbAjcZwRS6JeUwWjeThCFvKTNhvS3W9Y2zcEDsaW8v3Gqv78JcEd+S1BOn/z ZENQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=QOn+6Spq2azhJkMg4hExaP36H5dFsqw/dQxW+wM3h5A=; fh=NQ6EMHFHbe7WECSDVakIxoGVo9Bp2A+B0xqda77BRlE=; b=bnzSoGOC1Gykv4GV+BJK63WLknNhia6ils9wUjF3zbWRkXaGgP9YS3iBG5m52HtGBs sdaCzh5WwezI74tnB7k6WpOQtDZL36lSkzNTgwf2nzXSe7bJCBJK5+uqK/mPHp3KZnL7 wJIqYp8jOdLu+jVtJ/ea5lDe6BofZwhnF+Q9ENjLrGsVuXdk/L6F9UV8fgT3+lBG8s7B 2iUFvCzH6V4NxYgYIHPpR8w+oUOPdFd1u0xAw8Ieq8Sv2bYKkpdEx6es/+GE9oLxHQd+ PFO/ojZe9pAsXGJpH9hicCZ2R2zKKKCwX1GdvfRZDlpKjSncY7w74BClAlRkQBg/DYY+ yIDQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=elUE81bO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id pf17-20020a17090b1d9100b0027ceb84b60bsi60208pjb.56.2023.10.09.17.23.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Oct 2023 17:23:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=elUE81bO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id C8EFD80AF81A; Mon, 9 Oct 2023 17:23:17 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379165AbjJJAXJ (ORCPT + 99 others); Mon, 9 Oct 2023 20:23:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379138AbjJJAXI (ORCPT ); Mon, 9 Oct 2023 20:23:08 -0400 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53B9DA6 for ; Mon, 9 Oct 2023 17:23:06 -0700 (PDT) Received: by mail-pj1-x1049.google.com with SMTP id 98e67ed59e1d1-2790596f212so3814200a91.1 for ; Mon, 09 Oct 2023 17:23:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1696897386; x=1697502186; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QOn+6Spq2azhJkMg4hExaP36H5dFsqw/dQxW+wM3h5A=; b=elUE81bOuBUsQZz+1J97nnnLIjk9GOLa4ydCYixXlK8Tng4gnxKDb9/VzfrHd/6wXu iBFWL+JtOMvUTAmLLK/Rmb2Z1ic2TN3BI9ZLjjfCHSnWxhDNlK9mmchX6ffA5K90nlAK hS5ZN/T9XQvIM7Bksz3D/xBxxwE89Wnh7zEfi+lgf35VXST3G/ht8w3cZ03ndytJhFew bnUJqvYck4v+p6WUq+n7sg6kfOzLRjcDK/s4v1iih6OcMowl0LFuxB4d86Inb+3WTvip Nj2KLBat0kYfOoFx0QCZOWoMSuH0bnAKkyayMrIfspqwGZRcXaKzNtNDTewyvUFiavOW 3SZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696897386; x=1697502186; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QOn+6Spq2azhJkMg4hExaP36H5dFsqw/dQxW+wM3h5A=; b=r09isRe81C2fSUP/XjvUAIwKho5NuE0m3GFvzCt/NRgkntz3uM2mZkLaUeuFESebTN J3aevDqQdGMi2RxjchDTqMPO2aceUmJ7v5fBcrmxGz3She801ROplxnXduNXUscXWmio iabvfCdIO/MVeSirgQOV6+rpqE6y65PKy8PFo2DHA1quUXJ6x7k4xrlJrDTQ9PqlDnSb NCQhFnQjgdvUbzi9xszizMC2oFQl9fbocc79rIZb2m+KyRmCbd9suahEaBhAw12GBKgq RLDozZgbDW5HPuEkTMwdsXv0s+QbsWMTTyiQS39hRhBBb8C0Z7/7WC5BJxdT23gbwMiB 8cyA== X-Gm-Message-State: AOJu0YziaNFZ5+KuWLsFb+73NavZIyu917i2gGRerC4AB5TDtd6x/i38 7cDcwQML0UB6lAcMCkEzR8pAKdFV+Aw= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90a:cb8b:b0:268:8e93:6459 with SMTP id a11-20020a17090acb8b00b002688e936459mr283509pju.8.1696897385772; Mon, 09 Oct 2023 17:23:05 -0700 (PDT) Date: Mon, 9 Oct 2023 17:23:04 -0700 In-Reply-To: <1b265d0c9dfe17de2782962ed26a99cc9d330138.camel@intel.com> Mime-Version: 1.0 References: <20230923030657.16148-1-haitao.huang@linux.intel.com> <20230923030657.16148-13-haitao.huang@linux.intel.com> <1b265d0c9dfe17de2782962ed26a99cc9d330138.camel@intel.com> Message-ID: Subject: Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC From: Sean Christopherson To: Kai Huang Cc: "hpa@zytor.com" , "linux-sgx@vger.kernel.org" , "x86@kernel.org" , "dave.hansen@linux.intel.com" , "cgroups@vger.kernel.org" , "bp@alien8.de" , "linux-kernel@vger.kernel.org" , "jarkko@kernel.org" , "tglx@linutronix.de" , "haitao.huang@linux.intel.com" , Sohil Mehta , "tj@kernel.org" , "mingo@redhat.com" , "kristen@linux.intel.com" , "yangjie@microsoft.com" , Zhiquan1 Li , "mikko.ylinen@linux.intel.com" , Bo Zhang , "anakrish@microsoft.com" Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-4.8 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Mon, 09 Oct 2023 17:23:17 -0700 (PDT) On Mon, Oct 09, 2023, Kai Huang wrote: > On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > > +/** > > + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU > > + * @lru: LRU that is low > > + * > > + * Return: %true if a victim was found and kicked. > > + */ > > +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) > > +{ > > + struct sgx_epc_page *victim; > > + > > + spin_lock(&lru->lock); > > + victim = sgx_oom_get_victim(lru); > > + spin_unlock(&lru->lock); > > + > > + if (!victim) > > + return false; > > + > > + if (victim->flags & SGX_EPC_OWNER_PAGE) > > + return sgx_oom_encl_page(victim->encl_page); > > + > > + if (victim->flags & SGX_EPC_OWNER_ENCL) > > + return sgx_oom_encl(victim->encl); > > I hate to bring this up, at least at this stage, but I am wondering why we need > to put VA and SECS pages to the unreclaimable list, but cannot keep an > "enclave_list" instead? The motivation for tracking EPC pages instead of enclaves was so that the EPC OOM-killer could "kill" VMs as well as host-owned enclaves. The virtual EPC code didn't actually kill the VM process, it instead just freed all of the EPC pages and abused the SGX architecture to effectively make the guest recreate all its enclaves (IIRC, QEMU does the same thing to "support" live migration). Looks like y'all punted on that with: The EPC pages allocated for KVM guests by the virtual EPC driver are not reclaimable by the host kernel [5]. Therefore they are not tracked by any LRU lists for reclaiming purposes in this implementation, but they are charged toward the cgroup of the user processs (e.g., QEMU) launching the guest. And when the cgroup EPC usage reaches its limit, the virtual EPC driver will stop allocating more EPC for the VM, and return SIGBUS to the user process which would abort the VM launch. which IMO is a hack, unless returning SIGBUS is actually enforced somehow. Relying on userspace to be kind enough to kill its VMs kinda defeats the purpose of cgroup enforcement. E.g. if the hard limit for a EPC cgroup is lowered, userspace running encalves in a VM could continue on and refuse to give up its EPC, and thus run above its limit in perpetuity. I can see userspace wanting to explicitly terminate the VM instead of "silently" the VM's enclaves, but that seems like it should be a knob in the virtual EPC code.