Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp2769951rdg; Mon, 16 Oct 2023 14:32:54 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFnZoseJSq6GeKBMXy6qjsbAZnJAw3eVOLURNjDhl/Ppt2SlhGTQQXK8cp0kDQpjYX9N26Y X-Received: by 2002:a17:903:294f:b0:1c0:b84d:3f73 with SMTP id li15-20020a170903294f00b001c0b84d3f73mr406749plb.53.1697491974143; Mon, 16 Oct 2023 14:32:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697491974; cv=none; d=google.com; s=arc-20160816; b=tSeiyj4/5boaCRVClWUHSyZSVKsJcF+flloxKMonwsQ2h8DFsifGeWpZ+iy5Jcgkc9 RDGZmuWo5tLoREijL+ME4O1AgyAc6Q1Jj3jgCeO2yj0qo1uR/IJHYBjzP/GVYGuwrVPA qbG1r0avoTYf7ucRmeT72ClLkZdpebypmIiJrfgWRNpLlycfKGXj8Btgqi+Y91dmpNFC kdgSXle8RWIpfYgPNnYh212CFCSGRk8cFpk6mPzkel47Fl57/KnunYV7OM+BtOY8r+wx QEZINHoBXPIQxgr4S6js6/Ndkxdrr9/Bax8/0HWkFsAVoUlR9YsbZgysVq0Uq73zEupp E/rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=+t15XOMF8Yqimk8R2WgZJ6PJZRWQZddWx2hFeG4o13M=; fh=Cfn9P3XkiI9C41jgPYR2H6APKCCsiAWTzIo3Bvb1Kuw=; b=PZt3fNTVRgAF1D60pm51FTtEdPDUhA/I1LzY4NfcID9D5/6qZJgl1KGxnc5xkoJyAl Jfxyqiup42oak4OAmvWebITn4g+dgiS/YhesiVkHiwGJYNlJbTvZPHdxfFkH6TpH+Bg4 eEBFHtdilP7TAUrb5NLpwDazfnzuPShssV5BtjDxI6m1sd4IPdY1tC8Mj/i+/6tx6n1Q eoM80lipTv5GYtx1ApP0FY8Fck8CcWOvpRUtnlkduZqAXLYE1eKfmJiqHlnC+KKdAs0H IfALh5Fy7Lzy3IO5s2mxGIwDoZOEs6pGhSRAm6RL4Q/OtovmqoWv5oGtVbROWI/oXUn3 QbUw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=hFvcy9PE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id l1-20020a170902f68100b001c5d09c7b0dsi204505plg.458.2023.10.16.14.32.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 14:32:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=hFvcy9PE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 83C7D804238F; Mon, 16 Oct 2023 14:32:51 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229666AbjJPVcg (ORCPT + 99 others); Mon, 16 Oct 2023 17:32:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232172AbjJPVcf (ORCPT ); Mon, 16 Oct 2023 17:32:35 -0400 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B76B4A1 for ; Mon, 16 Oct 2023 14:32:33 -0700 (PDT) Received: by mail-pg1-x549.google.com with SMTP id 41be03b00d2f7-578137b42b7so2602232a12.0 for ; Mon, 16 Oct 2023 14:32:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697491953; x=1698096753; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+t15XOMF8Yqimk8R2WgZJ6PJZRWQZddWx2hFeG4o13M=; b=hFvcy9PEmmo0QNNVLwXpHUcGnJ4AhGnvtLiZO4Ks+tOeAE4FGzQLiycFdaC0F3Wwgk 94ikGqvnGcqoRuteuXz1aeWfwnW6veqLdXZpMit3d/vmpUEYjtcd5IYKu2seOO5iJlY9 +rxcKhVtghO1MyNbBF1FwFjKq+QZK1KfqSZg4+hW0uZDa0G07+jXtLKRActVuEht41Np +IeU2JsFK3TpVGDhtI1g3IJgu1GNZH9Z08dUBLnrP/w6f1/QALgZrk3THJGUo4zefnyj fcK1+wQMC01Oh/0/FkYVa48h/vHuPVv1EWVnc6uYl6Uzf9KXMyQTdrf+2VYYgg3cyP/1 Lf+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697491953; x=1698096753; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+t15XOMF8Yqimk8R2WgZJ6PJZRWQZddWx2hFeG4o13M=; b=QG2VoXIMoBb+giOc3qdRxQqYTSk4SZyZ9ineUEBAGzeJosU0I6QVsPeQjIFT2Go32y /9mTMPVlb6IGB1Dhg9xija6CBgFQZblofHFtkD38QBEFY5FcCiOV1MWmeL33PXnXvHXY ZSbzaol790VBj7x1gdfR1Xf39iehqpLtNk05ojbF38ewYk6tl1oLD5AQ2JrOnmQ9hmZp HVUs1sPCOxqQqSIlX3ljb5oQQfLd2Kw134XCy1LzNYQRYD/s9V99xDKPTPuCK2/E5Z3w Y/Ed5Z66Y0gQNCsKHtRnyLJ6oZzSXEwCvflAdYzkqtqQrdPYTXn/zKdG712Ck6uiUpo6 t7cA== X-Gm-Message-State: AOJu0YxJqb05IkczCnANUI8xzvbgH3uiWqseSKSjY7Fw+HPhQZyoqfxV B83+NSqWjzZu+K8nGbLwqLJgsYHDN2I= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:706:b0:1c5:7c07:e403 with SMTP id kk6-20020a170903070600b001c57c07e403mr9215plb.10.1697491953065; Mon, 16 Oct 2023 14:32:33 -0700 (PDT) Date: Mon, 16 Oct 2023 14:32:31 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230923030657.16148-13-haitao.huang@linux.intel.com> <1b265d0c9dfe17de2782962ed26a99cc9d330138.camel@intel.com> <06142144151da06772a9f0cc195a3c8ffcbc07b7.camel@intel.com> <1f7a740f3acff8a04ec95be39864fb3e32d2d96c.camel@intel.com> <631f34613bcc8b5aa41cf519fa9d76bcd57a7650.camel@intel.com> Message-ID: Subject: Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC From: Sean Christopherson To: Haitao Huang Cc: Kai Huang , Bo Zhang , "linux-sgx@vger.kernel.org" , "cgroups@vger.kernel.org" , "yangjie@microsoft.com" , "dave.hansen@linux.intel.com" , Zhiquan1 Li , "linux-kernel@vger.kernel.org" , "mingo@redhat.com" , "tglx@linutronix.de" , "tj@kernel.org" , "anakrish@microsoft.com" , "jarkko@kernel.org" , "hpa@zytor.com" , "mikko.ylinen@linux.intel.com" , Sohil Mehta , "bp@alien8.de" , "x86@kernel.org" , "kristen@linux.intel.com" Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Mon, 16 Oct 2023 14:32:51 -0700 (PDT) On Mon, Oct 16, 2023, Haitao Huang wrote: > From this perspective, I think the current implementation is "well-defined": > EPC cgroup limits for VMs are only enforced at VM launch time, not runtime. > In practice, SGX VM can be launched only with fixed EPC size and all those > EPCs are fully committed to the VM once launched. Fully committed doesn't mean those numbers are reflected in the cgroup. A VM scheduler can easily "commit" EPC to a guest, but allocate EPC on demand, i.e. when the guest attempts to actually access a page. Preallocating memory isn't free, e.g. it can slow down guest boot, so it's entirely reasonable to have virtual EPC be allocated on-demand. Enforcing at launch time doesn't work for such setups, because from the cgroup's perspective, the VM is using 0 pages of EPC at launch. > Because of that, I imagine people are using VMs to primarily partition the > physical EPCs, i.e, the static size itself is the 'limit' for the workload of > a single VM and not expecting EPCs taken away at runtime. If everything goes exactly as planned, sure. But it's not hard to imagine some configuration change way up the stack resulting in the hard limit for an EPC cgroup being lowered. > So killing does not really add much value for the existing usages IIUC. As I said earlier, the behavior doesn't have to result in terminating a VM, e.g. the virtual EPC code could provide a knob to send a signal/notification if the owning cgroup has gone above the limit and the VM is targeted for forced reclaim. > That said, I don't anticipate adding the enforcement of killing VMs at > runtime would break such usages as admin/user can simply choose to set the > limit equal to the static size to launch the VM and forget about it. > > Given that, I'll propose an add-on patch to this series as RFC and have some > feedback from community before we decide if that needs be included in first > version or we can skip it until we have EPC reclaiming for VMs. Gracefully *swapping* virtual EPC isn't required for oversubscribing virtual EPC. Think of it like airlines overselling tickets. The airline sells more tickets than they have seats, and banks on some passengers canceling. If too many people show up, the airline doesn't swap passengers to the cargo bay, they just shunt them to a different plane. The same could be easily be done for hosts and virtual EPC. E.g. if every VM *might* use 1GiB, but in practice 99% of VMs only consume 128MiB, then it's not too crazy to advertise 1GiB to each VM, but only actually carve out 256MiB per VM in order to pack more VMs on a host. If the host needs to free up EPC, then the most problematic VMs can be migrated to a different host. Genuinely curious, who is asking for EPC cgroup support that *isn't* running VMs? AFAIK, these days, SGX is primarily targeted at cloud. I assume virtual EPC is the primary use case for an EPC cgroup. I don't have any skin in the game beyond my name being attached to some of the patches, i.e. I certainly won't stand in the way. I just don't understand why you would go through all the effort of adding an EPC cgroup and then not go the extra few steps to enforce limits for virtual EPC. Compared to the complexity of the rest of the series, that little bit seems quite trivial.