Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3433641pxk; Mon, 28 Sep 2020 18:17:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyj+Gyzq9QPBOnfNSsUNaX8m3nXdVfuWN8gq0FtZFx1P3bMtj7zVZor2xVolWs3WC2tD+x3 X-Received: by 2002:a17:906:b352:: with SMTP id cd18mr1378511ejb.229.1601342235099; Mon, 28 Sep 2020 18:17:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601342235; cv=none; d=google.com; s=arc-20160816; b=t2afFEPgq0fdvZ3rOKuvlWTGZhNh5WYZNme0Rtq+nxiazRhUjqSRdpkzx6Gf4tp+91 M4juiWAYBdzFFv+FPRGiH3xp6j8dCzXRpQx2NHJTHjY0jIV0dCBYvW2/IKPct9GIQ9Xi jmXZT8psB2uIaz5LOMPE7MYX51EGfpk13zwc+Bpr9Gsaa7g80zArIIvuqzPiD6OwMW7j sQuoedi7m6iud8CEoB4p/oE+Zq1M3W5lZ0feeZDUHI1aQPUP14AyaB+/4RsALutmgo4i VnKnV+bqAgReWOX3bJDIPqx5qAeLUsD1+sd88Q5gRsx83iqew1gHKv0jBvPOpncGDd3+ WxGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :ironport-sdr:ironport-sdr; bh=uXQ+l8SkTBVpJvyy1SOMggRzTbDXLNnGfAYPL1GVJrM=; b=IVi5rp+tx48XkmSHHOLhaB/1jvItj563gTy1NduJD7K4A9NypAaGeJlG3ZweIoREjL ZGEKQB7zTbXIw1zqPvpf1HvhUgDMK+bQommx6HhhiaY2gLKAeZDI6lgN05Mxr0aa/w9c ui3TpyMhR9rGxCbYUhy6JUXk32/B3yP5f8P+aTsny8NkWOnOfnoi74VvEn5e7Vz/UBYz gLs4+t7iA32CHc5QJuIrHMWh6tCCvoAcVqLVdawV4ghm+bP6MsgDMYYK65m3Uam+mP+R bF2Hl9BybH1mk0vFflDvocCQOFAGqRWBK0fI2yqcYQ2eHelkbw8s1L/PoxNksQfj8W6L QVSg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k19si1768030ejg.563.2020.09.28.18.16.38; Mon, 28 Sep 2020 18:17:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727268AbgI2BOn (ORCPT + 99 others); Mon, 28 Sep 2020 21:14:43 -0400 Received: from mga18.intel.com ([134.134.136.126]:4324 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726698AbgI2BOm (ORCPT ); Mon, 28 Sep 2020 21:14:42 -0400 IronPort-SDR: 0VE6I+L1Fol83jxI6sTKxNeLjPrXI1miDj1xhAL+C+pXsUpiXHqhc4vxEFERovQhtoWZgKt20w uL3BNOWoGHqQ== X-IronPort-AV: E=McAfee;i="6000,8403,9758"; a="149867893" X-IronPort-AV: E=Sophos;i="5.77,316,1596524400"; d="scan'208";a="149867893" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2020 18:14:42 -0700 IronPort-SDR: pgtpNA+jtuvGoogfUXxiqFnRu841hblJguoLlGbttUi+kgiYHcNtBAOTadUTHJP2gwNzlDQWAZ MIKAy04r32Pw== X-IronPort-AV: E=Sophos;i="5.77,316,1596524400"; d="scan'208";a="513656253" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.160]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2020 18:14:41 -0700 Date: Mon, 28 Sep 2020 18:14:39 -0700 From: Sean Christopherson To: Jarkko Sakkinen Cc: Borislav Petkov , x86@kernel.org, linux-sgx@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Jethro Beekman , Jordan Hand , Nathaniel McCallum , Chunyang Hui , Seth Moore , akpm@linux-foundation.org, andriy.shevchenko@linux.intel.com, asapek@google.com, cedric.xing@intel.com, chenalexchen@google.com, conradparker@google.com, cyhanish@google.com, dave.hansen@intel.com, haitao.huang@intel.com, josh@joshtriplett.org, kai.huang@intel.com, kai.svahn@intel.com, kmoy@google.com, ludloff@google.com, luto@kernel.org, nhorman@redhat.com, puiterwijk@redhat.com, rientjes@google.com, tglx@linutronix.de, yaozhangx@google.com Subject: Re: [PATCH v38 16/24] x86/sgx: Add a page reclaimer Message-ID: <20200929011438.GA31167@linux.intel.com> References: <20200915112842.897265-1-jarkko.sakkinen@linux.intel.com> <20200915112842.897265-17-jarkko.sakkinen@linux.intel.com> <20200922104538.GE22660@zn.tnic> <20200922140314.GA164527@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200922140314.GA164527@linux.intel.com> User-Agent: Mutt/1.5.24 (2015-08-30) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 22, 2020 at 05:03:23PM +0300, Jarkko Sakkinen wrote: > On Tue, Sep 22, 2020 at 12:45:38PM +0200, Borislav Petkov wrote: > > > + spin_lock(&sgx_active_page_list_lock); > > > + for (i = 0; i < SGX_NR_TO_SCAN; i++) { > > > + if (list_empty(&sgx_active_page_list)) > > > > Isn't it enough to do this once, i.e., not in the loop? You're holding > > sgx_active_page_list_lock... Argh, I missed this until I looked at Jarkko's updated tree. The reason for checking list_empty() on every iteration is that the loop is greedy, i.e. it tries to grab and reclaim up to 16 (SGX_NR_TO_SCAN) EPC pages at a time. > I think that would make sense. Distantly analogous to the EINIT > discussion. Too complex code for yet to be known problem workloads I'd > say. Nooooo. Please no. This will most definitely impact workloads running a single large enclave, or a process running multiple enclaves, as we'll lose the batching of ETRACK and IPIs. ETRACK isn't a big deal, but the IPIs are painful and could thrash the system. E.g. in the current code, reclaiming 10 pages from an enclave with actively running threads will generate one round of IPIs to CPUs associated with the enclave (based on the mm struct). Going to per-page reclaim will likely result in 10 rounds of IPIs as threads will reenter the enclave immediately after each IPI wave. Having to grab the section spinlock for every page is also (very) problematic. Reclaiming one page at a time makes it nigh impossible for the reclaimer to keep up as every allocation attempt acquires the spinlock. I.e. the reclaimer has to contend with every CPU that is faulting in an EPC page or is adding a page to the enclave. In my experiments with the EPC cgroup, even batching 16 pages may be insufficient to make "forward progress". That's not an apples to apples comparison, but it's a rough equivalent of what will happen with the reclaim thread versus EPC page faults. We can (and should) play with scaling the number of reclaim threads, but at this stage, that's an exercise best left to post-upstreaming. I can't do A/B testing to provide numbers, because the changes in Jarkko's tree break basic reclaim. Which is a nice segue into my last point: this is by far the most subtle code in the SGX code base; I really, really don't want to be making last minute changes in this area unless they are absolutely necessary or obviously benign. Even if/when we get the basic reclaim functional again, and assuming it doesn't have performance issues, I wouldn't be comfortable merging the code without hammering it with the EPC cgroup tests for multiple days (on top of the 1+ days to rebased the EPC cgroup). Past testing with the cgroup found multiple bugs with races between CPUs that required hitting a window that was open for less than 10 instructions.