Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1986481pxb; Thu, 4 Nov 2021 11:58:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz4ZFMaxWU4BILF2AZCTHi83OPelZB9WBPIf21+DAfhrAo+oIMMoQotV20+fV5xdWF2MVjd X-Received: by 2002:aa7:cb0a:: with SMTP id s10mr70267868edt.289.1636052323803; Thu, 04 Nov 2021 11:58:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1636052323; cv=none; d=google.com; s=arc-20160816; b=hJqAGJf6B0mRmp95JyQeoQMTtKQO63Fn4A9GDoQTBTNY1r8ldRFho5wZvQhiLnZ1Jm 221UbeWAcS9G0mbq+B6Vp+FqRvTjher0J4OmVQ8/HWVpaG+YnYjF4VwlO/I0g1nGDFFR QdCYKuV6g7NYssm9F5JWeC7R4OGi2ghD69F1uS5cdk6IoSwH3pY+8wcp0UnHMaU7CKxE ToK4G3k7JjytbpY+m5LtK6sQQ/WK6dEqQ39XogLDlvREdpna4ShwWmqwgVxs9/SFHmjN s0ScZeBhVWNq8V3rwRgBcni1ypmr517P5aiPmxDei5fRm/8kaP1tpfbfxw4x/yenKn4n nguQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=ra1uhjBoRrSt+snm/y3Rl8/6MvtDgLRvMisPLmI1yK4=; b=tsJMh/nhPGEW/CYkyijujkJOJbRPH0qNLuSo4LmtyTPaXNV6I2n0Tcp1MWjk7xvf0x dueXAOvbGhoypJWGXKjvohTE1585yuGJtdVaMWVDYlJRx5XLaEiMHsouXR/nJeMaiRFL tr4nSUWD5zDJGkdBz9S/1DFE77cH/vmD1Rm9xA+0E7aSBjGJmvmJBmElxD11FirJ7zEt jgH1iEafMjAxsdybEYDGAnRMaKQXbrhwwtevpJwgaQWH9/tuV8u8gRXvXoT/zZh36sAP 6itYY2ccYj7VqGdIviBI6IwY5X6qTncvsjQQl4D1C1jdcFXGo7YP4JYwQm0MGCYpxwxS H6Dg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=MNuEPS8z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dp11si11840152ejc.3.2021.11.04.11.58.17; Thu, 04 Nov 2021 11:58:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=MNuEPS8z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232017AbhKDS51 (ORCPT + 99 others); Thu, 4 Nov 2021 14:57:27 -0400 Received: from mail.kernel.org ([198.145.29.99]:43870 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231684AbhKDS5Y (ORCPT ); Thu, 4 Nov 2021 14:57:24 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id A0B9A6112E; Thu, 4 Nov 2021 18:54:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1636052086; bh=vB1LbcttBwp8vLqGtNl7l8YpDqhiHiaThXlmJVbnig8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=MNuEPS8z3TzgAIwvxTF8Kvx2xwDKkG2Rx55yw6OPp5dOY4Ft9DRuGxhd/jGPal/Y8 RFq+s62a4p5tfUQeLKD8NWVZeh4KsuBwwApzznZw6OjrhuQ5cJOLkpFCX2BChvHoeM D4ioRwCnYElTLvZqGcxP0ZGoMbJRStcwzrDITieI= Date: Thu, 4 Nov 2021 19:54:43 +0100 From: Greg KH To: Reinette Chatre Cc: dave.hansen@linux.intel.com, jarkko@kernel.org, tglx@linutronix.de, bp@alien8.de, mingo@redhat.com, linux-sgx@vger.kernel.org, x86@kernel.org, seanjc@google.com, tony.luck@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: Re: [PATCH] x86/sgx: Fix free page accounting Message-ID: References: <373992d869cd356ce9e9afe43ef4934b70d604fd.1636049678.git.reinette.chatre@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <373992d869cd356ce9e9afe43ef4934b70d604fd.1636049678.git.reinette.chatre@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 04, 2021 at 11:28:54AM -0700, Reinette Chatre wrote: > The SGX driver maintains a single global free page counter, > sgx_nr_free_pages, that reflects the number of free pages available > across all NUMA nodes. Correspondingly, a list of free pages is > associated with each NUMA node and sgx_nr_free_pages is updated > every time a page is added or removed from any of the free page > lists. The main usage of sgx_nr_free_pages is by the reclaimer > that will run when the total free pages go below a watermark to > ensure that there are always some free pages available to, for > example, support efficient page faults. > > With sgx_nr_free_pages accessed and modified from a few places > it is essential to ensure that these accesses are done safely but > this is not the case. sgx_nr_free_pages is sometimes accessed > without any protection and when it is protected it is done > inconsistently with any one of the spin locks associated with the > individual NUMA nodes. > > The consequence of sgx_nr_free_pages not being protected is that > its value may not accurately reflect the actual number of free > pages on the system, impacting the availability of free pages in > support of many flows. The problematic scenario is when the > reclaimer never runs because it believes there to be sufficient > free pages while any attempt to allocate a page fails because there > are no free pages available. The worst scenario observed was a > user space hang because of repeated page faults caused by > no free pages ever made available. > > Change the global free page counter to an atomic type that > ensures simultaneous updates are done safely. While doing so, move > the updating of the variable outside of the spin lock critical > section to which it does not belong. > > Cc: stable@vger.kernel.org > Fixes: 901ddbb9ecf5 ("x86/sgx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page()") > Suggested-by: Dave Hansen > Signed-off-by: Reinette Chatre > --- > arch/x86/kernel/cpu/sgx/main.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > index 63d3de02bbcc..8558d7d5f3e7 100644 > --- a/arch/x86/kernel/cpu/sgx/main.c > +++ b/arch/x86/kernel/cpu/sgx/main.c > @@ -28,8 +28,7 @@ static DECLARE_WAIT_QUEUE_HEAD(ksgxd_waitq); > static LIST_HEAD(sgx_active_page_list); > static DEFINE_SPINLOCK(sgx_reclaimer_lock); > > -/* The free page list lock protected variables prepend the lock. */ > -static unsigned long sgx_nr_free_pages; > +atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0); > > /* Nodes with one or more EPC sections. */ > static nodemask_t sgx_numa_mask; > @@ -403,14 +402,15 @@ static void sgx_reclaim_pages(void) > > spin_lock(&node->lock); > list_add_tail(&epc_page->list, &node->free_page_list); > - sgx_nr_free_pages++; > spin_unlock(&node->lock); > + atomic_long_inc(&sgx_nr_free_pages); > } > } > > static bool sgx_should_reclaim(unsigned long watermark) > { > - return sgx_nr_free_pages < watermark && !list_empty(&sgx_active_page_list); > + return atomic_long_read(&sgx_nr_free_pages) < watermark && > + !list_empty(&sgx_active_page_list); What prevents the value from changing right after you test this? Why is an atomic value somehow solving the problem? The value changes were happening safely, it was just the reading of the value that was not. You have not changed the fact that the value can change right after reading given that there was not going to be a problem with reading a stale value before. In other words, what did you really fix here? And how did you test it to verify it did fix things? thanks, greg k-h