Received: by 2002:a05:7208:13c3:b0:82:bbfa:f723 with SMTP id r3csp1537928rbe; Wed, 15 May 2024 06:21:37 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWI7yDWwEukSorUIlKzhqX32GViKER5LBI3BbW7GY+7bCo94+gOBHXstf0Zf/J2EDGfUql+EeAPLKSM8CEohM5nGrNVBGe206OadNAsHw== X-Google-Smtp-Source: AGHT+IGSzqZq3kuKfV46fR2U3UJGmUw93wnl2WGkwckwv3WC+VMpmnuuFfu0EzqWSCLcXmtC5bO0 X-Received: by 2002:a19:5f47:0:b0:523:9515:4b74 with SMTP id 2adb3069b0e04-52395154c9bmr1437941e87.14.1715779296851; Wed, 15 May 2024 06:21:36 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715779296; cv=pass; d=google.com; s=arc-20160816; b=OgBLNOME2igwbVPcuypS0qXXTCHI1/8gVXyPt2Zzw1MxbqPccrTkeTxDjMgnEGD1f8 wW+MIUG0oORK6KDgjfp4vz+iuEbQjAj7jjh5o/neBsv0LbcZLMTfyu0WEb61N46wC/0K MT+/Xhv4eLkdZUQc6JQWGnP29XWpsvOZ7FS2bFMBsKEquWQsM1VEWXL9NyIUsjr35H1e MgXPDv62lrF1IzAdkzWs6Npb8A3weZloAu2T55P4uGiI36AJDfa22cJDNAbyJzHLcghd 4QQqBpUMr2ppUxPKIRhNob2CNc6j9u8svNlR+P9rm2yjse4kkvnVx45F7zYtu4kHRlHP bkBA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:organization:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature; bh=GSkGnP539RnTVEvA7u5K+wbDGXLDsu5E+XWY+QdGJJw=; fh=TwsDylpBhWYkpo5RTOpuZ3+eE0fujUvInGQBpzi6rqc=; b=JXMFGHZnPgJiSU5p0p2WA/Wit0BVTp22yMr4mVUurdQdTF81IPEGIZ9pPFeB2s5uxb 1ehy4b6h2oqM81ovIlMQg7Hn+9MKQDVK4TYlaDfpvzRgBtPz1r2cZfVqZvOQnULoRUas 54Z6R1pjx616hxZNNFnU+WmvYLwRpY84jO7tEYs8yd7ksfqG3HbXzpMlaENCoaHjYW+3 ibEg1OOipPto53Yi/n3p1/cHM/zpHScG89Ls/jk6tdluhnArlSCVj4AhJAlKpOSfk2Y2 zSwM86dyeYRS8aCKaUxi1sG6NRbIGyBI+s+EUwQLd7hSBtR4fENd9/36ZgNuBfkAA/kL pUvw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Sq6hwGZE; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-179908-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-179908-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id a640c23a62f3a-a5a17b21e6csi751281866b.390.2024.05.15.06.21.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 May 2024 06:21:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-179908-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Sq6hwGZE; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-179908-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-179908-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 5B5E71F25D10 for ; Wed, 15 May 2024 13:21:36 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8910B1292FC; Wed, 15 May 2024 13:21:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Sq6hwGZE" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B515E8594A; Wed, 15 May 2024 13:21:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.12 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715779274; cv=none; b=PePAdVQ0lqR3Yi4UmHyUYK0gbkAUndxthHvOJ9YCP+ox46VMx/nMtwByIxPqSDY5UrGesK+hWdKAxZuHou+JogI/4/Z/NjhvkLPT7yQcORMExit1M9ZAECEnK3qPK75bPr3LhTHwGmqxsLlJfewBVfwiSI94qcf8zPAgSkGsC3Q= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715779274; c=relaxed/simple; bh=Z9vmlCFkz6VRPDWj2yh7t44lq4JjqhBWjDgdEDdA5iQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=YHmZnt1mWoPRJ/kiYTL7sOA+doRoafGhNX0CuDyJH3iR7kag5yDLs3WaGI2p14pEHvG/dBDkC1kU8khOQ5TQgi9nckUTGoJTIEi0I/0KRLIo4KI/HAj+0XNGTqrt8Xn/2K/9rVGwibRXV2BRGcr5z7xuVFHyq3l5H5rukhEE+N4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Sq6hwGZE; arc=none smtp.client-ip=192.198.163.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715779273; x=1747315273; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Z9vmlCFkz6VRPDWj2yh7t44lq4JjqhBWjDgdEDdA5iQ=; b=Sq6hwGZEpLR5vxYe0TeVyNViLxFs82qLc7ZNKXVHd6biKCQKDkMbq4Cb PhOaE54pqoOzyI1Yr2EqHGu8sPufwqfkOlFbEVlXHPSeAM7NImgmAc8Pd ExS9c6lgiGzlMR+EvStrAJpOF1UbyjeBAPJKfm48/xKBR/3w05bHE6x9t bJllc7YNQG6LUDK734CPvtQIQL0KRWKlLdBuzBZXjuPUjWCSt2XPCDucm gv39RYBS0yxvW7yMvGtl+nc/Oq0KovNa/vHHEgEj8A73XhShknLuDgxi7 nlnwRFIYD4mPG9xLuzbY+b3VAlbx77RUCwCNB2OChh0qaHtbwFrsAgjmu A==; X-CSE-ConnectionGUID: nk653v75T+e7tfqp4i/rDg== X-CSE-MsgGUID: KVhWwJPETaidfp3Pzgkt5g== X-IronPort-AV: E=McAfee;i="6600,9927,11073"; a="15648215" X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="15648215" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2024 06:21:03 -0700 X-CSE-ConnectionGUID: NXgAzgoYQh20XDnJKgq2ow== X-CSE-MsgGUID: 7reUfwCvQ7SFD51SxbuK9Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="68510855" Received: from mehlow-prequal01.jf.intel.com ([10.54.102.156]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2024 06:21:02 -0700 From: Dmitrii Kuvaiskii To: dave.hansen@linux.intel.com, jarkko@kernel.org, kai.huang@intel.com, haitao.huang@linux.intel.com, reinette.chatre@intel.com, linux-sgx@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mona.vij@intel.com, kailun.qin@intel.com, stable@vger.kernel.org, =?UTF-8?q?Marcelina=20Ko=C5=9Bcielnicka?= Subject: [PATCH v2 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS Date: Wed, 15 May 2024 06:12:39 -0700 Message-Id: <20240515131240.1304824-2-dmitrii.kuvaiskii@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240515131240.1304824-1-dmitrii.kuvaiskii@intel.com> References: <20240515131240.1304824-1-dmitrii.kuvaiskii@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Organization: Intel Deutschland GmbH - Registered Address: Am Campeon 10, 85579 Neubiberg, Germany Content-Transfer-Encoding: 8bit Two enclave threads may try to access the same non-present enclave page simultaneously (e.g., if the SGX runtime supports lazy allocation). The threads will end up in sgx_encl_eaug_page(), racing to acquire the enclave lock. The winning thread will perform EAUG, set up the page table entry, and insert the page into encl->page_array. The losing thread will then get -EBUSY on xa_insert(&encl->page_array) and proceed to error handling path. This race condition can be illustrated as follows: /* /* * Fault on CPU1 * Fault on CPU2 * on enclave page X * on enclave page X */ */ sgx_vma_fault() { sgx_vma_fault() { xa_load(&encl->page_array) xa_load(&encl->page_array) == NULL --> == NULL --> sgx_encl_eaug_page() { sgx_encl_eaug_page() { ... ... /* /* * alloc encl_page * alloc encl_page */ */ mutex_lock(&encl->lock); /* * alloc EPC page */ epc_page = sgx_alloc_epc_page(...); /* * add page to enclave's xarray */ xa_insert(&encl->page_array, ...); /* * add page to enclave via EAUG * (page is in pending state) */ /* * add PTE entry */ vmf_insert_pfn(...); mutex_unlock(&encl->lock); return VM_FAULT_NOPAGE; } } /* * All good up to here: enclave page * successfully added to enclave, * ready for EACCEPT from user space */ mutex_lock(&encl->lock); /* * alloc EPC page */ epc_page = sgx_alloc_epc_page(...); /* * add page to enclave's xarray, * this fails with -EBUSY as this * page was already added by CPU2 */ xa_insert(&encl->page_array, ...); err_out_shrink: sgx_encl_free_epc_page(epc_page) { /* * remove page via EREMOVE * * *BUG*: page added by CPU2 is * yanked from enclave while it * remains accessible from OS * perspective (PTE installed) */ /* * free EPC page */ sgx_free_epc_page(epc_page); } mutex_unlock(&encl->lock); /* * *BUG*: SIGBUS is returned * for a valid enclave page */ return VM_FAULT_SIGBUS; } } The err_out_shrink error handling path contains two bugs: (1) function sgx_encl_free_epc_page() is called that performs EREMOVE even though the enclave page was never intended to be removed, and (2) SIGBUS is sent to userspace even though the enclave page is correctly installed by another thread. The first bug renders the enclave page perpetually inaccessible (until another SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl). This is because the page is marked accessible in the PTE entry but is not EAUGed, and any subsequent access to this page raises a fault: with the kernel believing there to be a valid VMA, the unlikely error code X86_PF_SGX encountered by code path do_user_addr_fault() -> access_error() causes the SGX driver's sgx_vma_fault() to be skipped and user space receives a SIGSEGV instead. The userspace SIGSEGV handler cannot perform EACCEPT because the page was not EAUGed. Thus, the user space is stuck with the inaccessible page. The second bug is less severe: a spurious SIGBUS signal is unnecessarily sent to user space. Fix these two bugs (1) by returning VM_FAULT_NOPAGE to the generic Linux fault handler so that no signal is sent to userspace, and (2) by replacing sgx_encl_free_epc_page() with sgx_free_epc_page() so that no EREMOVE is performed. Note that sgx_encl_free_epc_page() performs an additional WARN_ON_ONCE check in comparison to sgx_free_epc_page(): whether the EPC page is being reclaimer tracked. However, the EPC page is allocated in sgx_encl_eaug_page() and has zeroed-out flags in all error handling paths. In other words, the page is marked as reclaimable only in the happy path of sgx_encl_eaug_page(). Therefore, in the particular code path affected in this commit, the "page reclaimer tracked" condition is always false and the warning is never printed. Thus, it is safe to replace sgx_encl_free_epc_page() with sgx_free_epc_page(). Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave") Cc: stable@vger.kernel.org Reported-by: Marcelina Koƛcielnicka Suggested-by: Reinette Chatre Signed-off-by: Dmitrii Kuvaiskii --- arch/x86/kernel/cpu/sgx/encl.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 279148e72459..41f14b1a3025 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -382,8 +382,11 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma, * If ret == -EBUSY then page was created in another flow while * running without encl->lock */ - if (ret) + if (ret) { + if (ret == -EBUSY) + vmret = VM_FAULT_NOPAGE; goto err_out_shrink; + } pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page); pginfo.addr = encl_page->desc & PAGE_MASK; @@ -419,7 +422,7 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma, err_out_shrink: sgx_encl_shrink(encl, va_page); err_out_epc: - sgx_encl_free_epc_page(epc_page); + sgx_free_epc_page(epc_page); err_out_unlock: mutex_unlock(&encl->lock); kfree(encl_page); -- 2.34.1