Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2682861rwd; Sun, 28 May 2023 22:07:38 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4YhA/0UH+lDGgriOa4mpc/Tjq/Fl2oCYKVtUwb3O327tBDGV2pDGa5f5s9sqKw/j/j9FZQ X-Received: by 2002:aa7:888e:0:b0:63b:8eeb:77b8 with SMTP id z14-20020aa7888e000000b0063b8eeb77b8mr13527041pfe.13.1685336857958; Sun, 28 May 2023 22:07:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685336857; cv=none; d=google.com; s=arc-20160816; b=1AjJ6hNcc0WmEO27pxoncSfyY4qO+WkxNIkIUZc4p/X0vRXalKfye3c1v2+wcVOYT0 n8zgCMW0Z4B+inYH0DXHpPVpEla7FiUzsI4PmJtQGGN2eb0zl3peNDciN5oLsKhTkPka mugv/8HpRl5XAoqMcpdi4M9y0DvKudfvOOhwhRsl2hgB8LEQMqDcT4lzEcSuEkOTN6RO q3KGHLV8bk6BC7p6UsfTVufq3m0+/9Gq2ruSCzvjLtBHszj+2xCMZrl1ZsSCjyd35sYv CbQGA+5igjSYchVBOc7HSu9mR7yMy/2Xna5sh3bQB6Ixofn++9FJGosQK7wdDbTjpmPM RrlA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3vkJXr82DXTgD3Q3BpJTa7BdWdyNJVnqyO4x55iI50o=; b=zjJ/y//Fn/tD3RQKAq6CUO85IagPzcHj9nsmjsMFS6Gi5kFE9zKLhpGgdM/+2xl4MT 03zKtUyxnK/WFk14OLvIDpWre//tuccGZa9tOf3UEAof9ddwoQ9BetL7Mql2BuoXVOU6 mAOaEXVlSgmQnIpEcbFvcRhSUiNUuoIoe35rGzcrVZDXM5Z3eeHCW+rgIw1XYmICs3J6 U0JrtdC7eXJTsg8CpvP+ZRCe00GPF3e7HhiVRMp2/E9JqNeYv8LTy/bWdnalhj4kw4qs hJpIjGaTov7gVgKum7pVWQMdcCy0JZbZ1bduXzYT8R7poAvvMfD02iqE5WSoVQcnNWml /rfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CRSZVKsZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w7-20020a626207000000b0064d32440768si9016031pfb.138.2023.05.28.22.07.26; Sun, 28 May 2023 22:07:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CRSZVKsZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231993AbjE2E05 (ORCPT + 99 others); Mon, 29 May 2023 00:26:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44778 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231938AbjE2EZ6 (ORCPT ); Mon, 29 May 2023 00:25:58 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E47FE54; Sun, 28 May 2023 21:23:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685334224; x=1716870224; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=goI57K85dZSQkK63NtQojDFIL0iFIo96+YM33QbNRuc=; b=CRSZVKsZDk28raWY/hXF6pzcu5Sez6NjHSgTxEGOROYKSd2S0Y4GUzBH M3gO1FU3tAV15ZzcJ9Eap3jp3KmBP1vFZkI39NQhMhcHHL1pWiJHBREFi hyYerDlxhOKIM+UWqqFUUdUjcvqFQHjd5SH0/McZvE3HtpfmY3fDgVfpx xnqG7F08O03ToLYIsoEabfYRSrygR/OtlMTuyHRbceW1Q+eD9hGBHLhXu LSsaOZlzHnZ1irRB2ksnJr0i3MROtHt9hwroLJ1dCY7+vsznIFq5iVLsk M62rkdxtnlIn+u+mf2RHFXtxB0JAmVqyYOSw1WRwPFqhzB4A1gSIegzKW Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10724"; a="334966039" X-IronPort-AV: E=Sophos;i="6.00,200,1681196400"; d="scan'208";a="334966039" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2023 21:21:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10724"; a="775784347" X-IronPort-AV: E=Sophos;i="6.00,200,1681196400"; d="scan'208";a="775784347" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2023 21:21:19 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Kai Huang , Zhi Wang , chen.bo@intel.com, Yuan Yao Subject: [PATCH v14 053/113] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT Date: Sun, 28 May 2023 21:19:35 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Yuan Yao TDX module internally uses locks to protect internal resources. It tries to acquire the locks. If it fails to obtain the lock, it returns TDX_OPERAND_BUSY error without spin because its execution time limitation. TDX SEAMCALL API reference describes what resources are used. It's known which TDX SEAMCALL can cause contention with which resources. VMM can avoid contention inside the TDX module by avoiding contentious TDX SEAMCALL with, for example, spinlock. Because OS knows better its process scheduling and its scalability, a lock at OS/VMM layer would work better than simply retrying TDX SEAMCALLs. TDH.MEM.* API except for TDH.MEM.TRACK operates on a secure EPT tree and the TDX module internally tries to acquire the lock of the secure EPT tree. They return TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT in case of failure to get the lock. TDX KVM allows sept callbacks to return error so that TDP MMU layer can retry. TDH.VP.ENTER is an exception with zero-step attack mitigation. Normally TDH.VP.ENTER uses only TD vcpu resources and it doesn't cause contention. When a zero-step attack is suspected, it obtains a secure EPT tree lock and tracks the GPAs causing a secure EPT fault. Thus TDG.VP.ENTER may result in TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT. Also TDH.MEM.* SEAMCALLs may result in TDX_OPERAN_BUSY | TDX_OPERAND_ID_SEPT. Retry TDX TDH.MEM.* API and TDH.VP.ENTER on the error because the error is a rare event caused by zero-step attack mitigation and spinlock can not be used for TDH.VP.ENTER due to indefinite time execution. Signed-off-by: Yuan Yao Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx_ops.h | 42 ++++++++++++++++++++++++++++++++------ 1 file changed, 36 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx_ops.h b/arch/x86/kvm/vmx/tdx_ops.h index 5d9b28b21cf0..cb23cb0fe28f 100644 --- a/arch/x86/kvm/vmx/tdx_ops.h +++ b/arch/x86/kvm/vmx/tdx_ops.h @@ -38,6 +38,36 @@ static inline u64 kvm_seamcall(u64 op, u64 rcx, u64 rdx, u64 r8, u64 r9, void pr_tdx_error(u64 op, u64 error_code, const struct tdx_module_output *out); #endif +/* + * TDX module acquires its internal lock for resources. It doesn't spin to get + * locks because of its restrictions of allowed execution time. Instead, it + * returns TDX_OPERAND_BUSY with an operand id. + * + * Multiple VCPUs can operate on SEPT. Also with zero-step attack mitigation, + * TDH.VP.ENTER may rarely acquire SEPT lock and release it when zero-step + * attack is suspected. It results in TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT + * with TDH.MEM.* operation. Note: TDH.MEM.TRACK is an exception. + * + * Because TDP MMU uses read lock for scalability, spin lock around SEAMCALL + * spoils TDP MMU effort. Retry several times with the assumption that SEPT + * lock contention is rare. But don't loop forever to avoid lockup. Let TDP + * MMU retry. + */ +#define TDX_ERROR_SEPT_BUSY (TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT) + +static inline u64 kvm_seamcall_sept(u64 op, u64 rcx, u64 rdx, u64 r8, u64 r9, + struct tdx_module_output *out) +{ +#define SEAMCALL_RETRY_MAX 16 + int retry = SEAMCALL_RETRY_MAX; + u64 ret; + + do { + ret = kvm_seamcall(op, rcx, rdx, r8, r9, out); + } while (ret == TDX_ERROR_SEPT_BUSY && retry-- > 0); + return ret; +} + static inline u64 tdh_mng_addcx(hpa_t tdr, hpa_t addr) { clflush_cache_range(__va(addr), PAGE_SIZE); @@ -48,14 +78,14 @@ static inline u64 tdh_mem_page_add(hpa_t tdr, gpa_t gpa, hpa_t hpa, hpa_t source struct tdx_module_output *out) { clflush_cache_range(__va(hpa), PAGE_SIZE); - return kvm_seamcall(TDH_MEM_PAGE_ADD, gpa, tdr, hpa, source, out); + return kvm_seamcall_sept(TDH_MEM_PAGE_ADD, gpa, tdr, hpa, source, out); } static inline u64 tdh_mem_sept_add(hpa_t tdr, gpa_t gpa, int level, hpa_t page, struct tdx_module_output *out) { clflush_cache_range(__va(page), PAGE_SIZE); - return kvm_seamcall(TDH_MEM_SEPT_ADD, gpa | level, tdr, page, 0, out); + return kvm_seamcall_sept(TDH_MEM_SEPT_ADD, gpa | level, tdr, page, 0, out); } static inline u64 tdh_mem_sept_remove(hpa_t tdr, gpa_t gpa, int level, @@ -81,13 +111,13 @@ static inline u64 tdh_mem_page_aug(hpa_t tdr, gpa_t gpa, hpa_t hpa, struct tdx_module_output *out) { clflush_cache_range(__va(hpa), PAGE_SIZE); - return kvm_seamcall(TDH_MEM_PAGE_AUG, gpa, tdr, hpa, 0, out); + return kvm_seamcall_sept(TDH_MEM_PAGE_AUG, gpa, tdr, hpa, 0, out); } static inline u64 tdh_mem_range_block(hpa_t tdr, gpa_t gpa, int level, struct tdx_module_output *out) { - return kvm_seamcall(TDH_MEM_RANGE_BLOCK, gpa | level, tdr, 0, 0, out); + return kvm_seamcall_sept(TDH_MEM_RANGE_BLOCK, gpa | level, tdr, 0, 0, out); } static inline u64 tdh_mng_key_config(hpa_t tdr) @@ -169,7 +199,7 @@ static inline u64 tdh_phymem_page_reclaim(hpa_t page, static inline u64 tdh_mem_page_remove(hpa_t tdr, gpa_t gpa, int level, struct tdx_module_output *out) { - return kvm_seamcall(TDH_MEM_PAGE_REMOVE, gpa | level, tdr, 0, 0, out); + return kvm_seamcall_sept(TDH_MEM_PAGE_REMOVE, gpa | level, tdr, 0, 0, out); } static inline u64 tdh_sys_lp_shutdown(void) @@ -185,7 +215,7 @@ static inline u64 tdh_mem_track(hpa_t tdr) static inline u64 tdh_mem_range_unblock(hpa_t tdr, gpa_t gpa, int level, struct tdx_module_output *out) { - return kvm_seamcall(TDH_MEM_RANGE_UNBLOCK, gpa | level, tdr, 0, 0, out); + return kvm_seamcall_sept(TDH_MEM_RANGE_UNBLOCK, gpa | level, tdr, 0, 0, out); } static inline u64 tdh_phymem_cache_wb(bool resume) -- 2.25.1