Received: by 2002:ab2:69cc:0:b0:1f4:be93:e15a with SMTP id n12csp253316lqp; Fri, 12 Apr 2024 17:20:43 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXNWX1qNtGmYpjHHsKFqwrtMz4iAT1FSkjcEZGdcl7naPSEsr2Q+PEAh6m8yNCEP6FxLORj4+fInai3ah4ibDM4dqfoTFheTIQcHXEO+g== X-Google-Smtp-Source: AGHT+IEb4sWWRMv1Y44wJ2FWKtbxmgSPRRNyXZ152wBD3WpJhd2T9FIYO6h2WvpShNHSfi/jNQEK X-Received: by 2002:a19:6919:0:b0:515:b9ee:e814 with SMTP id e25-20020a196919000000b00515b9eee814mr2788632lfc.24.1712967643208; Fri, 12 Apr 2024 17:20:43 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712967643; cv=pass; d=google.com; s=arc-20160816; b=YqTHK3Q4XqfzygaQhXsGSB7cv/MszUgmVo3TW4pdGjvPriKP/owzY+03k6v3RIAhJj IIczc6cZ31I4MhgDdXLbpL81mY79k6Ja1OGkFordMLwbatdfTt3kf758aZM1gVuAcWxl r/Q58DaucjtQ5JEWnr0h+DEc/+SkA+mb45I3JdRmcIDYj0BsXkM0Tj9STzwInFcJYI3Q xe+oc628v6KnmpO1uz12W+rEngH1XPpwhNOaJQ/MiCPW33AqNT4subtszcBYLq9dFAzs Dm3Y6p7JS3WBBvn52LvYGk26hzau4SZduPnAQSVehl99W+xTjMXPTcI7HAw0qRbTd7Cw LEog== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=xSBgmc6vEgQMhEAnathgeBiaq5OFz+aFNturIQTW0FY=; fh=FDpQc9WVsHfBnljdlTS05wVEO5IoYZwZpNdYlrf8N/M=; b=LWjV0HspRn4gfeF9kvFObGRh8/rVf9Dm/J9niaq+nTDCStiKZdsz8aNYjbiA50g3Dc qF6ubx5GdtEOAkRYw3eYvOMe5nbUHoeV50hHL55thOoLYtdb3nul3GwbFqiTfBm4qUSu BSy964HltjnbhYR4lLBkZuxrAgpSYZVv+tiONSXRpIvnL/Fi4hjFaz89fjvqjWTDgsuG MWDw/iKzk5S7gf1M2w3mrGUWf78hQ17M7tUckOm+3xuHKe6mkNMezGyZ/kkMhq6KM3Gd Sn+lUiZfmHuyV//ubTxn7DHaVsyH0tJN+40seb8oL+GKIGx3mJcxGppNaI7Dmn3Ul8MB hqVQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=lKTorjlP; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-143438-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-143438-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id v24-20020a509558000000b0056c3783c709si2205424eda.292.2024.04.12.17.20.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Apr 2024 17:20:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-143438-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=lKTorjlP; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-143438-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-143438-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id BEAD51F22A0E for ; Sat, 13 Apr 2024 00:20:42 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5235B17FE; Sat, 13 Apr 2024 00:20:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lKTorjlP" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C0D5173; Sat, 13 Apr 2024 00:20:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712967630; cv=none; b=eo/czndv0kzintZmfwLrPtxYvg5xor4o8TehY1fwaZvlrzddwBok/hVA6rlmd7rELBoufJbjyX4VE9uQwBPWVXCDEQLXtGXjiAilbbmRLQRxruKZVSEFdvo20wRVDjANOKLLHGiBTNEahe3BiX/M8VVFSSAhRV9EyLjRk/X5Np0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712967630; c=relaxed/simple; bh=18zdulWfNXqxnZk+DxpyaW+RxzbmWkSojubS4Cu+eC0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=dTj4nT95QWxmOi/9Is9jziYRSwWodbpU1JVkK+A7b60z/urNZMOW86xHlmlUfRkj/aix5dw2PXKcc/+G6SaM3BRht+j4qB1Nu8vr7dD+xz/BUOUPj+HJVr8GzYt1+RWeAETRb5NUkCbaso9WvQhqirr3cUqBVvCOXvzZhDE54lA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lKTorjlP; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1712967629; x=1744503629; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=18zdulWfNXqxnZk+DxpyaW+RxzbmWkSojubS4Cu+eC0=; b=lKTorjlPauTec4qIk6t9qSqfODtyJA/Czb3ieyZGGvkqe+K+TxtBmy5Q zDoQGQ8CwG9TyIXjq/AaWH6zYNg48zKdrFyjZG5QjshzEjHCHj7kPsS3d 9MRTGxZsCQkecwhTDxVBdcOST9MH8fY6PY6sMsz8C8XcALTNP397AvZCx k20clX0RUWOUisW7Ms/24++nffvFFnEhM9z/4xdvWVxOIsWT4pMwivhtt RlPK3/lGQJC+pEWTqvMtebrry/XifXWYR37KKYr0yx2pxOd/NVI/ahvqy B03AQ8mD3aVURrTHsQCCUsozheynnfOw1v4Zmp5i1LVHxFam6FwA7PkFl Q==; X-CSE-ConnectionGUID: UFh8Iu6/Q8GdTDQshDOt/Q== X-CSE-MsgGUID: xVGpPWb2RIO/bXI1tb2bXg== X-IronPort-AV: E=McAfee;i="6600,9927,11042"; a="19835101" X-IronPort-AV: E=Sophos;i="6.07,197,1708416000"; d="scan'208";a="19835101" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2024 17:20:28 -0700 X-CSE-ConnectionGUID: 8njdxwpMSz28Atsn66Oz5Q== X-CSE-MsgGUID: +LQ08F12RPmjdxp9sE/6cQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,197,1708416000"; d="scan'208";a="26071131" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by ORVIESA003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2024 17:20:28 -0700 Date: Fri, 12 Apr 2024 17:20:26 -0700 From: Isaku Yamahata To: Binbin Wu Cc: isaku.yamahata@intel.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Xiaoyao Li , Sean Christopherson , Chao Gao , isaku.yamahata@linux.intel.com Subject: Re: [PATCH v19 088/130] KVM: x86: Add a switch_db_regs flag to handle TDX's auto-switched behavior Message-ID: <20240413002026.GP3039520@ls.amr.corp.intel.com> References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: On Sun, Apr 07, 2024 at 06:52:44PM +0800, Binbin Wu wrote: > > > On 2/26/2024 4:26 PM, isaku.yamahata@intel.com wrote: > > From: Isaku Yamahata > > > > Add a flag, KVM_DEBUGREG_AUTO_SWITCHED_GUEST, to skip saving/restoring DRs > > irrespective of any other flags. > > Here "irrespective of any other flags" sounds like other flags will be > ignored if KVM_DEBUGREG_AUTO_SWITCHED_GUEST is set. > But the code below doesn't align with it. Sure, let's update the commit message. > > TDX-SEAM unconditionally saves and > > restores guest DRs and reset to architectural INIT state on TD exit. > > So, KVM needs to save host DRs before TD enter without restoring guest DRs > > and restore host DRs after TD exit. > > > > Opportunistically convert the KVM_DEBUGREG_* definitions to use BIT(). > > > > Reported-by: Xiaoyao Li > > Signed-off-by: Sean Christopherson > > Co-developed-by: Chao Gao > > Signed-off-by: Chao Gao > > Signed-off-by: Isaku Yamahata > > --- > > arch/x86/include/asm/kvm_host.h | 10 ++++++++-- > > arch/x86/kvm/vmx/tdx.c | 1 + > > arch/x86/kvm/x86.c | 11 ++++++++--- > > 3 files changed, 17 insertions(+), 5 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 3ab85c3d86ee..a9df898c6fbd 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -610,8 +610,14 @@ struct kvm_pmu { > > struct kvm_pmu_ops; > > enum { > > - KVM_DEBUGREG_BP_ENABLED = 1, > > - KVM_DEBUGREG_WONT_EXIT = 2, > > + KVM_DEBUGREG_BP_ENABLED = BIT(0), > > + KVM_DEBUGREG_WONT_EXIT = BIT(1), > > + /* > > + * Guest debug registers (DR0-3 and DR6) are saved/restored by hardware > > + * on exit from or enter to guest. KVM needn't switch them. Because DR7 > > + * is cleared on exit from guest, DR7 need to be saved/restored. > > + */ > > + KVM_DEBUGREG_AUTO_SWITCH = BIT(2), > > }; > > struct kvm_mtrr_range { > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > > index 7aa9188f384d..ab7403a19c5d 100644 > > --- a/arch/x86/kvm/vmx/tdx.c > > +++ b/arch/x86/kvm/vmx/tdx.c > > @@ -586,6 +586,7 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) > > vcpu->arch.efer = EFER_SCE | EFER_LME | EFER_LMA | EFER_NX; > > + vcpu->arch.switch_db_regs = KVM_DEBUGREG_AUTO_SWITCH; > > vcpu->arch.cr0_guest_owned_bits = -1ul; > > vcpu->arch.cr4_guest_owned_bits = -1ul; > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 1b189e86a1f1..fb7597c22f31 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -11013,7 +11013,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) > > if (vcpu->arch.guest_fpu.xfd_err) > > wrmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); > > - if (unlikely(vcpu->arch.switch_db_regs)) { > > + if (unlikely(vcpu->arch.switch_db_regs & ~KVM_DEBUGREG_AUTO_SWITCH)) { > > set_debugreg(0, 7); > > set_debugreg(vcpu->arch.eff_db[0], 0); > > set_debugreg(vcpu->arch.eff_db[1], 1); > > @@ -11059,6 +11059,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) > > */ > > if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)) { > > WARN_ON(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP); > > + WARN_ON(vcpu->arch.switch_db_regs & KVM_DEBUGREG_AUTO_SWITCH); > > static_call(kvm_x86_sync_dirty_debug_regs)(vcpu); > > kvm_update_dr0123(vcpu); > > kvm_update_dr7(vcpu); > > @@ -11071,8 +11072,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) > > * care about the messed up debug address registers. But if > > * we have some of them active, restore the old state. > > */ > > - if (hw_breakpoint_active()) > > - hw_breakpoint_restore(); > > + if (hw_breakpoint_active()) { > > + if (!(vcpu->arch.switch_db_regs & KVM_DEBUGREG_AUTO_SWITCH)) > > + hw_breakpoint_restore(); > > + else > > + set_debugreg(__this_cpu_read(cpu_dr7), 7); > > According to TDX module 1.5 ABI spec: > DR0-3, DR6 and DR7 are set to their architectural INIT value, why is only > DR7 restored? This hunk should be dropped. Thank you for finding this. I checked the base SPEC, the ABI spec, and the TDX module code. It seems the documentation bug of the TDX module 1.5 base architecture specification. The TDX module code: - restores guest DR on TD Entry to guest. - saves guest DR on TD Exit from guest TD - initializes DR on TD Exit to host VMM TDX module 1.5 base architecture specification: 15.1.2.1 Context Switch By design, the Intel TDX module context-switches all debug/tracing state that the guest TD is allowed to use. DR0-3, DR6 and IA32_DS_AREA MSR are context-switched in TDH.VP.ENTER and TD exit flows RFLAGS, IA32_DEBUGCTL MSR and DR7 are saved and cleared on VM exits from the guest TD and restored on VM entry to the guest TD. TDX module 1.5 ABI specification: 5.3.65. TDH.VP.ENTER Leaf CPU State Preservation Following a Successful TD Entry and a TD Exit Following a successful TD entry and a TD exit, some CPU state is modified: Registers DR0, DR1, DR2, DR3, DR6 and DR7 are set to their architectural INIT value. -- Isaku Yamahata