Received: by 2002:a05:6358:795:b0:dc:4c66:fc3e with SMTP id n21csp1447826rwj; Sun, 30 Oct 2022 00:31:35 -0700 (PDT) X-Google-Smtp-Source: AMsMyM50Hqrtuai4op0oLV6LtQ7SMqsNibNA5o+mSq1EwpTVCNty+NgYMRym4SiIZkZTY343liOR X-Received: by 2002:a17:902:6907:b0:179:c9bc:dd73 with SMTP id j7-20020a170902690700b00179c9bcdd73mr8026135plk.159.1667115095372; Sun, 30 Oct 2022 00:31:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667115095; cv=none; d=google.com; s=arc-20160816; b=Mftd9GaFKxUJK6Jo3smwdNuAoSES4gIhNgQk7ZH1IlEy5oeqCEEUUlx9eQSn9UYx2e B1U/+emc2/iwNZ9NhSOBXXf99Qg8W3EdQJgZR4rhOSIda5BhQdZzdBwItlykp7JloCU/ OgtR9yYrCP7OguRpellu+JnU54buNos502jgldd41UebzZbwylOX7qCZdEZUe6Ak0asM NQ7gSSM61KvewgT9Y1SyFg0hKA8Tvre0wsXTRncDkTHu0AlN1t0SR2jBXJPEiRkoMVGt QjHjx7osdOhQhpH8MY2MKbwKlu4gvmrzM4LHQHZhefZZzxlz4HHHE8PomLob3aKjm0Ow QHsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ixBJRpmEAAnRg6o8AfkMfQBnuoKyKFfeWRyYnhycjYc=; b=IuHOhj7p7wuop61CDkkv/pUdm2WebJgL6lqyCc+19FuEhMtpNDYfpfU28OIHCxkpGm O7cx+zHxtMbWxGdKPpqwIy+80ow7CJO+HixlPO7AdVP/Vj05KxJLuX0uoXnm6qMk4gyv 0zhRaSx5ovl60u7e2oH4/nzSZoDp5DHI+CPEx586yN2sT7gz+hlqd5wD5lrouMFTQ1RL PK0Q0VbbYUeV+PRBy+g9XUm2wkQwqZ0rUqynGwUuPJXBdJ/D8u2+vlQ/kpoY1fUcBgUN as9eSOLEp3GiTj3H03vkvcuDzFIvnhDMgGyCo1uQf76NIzGtTmisQ9LMlkxropjCOhg0 DVcg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EzxKHbEu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t12-20020a170902bc4c00b0018280f67482si4054601plz.113.2022.10.30.00.31.24; Sun, 30 Oct 2022 00:31:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EzxKHbEu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231352AbiJ3Gat (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231228AbiJ3G3j (ORCPT ); Sun, 30 Oct 2022 02:29:39 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF71E32D; Sat, 29 Oct 2022 23:24:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111084; x=1698647084; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bGICZvfOKjNV1tPqov4L745ZqjquYQLhU/mLJoqw8HQ=; b=EzxKHbEu34kGm2OuCXQa4lCaRy7TbCozlYOX3erEZJbzda9cTeXDB4DW xjTXHEJ/2wrZMuKaKCoRTBNyScl/5uiDT4tUZuGMkX7ot4jGCNarl1CeK 6fO475uynZ+U/sCtKtRyfbRzEGCe+F8tcsJ8wzguvxF3U3XUzeiGWhqeL n8WLeulVyTPw/kpt2yyfhsPhrlqrhpKhQ4PxyL+KU1PMylxcfTw3FlkrO rBLVUDhz2M5oBB+Q92MY1czA9vza2t/Q/oB+crk1pR2TBnibA6nWKYwdB OuswKuGTBcvNN50znw4amWKVbMuB9MG03sQe6Y6UK6a4BfM6s4jQFTvp2 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037191" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037191" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393095" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393095" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 078/108] KVM: TDX: Add support for find pending IRQ in a protected local APIC Date: Sat, 29 Oct 2022 23:23:19 -0700 Message-Id: <46c55abae1c12364c9159ca2ad41c342518fc0f9.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sean Christopherson Add flag and hook to KVM's local APIC management to support determining whether or not a TDX guest as a pending IRQ. For TDX vCPUs, the virtual APIC page is owned by the TDX module and cannot be accessed by KVM. As a result, registers that are virtualized by the CPU, e.g. PPR, cannot be read or written by KVM. To deliver interrupts for TDX guests, KVM must send an IRQ to the CPU on the posted interrupt notification vector. And to determine if TDX vCPU has a pending interrupt, KVM must check if there is an outstanding notification. Return "no interrupt" in kvm_apic_has_interrupt() if the guest APIC is protected to short-circuit the various other flows that try to pull an IRQ out of the vAPIC, the only valid operation is querying _if_ an IRQ is pending, KVM can't do anything based on _which_ IRQ is pending. Intentionally omit sanity checks from other flows, e.g. PPR update, so as not to degrade non-TDX guests with unnecessary checks. A well-behaved KVM and userspace will never reach those flows for TDX guests, but reaching them is not fatal if something does go awry. Note, this doesn't handle interrupts that have been delivered to the vCPU but not yet recognized by the core, i.e. interrupts that are sitting in vmcs.GUEST_INTR_STATUS. Querying that state requires a SEAMCALL and will be supported in a future patch. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/irq.c | 3 +++ arch/x86/kvm/lapic.c | 3 +++ arch/x86/kvm/lapic.h | 2 ++ arch/x86/kvm/vmx/main.c | 11 +++++++++++ arch/x86/kvm/vmx/tdx.c | 6 ++++++ arch/x86/kvm/vmx/x86_ops.h | 2 ++ 8 files changed, 29 insertions(+) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 1b01dc2098b0..17c3828d42a3 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -116,6 +116,7 @@ KVM_X86_OP_OPTIONAL(pi_update_irte) KVM_X86_OP_OPTIONAL(pi_start_assignment) KVM_X86_OP_OPTIONAL(apicv_post_state_restore) KVM_X86_OP_OPTIONAL_RET0(dy_apicv_has_pending_interrupt) +KVM_X86_OP_OPTIONAL(protected_apic_has_interrupt) KVM_X86_OP_OPTIONAL(set_hv_timer) KVM_X86_OP_OPTIONAL(cancel_hv_timer) KVM_X86_OP(setup_mce) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 082e94f78c66..70549018987d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1671,6 +1671,7 @@ struct kvm_x86_ops { void (*pi_start_assignment)(struct kvm *kvm); void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu); bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu); + bool (*protected_apic_has_interrupt)(struct kvm_vcpu *vcpu); int (*set_hv_timer)(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc, bool *expired); diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index f371f1292ca3..56e52eef0269 100644 --- a/arch/x86/kvm/irq.c +++ b/arch/x86/kvm/irq.c @@ -100,6 +100,9 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v) if (kvm_cpu_has_extint(v)) return 1; + if (lapic_in_kernel(v) && v->arch.apic->guest_apic_protected) + return static_call(kvm_x86_protected_apic_has_interrupt)(v); + return kvm_apic_has_interrupt(v) != -1; /* LAPIC */ } EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index d7639d126e6c..bcf339d02c0a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2624,6 +2624,9 @@ int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu) if (!kvm_apic_present(vcpu)) return -1; + if (apic->guest_apic_protected) + return -1; + __apic_update_ppr(apic, &ppr); return apic_has_interrupt_for_ppr(apic, ppr); } diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index a5ac4a5a5179..44a9b5131323 100644 --- a/arch/x86/kvm/lapic.h +++ b/arch/x86/kvm/lapic.h @@ -66,6 +66,8 @@ struct kvm_lapic { bool sw_enabled; bool irr_pending; bool lvt0_in_nmi_mode; + /* Select registers in the vAPIC cannot be read/written. */ + bool guest_apic_protected; /* Number of bits set in ISR. */ s16 isr_count; /* The highest vector set in ISR; if -1 - invalid, must scan ISR. */ diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 6d46ae9c5dce..1dfffc6c1533 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -46,6 +46,9 @@ static __init int vt_hardware_setup(void) enable_tdx = enable_tdx && !tdx_hardware_setup(&vt_x86_ops); + if (!enable_tdx) + vt_x86_ops.protected_apic_has_interrupt = NULL; + if (enable_ept) kvm_mmu_set_ept_masks(enable_ept_ad_bits, cpu_has_vmx_ept_execute_only()); @@ -168,6 +171,13 @@ static void vt_vcpu_load(struct kvm_vcpu *vcpu, int cpu) return vmx_vcpu_load(vcpu, cpu); } +static bool vt_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) +{ + KVM_BUG_ON(!is_td_vcpu(vcpu), vcpu->kvm); + + return tdx_protected_apic_has_interrupt(vcpu); +} + static void vt_flush_tlb_all(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -339,6 +349,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .sync_pir_to_irr = vmx_sync_pir_to_irr, .deliver_interrupt = vmx_deliver_interrupt, .dy_apicv_has_pending_interrupt = pi_has_pending_interrupt, + .protected_apic_has_interrupt = vt_protected_apic_has_interrupt, .set_tss_addr = vmx_set_tss_addr, .set_identity_map_addr = vmx_set_identity_map_addr, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 57767ef3353b..19a9263e5788 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -426,6 +426,7 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) return -EINVAL; fpstate_set_confidential(&vcpu->arch.guest_fpu); + vcpu->arch.apic->guest_apic_protected = true; vcpu->arch.efer = EFER_SCE | EFER_LME | EFER_LMA | EFER_NX; @@ -464,6 +465,11 @@ void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) local_irq_enable(); } +bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) +{ + return pi_has_pending_interrupt(vcpu); +} + void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) { struct vcpu_tdx *tdx = to_tdx(vcpu); diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index d4fcb6b29ffe..6bdd956b44c2 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -153,6 +153,7 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu); void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu); void tdx_vcpu_put(struct kvm_vcpu *vcpu); void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu); +bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -181,6 +182,7 @@ static inline fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) { return EXIT_FASTP static inline void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_put(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) {} +static inline bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) { return false; } static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } -- 2.25.1