Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp535845rwb; Thu, 12 Jan 2023 09:04:12 -0800 (PST) X-Google-Smtp-Source: AMrXdXv6eIBh6anpokNuSZIsrnRAUesUvZEaXVIQjJBGQ4b9+Jzm/UFq+cgKriv0Ho7HHW7UQDuW X-Received: by 2002:a17:906:99c7:b0:7b5:860d:7039 with SMTP id s7-20020a17090699c700b007b5860d7039mr68232291ejn.10.1673543052570; Thu, 12 Jan 2023 09:04:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673543052; cv=none; d=google.com; s=arc-20160816; b=FBU5C0zoGe3JmChUbu0rdKRuxBQ17Vk0MLzk6iTrAysvOrWSavYCtKdRe2GFJhxkRf 2bffaFIbJwPsvCYPTLU6vZK+fjqmpFsz4ZF2XlHaL9/diogxz0QlBD1fKGUn2KiiVKOu 4fH1XChjtAgzXnF5eqTAbpouq8PyRoNkRL82OV+WPVdrJ68eq05qxmUcqddjxWpS6+75 DqIMOaYLp3nMe6YERawNRe4iE6MTgMknBibkrqw4kYaYHgDdfxpvRDYRmIHNYUrlvTOK 96jiK95NMgBfAVaHALPq3soIseUIKRJZN1wVxGeaw6EKJCozF1XPvigeaXaw2kt+Ec0B ue/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ZDF01bXaGCpplrqQln05GIaVpeq+YoS9OTrG7ZP6A4k=; b=A8Ssn4viUegqR5uvnxju70LNNCTe6umzyMHEJdSZPFQAp1AQg2nD3yDW17LwJeAD8p O/tncpm5R97LeS64GIk45z+nQa7L3rHhsX1ORtpWJUQwmttUutRQhBqdlqnDKaiTWviU O+LOYY8JVCuJpVokP3EMo0yMA1hf4Z+U+PTW1SeCRxrddJy28NpBiEtXtRiRGIw8Zdvq KpDZoCmzC+a/pjBiUQRCOvR24qmcO9+wLPQx9he48pMIwsUBmH0YDzyb2Sjb5Xeaj20h b6lof6+98q+nZ5xYMOn/HnNatqDBzuTkA1OM8mmlsSsQHzxJzKiiwbXJJOAo2aAqkTj2 Vk5Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iOnmcJWl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m10-20020a50998a000000b0048811e4b073si17973681edb.95.2023.01.12.09.03.50; Thu, 12 Jan 2023 09:04:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iOnmcJWl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235143AbjALQho (ORCPT + 52 others); Thu, 12 Jan 2023 11:37:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238033AbjALQhN (ORCPT ); Thu, 12 Jan 2023 11:37:13 -0500 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 451A8B482; Thu, 12 Jan 2023 08:33:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1673541216; x=1705077216; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7soibQrB1nAF56bS28Yak1RCzGPWITXucWYnK8opr8Q=; b=iOnmcJWlTg/oHaIn07QD6betMerHd/Sn+znC/GZzpMrueKyyDHZjy7bC b/opHkP7Cat7+I1HSaY+qwy3bokWV9mUcTa+bWeOQB9kHuklNZi6vZhTo 9ODG91Oq1tK42mw2hm3VS649STVXm9DcPsjzfbvXuKQOlP5UDEaQXOSI9 sKyJRdntwybj80pZuHvnWaQpUMWxPfmRy0v150g3iO/fPZwvx3sGQ1cRX 4Et8qJdf/eH4MK0XTd4UJSPj5Hi/ReFtpSJ6MC7yfncAtYmQsBY8SMD2g HF4Yft7FED4/zCrgVSIfJpygKWaHP/c3ehqpU8zy4pNbnTIibTbcS8094 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10588"; a="386089637" X-IronPort-AV: E=Sophos;i="5.97,211,1669104000"; d="scan'208";a="386089637" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2023 08:33:33 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10588"; a="726372485" X-IronPort-AV: E=Sophos;i="5.97,211,1669104000"; d="scan'208";a="726372485" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2023 08:33:33 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v11 075/113] KVM: TDX: Add support for find pending IRQ in a protected local APIC Date: Thu, 12 Jan 2023 08:32:23 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sean Christopherson Add flag and hook to KVM's local APIC management to support determining whether or not a TDX guest as a pending IRQ. For TDX vCPUs, the virtual APIC page is owned by the TDX module and cannot be accessed by KVM. As a result, registers that are virtualized by the CPU, e.g. PPR, cannot be read or written by KVM. To deliver interrupts for TDX guests, KVM must send an IRQ to the CPU on the posted interrupt notification vector. And to determine if TDX vCPU has a pending interrupt, KVM must check if there is an outstanding notification. Return "no interrupt" in kvm_apic_has_interrupt() if the guest APIC is protected to short-circuit the various other flows that try to pull an IRQ out of the vAPIC, the only valid operation is querying _if_ an IRQ is pending, KVM can't do anything based on _which_ IRQ is pending. Intentionally omit sanity checks from other flows, e.g. PPR update, so as not to degrade non-TDX guests with unnecessary checks. A well-behaved KVM and userspace will never reach those flows for TDX guests, but reaching them is not fatal if something does go awry. Note, this doesn't handle interrupts that have been delivered to the vCPU but not yet recognized by the core, i.e. interrupts that are sitting in vmcs.GUEST_INTR_STATUS. Querying that state requires a SEAMCALL and will be supported in a future patch. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/irq.c | 3 +++ arch/x86/kvm/lapic.c | 3 +++ arch/x86/kvm/lapic.h | 2 ++ arch/x86/kvm/vmx/main.c | 11 +++++++++++ arch/x86/kvm/vmx/tdx.c | 6 ++++++ arch/x86/kvm/vmx/x86_ops.h | 2 ++ 8 files changed, 29 insertions(+) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 99ac85c3c8aa..aeff96543090 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -117,6 +117,7 @@ KVM_X86_OP_OPTIONAL(pi_update_irte) KVM_X86_OP_OPTIONAL(pi_start_assignment) KVM_X86_OP_OPTIONAL(apicv_post_state_restore) KVM_X86_OP_OPTIONAL_RET0(dy_apicv_has_pending_interrupt) +KVM_X86_OP_OPTIONAL(protected_apic_has_interrupt) KVM_X86_OP_OPTIONAL(set_hv_timer) KVM_X86_OP_OPTIONAL(cancel_hv_timer) KVM_X86_OP(setup_mce) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 908911e6f0ba..cead782a48b6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1738,6 +1738,7 @@ struct kvm_x86_ops { void (*pi_start_assignment)(struct kvm *kvm); void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu); bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu); + bool (*protected_apic_has_interrupt)(struct kvm_vcpu *vcpu); int (*set_hv_timer)(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc, bool *expired); diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index b2c397dd2bc6..fd6af5530c32 100644 --- a/arch/x86/kvm/irq.c +++ b/arch/x86/kvm/irq.c @@ -100,6 +100,9 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v) if (kvm_cpu_has_extint(v)) return 1; + if (lapic_in_kernel(v) && v->arch.apic->guest_apic_protected) + return static_call(kvm_x86_protected_apic_has_interrupt)(v); + return kvm_apic_has_interrupt(v) != -1; /* LAPIC */ } EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index cfaf1d8c64ca..9e8d768ce83f 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2623,6 +2623,9 @@ int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu) if (!kvm_apic_present(vcpu)) return -1; + if (apic->guest_apic_protected) + return -1; + __apic_update_ppr(apic, &ppr); return apic_has_interrupt_for_ppr(apic, ppr); } diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index 58c3242fcc7a..79075164e96b 100644 --- a/arch/x86/kvm/lapic.h +++ b/arch/x86/kvm/lapic.h @@ -66,6 +66,8 @@ struct kvm_lapic { bool sw_enabled; bool irr_pending; bool lvt0_in_nmi_mode; + /* Select registers in the vAPIC cannot be read/written. */ + bool guest_apic_protected; /* Number of bits set in ISR. */ s16 isr_count; /* The highest vector set in ISR; if -1 - invalid, must scan ISR. */ diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 7be68c3b345b..91696d5d183f 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -34,6 +34,9 @@ static __init int vt_hardware_setup(void) enable_tdx = enable_tdx && !tdx_hardware_setup(&vt_x86_ops); + if (!enable_tdx) + vt_x86_ops.protected_apic_has_interrupt = NULL; + if (enable_ept) kvm_mmu_set_ept_masks(enable_ept_ad_bits, cpu_has_vmx_ept_execute_only()); @@ -156,6 +159,13 @@ static void vt_vcpu_load(struct kvm_vcpu *vcpu, int cpu) return vmx_vcpu_load(vcpu, cpu); } +static bool vt_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) +{ + KVM_BUG_ON(!is_td_vcpu(vcpu), vcpu->kvm); + + return tdx_protected_apic_has_interrupt(vcpu); +} + static void vt_flush_tlb_all(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -336,6 +346,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .sync_pir_to_irr = vmx_sync_pir_to_irr, .deliver_interrupt = vmx_deliver_interrupt, .dy_apicv_has_pending_interrupt = pi_has_pending_interrupt, + .protected_apic_has_interrupt = vt_protected_apic_has_interrupt, .set_tss_addr = vmx_set_tss_addr, .set_identity_map_addr = vmx_set_identity_map_addr, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 9158d266a086..3f444e138f52 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -474,6 +474,7 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) return -EINVAL; fpstate_set_confidential(&vcpu->arch.guest_fpu); + vcpu->arch.apic->guest_apic_protected = true; vcpu->arch.efer = EFER_SCE | EFER_LME | EFER_LMA | EFER_NX; @@ -512,6 +513,11 @@ void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) local_irq_enable(); } +bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) +{ + return pi_has_pending_interrupt(vcpu); +} + void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) { struct vcpu_tdx *tdx = to_tdx(vcpu); diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index dbbc27519637..32e045480bdb 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -161,6 +161,7 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu); void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu); void tdx_vcpu_put(struct kvm_vcpu *vcpu); void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu); +bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu); u8 tdx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); @@ -190,6 +191,7 @@ static inline fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) { return EXIT_FASTP static inline void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_put(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) {} +static inline bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) { return false; } static inline u8 tdx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) { return 0; } static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } -- 2.25.1