Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp1362347rwb; Tue, 29 Nov 2022 12:30:33 -0800 (PST) X-Google-Smtp-Source: AA0mqf6aVkz5LLUdNWty9+pZqbrvAFxvUdskfqhfkYWNgfAh5M4HnaYviumd2tIOnxL0kNJn+TMO X-Received: by 2002:a17:90b:3c45:b0:20a:db08:8a8a with SMTP id pm5-20020a17090b3c4500b0020adb088a8amr66747195pjb.141.1669753833668; Tue, 29 Nov 2022 12:30:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669753833; cv=none; d=google.com; s=arc-20160816; b=SwLosUNhKeQ8w8dsSq9slnhY1dfh+wg/kdpU37c8D0t0mUXvb7fSmlwzDSbMp5ZXiw 9NNW/LwjgxJbqlZpFK+DuStWbo0eX495XAvwXzs41DAgfzuza9G2UVmYYDBB1SPpSo/J ie3ZraFX1MM3YwBufNvVqMFtaLJhKoSlY285zbK/saocfEew9r1Ui2QWYLu5pAG8pXk6 do02VtYGXtcYVeItY0UaXrRmksUrYcgfvhDM5m+KLd7FuEjgjS1nmLQFS7DQ9ABMClaW qcyjiXeNK3LRk32WoXBnTN0Ok/oOcIeUpzosg0Ohu9ecklDkzgCdKu9Z6dGJGdQujYqw Tujg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=OPg8L3jKUTetjU9lk920ZBLtdtJuDxmt2DbZj0atMtY=; b=Wrhis/GdmC00J1cebqlJjbqzwxQPSqDLsFKJgyACzmZ6S4kngkKDFe6NHkug9ZxrDj 1QhPkX+VS7HdnjGbxokICiJyK4/FDoA7oSu7HQde9IzFWRgV7Pb/9zXRl34vN2vt9cFT bRwvIuqsbOz6m36c/0loSQmf/lZidnYn/KsJScZ5S0J+5sBbBpp2HcH9zoRPrJLSmgTj 0XIj01Qw602MUTQgjUoq11ozfcX5TrQaR8eSEevPxnrxu1W9WOREmyYpmSzqCXXiIv3B yMagf6Z2cZCinH1p+U7cDT8cBwICJOAsonG3Wuql5GdUcCuFEK1yjicEevqKxd0Ktl1P iJgA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KlihBtPy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h8-20020a056a00230800b005748eade34dsi16505530pfh.278.2022.11.29.12.30.22; Tue, 29 Nov 2022 12:30:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KlihBtPy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236457AbiK2Tll (ORCPT + 84 others); Tue, 29 Nov 2022 14:41:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237119AbiK2Tkp (ORCPT ); Tue, 29 Nov 2022 14:40:45 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F3979FE0 for ; Tue, 29 Nov 2022 11:37:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750674; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OPg8L3jKUTetjU9lk920ZBLtdtJuDxmt2DbZj0atMtY=; b=KlihBtPydiHgF6pfaw70DrmeKs8gxSPe13GReK633VxiioNOS5nLofMftO8TSeVVfF66rJ HRxcKkd30d6ZrrE2Wi8uPDb+WWm83h3zmAkjAfi2r+Q0eE3Azi+YQQxek+94bARcge4BuA ENmpv0IY5G5HtQwLpdUHgonu2iXGBTs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-637-qVt9wRf9OFS2X7LYDsPYVA-1; Tue, 29 Nov 2022 14:37:52 -0500 X-MC-Unique: qVt9wRf9OFS2X7LYDsPYVA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A34A2800B23; Tue, 29 Nov 2022 19:37:51 +0000 (UTC) Received: from localhost.localdomain (unknown [10.35.206.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id CC7942027061; Tue, 29 Nov 2022 19:37:47 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Sandipan Das , Paolo Bonzini , Jim Mattson , Peter Zijlstra , Dave Hansen , Borislav Petkov , Pawan Gupta , Thomas Gleixner , Ingo Molnar , Josh Poimboeuf , Daniel Sneddon , Jiaxi Chen , Babu Moger , linux-kernel@vger.kernel.org, Jing Liu , Wyes Karny , x86@kernel.org, "H. Peter Anvin" , Sean Christopherson , Maxim Levitsky Subject: [PATCH v2 07/11] KVM: x86: add a delayed hardware NMI injection interface Date: Tue, 29 Nov 2022 21:37:13 +0200 Message-Id: <20221129193717.513824-8-mlevitsk@redhat.com> In-Reply-To: <20221129193717.513824-1-mlevitsk@redhat.com> References: <20221129193717.513824-1-mlevitsk@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch adds two new vendor callbacks: - kvm_x86_get_hw_nmi_pending() - kvm_x86_set_hw_nmi_pending() Using those callbacks the KVM can take advantage of the hardware's accelerated delayed NMI delivery (currently vNMI on SVM). Once NMI is set to pending via this interface, it is assumed that the hardware will deliver the NMI on its own to the guest once all the x86 conditions for the NMI delivery are met. Note that the 'kvm_x86_set_hw_nmi_pending()' callback is allowed to fail, in which case a normal NMI injection will be attempted when NMI can be delivered (possibly by using a NMI window). With vNMI that can happen either if vNMI is already pending or if a nested guest is running. When the vNMI injection fails due to the 'vNMI is already pending' condition, the new NMI will be dropped unless the new NMI can be injected immediately, so no NMI window will be requested. Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/kvm-x86-ops.h | 2 ++ arch/x86/include/asm/kvm_host.h | 15 ++++++++++++- arch/x86/kvm/x86.c | 36 ++++++++++++++++++++++++++---- 3 files changed, 48 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index abccd51dcfca1b..9e2db6cf7cc041 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -67,6 +67,8 @@ KVM_X86_OP(get_interrupt_shadow) KVM_X86_OP(patch_hypercall) KVM_X86_OP(inject_irq) KVM_X86_OP(inject_nmi) +KVM_X86_OP_OPTIONAL_RET0(get_hw_nmi_pending) +KVM_X86_OP_OPTIONAL_RET0(set_hw_nmi_pending) KVM_X86_OP(inject_exception) KVM_X86_OP(cancel_injection) KVM_X86_OP(interrupt_allowed) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 684a5519812fb2..46993ce61c92db 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -871,8 +871,13 @@ struct kvm_vcpu_arch { u64 tsc_scaling_ratio; /* current scaling ratio */ atomic_t nmi_queued; /* unprocessed asynchronous NMIs */ - unsigned nmi_pending; /* NMI queued after currently running handler */ + + unsigned int nmi_pending; /* + * NMI queued after currently running handler + * (not including a hardware pending NMI (e.g vNMI)) + */ bool nmi_injected; /* Trying to inject an NMI this entry */ + bool smi_pending; /* SMI queued after currently running handler */ u8 handling_intr_from_guest; @@ -1602,6 +1607,13 @@ struct kvm_x86_ops { int (*nmi_allowed)(struct kvm_vcpu *vcpu, bool for_injection); bool (*get_nmi_mask)(struct kvm_vcpu *vcpu); void (*set_nmi_mask)(struct kvm_vcpu *vcpu, bool masked); + + /* returns true, if a NMI is pending injection on hardware level (e.g vNMI) */ + bool (*get_hw_nmi_pending)(struct kvm_vcpu *vcpu); + + /* attempts make a NMI pending via hardware interface (e.g vNMI) */ + bool (*set_hw_nmi_pending)(struct kvm_vcpu *vcpu); + void (*enable_nmi_window)(struct kvm_vcpu *vcpu); void (*enable_irq_window)(struct kvm_vcpu *vcpu); void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr); @@ -1964,6 +1976,7 @@ int kvm_pic_set_irq(struct kvm_pic *pic, int irq, int irq_source_id, int level); void kvm_pic_clear_all(struct kvm_pic *pic, int irq_source_id); void kvm_inject_nmi(struct kvm_vcpu *vcpu); +int kvm_get_total_nmi_pending(struct kvm_vcpu *vcpu); void kvm_update_dr7(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 85d2a12c214dda..3c30e3f1106f79 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5103,7 +5103,7 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu, events->interrupt.shadow = static_call(kvm_x86_get_interrupt_shadow)(vcpu); events->nmi.injected = vcpu->arch.nmi_injected; - events->nmi.pending = vcpu->arch.nmi_pending != 0; + events->nmi.pending = kvm_get_total_nmi_pending(vcpu) != 0; events->nmi.masked = static_call(kvm_x86_get_nmi_mask)(vcpu); /* events->sipi_vector is never valid when reporting to user space */ @@ -5191,9 +5191,12 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu, vcpu->arch.nmi_injected = events->nmi.injected; if (events->flags & KVM_VCPUEVENT_VALID_NMI_PENDING) - vcpu->arch.nmi_pending = events->nmi.pending; + atomic_add(events->nmi.pending, &vcpu->arch.nmi_queued); + static_call(kvm_x86_set_nmi_mask)(vcpu, events->nmi.masked); + process_nmi(vcpu); + if (events->flags & KVM_VCPUEVENT_VALID_SIPI_VECTOR && lapic_in_kernel(vcpu)) vcpu->arch.apic->sipi_vector = events->sipi_vector; @@ -10008,6 +10011,10 @@ static int kvm_check_and_inject_events(struct kvm_vcpu *vcpu, static void process_nmi(struct kvm_vcpu *vcpu) { unsigned limit = 2; + int nmi_to_queue = atomic_xchg(&vcpu->arch.nmi_queued, 0); + + if (!nmi_to_queue) + return; /* * x86 is limited to one NMI running, and one NMI pending after it. @@ -10015,13 +10022,34 @@ static void process_nmi(struct kvm_vcpu *vcpu) * Otherwise, allow two (and we'll inject the first one immediately). */ if (static_call(kvm_x86_get_nmi_mask)(vcpu) || vcpu->arch.nmi_injected) - limit = 1; + limit--; + + /* Also if there is already a NMI hardware queued to be injected, + * decrease the limit again + */ + if (static_call(kvm_x86_get_hw_nmi_pending)(vcpu)) + limit--; - vcpu->arch.nmi_pending += atomic_xchg(&vcpu->arch.nmi_queued, 0); + if (limit <= 0) + return; + + /* Attempt to use hardware NMI queueing */ + if (static_call(kvm_x86_set_hw_nmi_pending)(vcpu)) { + limit--; + nmi_to_queue--; + } + + vcpu->arch.nmi_pending += nmi_to_queue; vcpu->arch.nmi_pending = min(vcpu->arch.nmi_pending, limit); kvm_make_request(KVM_REQ_EVENT, vcpu); } +/* Return total number of NMIs pending injection to the VM */ +int kvm_get_total_nmi_pending(struct kvm_vcpu *vcpu) +{ + return vcpu->arch.nmi_pending + static_call(kvm_x86_get_hw_nmi_pending)(vcpu); +} + void kvm_make_scan_ioapic_request_mask(struct kvm *kvm, unsigned long *vcpu_bitmap) { -- 2.26.3