Received: by 2002:ac0:da4c:0:0:0:0:0 with SMTP id a12csp1125844imi; Fri, 22 Jul 2022 18:13:17 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uyN4Ek5Zqj4j9W+o8cHlkug/hASWPb55ZIBwPzOI63Xtf5mDyg9jMgmRp2NTt9DJr7fJKu X-Received: by 2002:a17:902:a3cc:b0:16d:1af4:6359 with SMTP id q12-20020a170902a3cc00b0016d1af46359mr2473544plb.56.1658538796702; Fri, 22 Jul 2022 18:13:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658538796; cv=none; d=google.com; s=arc-20160816; b=Ozg6CQdNLx8hed4WN63fm395AT0mqHmJj04raCdEfYeVSneXX3R0tMXkkYwkt9fhbh mCpctM+cSmLGT6XhAL8sOYiJYBuxkm9cr/daPU2mAcOTJtijzH6u62nMe07seCltt5L5 NGeCJedrrH0O1md+k+pXR7FykFVMmBGuThB4OZkDSVA6RLO160L/UFm7KgnTNvhRWa31 pU9zM35QxYP9OfPT8vxmWssagmEob/rECxPkar9+bn7HcWEyt5LmrtOXeep+QcJ5T9BG fLNNBZZd77C4RFwfW5NDRahyiWt8jR55t/WLA39hhqjr+eIz4/grOwSKsLAXbrAuF5wH fQTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:reply-to:dkim-signature; bh=JPv4psDn8PV+ci2PBj7FjbjAzoqVrpvE+Je8Q54BJ7I=; b=SOMVHhuTJR+SIDyQ3rFn2WL1WdxXcY0HBNTa6t0HQNCLeDqnxi1rg8648AVjkA+XO4 jqt7rPwMd/hyfYzeRW44BwEa/NMkdw7BwFVAhQ5+xq9LyDdKJlDR4CurrufqI6g09Do4 GwAVKS1xbJt62o5eIIS0k6nGEMelMT4w2ThlFB7PYU9vIrWUIKTEX263vSrQj9EhZkcw neTsER7JdaSEve8/3P6riYbZrM9tMHmxkgqjqTQPcy1s1M3WeRXGvA7cKvUedyB2YHMA Rql802mq9Tv23o+bVoR+e6zVgWTAeUY+H4xBuPJpRG0simKrda7v53lkslHTmPpYG9NH Jalg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="Nj676L/U"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 185-20020a6206c2000000b00517c9022a86si7765516pfg.199.2022.07.22.18.13.01; Fri, 22 Jul 2022 18:13:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="Nj676L/U"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237050AbiGWAxY (ORCPT + 99 others); Fri, 22 Jul 2022 20:53:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236875AbiGWAxE (ORCPT ); Fri, 22 Jul 2022 20:53:04 -0400 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61D0DC0B72 for ; Fri, 22 Jul 2022 17:52:18 -0700 (PDT) Received: by mail-pj1-x1049.google.com with SMTP id q12-20020a17090aa00c00b001f228eb8b84so3717566pjp.3 for ; Fri, 22 Jul 2022 17:52:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=JPv4psDn8PV+ci2PBj7FjbjAzoqVrpvE+Je8Q54BJ7I=; b=Nj676L/UeNiybqIUAekBWLwu6V5Cht3bgUW2/uzFmdxXIZ1PW+VqhNc1+8A03vqhld 0PZYVbED+wj65Nw8k2onCO77X+2oAqbiIl6tiUYFExNYc08qSiCxNlYHkaWyMccH1odq bPUGzSUDD20MnTJ+5tmZiMfrvLlSvfQJgDdjDzRhIgKROV+u/hrJIC8EYXQyp+tAAvMt Qhy2wqLc+VtQnLnFNk8fAjkHL395U8a3gOvGNMv2FVZTxAcRE53MiG8LwAaAe1C1qgXf e5ep8PVQkn7HLTuuuM6XyWUsaIoJM942xy2F6U7xGA8TclU9Fq1lwiFGxahAyx+khTo0 cXcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=JPv4psDn8PV+ci2PBj7FjbjAzoqVrpvE+Je8Q54BJ7I=; b=dq53h1zgQ/Mkey0bxTM4UpcVl3ITKdKRIAWJBrLSp+bxoo9rO0v1SZ3Falprr2xM0F vTtEZ1V97YrUf2CaiqvheYRDPymbP2Yq1IhoNHexfCiaepkrZW/tdeTLQeW1yjUFBDw6 /ckIMt9mND3OgAyKztp5JZ03f7jb8SH1fIoZ1SiVdadlXR3B7Ug7erD5thGZjN8KEIKN Gu/KTwot+acv3Pa76b30SvY6aNWwQEClLvaf/JIlNVu1ya2DBZE5Eg5cXx+s6ZMwzkZJ t5+8Kp8IKaVlATWSc7dMvJ2HFYar86/lWTboig0mV8azlGdUdJl4DnUnB4uQYjPWkXqU jItg== X-Gm-Message-State: AJIora+GCYBQkcFQp27faP5AZqQ8gtw3deFjWzx2MSsOlmLocXC1/Bmo I6Nlg2sT8sLYFUPBUNB+PX4XiTAi4GA= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:5d45:0:b0:419:ab8f:c022 with SMTP id o5-20020a635d45000000b00419ab8fc022mr1892501pgm.557.1658537525684; Fri, 22 Jul 2022 17:52:05 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:28 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-16-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 15/24] KVM: x86: Hoist nested event checks above event injection logic From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Perform nested event checks before re-injecting exceptions/events into L2. If a pending exception causes VM-Exit to L1, re-injecting events into vmcs02 is premature and wasted effort. Take care to ensure events that need to be re-injected are still re-injected if checking for nested events "fails", i.e. if KVM needs to force an immediate entry+exit to complete the to-be-re-injecteed event. Keep the "can_inject" logic the same for now; it too can be pushed below the nested checks, but is a slightly riskier change (see past bugs about events not being properly purged on nested VM-Exit). Add and/or modify comments to better document the various interactions. Of note is the comment regarding "blocking" previously injected NMIs and IRQs if an exception is pending. The old comment isn't wrong strictly speaking, but it failed to capture the reason why the logic even exists. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 89 +++++++++++++++++++++++++++------------------- 1 file changed, 53 insertions(+), 36 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 041149c0cf02..046c8c2fbd8f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9670,53 +9670,70 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu) static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) { + bool can_inject = !kvm_event_needs_reinjection(vcpu); int r; - bool can_inject = true; - /* try to reinject previous events if any */ + /* + * Process nested events first, as nested VM-Exit supercedes event + * re-injection. If there's an event queued for re-injection, it will + * be saved into the appropriate vmc{b,s}12 fields on nested VM-Exit. + */ + if (is_guest_mode(vcpu)) + r = kvm_check_nested_events(vcpu); + else + r = 0; - if (vcpu->arch.exception.injected) { + /* + * Re-inject exceptions and events *especially* if immediate entry+exit + * to/from L2 is needed, as any event that has already been injected + * into L2 needs to complete its lifecycle before injecting a new event. + * + * Don't re-inject an NMI or interrupt if there is a pending exception. + * This collision arises if an exception occurred while vectoring the + * injected event, KVM intercepted said exception, and KVM ultimately + * determined the fault belongs to the guest and queues the exception + * for injection back into the guest. + * + * "Injected" interrupts can also collide with pending exceptions if + * userspace ignores the "ready for injection" flag and blindly queues + * an interrupt. In that case, prioritizing the exception is correct, + * as the exception "occurred" before the exit to userspace. Trap-like + * exceptions, e.g. most #DBs, have higher priority than interrupts. + * And while fault-like exceptions, e.g. #GP and #PF, are the lowest + * priority, they're only generated (pended) during instruction + * execution, and interrupts are recognized at instruction boundaries. + * Thus a pending fault-like exception means the fault occurred on the + * *previous* instruction and must be serviced prior to recognizing any + * new events in order to fully complete the previous instruction. + */ + if (vcpu->arch.exception.injected) kvm_inject_exception(vcpu); - can_inject = false; - } + else if (vcpu->arch.exception.pending) + ; /* see above */ + else if (vcpu->arch.nmi_injected) + static_call(kvm_x86_inject_nmi)(vcpu); + else if (vcpu->arch.interrupt.injected) + static_call(kvm_x86_inject_irq)(vcpu, true); + /* - * Do not inject an NMI or interrupt if there is a pending - * exception. Exceptions and interrupts are recognized at - * instruction boundaries, i.e. the start of an instruction. - * Trap-like exceptions, e.g. #DB, have higher priority than - * NMIs and interrupts, i.e. traps are recognized before an - * NMI/interrupt that's pending on the same instruction. - * Fault-like exceptions, e.g. #GP and #PF, are the lowest - * priority, but are only generated (pended) during instruction - * execution, i.e. a pending fault-like exception means the - * fault occurred on the *previous* instruction and must be - * serviced prior to recognizing any new events in order to - * fully complete the previous instruction. + * Exceptions that morph to VM-Exits are handled above, and pending + * exceptions on top of injected exceptions that do not VM-Exit should + * either morph to #DF or, sadly, override the injected exception. */ - else if (!vcpu->arch.exception.pending) { - if (vcpu->arch.nmi_injected) { - static_call(kvm_x86_inject_nmi)(vcpu); - can_inject = false; - } else if (vcpu->arch.interrupt.injected) { - static_call(kvm_x86_inject_irq)(vcpu, true); - can_inject = false; - } - } - WARN_ON_ONCE(vcpu->arch.exception.injected && vcpu->arch.exception.pending); /* - * Call check_nested_events() even if we reinjected a previous event - * in order for caller to determine if it should require immediate-exit - * from L2 to L1 due to pending L1 events which require exit - * from L2 to L1. + * Bail if immediate entry+exit to/from the guest is needed to complete + * nested VM-Enter or event re-injection so that a different pending + * event can be serviced (or if KVM needs to exit to userspace). + * + * Otherwise, continue processing events even if VM-Exit occurred. The + * VM-Exit will have cleared exceptions that were meant for L2, but + * there may now be events that can be injected into L1. */ - if (is_guest_mode(vcpu)) { - r = kvm_check_nested_events(vcpu); - if (r < 0) - goto out; - } + if (r < 0) + goto out; /* try to inject new event if pending */ if (vcpu->arch.exception.pending) { -- 2.37.1.359.gd136c6c3e2-goog