Received: by 2002:a05:6602:2086:0:0:0:0 with SMTP id a6csp4445121ioa; Wed, 27 Apr 2022 04:09:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyZfRguVU0ACucd/DyVgJiG2o1IKvvGrzmg1v7tJzmE/sMAqiCOzn+URkMrj8qGWJmgrToz X-Received: by 2002:a63:2d06:0:b0:39c:f643:8720 with SMTP id t6-20020a632d06000000b0039cf6438720mr23434533pgt.371.1651057752163; Wed, 27 Apr 2022 04:09:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651057752; cv=none; d=google.com; s=arc-20160816; b=RwYaRLrSQkc3kGa1k6HHbTou9Rl+bp2ovWmmdTfKXT0KllHb2eBVe6YondsCF4Ib0Z 2VLoomFLQQD8vLDztcVyRMSPQFcsL3bpLFbWhfCI2M02DJGTLOq22S9dWL4lwYLNJYpv sAxGwseQesK0gSBjjhhL8bRn06DebAT5CGpSmFLx+fhs1kS5e6sP+HwyA2DyjebE+SrR xtPtoK2pnFElaUW7yHmonJZE2q5186MYVu9Be4X+XPlp1jkPznXfEt60qAF5K5J7igyd tDDpIJdmLUu3QtrWNDa+4PgqPd5YwXzy3iOBDRI9AeRjieClzgkJdbc9ju/4CP2WFhDE xluQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=G/UOBLfS5BCS6hIDdIHzSNiQMmtlrs0FqydpF+yikZA=; b=zG424RTa0tQSr+HezUb82HcVrw6VoHcEqECPAfQO/XuYX/Yo3sKa8PMACgBdb8GZSf BZpAj6DfUYaGLQ93v1E3GAOSUBU2MAtuax8gYj5Alnm3yhxXDefVbjvNZonAwMFQfhYd eGE1O4om8Ux0s4Pq4nzhpVOKVv9AgYBFcJ0gCUYy9qizjmfJeVpqssomI83a18seMJsk tLAFz0W5ag/xR4ntSxBANJAmp8JPiHqcdWwBPtnLnVZA2aCNF0T93Jh3v1/YrR3+wlEx hZ801C9V1akrQOHFZ2uuGjusVycwRI+VsMtMSPuHGu6SjYQhjhI6FKdCOj6n/nEKHvEp 1JLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=kI9BsoFK; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id n13-20020a65488d000000b003ab491307f4si1190445pgs.384.2022.04.27.04.09.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Apr 2022 04:09:12 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=kI9BsoFK; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C6A357E1FC; Wed, 27 Apr 2022 03:02:40 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344603AbiDZJhv (ORCPT + 99 others); Tue, 26 Apr 2022 05:37:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346291AbiDZJH3 (ORCPT ); Tue, 26 Apr 2022 05:07:29 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B777B10FDD; Tue, 26 Apr 2022 01:48:40 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7168FB81CFE; Tue, 26 Apr 2022 08:48:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C560DC385AE; Tue, 26 Apr 2022 08:48:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1650962918; bh=zNPS6iT5Fq80XiP8/pWJ5zzdt4yqj14wbeneztDQ/3A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kI9BsoFKdQbQPA/zB+ODtcjNLvtUkTA0rwgt9l27Ej22WHewizyMFxWfyDz7jJlN7 eDI1w/zxrZgxlzNpmPgOm+baWRkh2kmEkriULz6+ULq2aLEej2nWBymT/czHYxl4gW ww+XvTNm/heGNHROixYs17fVZ5OB+I4P4KaTBG5Q= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Sean Christopherson , Paolo Bonzini Subject: [PATCH 5.17 131/146] KVM: nVMX: Defer APICv updates while L2 is active until L1 is active Date: Tue, 26 Apr 2022 10:22:06 +0200 Message-Id: <20220426081753.743120268@linuxfoundation.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220426081750.051179617@linuxfoundation.org> References: <20220426081750.051179617@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sean Christopherson commit 7c69661e225cc484fbf44a0b99b56714a5241ae3 upstream. Defer APICv updates that occur while L2 is active until nested VM-Exit, i.e. until L1 regains control. vmx_refresh_apicv_exec_ctrl() assumes L1 is active and (a) stomps all over vmcs02 and (b) neglects to ever updated vmcs01. E.g. if vmcs12 doesn't enable the TPR shadow for L2 (and thus no APICv controls), L1 performs nested VM-Enter APICv inhibited, and APICv becomes unhibited while L2 is active, KVM will set various APICv controls in vmcs02 and trigger a failed VM-Entry. The kicker is that, unless running with nested_early_check=1, KVM blames L1 and chaos ensues. In all cases, ignoring vmcs02 and always deferring the inhibition change to vmcs01 is correct (or at least acceptable). The ABSENT and DISABLE inhibitions cannot truly change while L2 is active (see below). IRQ_BLOCKING can change, but it is firmly a best effort debug feature. Furthermore, only L2's APIC is accelerated/virtualized to the full extent possible, e.g. even if L1 passes through its APIC to L2, normal MMIO/MSR interception will apply to the virtual APIC managed by KVM. The exception is the SELF_IPI register when x2APIC is enabled, but that's an acceptable hole. Lastly, Hyper-V's Auto EOI can technically be toggled if L1 exposes the MSRs to L2, but for that to work in any sane capacity, L1 would need to pass through IRQs to L2 as well, and IRQs must be intercepted to enable virtual interrupt delivery. I.e. exposing Auto EOI to L2 and enabling VID for L2 are, for all intents and purposes, mutually exclusive. Lack of dynamic toggling is also why this scenario is all but impossible to encounter in KVM's current form. But a future patch will pend an APICv update request _during_ vCPU creation to plug a race where a vCPU that's being created doesn't get included in the "all vCPUs request" because it's not yet visible to other vCPUs. If userspaces restores L2 after VM creation (hello, KVM selftests), the first KVM_RUN will occur while L2 is active and thus service the APICv update request made during VM creation. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Message-Id: <20220420013732.3308816-3-seanjc@google.com> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman --- arch/x86/kvm/vmx/nested.c | 5 +++++ arch/x86/kvm/vmx/vmx.c | 5 +++++ arch/x86/kvm/vmx/vmx.h | 1 + 3 files changed, 11 insertions(+) --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4618,6 +4618,11 @@ void nested_vmx_vmexit(struct kvm_vcpu * kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu); } + if (vmx->nested.update_vmcs01_apicv_status) { + vmx->nested.update_vmcs01_apicv_status = false; + kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu); + } + if ((vm_exit_reason != -1) && (enable_shadow_vmcs || evmptr_is_valid(vmx->nested.hv_evmcs_vmptr))) vmx->nested.need_vmcs12_to_shadow_sync = true; --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4182,6 +4182,11 @@ static void vmx_refresh_apicv_exec_ctrl( { struct vcpu_vmx *vmx = to_vmx(vcpu); + if (is_guest_mode(vcpu)) { + vmx->nested.update_vmcs01_apicv_status = true; + return; + } + pin_controls_set(vmx, vmx_pin_based_exec_ctrl(vmx)); if (cpu_has_secondary_exec_ctrls()) { if (kvm_vcpu_apicv_active(vcpu)) --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -183,6 +183,7 @@ struct nested_vmx { bool change_vmcs01_virtual_apic_mode; bool reload_vmcs01_apic_access_page; bool update_vmcs01_cpu_dirty_logging; + bool update_vmcs01_apicv_status; /* * Enlightened VMCS has been enabled. It does not mean that L1 has to