Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp541402pxx; Wed, 28 Oct 2020 10:39:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwcxz2CjGeo8D5blB769FCzy1VOdu7KalGpZwcyUH+5Mrkzt3Hqq9JXlXYPo7cPAAKhBAL5 X-Received: by 2002:a17:906:824a:: with SMTP id f10mr71085ejx.167.1603906750834; Wed, 28 Oct 2020 10:39:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603906750; cv=none; d=google.com; s=arc-20160816; b=Qz+k0Bt6kQo2HLp4gEaN0zNg0jisfBN3Fbsxf1ajSBukfLtBFvm9DwFp9TjylCCzCD YtQWaWSq033SMeh0GUKelBzlzTASgnbAGK4Et+QglTnQesCB/rVtmjrwkPy1XmnsPDTe 2q1EKxnyE3qiWuuULhnFIEuKz2aWvO0D+325/kCrOXHlj/Z9TsKGHErFellIrspjgV63 UsKf9+nn8G6t/RgxKl0smD5DjIjUnobwQNP/XE2mz2PvP3Ui551bd/wPXYF05igbYzDH ih1crHEoghDtWiWZYUyiv2Jj2I9JEDVv9sH/XG51tH5vObBammrSF32wvPVGIsqr49hq AH+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=yAzaZ00apotJWiDEPqnSKnSGz8VjyR6fpxklwoYGf64=; b=KudVPivKsryqUmoWI0TycqO6+cEGTxAsbkJkzPti7gI9wMKTIIRJoeXIiAg27dVg6u ysZf8bKbhaKkRSg0F+iiNsJETar0J9lvgN8gTxA4jf0VcWyKG/lJYYg2WCsEoUPPinNA UIhs1t3iYRMT3gmuoc1/WBggPiiuBeTnsAW/JO4fqUB1raAO57mW5dwP/kecVIOr/a9D vXTr4U5VtPOu/C+dHh4UkyN/6wJr/2ElxmOAR+S5RiRxBGwbvJl29RR9nx/qlMEkY4IF +PZU0QqKh42KHeR8H8dx74R+yCyZoaIavjvmhqWPdA/pksdhQFlGAXwsQlJk/2uxpteJ sEYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=ehI+MH2N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ay1si32292edb.3.2020.10.28.10.38.48; Wed, 28 Oct 2020 10:39:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=ehI+MH2N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1822276AbgJ0Rsr (ORCPT + 99 others); Tue, 27 Oct 2020 13:48:47 -0400 Received: from mail.kernel.org ([198.145.29.99]:57416 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2902135AbgJ0Oaq (ORCPT ); Tue, 27 Oct 2020 10:30:46 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id DC0AB20780; Tue, 27 Oct 2020 14:30:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1603809045; bh=WCcS1pBqHKD+lmU3SS7a/YGub/MWDpRMO7EyQ3RKzGg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ehI+MH2NhEH6uEpf0u+S0t0lDeHjy22brJ+XoHxodHdPMJdzP2oni9T2eAGPbyIle tPROqA1tjpJ4iASqJ5ZCdJryXndxXSGzWJ0ZyjxgbBvxZVjgSU1KuT+PKyOA4UMUet xlORPHv4hvw6HGKYCvvDJV+EHHrvtW0Ru4/Mvvu4= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Sean Christopherson , Paolo Bonzini Subject: [PATCH 5.4 048/408] KVM: nVMX: Reset the segment cache when stuffing guest segs Date: Tue, 27 Oct 2020 14:49:46 +0100 Message-Id: <20201027135457.294173409@linuxfoundation.org> X-Mailer: git-send-email 2.29.1 In-Reply-To: <20201027135455.027547757@linuxfoundation.org> References: <20201027135455.027547757@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sean Christopherson commit fc387d8daf3960c5e1bc18fa353768056f4fd394 upstream. Explicitly reset the segment cache after stuffing guest segment regs in prepare_vmcs02_rare(). Although the cache is reset when switching to vmcs02, there is nothing that prevents KVM from re-populating the cache prior to writing vmcs02 with vmcs12's values. E.g. if the vCPU is preempted after switching to vmcs02 but before prepare_vmcs02_rare(), kvm_arch_vcpu_put() will dereference GUEST_SS_AR_BYTES via .get_cpl() and cache the stale vmcs02 value. While the current code base only caches stale data in the preemption case, it's theoretically possible future code could read a segment register during the nested flow itself, i.e. this isn't technically illegal behavior in kvm_arch_vcpu_put(), although it did introduce the bug. This manifests as an unexpected nested VM-Enter failure when running with unrestricted guest disabled if the above preemption case coincides with L1 switching L2's CPL, e.g. when switching from a L2 vCPU at CPL3 to to a L2 vCPU at CPL0. stack_segment_valid() will see the new SS_SEL but the old SS_AR_BYTES and incorrectly mark the guest state as invalid due to SS.dpl != SS.rpl. Don't bother updating the cache even though prepare_vmcs02_rare() writes every segment. With unrestricted guest, guest segments are almost never read, let alone L2 guest segments. On the other hand, populating the cache requires a large number of memory writes, i.e. it's unlikely to be a net win. Updating the cache would be a win when unrestricted guest is not supported, as guest_state_valid() will immediately cache all segment registers. But, nested virtualization without unrestricted guest is dirt slow, saving some VMREADs won't change that, and every CPU manufactured in the last decade supports unrestricted guest. In other words, the extra (minor) complexity isn't worth the trouble. Note, kvm_arch_vcpu_put() may see stale data when querying guest CPL depending on when preemption occurs. This is "ok" in that the usage is imperfect by nature, i.e. it's used heuristically to improve performance but doesn't affect functionality. kvm_arch_vcpu_put() could be "fixed" by also disabling preemption while loading segments, but that's pointless and misleading as reading state from kvm_sched_{in,out}() is guaranteed to see stale data in one form or another. E.g. even if all the usage of regs_avail is fixed to call kvm_register_mark_available() after the associated state is set, the individual state might still be stale with respect to the overall vCPU state. I.e. making functional decisions in an asynchronous hook is doomed from the get go. Thankfully KVM doesn't do that. Fixes: de63ad4cf4973 ("KVM: X86: implement the logic for spinlock optimization") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Message-Id: <20200923184452.980-2-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman --- arch/x86/kvm/vmx/nested.c | 2 ++ 1 file changed, 2 insertions(+) --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2231,6 +2231,8 @@ static void prepare_vmcs02_rare(struct v vmcs_writel(GUEST_TR_BASE, vmcs12->guest_tr_base); vmcs_writel(GUEST_GDTR_BASE, vmcs12->guest_gdtr_base); vmcs_writel(GUEST_IDTR_BASE, vmcs12->guest_idtr_base); + + vmx->segment_cache.bitmask = 0; } if (!hv_evmcs || !(hv_evmcs->hv_clean_fields &