Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp3354130ybn; Fri, 27 Sep 2019 05:14:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqyLWZ/j1u0n/mhwIrZOXuZZGQ34xrdwioUo62P60/JniGa4sGYL/8o968PExh1eechSb6yv X-Received: by 2002:a50:fd10:: with SMTP id i16mr4170687eds.239.1569586447020; Fri, 27 Sep 2019 05:14:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569586447; cv=none; d=google.com; s=arc-20160816; b=mTk50PjGAhFqXa2zWvseTTQR+0O0AqTpsXIwzbSf7zJiYea/5/ApWv99NK+1Ml9cdi FBINB8H6SRWrb9OMQdoKXsaRowLR2PtVbDgX8v6V9ZELSZtXHc9kbX2vgbplUkhXwLmi rDLoFkSYvb7WJUPhYwZR7dOEDOI/ogzMl6w0TPr5w7uJYJBqfUTVDKB+FqnAawTujcjz ZLExqJ01q0EbBQoifN5aOsWEk30oV6zfDjsVVUh1L1JCmBtxxb3b0kzs1TdHQbyHUuSV cJrRRErM3HagHaU03/TzcYFvbrfKHO6uY7YFO/WDLRMM37LzVyWpmn7hbcyu2BQ2pshb fmQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from; bh=1BKH6Vd2y0kjt5qb+nb7wXA/qqVsC/VTzacnT15CY10=; b=jAJdz7d4/IjWdJq2GPQrb+EwMsdywtwIgddEBJaO6bYJ0wD1bC1e09I2oVAIeFfpFL 6SgGUPNvVoQ0va2DjKuwqC+7o5oDc4zl47fVRFcbwi1fas99cIYNWEnerGQfvPugdoJd 4umQOdWC/ZQsBpVh+K8OelJsAsp6Q7IHUMdcoLG195bLMTMf9D9NKWPFbMJxD0lBH/wQ 6hkUEvik9PdPLNQ2xan901UFJcibklxbf388eIN71/jO5MZSn4qLenjOxtd+k5qACiLI Ipn/D/BKKiIfkkTx0YSPZyyAtOHirVLb5BP2m7JdecTRnhBpub3LupzPawDjJ8PHedSS iBXw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y49si1351309edb.138.2019.09.27.05.13.41; Fri, 27 Sep 2019 05:14:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727358AbfI0MMU (ORCPT + 99 others); Fri, 27 Sep 2019 08:12:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38200 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726144AbfI0MMT (ORCPT ); Fri, 27 Sep 2019 08:12:19 -0400 Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BDE22C05975D for ; Fri, 27 Sep 2019 12:12:18 +0000 (UTC) Received: by mail-wm1-f72.google.com with SMTP id r21so2145932wme.5 for ; Fri, 27 Sep 2019 05:12:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=1BKH6Vd2y0kjt5qb+nb7wXA/qqVsC/VTzacnT15CY10=; b=IkFgFEKMxE/QYdGLWwiirwhqQtBFrkoV+vNNfwS2LBJmzV6AWJGzrvd7igijZOUJrk UyCWSH2yjmxbyGtuD7/8JH5ocTV+qt1dFBhvP+FyNfw5nNw/e4Eini+TG3p4oZh7JPXp rLoy+iOclar1+Z7CLiR5/hNW34Gwy0MdYhwKT6snhQG72HtsbwNNDQC9O9+YGQK1vict 6gfOt3z6gYvZDC7HccWT1QJQV7I+oFHERkqKu90BePuauwVy5UYhpCxF3wsaKglnbBih 1FV1a/q9Y4Vsb6l6dBRaIISmfgItQ4yteVJ5dPbH3k960GqQypwkPrWdNaGCux8SB//X +azg== X-Gm-Message-State: APjAAAUkr2Q8WSrt5sy8uMLecnr6+psryRTv1fXGDfh+GSERUHkffqao F+aG4qWMreaE5N8CGMhB5aWpLf8uH3dgQ+vu/eAc3kXGnWF8QbPBYhhYLadQVqcQGJl0UydTbk/ OKGJpK0fGPblWUpsVcHZg9o2m X-Received: by 2002:adf:f58c:: with SMTP id f12mr2689598wro.38.1569586337498; Fri, 27 Sep 2019 05:12:17 -0700 (PDT) X-Received: by 2002:adf:f58c:: with SMTP id f12mr2689586wro.38.1569586337260; Fri, 27 Sep 2019 05:12:17 -0700 (PDT) Received: from vitty.brq.redhat.com ([95.82.135.182]) by smtp.gmail.com with ESMTPSA id a13sm6204997wrf.73.2019.09.27.05.12.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Sep 2019 05:12:16 -0700 (PDT) From: Vitaly Kuznetsov To: Sean Christopherson , Paolo Bonzini , Radim =?utf-8?B?S3LEjW3DocWZ?= Cc: Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Reto Buerki Subject: Re: [PATCH 0/2] KVM: nVMX: Bug fix for consuming stale vmcs02.GUEST_CR3 In-Reply-To: <20190926214302.21990-1-sean.j.christopherson@intel.com> References: <20190926214302.21990-1-sean.j.christopherson@intel.com> Date: Fri, 27 Sep 2019 14:12:15 +0200 Message-ID: <87o8z65468.fsf@vitty.brq.redhat.com> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sean Christopherson writes: > Reto Buerki reported a failure in a nested VMM when running with HLT > interception disabled in L1. When putting L2 into HLT, KVM never actually > enters L2 and instead cancels the nested run and pretends that VM-Enter to > L2 completed and then exited on HLT (which KVM intercepted). Because KVM > never actually runs L2, KVM skips the pending MMU update for L2 and so > leaves a stale value in vmcs02.GUEST_CR3. If the next wake event for L2 > triggers a nested VM-Exit, KVM will refresh vmcs12->guest_cr3 from > vmcs02.GUEST_CR3 and consume the stale value. > > Fix the issue by unconditionally writing vmcs02.GUEST_CR3 during nested > VM-Enter instead of deferring the update to vmx_set_cr3(), and skip the > update of GUEST_CR3 in vmx_set_cr3() when running L2. I.e. make the > nested code fully responsible for vmcs02.GUEST_CR3. > > I really wanted to go with a different fix of handling this as a one-off > case in the HLT flow (in nested_vmx_run()), and then following that up > with a cleanup of VMX's CR3 handling, e.g. to do proper dirty tracking > instead of having the nested code do manual VMREADs and VMWRITEs. I even > went so far as to hide vcpu->arch.cr3 (put CR3 in vcpu->arch.regs), but > things went south when I started working through the dirty tracking logic. > > Because EPT can be enabled *without* unrestricted guest, enabling EPT > doesn't always mean GUEST_CR3 really is the guest CR3 (unlike SVM's NPT). > And because the unrestricted guest handling of GUEST_CR3 is dependent on > whether the guest has paging enabled, VMX can't even do a clean handoff > based on unrestricted guest. In a nutshell, dynamically handling the > transitions of GUEST_CR3 ownership in VMX is a nightmare, so fixing this > purely within the context of nested VMX turned out to be the cleanest fix. > > Sean Christopherson (2): > KVM: nVMX: Always write vmcs02.GUEST_CR3 during nested VM-Enter > KVM: VMX: Skip GUEST_CR3 VMREAD+VMWRITE if the VMCS is up-to-date > Series: Tested-by: Vitaly Kuznetsov -- Vitaly