Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756802AbcCDRZ6 (ORCPT ); Fri, 4 Mar 2016 12:25:58 -0500 Received: from mail-ob0-f173.google.com ([209.85.214.173]:36637 "EHLO mail-ob0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755567AbcCDRZ4 convert rfc822-to-8bit (ORCPT ); Fri, 4 Mar 2016 12:25:56 -0500 MIME-Version: 1.0 In-Reply-To: <1457100522-24551-1-git-send-email-rkrcmar@redhat.com> References: <1457100522-24551-1-git-send-email-rkrcmar@redhat.com> From: David Matlack Date: Fri, 4 Mar 2016 09:25:35 -0800 Message-ID: Subject: Re: [PATCH v2] KVM: VMX: disable PEBS before a guest entry To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Cc: "linux-kernel@vger.kernel.org" , kvm list , Paolo Bonzini , =?UTF-8?B?SmnFmcOtIE9sxaFh?= , stable@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3332 Lines: 78 On Fri, Mar 4, 2016 at 6:08 AM, Radim Krčmář wrote: > Linux guests on Haswell (and also SandyBridge and Broadwell, at least) > would crash if you decided to run a host command that uses PEBS, like > perf record -e 'cpu/mem-stores/pp' -a > > This happens because KVM is using VMX MSR switching to disable PEBS, but > SDM [2015-12] 18.4.4.4 Re-configuring PEBS Facilities explains why it > isn't safe: > When software needs to reconfigure PEBS facilities, it should allow a > quiescent period between stopping the prior event counting and setting > up a new PEBS event. The quiescent period is to allow any latent > residual PEBS records to complete its capture at their previously > specified buffer address (provided by IA32_DS_AREA). > > There might not be a quiescent period after the MSR switch, so a CPU > ends up using host's MSR_IA32_DS_AREA to access an area in guest's > memory. (Or MSR switching is just buggy on some models.) > > The guest can learn something about the host this way: > If the guest doesn't map address pointed by MSR_IA32_DS_AREA, it results > in #PF where we leak host's MSR_IA32_DS_AREA through CR2. > > After that, a malicious guest can map and configure memory where > MSR_IA32_DS_AREA is pointing and can therefore get an output from > host's tracing. > > This is not a critical leak as the host must initiate with PEBS tracing > and I have not been able to get a record from more than one instruction > before vmentry in vmx_vcpu_run() (that place has most registers already > overwritten with guest's). > > We could disable PEBS just few instructions before vmentry, but > disabling it earlier shouldn't affect host tracing too much. > We also don't need to switch MSR_IA32_PEBS_ENABLE on VMENTRY, but that > optimization isn't worth its code, IMO. > > (If you are implementing PEBS for guests, be sure to handle the case > where both host and guest enable PEBS, because this patch doesn't.) > > Fixes: 26a4f3c08de4 ("perf/x86: disable PEBS on a guest entry.") > Cc: > Reported-by: Jiří Olša > Signed-off-by: Radim Krčmář Reviewed-by: David Matlack BTW the commit message is great. Thanks for including so much detail. > --- > v2 > - moved code to add_atomic_switch_msr, so the patch will work [David] > - more appropriate "KVM: VMX:" in subject > v1: http://www.spinics.net/lists/kvm/msg128808.html > > arch/x86/kvm/vmx.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 46154dac71e6..e5572696c9e3 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -1822,6 +1822,13 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, > return; > } > break; > + case MSR_IA32_PEBS_ENABLE: > + /* PEBS needs a quiescent period after being disabled (to write > + * a record). Disabling PEBS through VMX MSR swapping doesn't > + * provide that period, so a CPU could write host's record into > + * guest's memory. > + */ > + wrmsrl(MSR_IA32_PEBS_ENABLE, 0); > } > > for (i = 0; i < m->nr; ++i) > -- > 2.7.2 >