Received: by 2002:a05:7412:5112:b0:fa:6e18:a558 with SMTP id fm18csp1380264rdb; Wed, 24 Jan 2024 13:26:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IEeZ2rpDrBy+LHN6SQCS0dUX8HlXgKYp2ObgGt9a1CyHllJswDY6/vltFuVnYVT2y6M9fE+ X-Received: by 2002:a92:360b:0:b0:361:a813:2192 with SMTP id d11-20020a92360b000000b00361a8132192mr70249ila.10.1706131580966; Wed, 24 Jan 2024 13:26:20 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706131580; cv=pass; d=google.com; s=arc-20160816; b=RgCwpCdX1+2HEBU/UqoklGiiPbNoGYkoolNz00/ZuHrviYAXBFuUS8VOlJTIVSIzop vXISBdI5cj1vtnCU75oEOp4Ia/dlNAqSX/SNPKRrXfCeTjnBCIahiW0iEiY9dJ4TG7TF 4dFKBKH+eQYlmeHuUXOagMsy4H0fVCz6xKzPZ1j5xSqfhk+8/uQGnmEw8oZnhn8tAq03 nWotqlRw6SAfjtMbJaxywRISc8qnrGrI1RlBUxSwiNJ3+8FnwjnPS8J5K75HSIYlqTEf F5k/KCDCzO7STqwujap05xOYpUooXrc0sWIp+bpk2r4hotfpLJm5ET41PfIQQiEzUj7/ rXvw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :in-reply-to:date:dkim-signature; bh=wOrPayXVbMvuGV7K47I5tyuTX654w1HjkPILeSMWTng=; fh=1sq4q1wlpSUxE53Ev6I+dp/Y/z5qg1h1uxe32viDIh0=; b=JJ9+Vqbxncz4q5j7i51azYbGPMFuPxlnrQ8m6OZlyvvyg8aLHpgkan4I15U2AzW46W lTybX/KCYUOi5yeE6pqvRYBGtV4/ODWkpTBbYDTqa+m5HxAghxPJXoQnygRGfxQ3Z/v2 p+6gL18cephaOGBtPAnHFHFhFLRnw3h0qMFQdmMgm7LsUf2yCKpNKu8uMKA0G2T4Lm1I Khk6VCUzSbsjs6BjkuPc9rCbsVqDXAD+cGFns+V2PDMrONqFj6GabiSFTASEhqrtFShZ A4YLjOcVbmwlUw5UCbGe10ZxSmUTwtuFT2vAdD3ZDUySImOCbkpYFeAIqLvW74S1lZNA Y+3w== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=31GB1bJg; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-37690-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-37690-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id x38-20020a634a26000000b005afc5ea43fdsi12181485pga.849.2024.01.24.13.26.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Jan 2024 13:26:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-37690-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=31GB1bJg; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-37690-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-37690-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 8A99128701C for ; Wed, 24 Jan 2024 21:26:20 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 32B971353F6; Wed, 24 Jan 2024 21:25:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="31GB1bJg" Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B72C21350F4 for ; Wed, 24 Jan 2024 21:25:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706131555; cv=none; b=uyocUo3HyTji/0a9DK0JhozXkRs8QeMZ/ZYyx6eCoYSeF/dXG2T0f7eGUMOz5LsltFfMLZg0IHGAJLTKhAU4CsDHPbBNBW7dgJQB1xTQt0x9C8set+nqLSGvYrc+AtDaSxtAIiXV38pVQdbV4Q7VDrt5XW6EZYyLRko4crYj6xc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706131555; c=relaxed/simple; bh=277XN7rVdH2eY6PXPKLwabMU7upP6pfA70NJVhCEo/o=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=URgD7lS7LkT8518nnBTVmnkOHAUV7PgSlb3F+ClU390y0a2hFEwjwpEn7HfxHgO2XeVjVIjWP8PiTK2aSKx8daUuwNnBClVqvnRXDnbY5ejWhcMJetzjcmc4mMU4iOkA0tIynONz1gM+8afitL0FjMIK+3jJ5NNY/R31sD2XitQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=31GB1bJg; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dc253fee264so9618188276.2 for ; Wed, 24 Jan 2024 13:25:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1706131551; x=1706736351; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=wOrPayXVbMvuGV7K47I5tyuTX654w1HjkPILeSMWTng=; b=31GB1bJgboGhBrw4RISXeyE7TiQjv+OlfZ+JHyUASJBX0W9PR1S3Y65uS4St5qoQpJ OyPEcRz5nfkhavW0cOHUin/sg5mM5hnplsF3ZHY2tdAGSn1F23AphhbgNhwssJD/vpp2 V7R3MI0YgBScjtLUQSl7nuQHgNLoR1s7mJ+XASA2fO+hD6dw6A4KRFtqLvoev9q2NNaI u+mYAp+ch9pqvFUwXtKordRxxCfmx5Vy7O4nnEr8U+esZ4cY2L6Vtndjg7+eLULTME5A AWlXVDR3UPZVQ3wSb142rdNaQg5nJJDkLPpfsVLFBsNVcCqmqzEJRKk4ZLPAw52w6rsl rOLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706131551; x=1706736351; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=wOrPayXVbMvuGV7K47I5tyuTX654w1HjkPILeSMWTng=; b=nJPD1nB5gQY5Him4tT2GFXCZrzjthpJscuzX9P1NECMHn2Od8w/8XFU4rBgrc4CxbA cJfGP4d68lbAQaG0bmXFCEjRj6q00lXAlorNNtBN96pggKq58WDeNo5eNy2LVYBnZOLf PVaM90iADezbjOSzIc7HUGEk4mo6cKfINuoWkVVdieR8frO1vnlbLaXKaOB7PYBdXCzI bKjcn/N5pOUx3n3Xh+WMDMrmj0trv6dyTbrRZf96vSSh+915ugXtvsSsUjEIVclAiqnf UbAuFf+FkbgULA5sk2FdATunPS69sOQdrgsYNkl4FYXiM73s8m6uLzUqj1oodflyQTr5 9PgQ== X-Gm-Message-State: AOJu0Yz7Iepr0jGel3hXY+DKvEnfZoWLoUAGxMhz5MrikmwABSB09c9T N/iyGpmzyD629STOGx1ZCvkZBV77G6Htd8f00bCfhC5lvHpnOaUEMCEGgxMx4JtoWi5oS2wcyHW Gfw== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1793:b0:dc2:1c5d:eed5 with SMTP id ca19-20020a056902179300b00dc21c5deed5mr76ybb.12.1706131551648; Wed, 24 Jan 2024 13:25:51 -0800 (PST) Date: Wed, 24 Jan 2024 13:25:50 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240124003858.3954822-1-mizhang@google.com> <20240124003858.3954822-2-mizhang@google.com> Message-ID: Subject: Re: [PATCH 1/2] KVM: x86/pmu: Reset perf_capabilities in vcpu to 0 if PDCM is disabled From: Sean Christopherson To: Aaron Lewis Cc: Mingwei Zhang , Paolo Bonzini , "H. Peter Anvin" , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Wed, Jan 24, 2024, Aaron Lewis wrote: > On Wed, Jan 24, 2024 at 7:49=E2=80=AFAM Sean Christopherson wrote: > > > > On Wed, Jan 24, 2024, Mingwei Zhang wrote: > > > Reset vcpu->arch.perf_capabilities to 0 if PDCM is disabled in guest = cpuid. > > > Without this, there is an issue in live migration. In particular, to > > > migrate a VM with no PDCM enabled, VMM on the source is able to retri= eve a > > > non-zero value by reading the MSR_IA32_PERF_CAPABILITIES. However, VM= M on > > > the target is unable to set the value. This creates confusions on the= user > > > side. > > > > > > Fundamentally, it is because vcpu->arch.perf_capabilities as the cach= ed > > > value of MSR_IA32_PERF_CAPABILITIES is incorrect, and there is nothin= g > > > wrong on the kvm_get_msr_common() which just reads > > > vcpu->arch.perf_capabilities. > > > > > > Fix the issue by adding the reset code in kvm_vcpu_after_set_cpuid(),= i.e. > > > early in VM setup time. > > > > > > Cc: Aaron Lewis > > > Signed-off-by: Mingwei Zhang > > > --- > > > arch/x86/kvm/cpuid.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > > > index adba49afb5fe..416bee03c42a 100644 > > > --- a/arch/x86/kvm/cpuid.c > > > +++ b/arch/x86/kvm/cpuid.c > > > @@ -369,6 +369,9 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_v= cpu *vcpu) > > > vcpu->arch.maxphyaddr =3D cpuid_query_maxphyaddr(vcpu); > > > vcpu->arch.reserved_gpa_bits =3D kvm_vcpu_reserved_gpa_bits_raw= (vcpu); > > > > > > + /* Reset MSR_IA32_PERF_CAPABILITIES guest value to 0 if PDCM is= off. */ > > > + if (!guest_cpuid_has(vcpu, X86_FEATURE_PDCM)) > > > + vcpu->arch.perf_capabilities =3D 0; > > > > No, this is just papering over the underlying bug. KVM shouldn't be st= uffing > > vcpu->arch.perf_capabilities without explicit writes from host userspac= e. E.g > > KVM_SET_CPUID{,2} is allowed multiple times, at which point KVM could c= lobber a > > host userspace write to MSR_IA32_PERF_CAPABILITIES. It's unlikely any = userspace > > actually does something like that, but KVM overwriting guest state is a= lmost > > never a good thing. > > > > I've been meaning to send a patch for a long time (IIRC, Aaron also ran= into this?). > > KVM needs to simply not stuff vcpu->arch.perf_capabilities. I believe = we are > > already fudging around this in our internal kernels, so I don't think t= here's a > > need to carry a hack-a-fix for the destination kernel. > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 27e23714e960..fdef9d706d61 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -12116,7 +12116,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) > > > > kvm_async_pf_hash_reset(vcpu); > > > > - vcpu->arch.perf_capabilities =3D kvm_caps.supported_perf_cap; >=20 > Yeah, that will fix the issue we are seeing. The only thing that's > not clear to me is if userspace should expect KVM to set this or if > KVM should expect userspace to set this. How is that generally > decided? By "this", you mean the effective RESET value for vcpu->arch.perf_capabilit= ies? To be consistent with KVM's CPUID module at vCPU creation, which is complet= ely empty (vCPU has no PMU and no PDCM support) KVM *must* zero vcpu->arch.perf_capabilities. If userspace wants a non-zero value, then userspace needs to set CPUID to e= nable PDCM and set MSR_IA32_PERF_CAPABILITIES. MSR_IA32_ARCH_CAPABILITIES is in the same boat, e.g. a vCPU without X86_FEATURE_ARCH_CAPABILITIES can end up seeing a non-zero MSR value. That= too should be excised. In a perfect world, KVM would also zero-initialize vcpu->arch.msr_platform_= info, but that one is less obviously broken and also less obviously safe to remov= e. commit e53d88af63ab4104e1226b8f9959f1e9903da10b Author: Jim Mattson AuthorDate: Tue Oct 30 12:20:21 2018 -0700 Commit: Paolo Bonzini CommitDate: Fri Dec 14 18:00:01 2018 +0100 kvm: x86: Don't modify MSR_PLATFORM_INFO on vCPU reset =20 If userspace has provided a different value for this MSR (e.g with th= e turbo bits set), the userspace-provided value should survive a vCPU reset. For backwards compatibility, MSR_PLATFORM_INFO is initialized in kvm_arch_vcpu_setup. =20 Signed-off-by: Jim Mattson Reviewed-by: Drew Schmitt Cc: Abhiroop Dabral Signed-off-by: Paolo Bonzini In other words, KVM shouldn't define the vCPU model beyond the absolute bar= e minimum that is required by the x86 architecture (as of P6 CPUs, which is m= ore or less the oldest CPU KVM can reasonably virtualize without carrying usele= ss code).