Received: by 2002:a05:7412:3784:b0:e2:908c:2ebd with SMTP id jk4csp1547344rdb; Mon, 2 Oct 2023 13:04:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGZ1kA7FEmJ1AqrxfiFeVUayp2CtLGbq9vNXzPhUF1AP55gpXNoIWgso4U1mwx1xqvUjegW X-Received: by 2002:a17:90b:3905:b0:274:6ab0:67ba with SMTP id ob5-20020a17090b390500b002746ab067bamr8287165pjb.48.1696277046292; Mon, 02 Oct 2023 13:04:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696277046; cv=none; d=google.com; s=arc-20160816; b=hvT7Y4Lpd4wxpZ6F6cEMUdtePti1D3ZZdJVlGo1u/7RoerHy7SCqow7dgH6wPvRIiK yUXrUAKPj1eiS/5p5EMWzh+4mP9Gv2AEbwG/4OaR4kRDQccDZXt2WSeqISd4Xkt5KoG5 HGoCcdx/pO/szIVsIfrMbAtD9bHcLWcrZbyuiPkg1kOMe6tKzGc/x5DldskIVd6Gz9vh 4MoFgogTUNk0P00fd6CT215mLz6vXXNuJJa5F/L2ZFof6+j9RYNyAwmSFfyhTQynGt91 Z0XJLCAMYDqbNo6rLD0HhxAj3oZAQmtV4zWNrOQw6kuHZNdKNtMjwfmmgfg/EEpl+Wxb 9WGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=3He/PUj1SuAa2JXabRDuVg056s7G52ETnxzH3dmdWOg=; fh=eJ2aHw5C8XbhWP28a6osplhAOi9oc4j/x+GhsbKdudA=; b=snfuFdrHpzkoIH34YNf8flcSukQZupDjG8tjmK4xAm4St9tHNkbox4VFPqYWk9XKiX Jd+8IlndlqF5UO/ldPqlZxGhoDwGJBz74bDa8GYURcPQ9qOmBTdjruUcqekjDPvuvjrD 6ilNnN7wKSq4PEiwXcBhvUjTuGHqyTtfCP6BayHUtmrKeuLGc0DtjkfVTfsA05riaaiL yPmgv8E3RTZoWI5UTpNcP6Sm+55QjV+1jgSqjzSkv6ibsnNiOqrMWAqeF2nwrSyKiMi1 +Yvrkvhjc3wRFKcBLimcT48pNDm/F9pWOAb6NWyNEufY368rysLW3ijfgF6Y85AFPTjF U1KA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=OPt6mP8S; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id q25-20020a656859000000b00578ac88e239si27824631pgt.595.2023.10.02.13.04.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 13:04:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=OPt6mP8S; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id E42C780B026D; Mon, 2 Oct 2023 12:03:44 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238593AbjJBTDb (ORCPT + 99 others); Mon, 2 Oct 2023 15:03:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58112 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236879AbjJBTD3 (ORCPT ); Mon, 2 Oct 2023 15:03:29 -0400 Received: from mail-yb1-xb2a.google.com (mail-yb1-xb2a.google.com [IPv6:2607:f8b0:4864:20::b2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D71BB3 for ; Mon, 2 Oct 2023 12:03:25 -0700 (PDT) Received: by mail-yb1-xb2a.google.com with SMTP id 3f1490d57ef6-d8cc08fb47dso89975276.3 for ; Mon, 02 Oct 2023 12:03:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1696273404; x=1696878204; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=3He/PUj1SuAa2JXabRDuVg056s7G52ETnxzH3dmdWOg=; b=OPt6mP8S+PTAA1io3oSX62PkRFFRMQai47lm+csYxbnU/hk3TOjsl8UkXN5mYvfxCb 00hmO/F/w1BhhNahHTaONXilcHlVACvejg34jHMR9BV50BZKxr7hqbdHaB/UzbqMH8I+ hxdGNQCtra/OsmPTGZ4Al5EzQLPrtYYei6jzdocuVLQ+JrL/D0oEORbTKoeUI5XeMIim 0C1dMnN2Abc0OSDQpM7wCxMi4tVgjaAnAZWOQmwn6AFDAcNbHqGP3QqdCvu3qMm1A3c2 g06qAxrA8nZ6BslyWONYQyWGHmzLhkQIZkBD0oj9QHDxAFvbodZEwak7aTX9VD8dVfTt WN1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696273404; x=1696878204; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3He/PUj1SuAa2JXabRDuVg056s7G52ETnxzH3dmdWOg=; b=ppgT1MmOR7j3/Q7P3oCSdtimvyQlLbFdYPI5/B34WbmhkRVKCtDOOQYLK/yT/uc8qq RDHZdhB06skpFuT8kxn7IcO4ER1lJEomE5YATbBf02g/4SAiPf75kweMYLKbziZ/CZsV NV1a2ILy0staYkgz2yc10lm05GPO2qtn/UpAC29B4YJ32xZB1oj9Kv1Fv1AgpXdWUh5P w0bcw3gv37AdXSegySiZKVQe05Wm6PIHI3nrkCQZRVGcMr0xIAbxD/iFjgF4TdGKy6EX /Qxe1zrgvl5dI7vY9Ywr2welTiH1XA4qU8R39nEJdwF8xPNgwPN6oRHsEitaFvRmtm/O ER1w== X-Gm-Message-State: AOJu0Yw4BmvdTsgXJ4ZbaDBKYMYXj5QGLxrQCeB2gzV1QZxnp00gwamr 2CIycW/lEgCTJV6oFwiM894pVG+iv1xB2KZ8teIomQ== X-Received: by 2002:a25:874e:0:b0:d81:83f6:99cb with SMTP id e14-20020a25874e000000b00d8183f699cbmr11987198ybn.42.1696273404469; Mon, 02 Oct 2023 12:03:24 -0700 (PDT) MIME-Version: 1.0 References: <20230927033124.1226509-1-dapeng1.mi@linux.intel.com> <20230927033124.1226509-8-dapeng1.mi@linux.intel.com> <20230927113312.GD21810@noisy.programming.kicks-ass.net> <20230929115344.GE6282@noisy.programming.kicks-ass.net> <20231002115718.GB13957@noisy.programming.kicks-ass.net> In-Reply-To: From: Mingwei Zhang Date: Mon, 2 Oct 2023 12:02:47 -0700 Message-ID: Subject: Re: [Patch v4 07/13] perf/x86: Add constraint for guest perf metrics event To: David Dunn Cc: Ingo Molnar , Peter Zijlstra , Sean Christopherson , Dapeng Mi , Paolo Bonzini , Arnaldo Carvalho de Melo , Kan Liang , Like Xu , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , kvm@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Zhenyu Wang , Zhang Xiong , Lv Zhiyuan , Yang Weijiang , Dapeng Mi , Jim Mattson , Thomas Gleixner Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Mon, 02 Oct 2023 12:03:45 -0700 (PDT) On Mon, Oct 2, 2023 at 8:23=E2=80=AFAM David Dunn wr= ote: > > On Mon, Oct 2, 2023 at 6:30=E2=80=AFAM Ingo Molnar wro= te: > > > > > > The host OS shouldn't offer facilities that severely limit its own capa= bilities, > > when there's a better solution. We don't give the FPU to apps exclusive= ly either, > > it would be insanely stupid for a platform to do that. > > > > If you think of the guest VM as a usermode application (which it > effectively is), the analogous situation is that there is no way to > tell the usermode application which portions of the FPU state might be > used by the kernel without context switching. Although the kernel can > and does use FPU state, it doesn't zero out a portion of that state > whenever the kernel needs to use the FPU. > > Today there is no way for a guest to dynamically adjust which PMU > state is valid or invalid. And this changes based on usage by other > commands run on the host. As observed by perf subsystem running in > the guest kernel, this looks like counters that simply zero out and > stop counting at random. > > I think the request here is that there be a way for KVM to be able to > tell the guest kernel (running the perf subsystem) that it has a > functional HW PMU. And for that to be true. This doesn't mean taking > away the use of the PMU any more than exposing the FPU to usermode > applications means taking away the FPU from the kernel. But it does > mean that when entering the KVM run loop, the host perf system needs > to context switch away the host PMU state and allow KVM to load the > guest PMU state. And much like the FPU situation, the portion of the > host kernel that runs between the context switch to the KVM thread and > VMENTER to the guest cannot use the PMU. > > This obviously should be a policy set by the host owner. They are > deliberately giving up the ability to profile that small portion of > the host (KVM VCPU thread cannot be profiled) in return to providing a > full set of perf functionality to the guest kernel. > +1 I was pretty confused until I read this one. Pass-through vPMU for the guest VM does not conflict with the host PMU software. All we need is to accept the feasibility that host PMU software (perf subsystem in Linux) can co-exist with pass-through vPMU in KVM. They could both work directly on the hardware PMU, operating the registers etc... To achieve this, I think what we really ask for the perf subsystem in Linux are two things: - full context switch for hardware PMU. Currently, perf subsystem is the exclusive owner of this piece of hardware. So this code is missing - NMI sharing or NMI control transfer. Either KVM could raise its own NMI handler and get control transferred or Linux promotes the existing NMI handler to serve two entities in the kernel. Once the above is achieved, KVM and perf subsystem in Linux could harmoniously share the hardware PMU as I believe, instead of forcing the former as a client of the latter. To step back a little bit, we are not asking about the feasibility, since KVM and perf subsystem sharing hardware PMU is a reality because of TDX/SEV-SNP. So, I think all that is just a draft proposal to make the sharing clean and efficient. Thanks. -Mingwei > Dave Dunn