Received: by 2002:a05:7412:3784:b0:e2:908c:2ebd with SMTP id jk4csp2856273rdb; Wed, 4 Oct 2023 13:43:58 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHg3QkT0yeKcDMiXmcX6v255YnF/jJEJqg/zAQTwRwaYgfRF8hnv4+SkNbTJ43DfoikhudF X-Received: by 2002:a92:cd82:0:b0:351:80c:bc29 with SMTP id r2-20020a92cd82000000b00351080cbc29mr4232418ilb.17.1696452238210; Wed, 04 Oct 2023 13:43:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696452238; cv=none; d=google.com; s=arc-20160816; b=t/uRYUfwQKN/rYTcoHmuZB30K4xzq4g8FLnXi7S1DeW+O83/IWgqKCuU51/6TBXz9M ou+aoEtxoKngnG6gutNMPWuaF19oWInFQhaLkJiR5dioyckADBD8U/f4oyEMf7gEo/U7 wWh0ZqdaPQHp1WVK11wpdUnA91hA85A++LEWzRI4nJ54or5lEZXddmDI5ZPNgXLPpvUJ h8CgIVvwPxs6y80blEY0ZgBkwDc+UztGwvQw6+WgFAKE9GHmcztEo8oycqQcyo3E4siF lDcVUtwa7D9bSUr//xVRu1cYPCSDfSR66pCZppLjTfVC9jfFlZdtonp6qrbe5p0stanA vmUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :message-id:references:mime-version:in-reply-to:date:dkim-signature; bh=wlxMyVNfjSoRd0rBi0mcIrA81XFNXKxdXgwPCPKKi+c=; fh=z73Nw6ZnROKujOlha8ONdOw7siSFOjDnwUvdk4jMB50=; b=i0K4a48Zvey0D9N+Zp7xd+9FnMIGD2G/0ch/UkIS7qs3zQ+MfQmQLlM090Yo/Dg9bY huxj3x0GkWng7GCptMmBjdIRbTjSZFbR35z0ValJT0LFRFE9XJ+I0mSCRoBSqkfFbSes uVGtW95W59RT8080LIoV98ZjJxuZPLJv4IQtRyaaBhOzV3EtgT83Kp9KOcSloOvydUgW MFkcYvoO2Bl2nyNvVq/SGPYhNHjFdPZWUmO521RKCxPiPRtDwAnk4T3NqV7V12s8pTXM uAGw7migDt66Wzbue+1+ke6gyPwt0r2+3svJ86eqN4Oe98PawcaPCGiOSTRYPWjcZNYD njtw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=H3oNnDSb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id b197-20020a6334ce000000b0057e2432b366si4287072pga.378.2023.10.04.13.43.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Oct 2023 13:43:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=H3oNnDSb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 57342807EDA6; Wed, 4 Oct 2023 13:43:57 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233557AbjJDUn5 (ORCPT + 99 others); Wed, 4 Oct 2023 16:43:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233243AbjJDUn4 (ORCPT ); Wed, 4 Oct 2023 16:43:56 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 15CE5B8 for ; Wed, 4 Oct 2023 13:43:53 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-59f7d109926so3333917b3.2 for ; Wed, 04 Oct 2023 13:43:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1696452232; x=1697057032; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=wlxMyVNfjSoRd0rBi0mcIrA81XFNXKxdXgwPCPKKi+c=; b=H3oNnDSb1ptmSWlR1eGP4UzY4jKXo+IhvD8pqUNFPMBByPe825qdI2i05focHY1HCk 9YNkeCO6VtciiEAzWZ+peisMwXCu3jUWThy8DExH6SWBzRMp8SBXFI36LDjvJL04Ux6P efXOsdwcfN4WhgeLvvZVmBqMGZFHqrp5cu2YlWn9lh3b3GxfRxvRHRNucaD9AcyhqvrC F2zzubenazDDfJfdi3sPi4NtFZQdYFEYvMKjPNAN5T+uGqOXgeyu5ssEH8/mssUPEftM O78pk3a8gXvcrKmrwX9KAetSYBk3JknJakBY55EAXzHBhIG9jc1bPX9unQxSvsfQmdQ6 WfmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696452232; x=1697057032; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=wlxMyVNfjSoRd0rBi0mcIrA81XFNXKxdXgwPCPKKi+c=; b=ZOjEUWyekY8nXN5Fe300eqqGAULA9m8KanyJlQQ0HSQSUMD3+R2Ver8VqLaET6eFnV McHk5R7vFiApM9x0ZBYIY97GNbyOSYGvFlsPTKdd0h05fZaJciq0qR77fzgT92gvlN3Q cjDR12fBjCbS1pPIfIpLauF80SNOA5STEvzz6JbQinlDQa0nTIvZKAiel/dlvp/uGVPL mLTIUukfKo74JMKwDCD++m8Tz+z1pAn6txP3RaUUvDIW0AJaRugx0VmQSPnNZaIDsIKw +e+wBL1FszX1rraqjqqmXMB/npiL+urlCeTp3V5F5QwATHvBzAzxe/w0mrUFSzkp6p0N qTgg== X-Gm-Message-State: AOJu0Yz6+aOhp0jYpXcXCqDV6Uu0WF2ynBXw5GKX6EPdUAtYbLWrqYcl 2ik647zJA+W1HXyWNq1PV27ztPlfjK4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:a909:0:b0:59b:e97e:f7e3 with SMTP id g9-20020a81a909000000b0059be97ef7e3mr63909ywh.2.1696452232294; Wed, 04 Oct 2023 13:43:52 -0700 (PDT) Date: Wed, 4 Oct 2023 13:43:50 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230927113312.GD21810@noisy.programming.kicks-ass.net> <20230929115344.GE6282@noisy.programming.kicks-ass.net> <20231002115718.GB13957@noisy.programming.kicks-ass.net> <20231002204017.GB27267@noisy.programming.kicks-ass.net> Message-ID: Subject: Re: [Patch v4 07/13] perf/x86: Add constraint for guest perf metrics event From: Sean Christopherson To: Mingwei Zhang Cc: Peter Zijlstra , Ingo Molnar , Dapeng Mi , Paolo Bonzini , Arnaldo Carvalho de Melo , Kan Liang , Like Xu , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , kvm@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Zhenyu Wang , Zhang Xiong , Lv Zhiyuan , Yang Weijiang , Dapeng Mi , Jim Mattson , David Dunn , Thomas Gleixner Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 04 Oct 2023 13:43:57 -0700 (PDT) On Tue, Oct 03, 2023, Mingwei Zhang wrote: > On Mon, Oct 2, 2023 at 5:56=E2=80=AFPM Sean Christopherson wrote: > > The "when" is what's important. If KVM took a literal interpretation = of > > "exclude guest" for pass-through MSRs, then KVM would context switch al= l those > > MSRs twice for every VM-Exit=3D>VM-Enter roundtrip, even when the VM-Ex= it isn't a > > reschedule IRQ to schedule in a different task (or vCPU). The overhead= to save > > all the host/guest MSRs and load all of the guest/host MSRs *twice* for= every > > VM-Exit would be a non-starter. E.g. simple VM-Exits are completely ha= ndled in > > <1500 cycles, and "fastpath" exits are something like half that. Switc= hing all > > the MSRs is likely 1000+ cycles, if not double that. >=20 > Hi Sean, >=20 > Sorry, I have no intention to interrupt the conversation, but this is > slightly confusing to me. >=20 > I remember when doing AMX, we added gigantic 8KB memory in the FPU > context switch. That works well in Linux today. Why can't we do the > same for PMU? Assuming we context switch all counters, selectors and > global stuff there? That's what we (Google folks) are proposing. However, there are significan= t side effects if KVM context switches PMU outside of vcpu_run(), whereas the= FPU doesn't suffer the same problems. Keeping the guest FPU resident for the duration of vcpu_run() is, in terms = of functionality, completely transparent to the rest of the kernel. From the = kernel's perspective, the guest FPU is just a variation of a userspace FPU, and the = kernel is already designed to save/restore userspace/guest FPU state when the kern= el wants to use the FPU for whatever reason. And crucially, kernel FPU usage is exp= licit and contained, e.g. see kernel_fpu_{begin,end}(), and comes with mechanisms= for KVM to detect when the guest FPU needs to be reloaded (see TIF_NEED_FPU_LOA= D). The PMU is a completely different story. PMU usage, a.k.a. perf, by design= is "always running". KVM can't transparently stop host usage of the PMU, as d= isabling host PMU usage stops perf events from counting/profiling whatever it is the= y're supposed to profile. Today, KVM minimizes the "downtime" of host PMU usage by context switching = PMU state at VM-Enter and VM-Exit, or at least as close as possible, e.g. for L= BRs and Intel PT. What we are proposing would *significantly* increase the downtime, to the p= oint where it would almost be unbounded in some paths, e.g. if KVM faults in a p= age, gup() could go swap in memory from disk, install PTEs, and so on and so for= th. If the host is trying to profile something related to swap or memory manage= ment, they're out of luck.