Received: by 2002:a05:7412:3784:b0:e2:908c:2ebd with SMTP id jk4csp2532106rdb; Wed, 4 Oct 2023 04:22:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGpn43sdk2SFNooR3cKBaRp/qJOrEFiQQMHdA6Lv662FBjS6pImbUo/xFQ/9+QqnSiVDfyE X-Received: by 2002:a05:6a20:8422:b0:140:730b:4b3f with SMTP id c34-20020a056a20842200b00140730b4b3fmr2575076pzd.1.1696418546304; Wed, 04 Oct 2023 04:22:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696418546; cv=none; d=google.com; s=arc-20160816; b=okXSgGa1YsnTbS/Xn1e0Jkg4SloqRqMs7e1UIkHBhv7bk4mR6GaZQ4rYvNe+XzcJQW HTZLinA1jb5Otct398u5xqb6uwzvgpu8oLAsMle+/0byIM16JN6qsefUbgoMJ+a5VEZj jklnzvcmIhgnv1rlEDNSROaeRctaMvkpse3wquKs7zHiDKgDw+qgr3j+Jwesg7DS8BM6 BRSME8d6/OAVNl+IoXq+L244lq5mSN5TN38cc9xuoViBa/a2F9caG6tMktTt0M+B6EHA Ye4Jeb21iGCot73u7vFCypfXz0THZ5qD3nBRjqCK5UYsFxIUf27Mjl1xjVsAXn+PuQH3 1tNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=TVbqbO3tLvT6hwA1RZc7HpSzY4IxE6YOiikjeQbfNIE=; fh=qTzdsszSUFc6rIP6ll5NrEWklyeFumzrTZVYBsC8eZo=; b=qtmbnIEZBt+5q95N/TKkGDbqaIf4KZFbYBC1UOGHbjOznWX4XrKaRLhW4bWroEGXkB LazEmDKXRxvBNUr/PVBtWdaAjkzY1XFyMK+c4swrRVT2XI6LJiFRmYZDUrHoUua77r+Q ACquPC+hZFniRVPR+HiInCaa+gyCpCbmzG13UVrzS+Xbz8dLRsQMY12F5xHupPcs1XXq xDApC4cRgm4wdJyCcdc5qWNZTs/J6wk3Q3dCeqWXjoJHSilyI+jX6OoH7/Cy5RleX3zN KtFSvkrwovxysLkPLGGOVSh4F4Bj1KLL3pADjR9NFga+arV4D8be+mpjLIAj1LbAFg7u GxmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=TGctT1Nv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id u11-20020a056a00098b00b00690fe0f6e0dsi3756255pfg.68.2023.10.04.04.22.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Oct 2023 04:22:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=TGctT1Nv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id B5B4A802C7ED; Wed, 4 Oct 2023 04:22:24 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242237AbjJDLWU (ORCPT + 99 others); Wed, 4 Oct 2023 07:22:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34944 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232725AbjJDLWU (ORCPT ); Wed, 4 Oct 2023 07:22:20 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2DBCA1; Wed, 4 Oct 2023 04:22:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=TVbqbO3tLvT6hwA1RZc7HpSzY4IxE6YOiikjeQbfNIE=; b=TGctT1NvnNfm5OkdPnq/PXMR9U 0PhIm8AndGbNhHlbJH23STcBHZTPVlewU/9H+lUeti6KGb71iiMWfsg4B8OFpuwe0ia0ybfBcxfGa JUndiUUl44LLjIevd6+pmUzGzAzgzTA261p6sP6InwIzJZO+vpsPEueZ3KGKQgwegH8QFJLAT4Kga bON31WXhohd3qofmVuP/OlvCJUtRiS3v94eizCt0mhrr4K6wANQ663LL32vDZu+GYSbRm+5IEs3p4 HhHRVxvR6Jt4dBxYPjy9AZ5Jn/b0VxNVPLKCbMEnKTM+ahqaqg2od74YfR1gru61JIX+XenfkEMYa 2y5NYD5g==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1qnzwv-0035Dg-35; Wed, 04 Oct 2023 11:21:53 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id B78F6300392; Wed, 4 Oct 2023 13:21:52 +0200 (CEST) Date: Wed, 4 Oct 2023 13:21:52 +0200 From: Peter Zijlstra To: Sean Christopherson Cc: Ingo Molnar , Dapeng Mi , Paolo Bonzini , Arnaldo Carvalho de Melo , Kan Liang , Like Xu , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , kvm@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Zhenyu Wang , Zhang Xiong , Lv Zhiyuan , Yang Weijiang , Dapeng Mi , Jim Mattson , David Dunn , Mingwei Zhang , Thomas Gleixner Subject: Re: [Patch v4 07/13] perf/x86: Add constraint for guest perf metrics event Message-ID: <20231004112152.GA5947@noisy.programming.kicks-ass.net> References: <20230929115344.GE6282@noisy.programming.kicks-ass.net> <20231002115718.GB13957@noisy.programming.kicks-ass.net> <20231002204017.GB27267@noisy.programming.kicks-ass.net> <20231003081616.GE27267@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 04 Oct 2023 04:22:24 -0700 (PDT) On Tue, Oct 03, 2023 at 08:23:26AM -0700, Sean Christopherson wrote: > On Tue, Oct 03, 2023, Peter Zijlstra wrote: > > On Mon, Oct 02, 2023 at 05:56:28PM -0700, Sean Christopherson wrote: > > > Well drat, that there would have saved a wee bit of frustration. Better late > > > than never though, that's for sure. > > > > > > Just to double confirm: keeping guest PMU state loaded until the vCPU is scheduled > > > out or KVM exits to userspace, would mean that host perf events won't be active > > > for potentially large swaths of non-KVM code. Any function calls or event/exception > > > handlers that occur within the context of ioctl(KVM_RUN) would run with host > > > perf events disabled. > > > > Hurmph, that sounds sub-optimal, earlier you said <1500 cycles, this all > > sounds like a ton more. > > > > /me frobs around the kvm code some... > > > > Are we talking about exit_fastpath loop in vcpu_enter_guest() ? That > > seems to run with IRQs disabled, so at most you can trigger a #PF or > > something, which will then trip an exception fixup because you can't run > > #PF with IRQs disabled etc.. > > > > That seems fine. That is, a theoretical kvm_x86_handle_enter_irqoff() > > coupled with the existing kvm_x86_handle_exit_irqoff() seems like > > reasonable solution from where I'm sitting. That also more or less > > matches the FPU state save/restore AFAICT. > > > > Or are you talking about the whole of vcpu_run() ? That seems like a > > massive amount of code, and doesn't look like anything I'd call a > > fast-path. Also, much of that loop has preemption enabled... > > The whole of vcpu_run(). And yes, much of it runs with preemption enabled. KVM > uses preempt notifiers to context switch state if the vCPU task is scheduled > out/in, we'd use those hooks to swap PMU state. > > Jumping back to the exception analogy, not all exits are equal. For "simple" exits > that KVM can handle internally, the roundtrip is <1500. The exit_fastpath loop is > roughly half that. > > But for exits that are more complex, e.g. if the guest hits the equivalent of a > page fault, the cost of handling the page fault can vary significantly. It might > be <1500, but it might also be 10x that if handling the page fault requires faulting > in a new page in the host. > > We don't want to get too aggressive with moving stuff into the exit_fastpath loop, > because doing too much work with IRQs disabled can cause latency problems for the > host. This isn't much of a concern for slice-of-hardware setups, but would be > quite problematic for other use cases. > > And except for obviously slow paths (from the guest's perspective), extra latency > on any exit can be problematic. E.g. even if we got to the point where KVM handles > 99% of exits the fastpath (may or may not be feasible), a not-fastpath exit at an > inopportune time could throw off the guest's profiling results, introduce unacceptable > jitter, etc. I'm confused... the PMU must not be running after vm-exit. It must not be able to profile the host. So what jitter are you talking about? Even if we persist the MSR contents, the PMU itself must be disabled on vm-exit and enabled on vm-enter. If not by hardware then by software poking at the global ctrl msr. I also don't buy the latency argument, we already do full and complete PMU rewrites with IRQs disabled in the context switch path. And as mentioned elsewhere, the whole AMX thing has an 8k copy stuck in the FPU save/restore. I would much prefer we keep the PMU swizzle inside the IRQ disabled region of vcpu_enter_guest(). That's already a ton better than you have today.