Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7748BC54EAA for ; Fri, 27 Jan 2023 22:46:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230149AbjA0Wq1 (ORCPT ); Fri, 27 Jan 2023 17:46:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39332 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231890AbjA0WqZ (ORCPT ); Fri, 27 Jan 2023 17:46:25 -0500 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 181514A1F2 for ; Fri, 27 Jan 2023 14:46:24 -0800 (PST) Received: by mail-pl1-x633.google.com with SMTP id jl3so6410317plb.8 for ; Fri, 27 Jan 2023 14:46:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=ZEYnhr/B5b9mHUoTu3/hsSx7bXw1ljXohhUlX7HpfeQ=; b=VX7UwjHqDndiT8deGr2yNk6PuKZhLEkrMsCxTDrbAD+YJvACNeKBitXSo7rWBR8s3A hex+j/TahRdLE2+iXhdd/UZb1N5ZP4YjQpuOtlHkxjhl0l+XHWZNkWbRhb2cynLezN9V 1hWvNC6omWu32vQ0CgIM9Qu5o3NtRQfbjLQhTrtDyqqd/LoN6zYDH+S8qtTm+VsIyLvJ 66/r+PgaIWU5CQzt4XpAaNeEvNOtYUug32Sw7NpOk6GGqz/CB5yY9JiBRQB/YyScad1V EmguIJVtWtaW2iLlKox6drY7m71gggdbgQHupqAv7gTX8JPpUS6O06qWYIU6BrEeKrB3 4Dew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ZEYnhr/B5b9mHUoTu3/hsSx7bXw1ljXohhUlX7HpfeQ=; b=VPlqhn05AHPVxhZYstIO3G0TjMZTCFsJTy3UYE2KwCeLNZiBuTDyjfjMq+3WiZYEFV 5Ok4HD06R87OxZo2EyeglwogH4uBDbTNzhRPewnZmvW0FFdWIZjFInSQ2JkwQ5iprLvj HoRJDW+9Pdw4B9RAxxbhv7zY5pyvYwUNi3Q7tlsXGesg61FMDZ+lWVEAyNEFK87lKmje Zhr3SqqVGkUs1aScXHvzaPpvLU8T5n7Mlc4KdvS5uXEr7wzfP6RSPm6EUkqI7VZ2JDD/ 2zTToU5foKpO02haea1qmPHw7/1fSgXZrjYAIQaz0vJaR5o7cFBXmwVzPvhMBKGtLnwU Jt3w== X-Gm-Message-State: AO0yUKXONgK//hWbc6B61kKL/0EJgNJvZul5Oo/9Dczaaf8dnoybR/No W5ZH9GEpzods4AcVGgDQjsjeZQ== X-Google-Smtp-Source: AK7set8OSkfemmivnzt+FaTqk7pVCDdsxRS5FakopzxPZyWWlPwJdqWTOnIrlHupwNkFnDayOuy2BQ== X-Received: by 2002:a17:90a:6949:b0:219:f970:5119 with SMTP id j9-20020a17090a694900b00219f9705119mr170655pjm.1.1674859583405; Fri, 27 Jan 2023 14:46:23 -0800 (PST) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id c71-20020a63354a000000b004cffa8c0227sm2802295pga.23.2023.01.27.14.46.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Jan 2023 14:46:22 -0800 (PST) Date: Fri, 27 Jan 2023 22:46:18 +0000 From: Sean Christopherson To: Yang Weijiang Cc: pbonzini@redhat.com, jmattson@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, like.xu.linux@gmail.com, kan.liang@linux.intel.com, wei.w.wang@intel.com Subject: Re: [PATCH v2 00/15] Introduce Architectural LBR for vPMU Message-ID: References: <20221125040604.5051-1-weijiang.yang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221125040604.5051-1-weijiang.yang@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 24, 2022, Yang Weijiang wrote: > Intel CPU model-specific LBR(Legacy LBR) has evolved to Architectural > LBR(Arch LBR [0]), it's the replacement of legacy LBR on new platforms. > The native support patches were merged into 5.9 kernel tree, and this > patch series is to enable Arch LBR in vPMU so that guest can benefit > from the feature. > > The main advantages of Arch LBR are [1]: > - Faster context switching due to XSAVES support and faster reset of > LBR MSRs via the new DEPTH MSR > - Faster LBR read for a non-PEBS event due to XSAVES support, which > lowers the overhead of the NMI handler. > - Linux kernel can support the LBR features without knowing the model > number of the current CPU. > > From end user's point of view, the usage of Arch LBR is the same as > the Legacy LBR that has been merged in the mainline. > > Note, in this series, there's one restriction for guest Arch LBR, i.e., > guest can only set its LBR record depth the same as host's. This is due > to the special behavior of MSR_ARCH_LBR_DEPTH: > 1) On write to the MSR, it'll reset all Arch LBR recording MSRs to 0s. > 2) XRSTORS resets all record MSRs to 0s if the saved depth mismatches > MSR_ARCH_LBR_DEPTH. > Enforcing the restriction keeps KVM Arch LBR vPMU working flow simple > and straightforward. > > Paolo refactored the old series and the resulting patches became the > base of this new series, therefore he's the author of some patches. To be very blunt, this series is a mess. I don't want to point fingers as there is plenty of blame to go around. The existing LBR support is a confusing mess, vPMU as a whole has been neglected for too long, review feedback has been relatively non-existent, and I'm sure some of the mess is due to Paolo trying to hastily fix things up back when this was temporarily queued. However, for arch LBR support to be merged, things need to change. First and foremost, the existing LBR support needs to be documented. Someone, I don't care who, needs to provide a detailed writeup of the contract between KVM and perf. Specifically, I want to know: 1. When exactly is perf allowed to take control of LBR MRS. Task switch? IRQ? NMI? 2. What is the expected behavior when perf is using LBRs? Is the guest supposed to be traced? 3. Why does KVM snapshot DEBUGCTL with IRQs enabled, but disables IRQs when accessing LBR MSRs? It doesn't have to be polished, e.g. I'll happily wordsmith things into proper documentation, but I want to have a very clear understanding of how LBR support is _intended_ to function and how it all _actually_ functions without having to make guesses. And depending on the answers, I want to revisit KVM's LBR implementation before tackling arch LBRs. Letting perf usurp LBRs while KVM has the vCPU loaded is frankly ridiculous. Just have perf set a flag telling KVM that it needs to take control of LBRs and have KVM service the flag as a request or something. Stealing the LBRs back in IRQ context adds a stupid amount of complexity without much value, e.g. waiting a few branches for KVM to get to a safe place isn't going to meaningfully change the traces. If that can't actually happen, then why on earth does KVM need to disable IRQs to read MSRs? And AFAICT, since KVM unconditionally loads the guest's DEBUGCTL, whether or not guest branches show up in the LBRs when the host is tracing is completely up to the whims of the guest. If that's correct, then again, what's the point of the dance between KVM and perf? Beyond the "how does this work" issues, there needs to be tests. At the absolute minimum, there needs to be selftests showing that this stuff actually works, that save/restore (migration) works, that the MSRs can/can't be accessed when guest CPUID is (in)correctly configured, etc. And I would really, really like to have tests that force contention between host and guests, e.g. to make sure that KVM isn't leaking host state or outright exploding, but I can understand that those types of tests would be very difficult to write. I've pushed a heavily reworked, but definitely broken, version to git@github.com:sean-jc/linux.git x86/arch_lbrs It compiles, but it's otherwise untested and there are known gaps. E.g. I omitted toggling load+clear of ARCH_LBR_CTL because I couldn't figure out the intended behavior.