Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp620195pxb; Tue, 2 Feb 2021 13:26:04 -0800 (PST) X-Google-Smtp-Source: ABdhPJxreRn50YdqjHJ0w9Bs3RBRdCvuKT6Ul6h4T7LViYgWt//FJ07809h036aaxrnUIOCxmhjT X-Received: by 2002:a17:906:57d4:: with SMTP id u20mr8086979ejr.247.1612301164327; Tue, 02 Feb 2021 13:26:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612301164; cv=none; d=google.com; s=arc-20160816; b=qd8dSWWfKV08+L4bJ7SljJvBGlYKxZrK189DOT9CfUxHyOjcWZtJq+2Ur3IviI80t2 +vIUdP9IPBf5mLaxb+7J2IF201hB30teEjsjnH65NeBQT5kX34tEAjZHhqrVa9xjcKhO OVxm59kZd1IWR81GtcBste6sfX3I2cceCxdAtKrGW2CuniuPXN8Koo8GzhAXkn/RKOXS XZiKIMEEXywyE24u0XZyM904NhcoaF7I/NweJi/128/qnxAX6wLYMi+ZN4OK51CTby+A YI6SOf3+eQTGQ9mkB5HGJDrzLoVlmGM/l9GXrfYncQyE5uQfisI7jRi+gwgvcmZ9/Dn+ ks5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=NZRr21VwPb+gddA+LsXP9sowDhSK8vzx4R4PPP1BuY4=; b=ryZDLd3rdYJ2DcckF63VgX34xKXO6KduQvgZLKjjA1H31oC5wCrXlP3cQQ48n0WR+p lgaepB4QFy9tuEJysZrTh9tktpeYN6XaG2GGFezmpSBJrIrdFL4oGtMUSlrMpy2M/yvO je+LbnuaheIsAYNdhruFcWTxMsKS1u+xlC2CLHv7ZGk7eUeRIaWBINvvZQ+gHCHtSzy0 loaP1pXOznyTrKeTK2p/ujoy9zPoMp4oBYD5dJfIuvq3mJa1FX5tAF7FplemiXLT7bbo HIuBlbQfSW1LaCDx6CMFkaeyXvQSpWmqD6HOsQmaylGtn65jDOtjn02ZucBoFVL+m0rg nRHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BLdU5lio; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q26si34417eja.258.2021.02.02.13.25.39; Tue, 02 Feb 2021 13:26:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BLdU5lio; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231470AbhBBMjg (ORCPT + 99 others); Tue, 2 Feb 2021 07:39:36 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:22210 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230377AbhBBMjb (ORCPT ); Tue, 2 Feb 2021 07:39:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612269482; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NZRr21VwPb+gddA+LsXP9sowDhSK8vzx4R4PPP1BuY4=; b=BLdU5lioZPqal/0DTKGsvWF9csI95kWKNw3KPF8qmL4CvIXpE+TP2w92oUDFcl5tzoy8mg 1Gw4TKkfe4cI6w3Xo/qrAi78oxUrAIc2E0eHtbjgCF9nucs4G+pCDOmtui73VvYGvvTYcF 0We4gRBquhYxF5z5khgBEA/RnKkk6Fg= Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-383-198W8QOfP0-oy8_WPc5UYw-1; Tue, 02 Feb 2021 07:38:00 -0500 X-MC-Unique: 198W8QOfP0-oy8_WPc5UYw-1 Received: by mail-ed1-f71.google.com with SMTP id o8so9529306edh.12 for ; Tue, 02 Feb 2021 04:38:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=NZRr21VwPb+gddA+LsXP9sowDhSK8vzx4R4PPP1BuY4=; b=Ih5QbtiL7D+z4A1gHppqf4+C3l1wKaILD0Kni75V2BbleRL+cwX0ABLuyI1MZ07dOI MY4obdBQNWtDoPkuRfAXIMIWnkzF0a56yzWcBwXaWDkBXJr6X3keJGSm5EfIqd69/IBO cMGjNLazbmtibe7tyQwld29fcQDdOqvQYkEg978NWxPG80pEs1VLbhzAi0YY+itDc9Nd Tauoguw5SOMQOJig33a8RGKRGRv6tXVos60D7ttCAE7CsWwNlhw+/TxjBprgL7Onn8WE j6JZYahLnK6r8Y2iMKBCOExJykUwwkEE+aGA1SI5HmAFIq+UbEJYzCMo0B436s+KEm/J uSwA== X-Gm-Message-State: AOAM531hQkjuMwQ08JmCPkjf6xeRHR/W//pyxwldJm4uHwSz4Qkld5hU OODOoJBtbrJbqRcA+GNq3XIF865Q/0v2lCT1TOBuNdy8JuXuFAYV1BopbMbED1GB8pTAnsTUPXf hhKlS4HAdmkhbkWhn2pnL5xTUL/SWxMDb9EK+GZGPciFi6zz4UkeuhTAadRESpDsLFk/D8BzSHW VP X-Received: by 2002:a17:906:6407:: with SMTP id d7mr21780948ejm.133.1612269479064; Tue, 02 Feb 2021 04:37:59 -0800 (PST) X-Received: by 2002:a17:906:6407:: with SMTP id d7mr21780920ejm.133.1612269478753; Tue, 02 Feb 2021 04:37:58 -0800 (PST) Received: from ?IPv6:2001:b07:6468:f312:c8dd:75d4:99ab:290a? ([2001:b07:6468:f312:c8dd:75d4:99ab:290a]) by smtp.gmail.com with ESMTPSA id v25sm9597237ejw.21.2021.02.02.04.37.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 02 Feb 2021 04:37:57 -0800 (PST) Subject: Re: [PATCH v14 00/11] KVM: x86/pmu: Guest Last Branch Recording Enabling To: Like Xu , Sean Christopherson Cc: Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, kan.liang@intel.com, alex.shi@linux.alibaba.com, kvm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org References: <20210201051039.255478-1-like.xu@linux.intel.com> From: Paolo Bonzini Message-ID: <196ffa8c-4027-724a-ca29-f1db69f48ab7@redhat.com> Date: Tue, 2 Feb 2021 13:37:56 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20210201051039.255478-1-like.xu@linux.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/02/21 06:10, Like Xu wrote: > Hi geniuses, > > Please help review this new version which enables the guest LBR. > > We already upstreamed the guest LBR support in the host perf, please > check more details in each commit and feel free to test and comment. > > QEMU part: https://lore.kernel.org/qemu-devel/20210201045453.240258-1-like.xu@linux.intel.com > kvm-unit-tests: https://lore.kernel.org/kvm/20210201045751.243231-1-like.xu@linux.intel.com > > v13-v14 Changelog: > - Rewrite crud about vcpu->arch.perf_capabilities; > - Add PERF_CAPABILITIES testcases to tools/testing/selftests/kvm; > - Add basic LBR testcases to the kvm-unit-tests (w/ QEMU patches); > - Apply rewritten commit log from Paolo; > - Queued the first patch "KVM: x86: Move common set/get handler ..."; > - Rename 'already_passthrough' to 'msr_passthrough'; > - Check the values of MSR_IA32_PERF_CAPABILITIES early; > - Call kvm_x86_ops.pmu_ops->cleanup() always and drop extra_cleanup; > - Use INTEL_PMC_IDX_FIXED_VLBR directly; > - Fix a bug in the vmx_get_perf_capabilities(); > > Previous: > https://lore.kernel.org/kvm/20210108013704.134985-1-like.xu@linux.intel.com/ Queued, thanks. There were some conflicts with the bus lock detection patches, so I had to tweak a bit the DEBUGCTL MSR handling. Paolo > --- > > The last branch recording (LBR) is a performance monitor unit (PMU) > feature on Intel processors that records a running trace of the most > recent branches taken by the processor in the LBR stack. This patch > series is going to enable this feature for plenty of KVM guests. > > with this patch set, the following error will be gone forever and cloud > developers can better understand their programs with less profiling overhead: > > $ perf record -b lbr ${WORKLOAD} > or $ perf record --call-graph lbr ${WORKLOAD} > Error: > cycles: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' > > The user space could configure whether it's enabled or not for each > guest via MSR_IA32_PERF_CAPABILITIES msr. As a first step, a guest > could only enable LBR feature if its cpu model is the same as the > host since the LBR feature is still one of model specific features. > > If it's enabled on the guest, the guest LBR driver would accesses the > LBR MSR (including IA32_DEBUGCTLMSR and records MSRs) as host does. > The first guest access on the LBR related MSRs is always interceptible. > The KVM trap would create a special LBR event (called guest LBR event) > which enables the callstack mode and none of hardware counter is assigned. > The host perf would enable and schedule this event as usual. > > Guest's first access to a LBR registers gets trapped to KVM, which > creates a guest LBR perf event. It's a regular LBR perf event which gets > the LBR facility assigned from the perf subsystem. Once that succeeds, > the LBR stack msrs are passed through to the guest for efficient accesses. > However, if another host LBR event comes in and takes over the LBR > facility, the LBR msrs will be made interceptible, and guest following > accesses to the LBR msrs will be trapped and meaningless. > > Because saving/restoring tens of LBR MSRs (e.g. 32 LBR stack entries) in > VMX transition brings too excessive overhead to frequent vmx transition > itself, the guest LBR event would help save/restore the LBR stack msrs > during the context switching with the help of native LBR event callstack > mechanism, including LBR_SELECT msr. > > If the guest no longer accesses the LBR-related MSRs within a scheduling > time slice and the LBR enable bit is unset, vPMU would release its guest > LBR event as a normal event of a unused vPMC and the pass-through > state of the LBR stack msrs would be canceled. > > --- > > LBR testcase: > echo 1 > /proc/sys/kernel/watchdog > echo 25 > /proc/sys/kernel/perf_cpu_time_max_percent > echo 5000 > /proc/sys/kernel/perf_event_max_sample_rate > echo 0 > /proc/sys/kernel/perf_cpu_time_max_percent > perf record -b ./br_instr a > (perf record --call-graph lbr ./br_instr a) > > - Perf report on the host: > Samples: 72K of event 'cycles', Event count (approx.): 72512 > Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles > 12.12% br_instr br_instr [.] cmp_end [.] lfsr_cond 1 > 11.05% br_instr br_instr [.] lfsr_cond [.] cmp_end 5 > 8.81% br_instr br_instr [.] lfsr_cond [.] cmp_end 4 > 5.04% br_instr br_instr [.] cmp_end [.] lfsr_cond 20 > 4.92% br_instr br_instr [.] lfsr_cond [.] cmp_end 6 > 4.88% br_instr br_instr [.] cmp_end [.] lfsr_cond 6 > 4.58% br_instr br_instr [.] cmp_end [.] lfsr_cond 5 > > - Perf report on the guest: > Samples: 92K of event 'cycles', Event count (approx.): 92544 > Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles > 12.03% br_instr br_instr [.] cmp_end [.] lfsr_cond 1 > 11.09% br_instr br_instr [.] lfsr_cond [.] cmp_end 5 > 8.57% br_instr br_instr [.] lfsr_cond [.] cmp_end 4 > 5.08% br_instr br_instr [.] lfsr_cond [.] cmp_end 6 > 5.06% br_instr br_instr [.] cmp_end [.] lfsr_cond 20 > 4.87% br_instr br_instr [.] cmp_end [.] lfsr_cond 6 > 4.70% br_instr br_instr [.] cmp_end [.] lfsr_cond 5 > > Conclusion: the profiling results on the guest are similar to that on the host. > > Like Xu (11): > KVM: x86/vmx: Make vmx_set_intercept_for_msr() non-static > KVM: x86/pmu: Set up IA32_PERF_CAPABILITIES if PDCM bit is available > KVM: vmx/pmu: Add PMU_CAP_LBR_FMT check when guest LBR is enabled > KVM: vmx/pmu: Expose DEBUGCTLMSR_LBR in the MSR_IA32_DEBUGCTLMSR > KVM: vmx/pmu: Create a guest LBR event when vcpu sets DEBUGCTLMSR_LBR > KVM: vmx/pmu: Pass-through LBR msrs when the guest LBR event is ACTIVE > KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation > KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI > KVM: vmx/pmu: Release guest LBR event via lazy release mechanism > KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES > selftests: kvm/x86: add test for pmu msr MSR_IA32_PERF_CAPABILITIES > > arch/x86/kvm/pmu.c | 8 +- > arch/x86/kvm/pmu.h | 2 + > arch/x86/kvm/vmx/capabilities.h | 19 +- > arch/x86/kvm/vmx/pmu_intel.c | 281 +++++++++++++++++- > arch/x86/kvm/vmx/vmx.c | 55 +++- > arch/x86/kvm/vmx/vmx.h | 28 ++ > arch/x86/kvm/x86.c | 2 +- > tools/testing/selftests/kvm/.gitignore | 1 + > tools/testing/selftests/kvm/Makefile | 1 + > .../selftests/kvm/x86_64/vmx_pmu_msrs_test.c | 149 ++++++++++ > 10 files changed, 524 insertions(+), 22 deletions(-) > create mode 100644 tools/testing/selftests/kvm/x86_64/vmx_pmu_msrs_test.c >