Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp194234ybt; Sat, 13 Jun 2020 01:18:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy08VpmZyHTOUeTy9n0LwcSzHNh9MpKAh7VlgRVLXpfj6Dz1FY2n+rfvGWxJCw8Hg/9+HpH X-Received: by 2002:a17:906:2e81:: with SMTP id o1mr15969920eji.362.1592036194073; Sat, 13 Jun 2020 01:16:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592036194; cv=none; d=google.com; s=arc-20160816; b=ficeFpufrFKeWNMbF93o0oJErww13aIgDdnifC2KCr5EyMkX2yASzvZTV+MtpNc21L KBP4b5Q5Zjb81XK/sTmV/jigHb43P4cYkhNFZkZr06ZBqzAMYyFSj2biKSKBJrbFDgvf In1sFRcrYb3BpuqNPVzzTmTW0npeC5b+J8xZMcBYRps7DHgcLsCZbJpYHkMvsyQ7DjLq cwiLis2YOzvmJC6e352b+uilRfTQaS984/1FQ6yGYkD9hA3wFu+05SWuIresI3pPKgzM KZM2np5ILV573e0fw7Bm/Nx5+kE2IZ4iiFg0co6jc4SZaggc+YaK+yQLCL1+tAQhjMqE iatQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:ironport-sdr:ironport-sdr; bh=jo9JnpQ9QE8emMZ2zyfdJZIL69AYIvmp0c4iBD5PXbQ=; b=btG7tDZY1mmRHCwRJOU7w8a1pHs+f0BV56lg+lD52RAjEPs17eSRB0mA4CAX1l7J8O rbChFNiuozEElfSefbLYidNtOAV5H3d/GzmJJBY8K04P3oCaxLdm2nIbHR7RGPWjmawc 9o+RPKdCNwS+AIZz0GZVSojXk/9+S1U9iy5DTsnfS+LuHaHDeR6oemnEms7ixrpzcb4k zLbRZ/nda8/oGfrFDABsuq+Vs9jtjokmGs2tXfbPyBX9ggB9QHXz9hqjC/E3sjskudjI g13SZAG5yoycnQ6BoLGiHMA45cKueq6ykwhGydTatJCcn7NpHCxGwnfftpxPc4Hr4DOb cxCQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id lh7si5571078ejb.106.2020.06.13.01.15.42; Sat, 13 Jun 2020 01:16:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726442AbgFMILA (ORCPT + 99 others); Sat, 13 Jun 2020 04:11:00 -0400 Received: from mga14.intel.com ([192.55.52.115]:29194 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725783AbgFMIK5 (ORCPT ); Sat, 13 Jun 2020 04:10:57 -0400 IronPort-SDR: M+w4/YBj08MjvEhjWu0J/f0zFpE2omLQyalwQVf3kKnbhA8NXyvYkNKFZN8PUICibEl3GtJiPi KxDHyTrY1cMA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:10:57 -0700 IronPort-SDR: qshJJZDYaukelPetm3Jrr0NyCRhfa96f4HEIP5dLSs3cMKpry0O8Dgx4vx42etajZUvk6fjSEW LGuwaml+1asQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467288" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:10:52 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v12 00/11] Guest Last Branch Recording Enabling Date: Sat, 13 Jun 2020 16:09:45 +0800 Message-Id: <20200613080958.132489-1-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi all, Please help review this new version for the Kenrel 5.9 release. Now, you may apply the last two qemu-devel patches to the upstream qemu and try the guest LBR feature with '-cpu host' command line. v11->v12 Changelog: - apply "Signed-off-by" form PeterZ and his codes for the perf subsystem; - add validity checks before expose LBR via MSR_IA32_PERF_CAPABILITIES; - refactor MSR_IA32_DEBUGCTLMSR emulation with validity check; - reorder "perf_event_attr" fields according to how they're declared; - replace event_is_oncpu() with "event->state" check; - make LBR emualtion specific to vmx rather than x86 generic; - move pass-through LBR code to vmx.c instead of pmu_intel.c; - add vmx_lbr_en/disable_passthrough layer to make code readable; - rewrite pmu availability check with vmx_passthrough_lbr_msrs(); You may check more details in each commit. Previous: https://lore.kernel.org/kvm/20200514083054.62538-1-like.xu@linux.intel.com/ --- The last branch recording (LBR) is a performance monitor unit (PMU) feature on Intel processors that records a running trace of the most recent branches taken by the processor in the LBR stack. This patch series is going to enable this feature for plenty of KVM guests. The userspace could configure whether it's enabled or not for each guest via MSR_IA32_PERF_CAPABILITIES msr. As a first step, a guest could only enable LBR feature if its cpu model is the same as the host since the LBR feature is still one of model specific features. If it's enabled on the guest, the guest LBR driver would accesses the LBR MSR (including IA32_DEBUGCTLMSR and records MSRs) as host does. The first guest access on the LBR related MSRs is always interceptible. The KVM trap would create a special LBR event (called guest LBR event) which enables the callstack mode and none of hardware counter is assigned. The host perf would enable and schedule this event as usual. Guest's first access to a LBR registers gets trapped to KVM, which creates a guest LBR perf event. It's a regular LBR perf event which gets the LBR facility assigned from the perf subsystem. Once that succeeds, the LBR stack msrs are passed through to the guest for efficient accesses. However, if another host LBR event comes in and takes over the LBR facility, the LBR msrs will be made interceptible, and guest following accesses to the LBR msrs will be trapped and meaningless. Because saving/restoring tens of LBR MSRs (e.g. 32 LBR stack entries) in VMX transition brings too excessive overhead to frequent vmx transition itself, the guest LBR event would help save/restore the LBR stack msrs during the context switching with the help of native LBR event callstack mechanism, including LBR_SELECT msr. If the guest no longer accesses the LBR-related MSRs within a scheduling time slice and the LBR enable bit is unset, vPMU would release its guest LBR event as a normal event of a unused vPMC and the pass-through state of the LBR stack msrs would be canceled. --- LBR testcase: echo 1 > /proc/sys/kernel/watchdog echo 25 > /proc/sys/kernel/perf_cpu_time_max_percent echo 5000 > /proc/sys/kernel/perf_event_max_sample_rate echo 0 > /proc/sys/kernel/perf_cpu_time_max_percent ./perf record -b ./br_instr a - Perf report on the host: Samples: 72K of event 'cycles', Event count (approx.): 72512 Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles 12.12% br_instr br_instr [.] cmp_end [.] lfsr_cond 1 11.05% br_instr br_instr [.] lfsr_cond [.] cmp_end 5 8.81% br_instr br_instr [.] lfsr_cond [.] cmp_end 4 5.04% br_instr br_instr [.] cmp_end [.] lfsr_cond 20 4.92% br_instr br_instr [.] lfsr_cond [.] cmp_end 6 4.88% br_instr br_instr [.] cmp_end [.] lfsr_cond 6 4.58% br_instr br_instr [.] cmp_end [.] lfsr_cond 5 - Perf report on the guest: Samples: 92K of event 'cycles', Event count (approx.): 92544 Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles 12.03% br_instr br_instr [.] cmp_end [.] lfsr_cond 1 11.09% br_instr br_instr [.] lfsr_cond [.] cmp_end 5 8.57% br_instr br_instr [.] lfsr_cond [.] cmp_end 4 5.08% br_instr br_instr [.] lfsr_cond [.] cmp_end 6 5.06% br_instr br_instr [.] cmp_end [.] lfsr_cond 20 4.87% br_instr br_instr [.] cmp_end [.] lfsr_cond 6 4.70% br_instr br_instr [.] cmp_end [.] lfsr_cond 5 Conclusion: the profiling results on the guest are similar to that on the host. Like Xu (10): perf/x86/core: Refactor hw->idx checks and cleanup perf/x86/lbr: Add interface to get LBR information perf/x86: Add constraint to create guest LBR event without hw counter perf/x86: Keep LBR records unchanged in host context for guest usage KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation KVM: vmx/pmu: Release guest LBR event via lazy release mechanism Wei Wang (1): perf/x86: Fix variable types for LBR registers Qemu-devel: target/i386: define a MSR based feature word - FEAT_PERF_CAPABILITIES target/i386: add -cpu,lbr=true support to enable guest LBR arch/x86/events/core.c | 26 +-- arch/x86/events/intel/core.c | 109 ++++++++----- arch/x86/events/intel/lbr.c | 51 +++++- arch/x86/events/perf_event.h | 8 +- arch/x86/include/asm/perf_event.h | 34 +++- arch/x86/kvm/pmu.c | 12 +- arch/x86/kvm/pmu.h | 5 + arch/x86/kvm/vmx/capabilities.h | 23 ++- arch/x86/kvm/vmx/pmu_intel.c | 253 +++++++++++++++++++++++++++++- arch/x86/kvm/vmx/vmx.c | 86 +++++++++- arch/x86/kvm/vmx/vmx.h | 17 ++ arch/x86/kvm/x86.c | 13 -- 12 files changed, 559 insertions(+), 78 deletions(-) -- 2.21.3