Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp3006409ybg; Mon, 28 Oct 2019 06:03:31 -0700 (PDT) X-Google-Smtp-Source: APXvYqys+cc11rESevaW7DzJtXT0YSNTFrEHjxVTo3YCyxLQNY4thSvLFDNsZR0Ux8VY5596pl/u X-Received: by 2002:a50:ab01:: with SMTP id s1mr19761154edc.192.1572267811628; Mon, 28 Oct 2019 06:03:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1572267811; cv=none; d=google.com; s=arc-20160816; b=GFxbiFSA8iDsU+FzOcijNTW+VeztfmEotngaxCCkXfvZgMme6rAQW3FMtjwhBmA8L7 8RiWksW6lfJuSoRD3NHSQw+MlkzfAOHUtUuD84IMMtDHOEOnVWxTgsG9zLOPtLA0xI6V csxOSNW6GXoUkXwD3aRZa3SGmzIedPZEWG/nrQhAD2/xDtiU/+kaX896U0rQjT9xpq7H xglQRUoz4ljWQAZZXulASfkC6351YnW5pWiRZa8VwDhRXMgfRGyqP/HChbQX+XcLYN2S k92RXup9pKtW6dO7fhS5UC20gdJbYBqa48r8Ja9s2FyzSt/SavD3Ck3IxJ7CxTp4hUi8 AdWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=eV5TIrE/a1qgXfL+NYEJL8yED+UHezyc0vObNZ7vxw4=; b=N1aR2y5aT2EOyMcrVgttgjw7VYbxdR7I19wsWWQYrtwxy7ZSLWpvGKvYVUO8BSuiSj oXjW3nbsAcnRG6P/BiU92mEcLWmvj8II57vxPZlTYx0N2JzhoKRTd6jPPmM8wJXe28n4 cEBlG1MEe+ywHV3gHFkq3ePhYf3ABAUnz21x52k4obje482H2ije7qVz3ZsNkkzt7Im9 4rgES3l/YLm1Hlk/Nsp5vBsdetnzvhvAAYh7tOFYNO7ZkHF3ANerPIJp/N2vJV3ubcx7 fmzLItZXAIxh6Z0GOURY6PnZ/6Nr26jNrJWqGcN0R7Kb1jimPykeU/6QuUKkws5HxwM+ BWBA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id jr6si5934848ejb.307.2019.10.28.06.03.04; Mon, 28 Oct 2019 06:03:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728595AbfJ1C6X (ORCPT + 99 others); Sun, 27 Oct 2019 22:58:23 -0400 Received: from mga12.intel.com ([192.55.52.136]:35133 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728229AbfJ1C6W (ORCPT ); Sun, 27 Oct 2019 22:58:22 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Oct 2019 19:58:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,238,1569308400"; d="scan'208";a="229552422" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga002.fm.intel.com with ESMTP; 27 Oct 2019 19:58:19 -0700 From: Like Xu To: Peter Zijlstra , Paolo Bonzini Cc: Sean Christopherson , Jim Mattson , Wanpeng Li , Alexander Shishkin , Arnaldo Carvalho de Melo , Borislav Petkov , Ingo Molnar , Jiri Olsa , Joerg Roedel , Namhyung Kim , Thomas Gleixner , Vitaly Kuznetsov , kan.liang@intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH v4 0/6] KVM: x86/vPMU: Efficiency optimization by reusing last created perf_event Date: Sun, 27 Oct 2019 18:52:37 +0800 Message-Id: <20191027105243.34339-1-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Community, For perf subsystem, please help review first two patches. For kvm subsystem, please help review last four patches. This patch series is going to improve vPMU Efficiency for guest which is mainly measured by guest NMI handler latency in such as basic perf usages [1][2] with hardware PMU. It's not a passthrough solution but based on the legacy vPMU implementation. With this optimization, the average latency of the guest NMI handler is reduced from 104923 ns to 48393 ns (~2.16x speed up on CLX-AP with 5.4-rc4, w/ perf_v4_pmi=n). If host disables the watchdog, the minimum latency of guest NMI handler could be speed up at ~3413x and in the average at ~786x. The run time of workload with perf attached inside the guest could be reduced significantly with this optimization. The general idea (defined in patch 5/6) is to reuse last created event for the same vPMC when the new requested config is the exactly same as the current_config (used by last pmc_reprogram_counter()) AND the new event period is appropriate and accepted (via perf_event_period() in patch 1/6). Before reusing the perf_event, it will be disabled until it's suitable for reuse and a hardware counter will be reassigned again to serve vPMC. If the disabled perf_event is no longer reused, we do a lazy release mechanism (defined in patch 6/6) which in a short is to release the disabled perf_events in the call of kvm_pmu_handle_event since the vcpu gets next scheduled in if guest doesn't WRMSR its MSRs in the last sched time slice. In the kvm_arch_sched_in(), KVM_REQ_PMU is requested if the pmu->event_count has not been reduced to zero and then do kvm_pmu_cleanup only once for a sched time slice to ensure that overhead is very limited. Please check each commit for more details and share your comments with us. Thanks, Like Xu --- [1] multiplexing sampling mode usage: perf record -e \ `perf list | grep Hardware | grep event |\ awk '{print $1}' | head -n 10 |tr '\n' ',' | sed 's/,$//' ` ./ftest [2] single event count mode usage: perf stat -e branch-misses ./ftest --- Changes in v4: - s/rdpmc_idx/rdpmc_ecx/g (Jim Mattson) - make *_msr_idx_to_pmc static (kbuild test robot) Changes in v3: - optimize perf_event_pause() for no child event - rename programed_config to programed_config - rename lazy_release_ctrl to pmc_in_use - rename kvm_pmu_ops callbacks form msr_idx to rdpmc_idx - add a new kvm_pmu_ops callback msr_idx_to_pmc - use DECLARE_BITMAP to declare bitmap - set up a bitmap 'pmu->all_valid_pmc_idx' - move kvm_pmu_cleanup to kvm_pmu_handle_event - update performance data based on 5.4-rc4 on CLX-AP Changes in v2: - use perf_event_pause() to disable, read, reset by only one lock; - use __perf_event_read_value() after _perf_event_disable(); - replace bitfields with 'u8 event_count; bool need_cleanup;'; - refine comments and commit messages; - fix two issues reported by kbuild test robot for ARCH=[nds32|sh] v3: https://lore.kernel.org/kvm/20191021160651.49508-1-like.xu@linux.intel.com/ v2: https://lore.kernel.org/kvm/20191013091533.12971-1-like.xu@linux.intel.com/ v1: https://lore.kernel.org/kvm/20190930072257.43352-1-like.xu@linux.intel.com/ Like Xu (6): perf/core: Provide a kernel-internal interface to recalibrate event period perf/core: Provide a kernel-internal interface to pause perf_event KVM: x86/vPMU: Rename pmu_ops callbacks from msr_idx to rdpmc_ecx KVM: x86/vPMU: Introduce a new kvm_pmu_ops->msr_idx_to_pmc callback KVM: x86/vPMU: Reuse perf_event to avoid unnecessary pmc_reprogram_counter KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC arch/x86/include/asm/kvm_host.h | 19 ++++++ arch/x86/kvm/pmu.c | 112 ++++++++++++++++++++++++++++++-- arch/x86/kvm/pmu.h | 23 +++++-- arch/x86/kvm/pmu_amd.c | 24 +++++-- arch/x86/kvm/vmx/pmu_intel.c | 29 +++++++-- arch/x86/kvm/x86.c | 8 ++- include/linux/perf_event.h | 10 +++ kernel/events/core.c | 46 +++++++++++-- 8 files changed, 240 insertions(+), 31 deletions(-) -- 2.21.0