Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp7573834ybn; Mon, 30 Sep 2019 16:28:39 -0700 (PDT) X-Google-Smtp-Source: APXvYqwSpoFLp9Zong/c0IAyEtkJyrpkMpnuIh0eHT6azMxwtQypRFp9uRcx39zEx6rk+hh+ayIE X-Received: by 2002:a17:906:48e:: with SMTP id f14mr21190771eja.15.1569886119618; Mon, 30 Sep 2019 16:28:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569886119; cv=none; d=google.com; s=arc-20160816; b=bYEfvK095uqPSzoJ40W5weBpfpXAKQd0bojX+uUSYG8g0BD5vR9KXDlySR80AgeIxp TUPeFfQjKiIhlKNR8rMKOBcNGL3s4Vgut7rTqub2gh8spmNNdXIv5PK8DzPXRKnK2WDv DOdz71M0Hk5CgOipR21ww+0sTMcx+9vU/yh/7QHVdflnlWyVA1K6teBIkZ6Z14AuSyg1 ubMcmRvJwfTuvJ4vSDBjz05mdVVO1/HltemAcB///2IR29UQXcACTO9m5KZIC2n1G97+ tZZoUvBh5oTSRyvTHmJSMttFlJN/KwCVSsNqqH/iHA7mkwr60ugi9B4SUHQG85ZU5nwT bVyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=HMdvT4IOPDCCE8pP/nQFNo9cypY0X/e/c58PQQtZ6MY=; b=pP12SDko8PTeBnf0YFwXMS6y8Q0Ge5RDvOtc9WzN9bPjx5q4VcVC0PWvyW/hRXss2Y DiPImNyKbsGyVVgLOiTUkSEXc7cxjCMZ+ETKoWCcob5yk+RyKarGO0aCO31z+Xe8wyvp 1JdZIAXC0q/2hlx564BOLjUvceNfzKfM7l7himOBr1GT7G18gO6duMlXAQmH7Q8hyzAf CtbnIwcHR6VnF0mjui0w5lF20/vEqgYdB1yB4jZWuU/boUOGd+rZcGq2JmYsU7VrFR8F MkjnRcDiWK+MgzfdhqSDZkBRwgC8oW8ewlqrS0NlQocQvHqVrT+aGbLpjnNmT6WtXHut VRTg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f57si8070394edb.165.2019.09.30.16.28.14; Mon, 30 Sep 2019 16:28:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732453AbfI3X16 (ORCPT + 99 others); Mon, 30 Sep 2019 19:27:58 -0400 Received: from mga05.intel.com ([192.55.52.43]:63706 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731685AbfI3X15 (ORCPT ); Mon, 30 Sep 2019 19:27:57 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Sep 2019 16:27:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,568,1559545200"; d="scan'208";a="215880195" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga004.fm.intel.com with ESMTP; 30 Sep 2019 16:27:54 -0700 From: Like Xu To: Paolo Bonzini , kvm@vger.kernel.org, rkrcmar@redhat.com, sean.j.christopherson@intel.com, vkuznets@redhat.com, peterz@infradead.org, Jim Mattson Cc: Ingo Molnar , Arnaldo Carvalho de Melo , ak@linux.intel.com, wei.w.wang@intel.com, kan.liang@intel.com, like.xu@intel.com, ehankland@google.com, arbel.moshe@oracle.com, linux-kernel@vger.kernel.org Subject: [PATCH 0/3] KVM: x86/vPMU: Efficiency optimization by reusing last created perf_event Date: Mon, 30 Sep 2019 15:22:54 +0800 Message-Id: <20190930072257.43352-1-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Paolo & Community: Performance Monitoring Unit is designed to monitor micro architectural events which helps in analyzing how an application or operating systems are performing on the processors. In KVM/X86, version 2 Architectural PMU on Intel and AMD hosts have been enabled. This patch series is going to improve vPMU Efficiency for guest perf users which is mainly measured by guest NMI handler latency for basic perf usage [1][2][3][4] with hardware PMU. It's not a passthrough solution but based on the legacy vPMU implementation (since 2011) with backport-friendliness. The general idea (defined in patch 2/3) is to reuse last created perf_event for the same vPMC when the new requested config is the exactly same as the last programed config (used by pmc_reprogram_counter()) AND the new event period is appropriate and accepted (via perf_event_period() in patch 1/3). Before the perf_event is resued, it would be disabled until it's could be reused and reassigned a hw-counter again to serve for vPMC. If the disabled perf_event is no longer reused, we do a lazy release mechanism (defined in patch 3/3) which in a short is to release the disabled perf_events on the first call of vcpu_enter_guest since the vcpu gets next scheduled in if its MSRs is not accessed in the last sched time slice. The bitmap pmu->lazy_release_ctrl is added to track. The kvm_pmu_cleanup() is added to the first time to run vcpu_enter_guest after the vcpu shced_in and the overhead is very limited. With this optimization, the average latency of the guest NMI handler is reduced from 99450 ns to 56195 ns (1.76x speed up on CLX-AP with v5.3). If host disables the watchdog (echo 0 > /proc/sys/kernel/watchdog), the minimum latency of guest NMI handler could be speed up at 2994x and in the average at 685x. The run time of workload with perf attached inside the guest could be reduced significantly with this optimization. Please check each commit for more details and share your comments with us. Thanks, Like Xu --- [1] multiplexing sampling usage: time perf record -e \ `perf list | grep Hardware | grep event |\ awk '{print $1}' | head -n 10 |tr '\n' ',' | sed 's/,$//' ` ./ftest [2] one gp counter sampling usage: perf record -e branch-misses ./ftest [3] one fixed counter sampling usage: perf record -e instructions ./ftest [4] event count usage: perf stat -e branch-misses ./ftest Like Xu (3): perf/core: Provide a kernel-internal interface to recalibrate event period KVM: x86/vPMU: Reuse perf_event to avoid unnecessary pmc_reprogram_counter KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC arch/x86/include/asm/kvm_host.h | 10 ++++ arch/x86/kvm/pmu.c | 88 ++++++++++++++++++++++++++++++++- arch/x86/kvm/pmu.h | 15 +++++- arch/x86/kvm/pmu_amd.c | 14 ++++++ arch/x86/kvm/vmx/pmu_intel.c | 27 ++++++++++ arch/x86/kvm/x86.c | 6 +++ include/linux/perf_event.h | 5 ++ kernel/events/core.c | 28 ++++++++--- 8 files changed, 182 insertions(+), 11 deletions(-) -- 2.21.0