Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp432706pxb; Wed, 14 Apr 2021 20:21:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwJsJ5fBcEydhPoLqwPW/Tes2rQ0rZg2+UUpkirnHnt+vQfgzNSBuazFTKQReuSX3kPVm8n X-Received: by 2002:a17:907:c16:: with SMTP id ga22mr1210020ejc.120.1618456915611; Wed, 14 Apr 2021 20:21:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618456915; cv=none; d=google.com; s=arc-20160816; b=rBoWwLu12IlHlpVcfL6rmGF50Jn3X8Prfc0E2CuId0prNJe5S4bafV3ndd4t+BD15Q ydWGRHfKtSkd0QfJefJTkonX32NePNIkblt8d3Mmec1yIHwEPUtZAUA2kpbvnHkZmwqv 0r90PD3sH8qvlEW4xWQny1TWfRKkmakSM5tibhs1/ufFB3ve1/71AsquEqZV1uE2LN6i V7wPlYEuJKdpK7IUX3zrlfojMBX07D6Fo0vy/4Q04jjXD+8FPhBeiVvQcAXQgttXDvQM Jx70nrTYZapTvQLA9ixjgeugrZp//M7QSOoDmZjWgxAcc2ZX6/w3rlpwlgMlx1HvWVZ6 njYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :ironport-sdr:ironport-sdr; bh=b8YpgrGFAJswiB+TFgXkjOsssVLCvXE8Zz+G7eB8tKs=; b=Mvfh/ejN3il/39aEaYQrFx1RzVx/NS/EBq2bM5KUSn1bRJbhms6Y1HZ4pfOBYNTO2E Y3c14ClQMgmcQYm+RP+trRSoo0BvgsWS0/PJhUI4L9B2cxzmF1At1mjB2AUL6pBfJi8a aTuebclEll0/sydn7+nXQ9yS5ruJ7yQ4owUkF6pGQxm22ewVeVD1+p31hUnsaa/rxjEy hRC5SAWUBde8YTEFJLxT3fITGvHyqbX1YpmgrciMS53NNIvn2Kouk161ECyEqSxYdx/n wwt8Ym9vALQAz13rJpZgjKybxgu8j2OwUd+NqFSDI3oAmmh+atmA0FReGTw9ibE5V45P Cqqg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v16si1002615eja.117.2021.04.14.20.21.33; Wed, 14 Apr 2021 20:21:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229824AbhDODVA (ORCPT + 99 others); Wed, 14 Apr 2021 23:21:00 -0400 Received: from mga11.intel.com ([192.55.52.93]:1119 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229784AbhDODUw (ORCPT ); Wed, 14 Apr 2021 23:20:52 -0400 IronPort-SDR: WluMKt3RKzBvf2Kf6iDasblMAw0czv4tqs5BKgFmSp75ikV/9NwAgyJMGWtNuyaKtKIsRSepnn W/H87oxCqsBg== X-IronPort-AV: E=McAfee;i="6200,9189,9954"; a="191592818" X-IronPort-AV: E=Sophos;i="5.82,223,1613462400"; d="scan'208";a="191592818" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Apr 2021 20:20:30 -0700 IronPort-SDR: KyUl0mfZ8i4X1zlihPR6/hEbHgFTdebyzWrkvzHp4KuCB/HprucJZmtvHxkiDKNDdJR0qRQieB +0efxpG9V+sw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,223,1613462400"; d="scan'208";a="425013870" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by orsmga008.jf.intel.com with ESMTP; 14 Apr 2021 20:20:26 -0700 From: Like Xu To: peterz@infradead.org, Paolo Bonzini , Sean Christopherson Cc: andi@firstfloor.org, kan.liang@linux.intel.com, wei.w.wang@intel.com, eranian@google.com, liuxiangdong5@huawei.com, Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Like Xu , Andi Kleen Subject: [PATCH v5 02/16] perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest Date: Thu, 15 Apr 2021 11:20:02 +0800 Message-Id: <20210415032016.166201-3-like.xu@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210415032016.166201-1-like.xu@linux.intel.com> References: <20210415032016.166201-1-like.xu@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org With PEBS virtualization, the guest PEBS records get delivered to the guest DS, and the host pmi handler uses perf_guest_cbs->is_in_guest() to distinguish whether the PMI comes from the guest code like Intel PT. No matter how many guest PEBS counters are overflowed, only triggering one fake event is enough. The fake event causes the KVM PMI callback to be called, thereby injecting the PEBS overflow PMI into the guest. KVM may inject the PMI with BUFFER_OVF set, even if the guest DS is empty. That should really be harmless. Thus guest PEBS handler would retrieve the correct information from its own PEBS records buffer. Originally-by: Andi Kleen Co-developed-by: Kan Liang Signed-off-by: Kan Liang Signed-off-by: Like Xu --- arch/x86/events/intel/core.c | 40 ++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 591d60cc8436..021658df1feb 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2747,6 +2747,43 @@ static void intel_pmu_reset(void) local_irq_restore(flags); } +/* + * We may be running with guest PEBS events created by KVM, and the + * PEBS records are logged into the guest's DS and invisible to host. + * + * In the case of guest PEBS overflow, we only trigger a fake event + * to emulate the PEBS overflow PMI for guest PBES counters in KVM. + * The guest will then vm-entry and check the guest DS area to read + * the guest PEBS records. + * + * The contents and other behavior of the guest event do not matter. + */ +static void x86_pmu_handle_guest_pebs(struct pt_regs *regs, + struct perf_sample_data *data) +{ + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); + u64 guest_pebs_idxs = cpuc->pebs_enabled & ~cpuc->intel_ctrl_host_mask; + struct perf_event *event = NULL; + int bit; + + if (!x86_pmu.pebs_active || !guest_pebs_idxs) + return; + + for_each_set_bit(bit, (unsigned long *)&guest_pebs_idxs, + INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed) { + event = cpuc->events[bit]; + if (!event->attr.precise_ip) + continue; + + perf_sample_data_init(data, 0, event->hw.last_period); + if (perf_event_overflow(event, data, regs)) + x86_pmu_stop(event, 0); + + /* Inject one fake event is enough. */ + break; + } +} + static int handle_pmi_common(struct pt_regs *regs, u64 status) { struct perf_sample_data data; @@ -2797,6 +2834,9 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status) u64 pebs_enabled = cpuc->pebs_enabled; handled++; + if (x86_pmu.pebs_vmx && perf_guest_cbs && + perf_guest_cbs->is_in_guest()) + x86_pmu_handle_guest_pebs(regs, &data); x86_pmu.drain_pebs(regs, &data); status &= x86_pmu.intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI; -- 2.30.2