Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp3127020pxj; Mon, 10 May 2021 19:44:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJydeTfl0F93qLhoiyg6xr58D6LekMshD/TBZGepAyXvlprfBgmNO+QyRpYDLI4OODfDIM8L X-Received: by 2002:a17:907:c15:: with SMTP id ga21mr18182052ejc.141.1620701096823; Mon, 10 May 2021 19:44:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620701096; cv=none; d=google.com; s=arc-20160816; b=t6gYjk2sOGLCLvbQT/j3z2S/Ormp4a23RwnSpBds1qjy8sPHRejBuERGC7Ue+VKS6H 7D5AyaE/UN1Pvytfr330cNsaQKvwWdpbZ0UYlWd9XJYLiiFJgo42BbpeDGBOhrq5L2or swMMmkUtg1g1EMqPmxjJGtd1BDRwlt/4ves+MK7xDEAgbB/bKlZINNdzQotXhytV1TwG /d7erAZjtGoyYiAPbRY/wlq6C/E0vCyRnhr/YzA/6Gj3rFgy6Z5326xeG+GsdbFjJv7J h7AAw/7czueHJKw1Bn3zjQFfJ9kr/rJvYgShCrsri0ltLYtCJgKF60smUYLW49scL4Qb OlgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :ironport-sdr:ironport-sdr; bh=aELJZsYVkH/WQ9T5MFSPq43pqazFYi0zg3T2eqL/pzY=; b=EkANRRn8SqJcrC3eOOTjNvlo0aojFfVISVUoy3UUT1BXvDbf3Z7SZrYYqe84zFe0DX Po9MknGeDEfBacbg0pTkfd/BE9hniDRp2FMwD0lkvAmU/F3amJSCOkLt9S+nzWHBxnRo jFfbddwiZvmw0y2uzC9FE20pYe73Fe0mnZwnn00wFCD3Qkyc7BI5gvhPIWKVpUMSBbPC tMKT1ofpd0v7XPs4uvknUQLKZt94/Ti7kQB/LELlzsDHuv50EX6Ph55XNv/YgW8CNOlF 4RtAGYrhfVSP/TEalVXpNEnWQtzmFSaxUYGaAsBJaFBnaoMixMjL5JISQH4A2Hz4ZNCE T96A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v5si9396335edx.142.2021.05.10.19.44.22; Mon, 10 May 2021 19:44:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230095AbhEKCoN (ORCPT + 99 others); Mon, 10 May 2021 22:44:13 -0400 Received: from mga03.intel.com ([134.134.136.65]:7532 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230018AbhEKCoM (ORCPT ); Mon, 10 May 2021 22:44:12 -0400 IronPort-SDR: 75ZmY8NnigbNqDeca9KNY4L7kFxhEFhHuZPIZ3K8sw2/lkRwDFae8aHTKxqoV65//FxPwP9clv cmzNDCOuU1AA== X-IronPort-AV: E=McAfee;i="6200,9189,9980"; a="199391176" X-IronPort-AV: E=Sophos;i="5.82,290,1613462400"; d="scan'208";a="199391176" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2021 19:43:06 -0700 IronPort-SDR: 9DKVAqKpJuk5co7yGXSQGycLndL1zG4LLuFoPoVKvkB19GL8y9tc7fsLbn3YUweICAqT33MyLy CIZbouJMdZWg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,290,1613462400"; d="scan'208";a="468591624" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga002.fm.intel.com with ESMTP; 10 May 2021 19:43:03 -0700 From: Like Xu To: Peter Zijlstra , Paolo Bonzini Cc: Borislav Petkov , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , weijiang.yang@intel.com, Kan Liang , ak@linux.intel.com, wei.w.wang@intel.com, eranian@google.com, liuxiangdong5@huawei.com, linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v6 02/16] perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest Date: Tue, 11 May 2021 10:42:00 +0800 Message-Id: <20210511024214.280733-3-like.xu@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210511024214.280733-1-like.xu@linux.intel.com> References: <20210511024214.280733-1-like.xu@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org With PEBS virtualization, the guest PEBS records get delivered to the guest DS, and the host pmi handler uses perf_guest_cbs->is_in_guest() to distinguish whether the PMI comes from the guest code like Intel PT. No matter how many guest PEBS counters are overflowed, only triggering one fake event is enough. The fake event causes the KVM PMI callback to be called, thereby injecting the PEBS overflow PMI into the guest. KVM may inject the PMI with BUFFER_OVF set, even if the guest DS is empty. That should really be harmless. Thus guest PEBS handler would retrieve the correct information from its own PEBS records buffer. Originally-by: Andi Kleen Co-developed-by: Kan Liang Signed-off-by: Kan Liang Signed-off-by: Like Xu --- arch/x86/events/intel/core.c | 40 ++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index b6e45ee10e16..092ecacf8345 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2780,6 +2780,43 @@ static void intel_pmu_reset(void) local_irq_restore(flags); } +/* + * We may be running with guest PEBS events created by KVM, and the + * PEBS records are logged into the guest's DS and invisible to host. + * + * In the case of guest PEBS overflow, we only trigger a fake event + * to emulate the PEBS overflow PMI for guest PBES counters in KVM. + * The guest will then vm-entry and check the guest DS area to read + * the guest PEBS records. + * + * The contents and other behavior of the guest event do not matter. + */ +static void x86_pmu_handle_guest_pebs(struct pt_regs *regs, + struct perf_sample_data *data) +{ + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); + u64 guest_pebs_idxs = cpuc->pebs_enabled & ~cpuc->intel_ctrl_host_mask; + struct perf_event *event = NULL; + int bit; + + if (!x86_pmu.pebs_active || !guest_pebs_idxs) + return; + + for_each_set_bit(bit, (unsigned long *)&guest_pebs_idxs, + INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed) { + event = cpuc->events[bit]; + if (!event->attr.precise_ip) + continue; + + perf_sample_data_init(data, 0, event->hw.last_period); + if (perf_event_overflow(event, data, regs)) + x86_pmu_stop(event, 0); + + /* Inject one fake event is enough. */ + break; + } +} + static int handle_pmi_common(struct pt_regs *regs, u64 status) { struct perf_sample_data data; @@ -2831,6 +2868,9 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status) u64 pebs_enabled = cpuc->pebs_enabled; handled++; + if (x86_pmu.pebs_vmx && perf_guest_cbs && + perf_guest_cbs->is_in_guest()) + x86_pmu_handle_guest_pebs(regs, &data); x86_pmu.drain_pebs(regs, &data); status &= intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI; -- 2.31.1