Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2205766imm; Mon, 28 May 2018 03:56:43 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpzolU2DOnp/V6EzbvpFrT7zzAyIr87459xKDlcqCRy4C8WqahPkULrXvIxTsOPo/OVVIk/ X-Received: by 2002:a65:610d:: with SMTP id z13-v6mr10351608pgu.260.1527505002963; Mon, 28 May 2018 03:56:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527505002; cv=none; d=google.com; s=arc-20160816; b=bKb+TVhj80X71gmUdkWWxFBjuYaaSKjnkIz6SL6x8wexY47ntnAAeuU6RqhjqjWU0U RCRX5cHXAp+wGED+7YhBOvYV2I+H/DqkRAuNca9/Ck8+RdRJDAMpEgA3clxtk1fSSs/D uzJYY5vzWrZMA41eyNQ5JMqoWt8F4hqH/RiX04Abw7SmcID6sbCLvK3MJIKDQdEPdxXc MJZ8uDYQzzTj+QNR0IEzgfN0AGUy18ns/YpKLYA+eqoQX7u6z1ZWzzprpl2XuOg2VizR 8Nu2egoHSHNrp6GPEPNVRcy7dkOn8UHpMlqfTWneZV99Qrt2rbvUiTtaDLoOfiaIU22R Iw7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=orCc8R4MQIZsUj3GSoOoIYlBq0y38bw9mRH7GVXySX8=; b=DTWYL4VMd7s/3BEYnu4Dar019jj+F3A/IdEpdmToz1Bontnj+PtP4sWj6NiGKyQxf7 2xMI4njSJf5Ov2mqsZ7+CGi4DeL+mnt1TD0qX8st6Du/bcWIrFXAxCzOq0HSIH/prrdc Cpu8cYOMCydU9UKqQBqjTdW8MprD8CZL/g4h9PFHYLGiuLAmn2mJDUux2dNtj6nWulev F3qQNJyJIxYmE3gCWZkQRt32ptoQRzaH0sHogDEt6W8dFNBHR2DTBQXfzVQ7h0/Cr67h KAdfwEFyl2ISrkqtwMGf+hmNQhzWnh7/WQj+dHvPJbDJ0NDZHEeFWsHRNmtef1hh1YWj +8/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=NLsZNRlF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e125-v6si18383328pgc.128.2018.05.28.03.56.28; Mon, 28 May 2018 03:56:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=NLsZNRlF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1164365AbeE1KzN (ORCPT + 99 others); Mon, 28 May 2018 06:55:13 -0400 Received: from mail.kernel.org ([198.145.29.99]:43684 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1164327AbeE1KzB (ORCPT ); Mon, 28 May 2018 06:55:01 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 3A46A20883; Mon, 28 May 2018 10:55:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1527504900; bh=urrZQ0qG9oIO0y1TJ2P8RQE0rHtDhb/XzNkuMjlBqq0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NLsZNRlFyA+6fvQNEc4kZIq9blgVnG86oZ0JOrySNU4ilRyyJHnlgPdKhFZ67qmAT gUIFZHKFRM+MR0PIPexfweeKl4ReBAsItZx+UXpWWZXj/6ALRk4XZDruVz13HlqPss Vf+gzyGFNmjy4Fz7GzAVMs2j76hDlQRlTn/8/XRM= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Stephane Eranian , Alexander Shishkin , Arnaldo Carvalho de Melo , Jiri Olsa , Linus Torvalds , Peter Zijlstra , Thomas Gleixner , Vince Weaver , kan.liang@intel.com, Ingo Molnar , Sasha Levin Subject: [PATCH 4.14 303/496] perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs Date: Mon, 28 May 2018 12:01:28 +0200 Message-Id: <20180528100332.582354084@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180528100319.498712256@linuxfoundation.org> References: <20180528100319.498712256@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Stephane Eranian [ Upstream commit 71eb9ee9596d8df3d5723c3cfc18774c6235e8b1 ] this patch fix a bug in how the pebs->real_ip is handled in the PEBS handler. real_ip only exists in Haswell and later processor. It is actually the eventing IP, i.e., where the event occurred. As opposed to the pebs->ip which is the PEBS interrupt IP which is always off by one. The problem is that the real_ip just like the IP needs to be fixed up because PEBS does not record all the machine state registers, and in particular the code segement (cs). This is why we have the set_linear_ip() function. The problem was that set_linear_ip() was only used on the pebs->ip and not the pebs->real_ip. We have profiles which ran into invalid callstacks because of this. Here is an example: ..... 0: ffffffffffffff80 recent entry, marker kernel v ..... 1: 000000000040044d <= user address in kernel space! ..... 2: fffffffffffffe00 marker enter user v ..... 3: 000000000040044d ..... 4: 00000000004004b6 oldest entry Debugging output in get_perf_callchain(): [ 857.769909] CALLCHAIN: CPU8 ip=40044d regs->cs=10 user_mode(regs)=0 The problem is that the kernel entry in 1: points to a user level address. How can that be? The reason is that with PEBS sampling the instruction that caused the event to occur and the instruction where the CPU was when the interrupt was posted may be far apart. And sometime during that time window, the privilege level may change. This happens, for instance, when the PEBS sample is taken close to a kernel entry point. Here PEBS, eventing IP (real_ip) captured a user level instruction. But by the time the PMU interrupt fired, the processor had already entered kernel space. This is why the debug output shows a user address with user_mode() false. The problem comes from PEBS not recording the code segment (cs) register. The register is used in x86_64 to determine if executing in kernel vs user space. This is okay because the kernel has a software workaround called set_linear_ip(). But the issue in setup_pebs_sample_data() is that set_linear_ip() is never called on the real_ip value when it is available (Haswell and later) and precise_ip > 1. This patch fixes this problem and eliminates the callchain discrepancy. The patch restructures the code around set_linear_ip() to minimize the number of times the IP has to be set. Signed-off-by: Stephane Eranian Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Vince Weaver Cc: kan.liang@intel.com Link: http://lkml.kernel.org/r/1521788507-10231-1-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- arch/x86/events/intel/ds.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-) --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1150,6 +1150,7 @@ static void setup_pebs_sample_data(struc if (pebs == NULL) return; + regs->flags &= ~PERF_EFLAGS_EXACT; sample_type = event->attr.sample_type; dsrc = sample_type & PERF_SAMPLE_DATA_SRC; @@ -1194,7 +1195,6 @@ static void setup_pebs_sample_data(struc */ *regs = *iregs; regs->flags = pebs->flags; - set_linear_ip(regs, pebs->ip); if (sample_type & PERF_SAMPLE_REGS_INTR) { regs->ax = pebs->ax; @@ -1230,13 +1230,22 @@ static void setup_pebs_sample_data(struc #endif } - if (event->attr.precise_ip > 1 && x86_pmu.intel_cap.pebs_format >= 2) { - regs->ip = pebs->real_ip; - regs->flags |= PERF_EFLAGS_EXACT; - } else if (event->attr.precise_ip > 1 && intel_pmu_pebs_fixup_ip(regs)) - regs->flags |= PERF_EFLAGS_EXACT; - else - regs->flags &= ~PERF_EFLAGS_EXACT; + if (event->attr.precise_ip > 1) { + /* Haswell and later have the eventing IP, so use it: */ + if (x86_pmu.intel_cap.pebs_format >= 2) { + set_linear_ip(regs, pebs->real_ip); + regs->flags |= PERF_EFLAGS_EXACT; + } else { + /* Otherwise use PEBS off-by-1 IP: */ + set_linear_ip(regs, pebs->ip); + + /* ... and try to fix it up using the LBR entries: */ + if (intel_pmu_pebs_fixup_ip(regs)) + regs->flags |= PERF_EFLAGS_EXACT; + } + } else + set_linear_ip(regs, pebs->ip); + if ((sample_type & (PERF_SAMPLE_ADDR | PERF_SAMPLE_PHYS_ADDR)) && x86_pmu.intel_cap.pebs_format >= 1)