Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp1911353imm; Sat, 23 Jun 2018 05:29:04 -0700 (PDT) X-Google-Smtp-Source: ADUXVKK0EatGZQUBbPj9EX91BxxWsSeS9p86kwHtUYBZUvv0fiMWnEqDUt/2tNGP7LncG5ehPLC6 X-Received: by 2002:a63:6ec1:: with SMTP id j184-v6mr4727945pgc.332.1529756944117; Sat, 23 Jun 2018 05:29:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529756943; cv=none; d=google.com; s=arc-20160816; b=lLqS1AbddpW+dbG1BlU1+4Ds/S30SVnnLQzfXZvBZN4RijxM9ZJBwdDOwk/Wm0d0+B 5TDCURYGq7muuqrQ8oOhsNGp5dhxylFyZZzA6ctkFAmh1gmfxc3Wr20rJZOLOB33e7Uc iT/GJMlYSLimmP8OCwqtb2OifaYMGsKuUXjSlhQXNiJ17EhCkxt0py1jOp2A3ucVFnhO 7u9VSCBg52mc03B3Tl/mihugb6snl6qC5bcmuJzTa1xPpmI+JM8DBdezJk61fAnzctev 7zpw1XSiTHU1KArcKmbZlSDC6bIvyg03d395gW1NLhYUeOX7WuNnsDhP5d/URBPZKtyf z6dQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-disposition :content-transfer-encoding:mime-version:robot-unsubscribe:robot-id :git-commit-id:subject:to:references:in-reply-to:reply-to:cc :message-id:from:date:arc-authentication-results; bh=jhpfQ9bGUeycG3ciWBS69DYEDn+ZcXv9kf/C73761I0=; b=q/ulVY9lGZXAUIOJj3m4S0OzpKNkTb+pQYMkMx4KzU99oSU3N50Khqrr09JLwrJeGK a4rBEbE923d22JBQVfHfKglf6QZvjUMElbR9dEkc2plqMxfvs6m7rLOVOYskqCIGEEcz hDiibic5b6RY1B3SwXkwbo8AXTiTQqn2LQywaG59cll1unehtVswtnV7r1oN/J7zeN8B 417HP7+zaKrwxoZPjwtM/zP0jOd7F3jvCX3XBsUCqdRTdFi4eQBz6G1vxUMr5wJbvH5d DQSHMEqKi0ijp0qVFdu95aOEsg664HJdEpG+m/J5vNl6H0dMgjYm6yIvy5StFXD96UUP 6rDg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f18-v6si7951956pgt.283.2018.06.23.05.28.49; Sat, 23 Jun 2018 05:29:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752331AbeFWM1H (ORCPT + 99 others); Sat, 23 Jun 2018 08:27:07 -0400 Received: from terminus.zytor.com ([198.137.202.136]:41739 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751640AbeFWM1G (ORCPT ); Sat, 23 Jun 2018 08:27:06 -0400 Received: from terminus.zytor.com (localhost [127.0.0.1]) by terminus.zytor.com (8.15.2/8.15.2) with ESMTPS id w5NCR0mu467331 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 23 Jun 2018 05:27:00 -0700 Received: (from tipbot@localhost) by terminus.zytor.com (8.15.2/8.15.2/Submit) id w5NCR0ri467328; Sat, 23 Jun 2018 05:27:00 -0700 Date: Sat, 23 Jun 2018 05:27:00 -0700 X-Authentication-Warning: terminus.zytor.com: tipbot set sender to tipbot@zytor.com using -f From: tip-bot for Reinette Chatre Message-ID: Cc: mingo@kernel.org, hpa@zytor.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org, tglx@linutronix.de Reply-To: hpa@zytor.com, mingo@kernel.org, tglx@linutronix.de, linux-kernel@vger.kernel.org, reinette.chatre@intel.com In-Reply-To: <36c1414e9bd17c3faf440f32b644b9c879bcbae2.1529706536.git.reinette.chatre@intel.com> References: <36c1414e9bd17c3faf440f32b644b9c879bcbae2.1529706536.git.reinette.chatre@intel.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:x86/cache] x86/intel_rdt: Support L3 cache performance event of Broadwell Git-Commit-ID: 5c3968994582c1265cdb81884533371cbe868a93 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, DATE_IN_FUTURE_96_Q autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on terminus.zytor.com Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: 5c3968994582c1265cdb81884533371cbe868a93 Gitweb: https://git.kernel.org/tip/5c3968994582c1265cdb81884533371cbe868a93 Author: Reinette Chatre AuthorDate: Fri, 22 Jun 2018 15:42:29 -0700 Committer: Thomas Gleixner CommitDate: Sat, 23 Jun 2018 13:03:52 +0200 x86/intel_rdt: Support L3 cache performance event of Broadwell Broadwell microarchitecture supports pseudo-locking. Add support for the L3 cache related performance events of these systems so that the success of pseudo-locking can be measured more accurately on these platforms. Signed-off-by: Reinette Chatre Signed-off-by: Thomas Gleixner Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/36c1414e9bd17c3faf440f32b644b9c879bcbae2.1529706536.git.reinette.chatre@intel.com --- arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 56 +++++++++++++++++++++++ arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h | 10 ++++ 2 files changed, 66 insertions(+) diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c index aa9fa8a16875..5f674e042b35 100644 --- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c +++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c @@ -847,6 +847,8 @@ static int measure_cycles_lat_fn(void *_plr) static int measure_cycles_perf_fn(void *_plr) { + unsigned long long l3_hits = 0, l3_miss = 0; + u64 l3_hit_bits = 0, l3_miss_bits = 0; struct pseudo_lock_region *plr = _plr; unsigned long long l2_hits, l2_miss; u64 l2_hit_bits, l2_miss_bits; @@ -880,6 +882,16 @@ static int measure_cycles_perf_fn(void *_plr) * L2_HIT 02H * L1_MISS 08H * L2_MISS 10H + * + * On Broadwell Microarchitecture the MEM_LOAD_UOPS_RETIRED event + * has two "no fix" errata associated with it: BDM35 and BDM100. On + * this platform we use the following events instead: + * L2_RQSTS 24H (Documented in https://download.01.org/perfmon/BDW/) + * REFERENCES FFH + * MISS 3FH + * LONGEST_LAT_CACHE 2EH (Documented in SDM) + * REFERENCE 4FH + * MISS 41H */ /* @@ -898,6 +910,14 @@ static int measure_cycles_perf_fn(void *_plr) l2_hit_bits = (0x52ULL << 16) | (0x2 << 8) | 0xd1; l2_miss_bits = (0x52ULL << 16) | (0x10 << 8) | 0xd1; break; + case INTEL_FAM6_BROADWELL_X: + /* On BDW the l2_hit_bits count references, not hits */ + l2_hit_bits = (0x52ULL << 16) | (0xff << 8) | 0x24; + l2_miss_bits = (0x52ULL << 16) | (0x3f << 8) | 0x24; + /* On BDW the l3_hit_bits count references, not hits */ + l3_hit_bits = (0x52ULL << 16) | (0x4f << 8) | 0x2e; + l3_miss_bits = (0x52ULL << 16) | (0x41 << 8) | 0x2e; + break; default: goto out; } @@ -914,9 +934,21 @@ static int measure_cycles_perf_fn(void *_plr) pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x0); pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0, 0x0); pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0 + 1, 0x0); + if (l3_hit_bits > 0) { + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 2, 0x0); + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 3, 0x0); + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0 + 2, 0x0); + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0 + 3, 0x0); + } /* Set and enable the L2 counters */ pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0, l2_hit_bits); pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, l2_miss_bits); + if (l3_hit_bits > 0) { + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 2, + l3_hit_bits); + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 3, + l3_miss_bits); + } mem_r = plr->kmem; size = plr->size; line_size = plr->line_size; @@ -934,11 +966,35 @@ static int measure_cycles_perf_fn(void *_plr) l2_hit_bits & ~(0x40ULL << 16)); pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, l2_miss_bits & ~(0x40ULL << 16)); + if (l3_hit_bits > 0) { + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 2, + l3_hit_bits & ~(0x40ULL << 16)); + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 3, + l3_miss_bits & ~(0x40ULL << 16)); + } l2_hits = native_read_pmc(0); l2_miss = native_read_pmc(1); + if (l3_hit_bits > 0) { + l3_hits = native_read_pmc(2); + l3_miss = native_read_pmc(3); + } wrmsr(MSR_MISC_FEATURE_CONTROL, 0x0, 0x0); local_irq_enable(); + /* + * On BDW we count references and misses, need to adjust. Sometimes + * the "hits" counter is a bit more than the references, for + * example, x references but x + 1 hits. To not report invalid + * hit values in this case we treat that as misses eaqual to + * references. + */ + if (boot_cpu_data.x86_model == INTEL_FAM6_BROADWELL_X) + l2_hits -= (l2_miss > l2_hits ? l2_hits : l2_miss); trace_pseudo_lock_l2(l2_hits, l2_miss); + if (l3_hit_bits > 0) { + if (boot_cpu_data.x86_model == INTEL_FAM6_BROADWELL_X) + l3_hits -= (l3_miss > l3_hits ? l3_hits : l3_miss); + trace_pseudo_lock_l3(l3_hits, l3_miss); + } out: plr->thread_done = 1; diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h index efad50d2ee2f..2c041e6d9f05 100644 --- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h +++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h @@ -25,6 +25,16 @@ TRACE_EVENT(pseudo_lock_l2, TP_printk("hits=%llu miss=%llu", __entry->l2_hits, __entry->l2_miss)); +TRACE_EVENT(pseudo_lock_l3, + TP_PROTO(u64 l3_hits, u64 l3_miss), + TP_ARGS(l3_hits, l3_miss), + TP_STRUCT__entry(__field(u64, l3_hits) + __field(u64, l3_miss)), + TP_fast_assign(__entry->l3_hits = l3_hits; + __entry->l3_miss = l3_miss;), + TP_printk("hits=%llu miss=%llu", + __entry->l3_hits, __entry->l3_miss)); + #endif /* _TRACE_PSEUDO_LOCK_H */ #undef TRACE_INCLUDE_PATH