Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp1374135imm; Fri, 22 Jun 2018 15:46:01 -0700 (PDT) X-Google-Smtp-Source: ADUXVKL11jJxF3cTmKug95e3PqBtidoKw17bcg++U8j1kM9TVn6xkjSRs3KOHcOkVtKO1Vuk5E26 X-Received: by 2002:a17:902:6b86:: with SMTP id p6-v6mr3393153plk.75.1529707561507; Fri, 22 Jun 2018 15:46:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529707561; cv=none; d=google.com; s=arc-20160816; b=C49XLz6gL4T5js8t3iQ+jWuK85fjXQtdjrIMLGZX7VCuSwy9/0PeqerbLFwKn9Urdq zBquAey6YBhwRHom4ELhM8KL4A8r5GGMO4YSPRCC9oE8uHfWO06KNmYDupjih0LmUEMP Aqi3sHbp1Z5N27puzrczSV/d7wBn7h7EkdrfR/I7BX4rIxX8xjbAIjtowTkrAkRN5Idw RgaRhEVqt5Er7jUQD/ax9UvdOvPHS89jJjBfbY2iLFvH3BMuUR3CWw4gC1sCaJNZVQa1 Ek2qZQnlnH+qNfoMD8i5mEmGtX6ugYvsYwb/m/ySvqOFwB2r6ClBx8OvpFKLZDr3Qs/4 k8eQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=wAar8jI1EgoF1VoZM7Ta3rXo7dv7uzsKLopUYWonplg=; b=ifF3JJxLqnTVi1HYovHr4KhjttYE5AwN9PFUvqWfHhWRoE10oS9ooPzkHWNcuBVNbr FBEofpdW6E/mPKHM7ATtHqPhklfSGMXcyfFAn1DHtNMjWoZuqMUW/WFrxXgj9BKzJJo5 JSmP4YpNk2FZFSO3K37rF86UwBDJhUi66p4l3XYdh51Lz1IN55WlR6H46ex3+aEPMTf5 ibfOzRNPKbQRgnyYwJ48o1CBfK1VtgT9vWBvnHrg+onI+imOaWqZ4LW90QAHKTn8XmPi m0lDrjE8FqeLhe3m/VqZPwuWzPJ08b+jv2G5rKfOmH35Lst7P3RZN5eeFSyn8m9E8hOI 41jA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q18-v6si6169031pge.576.2018.06.22.15.45.47; Fri, 22 Jun 2018 15:46:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754715AbeFVWnx (ORCPT + 99 others); Fri, 22 Jun 2018 18:43:53 -0400 Received: from mga02.intel.com ([134.134.136.20]:22311 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934003AbeFVWm7 (ORCPT ); Fri, 22 Jun 2018 18:42:59 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Jun 2018 15:42:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,259,1526367600"; d="scan'208";a="234843462" Received: from rchatre-s.jf.intel.com ([10.54.70.76]) by orsmga005.jf.intel.com with ESMTP; 22 Jun 2018 15:42:48 -0700 From: Reinette Chatre To: tglx@linutronix.de, fenghua.yu@intel.com, tony.luck@intel.com, vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com, jithu.joseph@intel.com, dave.hansen@intel.com, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, Reinette Chatre Subject: [PATCH V7 38/41] x86/intel_rdt: Support L3 cache performance event of Broadwell Date: Fri, 22 Jun 2018 15:42:29 -0700 Message-Id: <36c1414e9bd17c3faf440f32b644b9c879bcbae2.1529706536.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: References: In-Reply-To: References: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Broadwell microarchitecture supports pseudo-locking. Add support for the L3 cache related performance events of these systems so that the success of pseudo-locking can be measured more accurately on these platforms. Signed-off-by: Reinette Chatre Signed-off-by: Thomas Gleixner Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/5b91247c6ea44df78ddb18a2d488b86bbd20898c.1527593971.git.reinette.chatre@intel.com --- arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 56 +++++++++++++++++++ .../kernel/cpu/intel_rdt_pseudo_lock_event.h | 10 ++++ 2 files changed, 66 insertions(+) diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c index acaec07134c7..17ed2e9d4551 100644 --- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c +++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c @@ -847,6 +847,8 @@ static int measure_cycles_lat_fn(void *_plr) static int measure_cycles_perf_fn(void *_plr) { + unsigned long long l3_hits = 0, l3_miss = 0; + u64 l3_hit_bits = 0, l3_miss_bits = 0; struct pseudo_lock_region *plr = _plr; unsigned long long l2_hits, l2_miss; u64 l2_hit_bits, l2_miss_bits; @@ -880,6 +882,16 @@ static int measure_cycles_perf_fn(void *_plr) * L2_HIT 02H * L1_MISS 08H * L2_MISS 10H + * + * On Broadwell Microarchitecture the MEM_LOAD_UOPS_RETIRED event + * has two "no fix" errata associated with it: BDM35 and BDM100. On + * this platform we use the following events instead: + * L2_RQSTS 24H (Documented in https://download.01.org/perfmon/BDW/) + * REFERENCES FFH + * MISS 3FH + * LONGEST_LAT_CACHE 2EH (Documented in SDM) + * REFERENCE 4FH + * MISS 41H */ /* @@ -898,6 +910,14 @@ static int measure_cycles_perf_fn(void *_plr) l2_hit_bits = (0x52ULL << 16) | (0x2 << 8) | 0xd1; l2_miss_bits = (0x52ULL << 16) | (0x10 << 8) | 0xd1; break; + case INTEL_FAM6_BROADWELL_X: + /* On BDW the l2_hit_bits count references, not hits */ + l2_hit_bits = (0x52ULL << 16) | (0xff << 8) | 0x24; + l2_miss_bits = (0x52ULL << 16) | (0x3f << 8) | 0x24; + /* On BDW the l3_hit_bits count references, not hits */ + l3_hit_bits = (0x52ULL << 16) | (0x4f << 8) | 0x2e; + l3_miss_bits = (0x52ULL << 16) | (0x41 << 8) | 0x2e; + break; default: goto out; } @@ -914,9 +934,21 @@ static int measure_cycles_perf_fn(void *_plr) pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x0); pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0, 0x0); pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0 + 1, 0x0); + if (l3_hit_bits > 0) { + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 2, 0x0); + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 3, 0x0); + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0 + 2, 0x0); + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0 + 3, 0x0); + } /* Set and enable the L2 counters */ pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0, l2_hit_bits); pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, l2_miss_bits); + if (l3_hit_bits > 0) { + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 2, + l3_hit_bits); + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 3, + l3_miss_bits); + } mem_r = plr->kmem; size = plr->size; line_size = plr->line_size; @@ -934,11 +966,35 @@ static int measure_cycles_perf_fn(void *_plr) l2_hit_bits & ~(0x40ULL << 16)); pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, l2_miss_bits & ~(0x40ULL << 16)); + if (l3_hit_bits > 0) { + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 2, + l3_hit_bits & ~(0x40ULL << 16)); + pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 3, + l3_miss_bits & ~(0x40ULL << 16)); + } l2_hits = native_read_pmc(0); l2_miss = native_read_pmc(1); + if (l3_hit_bits > 0) { + l3_hits = native_read_pmc(2); + l3_miss = native_read_pmc(3); + } wrmsr(MSR_MISC_FEATURE_CONTROL, 0x0, 0x0); local_irq_enable(); + /* + * On BDW we count references and misses, need to adjust. Sometimes + * the "hits" counter is a bit more than the references, for + * example, x references but x + 1 hits. To not report invalid + * hit values in this case we treat that as misses eaqual to + * references. + */ + if (boot_cpu_data.x86_model == INTEL_FAM6_BROADWELL_X) + l2_hits -= (l2_miss > l2_hits ? l2_hits : l2_miss); trace_pseudo_lock_l2(l2_hits, l2_miss); + if (l3_hit_bits > 0) { + if (boot_cpu_data.x86_model == INTEL_FAM6_BROADWELL_X) + l3_hits -= (l3_miss > l3_hits ? l3_hits : l3_miss); + trace_pseudo_lock_l3(l3_hits, l3_miss); + } out: plr->thread_done = 1; diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h index efad50d2ee2f..2c041e6d9f05 100644 --- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h +++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h @@ -25,6 +25,16 @@ TRACE_EVENT(pseudo_lock_l2, TP_printk("hits=%llu miss=%llu", __entry->l2_hits, __entry->l2_miss)); +TRACE_EVENT(pseudo_lock_l3, + TP_PROTO(u64 l3_hits, u64 l3_miss), + TP_ARGS(l3_hits, l3_miss), + TP_STRUCT__entry(__field(u64, l3_hits) + __field(u64, l3_miss)), + TP_fast_assign(__entry->l3_hits = l3_hits; + __entry->l3_miss = l3_miss;), + TP_printk("hits=%llu miss=%llu", + __entry->l3_hits, __entry->l3_miss)); + #endif /* _TRACE_PSEUDO_LOCK_H */ #undef TRACE_INCLUDE_PATH -- 2.17.0