Commit 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h"),
claimed L2 misses were unsupported, due to them not being found in its
referenced documentation, whose link has now moved [1].
That old documentation listed PMCx064 unit mask bit 3 as:
"LsRdBlkC: LS Read Block C S L X Change to X Miss."
and bit 0 as:
"IcFillMiss: IC Fill Miss"
We now have new public documentation [2] with improved descriptions, that
clearly indicate what events those unit mask bits represent:
Bit 3 now clearly states:
"LsRdBlkC: Data Cache Req Miss in L2 (all types)"
and bit 0 is:
"IcFillMiss: Instruction Cache Req Miss in L2."
So we can now add support for L2 misses in perf's genericised events as
PMCx064 with both the above unit masks.
[1] The commit's original documentation reference, "Processor Programming
Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors",
originally available here:
https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf
is now available here:
https://developer.amd.com/wordpress/media/2017/11/54945_PPR_Family_17h_Models_00h-0Fh.pdf
[2] "Processor Programming Reference (PPR) for Family 17h Model 31h,
Revision B0 Processors", available here:
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Babu Moger <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: Frank van der Linden <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang Rui <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Janakarajan Natarajan <[email protected]>
Cc: Jan Beulich <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Luwei Kang <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Pawan Gupta <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suravee Suthikulpanit <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Reported-by: Babu Moger <[email protected]>
Tested-by: Babu Moger <[email protected]>
Fixes: 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h")
Signed-off-by: Kim Phillips <[email protected]>
---
RESENDing because I wasn't sure if the original version of this patch would get
ignored because it was sent with "[PATCH internal v2]" in the subject line:
https://lkml.org/lkml/2020/1/8/894
FWIW, I updated the Cc list to merge with those in patch 2/2 of this series.
arch/x86/events/amd/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 1f22b6bbda68..39eb276d0277 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -250,6 +250,7 @@ static const u64 amd_f17h_perfmon_event_map[PERF_COUNT_HW_MAX] =
[PERF_COUNT_HW_CPU_CYCLES] = 0x0076,
[PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0,
[PERF_COUNT_HW_CACHE_REFERENCES] = 0xff60,
+ [PERF_COUNT_HW_CACHE_MISSES] = 0x0964,
[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2,
[PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3,
[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x0287,
--
2.24.1
commit aaf248848db50 ("perf/x86/msr: Add AMD IRPERF (Instructions
Retired) performance counter") added support for 'perf -e msr/irperf/',
but when exercised, we always get a 0 count:
BEFORE:
$ sudo perf stat -e instructions,msr/irperf/ true
Performance counter stats for 'true':
624,833 instructions
# 0.00 stalled cycles per insn
0 msr/irperf/
It turns out it simply needs its enable bit - HWCR bit 30 - set. This patch
does just that.
Enablement is restricted to all machines advertising IRPERF capability,
except those susceptible to an erratum that makes the IRPERF return
bad values.
That erratum occurs in Family 17h models 00-1fh [1], but not in F17h
models 20h and above [2].
AFTER (on a family 17h model 31h machine):
$ sudo perf stat -e instructions,msr/irperf/ true
Performance counter stats for 'true':
621,690 instructions
# 0.00 stalled cycles per insn
622,490 msr/irperf/
[1] "Revision Guide for AMD Family 17h Models 00h-0Fh Processors",
currently available here:
https://www.amd.com/system/files/TechDocs/55449_Fam_17h_M_00h-0Fh_Rev_Guide.pdf
[2] "Revision Guide for AMD Family 17h Models 30h-3Fh Processors",
currently available here:
https://developer.amd.com/wp-content/resources/56323-PUB_0.74.pdf
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Babu Moger <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: Frank van der Linden <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang Rui <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Janakarajan Natarajan <[email protected]>
Cc: Jan Beulich <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Luwei Kang <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Pawan Gupta <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suravee Suthikulpanit <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Fixes: aaf248848db50 ("perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter")
Signed-off-by: Kim Phillips <[email protected]>
---
arch/x86/include/asm/msr-index.h | 2 ++
arch/x86/kernel/cpu/amd.c | 9 +++++++++
2 files changed, 11 insertions(+)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 084e98da04a7..717660f82f8f 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -512,6 +512,8 @@
#define MSR_K7_HWCR 0xc0010015
#define MSR_K7_HWCR_SMMLOCK_BIT 0
#define MSR_K7_HWCR_SMMLOCK BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
+#define MSR_K7_HWCR_IRPERF_EN_BIT 30
+#define MSR_K7_HWCR_IRPERF_EN BIT_ULL(MSR_K7_HWCR_IRPERF_EN_BIT)
#define MSR_K7_FID_VID_CTL 0xc0010041
#define MSR_K7_FID_VID_STATUS 0xc0010042
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 90f75e515876..765b2b6672eb 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -978,6 +978,15 @@ static void init_amd(struct cpuinfo_x86 *c)
/* AMD CPUs don't reset SS attributes on SYSRET, Xen does. */
if (!cpu_has(c, X86_FEATURE_XENPV))
set_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS);
+
+ /*
+ * Turn on the Instructions Retired free counter on machines not
+ * susceptible to erratum #1054 "Instructions Retired Performance
+ * Counter May Be Inaccurate"
+ */
+ if (cpu_has(c, X86_FEATURE_IRPERF) &&
+ !(c->x86 == 0x17 && c->x86_model <= 0x1f))
+ msr_set_bit(MSR_K7_HWCR, MSR_K7_HWCR_IRPERF_EN_BIT);
}
#ifdef CONFIG_X86_32
--
2.24.1
> + if (cpu_has(c, X86_FEATURE_IRPERF) &&
> + !(c->x86 == 0x17 && c->x86_model <= 0x1f))
Such checks are normally through a x86_pmu lag or possibly a X86_BUG_* flag
> + msr_set_bit(MSR_K7_HWCR, MSR_K7_HWCR_IRPERF_EN_BIT);
> }
>
> #ifdef CONFIG_X86_32
> --
> 2.24.1
>