Below patch series applies to family 15 CPU's of AMD platform, to address a
consistent warning of:
"[Firmware Bug]: cpu 0, invalid threshold interrupt offset ..."
at every boot and every resume, which is misguiding as the reason is not a
Firmware Bug but "MC4_MISC thresholding quirk" not being apporpriately applied.
Shirish S (3):
x86/mce/amd: apply MC4_MISC thresholding to all models of family 15
x86/mce/amd: carve out MC4_MISC thresholding quirk
x86/mce/amd: apply MC4_MISC thresholding quirk in resume path
arch/x86/include/asm/mce.h | 1 +
arch/x86/kernel/cpu/mce/amd.c | 6 ++++
arch/x86/kernel/cpu/mce/core.c | 65 +++++++++++++++++++++++-------------------
3 files changed, 42 insertions(+), 30 deletions(-)
--
2.7.4
Its evident from various forums and logs that MC4_MISC thresholding is not
supported for the family 15 processors, hence skip the x86_model check
while applying quirk.
Changelog[v2]:
- reword commit message to adhere to coding standards
- remove check of model range
Signed-off-by: Shirish S <[email protected]>
---
arch/x86/kernel/cpu/mce/core.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 672c722..d0c5416 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1612,11 +1612,10 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
mce_flags.overflow_recov = 1;
/*
- * Turn off MC4_MISC thresholding banks on those models since
+ * Turn off MC4_MISC thresholding banks on all models since
* they're not supported there.
*/
- if (c->x86 == 0x15 &&
- (c->x86_model >= 0x10 && c->x86_model <= 0x1f)) {
+ if (c->x86 == 0x15) {
int i;
u64 hwcr;
bool need_toggle;
--
2.7.4
There are 2 code paths leading to mce_amd_feature_init() as below.
1) S5 -> S0: (boot)
secondary_startup_64 -> start_kernel -> identify_boot_cpu ->
identify_cpu ->
mcheck_cpu_init (calls __mcheck_cpu_apply_quirks before) ->
mce_amd_feature_init
2) S3 -> S0: (resume)
syscore_resume -> mce_syscore_resume -> mce_amd_feature_init
Its clear that the quirks are not applied in S3 -> S0 path, hence apply
the same.
Signed-off-by: Shirish S <[email protected]>
---
arch/x86/kernel/cpu/mce/amd.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 89298c8..ab1b12a 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -552,6 +552,12 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
unsigned int bank, block, cpu = smp_processor_id();
int offset = -1;
+ /*
+ * mcheck_cpu_init() is not called when called by mce_syscore_resume(),
+ * hence re-apply quirks, to be on safer side.
+ */
+ quirk_fam15_mc4_misc_thresholding();
+
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (mce_flags.smca)
smca_configure(bank, cpu);
--
2.7.4
MC4_MISC thresholding quirk needs to be applied during S5 -> S0 and
S3 -> S0 state transitions, which follow different code paths, hence
carve it out so as to facilitate its application in both scenarios.
Signed-off-by: Shirish S <[email protected]>
---
arch/x86/include/asm/mce.h | 1 +
arch/x86/kernel/cpu/mce/core.c | 64 +++++++++++++++++++++++-------------------
2 files changed, 36 insertions(+), 29 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index c1a812b..328b65c 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -216,6 +216,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr);
static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { }
static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) { return -EINVAL; };
#endif
+void quirk_fam15_mc4_misc_thresholding(void);
static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c) { return mce_amd_feature_init(c); }
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d0c5416..51f61cf 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1570,6 +1570,39 @@ static void quirk_sandybridge_ifu(int bank, struct mce *m, struct pt_regs *regs)
m->cs = regs->cs;
}
+/*
+ * Turn off MC4_MISC thresholding banks on all family 15 models since
+ * they're not supported there.
+ */
+void quirk_fam15_mc4_misc_thresholding(void)
+{
+ if (boot_cpu_data.x86 == 0x15) {
+ int i;
+ u64 hwcr;
+ bool need_toggle;
+ u32 msrs[] = {
+ 0x00000413, /* MC4_MISC0 */
+ 0xc0000408, /* MC4_MISC1 */
+ };
+
+ rdmsrl(MSR_K7_HWCR, hwcr);
+
+ /* McStatusWrEn has to be set */
+ need_toggle = !(hwcr & BIT(18));
+
+ if (need_toggle)
+ wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
+
+ /* Clear CntP bit safely */
+ for (i = 0; i < ARRAY_SIZE(msrs); i++)
+ msr_clear_bit(msrs[i], 62);
+
+ /* restore old settings */
+ if (need_toggle)
+ wrmsrl(MSR_K7_HWCR, hwcr);
+ }
+}
+
/* Add per CPU specific workarounds here */
static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
{
@@ -1611,35 +1644,8 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
if (c->x86 == 0x15 && c->x86_model <= 0xf)
mce_flags.overflow_recov = 1;
- /*
- * Turn off MC4_MISC thresholding banks on all models since
- * they're not supported there.
- */
- if (c->x86 == 0x15) {
- int i;
- u64 hwcr;
- bool need_toggle;
- u32 msrs[] = {
- 0x00000413, /* MC4_MISC0 */
- 0xc0000408, /* MC4_MISC1 */
- };
-
- rdmsrl(MSR_K7_HWCR, hwcr);
-
- /* McStatusWrEn has to be set */
- need_toggle = !(hwcr & BIT(18));
-
- if (need_toggle)
- wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
-
- /* Clear CntP bit safely */
- for (i = 0; i < ARRAY_SIZE(msrs); i++)
- msr_clear_bit(msrs[i], 62);
-
- /* restore old settings */
- if (need_toggle)
- wrmsrl(MSR_K7_HWCR, hwcr);
- }
+ quirk_fam15_mc4_misc_thresholding();
+
}
if (c->x86_vendor == X86_VENDOR_INTEL) {
--
2.7.4
Commit-ID: c95b323dcd3598dd7ef5005d6723c1ba3b801093
Gitweb: https://git.kernel.org/tip/c95b323dcd3598dd7ef5005d6723c1ba3b801093
Author: Shirish S <[email protected]>
AuthorDate: Thu, 10 Jan 2019 07:54:40 +0000
Committer: Borislav Petkov <[email protected]>
CommitDate: Tue, 15 Jan 2019 18:11:38 +0100
x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models
MC4_MISC thresholding is not supported on all family 0x15 processors,
hence skip the x86_model check when applying the quirk.
[ bp: massage commit message. ]
Signed-off-by: Shirish S <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/cpu/mce/core.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 672c7225cb1b..d0c54160b439 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1612,11 +1612,10 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
mce_flags.overflow_recov = 1;
/*
- * Turn off MC4_MISC thresholding banks on those models since
+ * Turn off MC4_MISC thresholding banks on all models since
* they're not supported there.
*/
- if (c->x86 == 0x15 &&
- (c->x86_model >= 0x10 && c->x86_model <= 0x1f)) {
+ if (c->x86 == 0x15) {
int i;
u64 hwcr;
bool need_toggle;