2019-01-10 07:56:07

by Shirish S

[permalink] [raw]
Subject: [PATCH 0/3] x86/mce/amd: apply missing quirks to family 15 models (v2)

Below patch series applies to family 15 CPU's of AMD platform, to address a
consistent warning of:

"[Firmware Bug]: cpu 0, invalid threshold interrupt offset ..."

at every boot and every resume, which is misguiding as the reason is not a
Firmware Bug but "MC4_MISC thresholding quirk" not being apporpriately applied.

Shirish S (3):
x86/mce/amd: apply MC4_MISC thresholding to all models of family 15
x86/mce/amd: carve out MC4_MISC thresholding quirk
x86/mce/amd: apply MC4_MISC thresholding quirk in resume path

arch/x86/include/asm/mce.h | 1 +
arch/x86/kernel/cpu/mce/amd.c | 6 ++++
arch/x86/kernel/cpu/mce/core.c | 65 +++++++++++++++++++++++-------------------
3 files changed, 42 insertions(+), 30 deletions(-)

--
2.7.4



2019-01-10 07:56:14

by Shirish S

[permalink] [raw]
Subject: [PATCH 1/3] x86/mce/amd: apply MC4_MISC thresholding to all models of family 15

Its evident from various forums and logs that MC4_MISC thresholding is not
supported for the family 15 processors, hence skip the x86_model check
while applying quirk.

Changelog[v2]:
- reword commit message to adhere to coding standards
- remove check of model range

Signed-off-by: Shirish S <[email protected]>
---
arch/x86/kernel/cpu/mce/core.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 672c722..d0c5416 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1612,11 +1612,10 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
mce_flags.overflow_recov = 1;

/*
- * Turn off MC4_MISC thresholding banks on those models since
+ * Turn off MC4_MISC thresholding banks on all models since
* they're not supported there.
*/
- if (c->x86 == 0x15 &&
- (c->x86_model >= 0x10 && c->x86_model <= 0x1f)) {
+ if (c->x86 == 0x15) {
int i;
u64 hwcr;
bool need_toggle;
--
2.7.4


2019-01-10 07:57:18

by Shirish S

[permalink] [raw]
Subject: [PATCH 3/3] x86/mce/amd: apply MC4_MISC thresholding quirk in resume path

There are 2 code paths leading to mce_amd_feature_init() as below.
1) S5 -> S0: (boot)
secondary_startup_64 -> start_kernel -> identify_boot_cpu ->
identify_cpu ->
mcheck_cpu_init (calls __mcheck_cpu_apply_quirks before) ->
mce_amd_feature_init

2) S3 -> S0: (resume)
syscore_resume -> mce_syscore_resume -> mce_amd_feature_init

Its clear that the quirks are not applied in S3 -> S0 path, hence apply
the same.

Signed-off-by: Shirish S <[email protected]>
---
arch/x86/kernel/cpu/mce/amd.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 89298c8..ab1b12a 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -552,6 +552,12 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
unsigned int bank, block, cpu = smp_processor_id();
int offset = -1;

+ /*
+ * mcheck_cpu_init() is not called when called by mce_syscore_resume(),
+ * hence re-apply quirks, to be on safer side.
+ */
+ quirk_fam15_mc4_misc_thresholding();
+
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (mce_flags.smca)
smca_configure(bank, cpu);
--
2.7.4


2019-01-10 07:58:01

by Shirish S

[permalink] [raw]
Subject: [PATCH 2/3] x86/mce/amd: carve out MC4_MISC thresholding quirk

MC4_MISC thresholding quirk needs to be applied during S5 -> S0 and
S3 -> S0 state transitions, which follow different code paths, hence
carve it out so as to facilitate its application in both scenarios.

Signed-off-by: Shirish S <[email protected]>
---
arch/x86/include/asm/mce.h | 1 +
arch/x86/kernel/cpu/mce/core.c | 64 +++++++++++++++++++++++-------------------
2 files changed, 36 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index c1a812b..328b65c 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -216,6 +216,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr);
static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { }
static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) { return -EINVAL; };
#endif
+void quirk_fam15_mc4_misc_thresholding(void);

static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c) { return mce_amd_feature_init(c); }

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d0c5416..51f61cf 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1570,6 +1570,39 @@ static void quirk_sandybridge_ifu(int bank, struct mce *m, struct pt_regs *regs)
m->cs = regs->cs;
}

+/*
+ * Turn off MC4_MISC thresholding banks on all family 15 models since
+ * they're not supported there.
+ */
+void quirk_fam15_mc4_misc_thresholding(void)
+{
+ if (boot_cpu_data.x86 == 0x15) {
+ int i;
+ u64 hwcr;
+ bool need_toggle;
+ u32 msrs[] = {
+ 0x00000413, /* MC4_MISC0 */
+ 0xc0000408, /* MC4_MISC1 */
+ };
+
+ rdmsrl(MSR_K7_HWCR, hwcr);
+
+ /* McStatusWrEn has to be set */
+ need_toggle = !(hwcr & BIT(18));
+
+ if (need_toggle)
+ wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
+
+ /* Clear CntP bit safely */
+ for (i = 0; i < ARRAY_SIZE(msrs); i++)
+ msr_clear_bit(msrs[i], 62);
+
+ /* restore old settings */
+ if (need_toggle)
+ wrmsrl(MSR_K7_HWCR, hwcr);
+ }
+}
+
/* Add per CPU specific workarounds here */
static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
{
@@ -1611,35 +1644,8 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
if (c->x86 == 0x15 && c->x86_model <= 0xf)
mce_flags.overflow_recov = 1;

- /*
- * Turn off MC4_MISC thresholding banks on all models since
- * they're not supported there.
- */
- if (c->x86 == 0x15) {
- int i;
- u64 hwcr;
- bool need_toggle;
- u32 msrs[] = {
- 0x00000413, /* MC4_MISC0 */
- 0xc0000408, /* MC4_MISC1 */
- };
-
- rdmsrl(MSR_K7_HWCR, hwcr);
-
- /* McStatusWrEn has to be set */
- need_toggle = !(hwcr & BIT(18));
-
- if (need_toggle)
- wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
-
- /* Clear CntP bit safely */
- for (i = 0; i < ARRAY_SIZE(msrs); i++)
- msr_clear_bit(msrs[i], 62);
-
- /* restore old settings */
- if (need_toggle)
- wrmsrl(MSR_K7_HWCR, hwcr);
- }
+ quirk_fam15_mc4_misc_thresholding();
+
}

if (c->x86_vendor == X86_VENDOR_INTEL) {
--
2.7.4


Subject: [tip:ras/core] x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models

Commit-ID: c95b323dcd3598dd7ef5005d6723c1ba3b801093
Gitweb: https://git.kernel.org/tip/c95b323dcd3598dd7ef5005d6723c1ba3b801093
Author: Shirish S <[email protected]>
AuthorDate: Thu, 10 Jan 2019 07:54:40 +0000
Committer: Borislav Petkov <[email protected]>
CommitDate: Tue, 15 Jan 2019 18:11:38 +0100

x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models

MC4_MISC thresholding is not supported on all family 0x15 processors,
hence skip the x86_model check when applying the quirk.

[ bp: massage commit message. ]

Signed-off-by: Shirish S <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/cpu/mce/core.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 672c7225cb1b..d0c54160b439 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1612,11 +1612,10 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
mce_flags.overflow_recov = 1;

/*
- * Turn off MC4_MISC thresholding banks on those models since
+ * Turn off MC4_MISC thresholding banks on all models since
* they're not supported there.
*/
- if (c->x86 == 0x15 &&
- (c->x86_model >= 0x10 && c->x86_model <= 0x1f)) {
+ if (c->x86 == 0x15) {
int i;
u64 hwcr;
bool need_toggle;