2019-01-16 22:42:19

by Shirish S

[permalink] [raw]
Subject: [PATCH 0/2] x86/mce/amd: apply missing quirks to family 15 models (v2)

Below patch series applies to family 15 CPU's of AMD platform, to address a
consistent warning of:

"[Firmware Bug]: cpu 0, invalid threshold interrupt offset ..."

at every boot and every resume, which is misguiding as the reason is not a
Firmware Bug but "MC4_MISC thresholding quirk" not being apporpriately
applied.

Shirish S (2):
x86/mce/amd: apply MC4_MISC thresholding to all models of family 15
x86/mce/amd: carve out MC4_MISC thresholding quirk

arch/x86/kernel/cpu/mce/amd.c | 34 ++++++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/mce/core.c | 30 ------------------------------
2 files changed, 34 insertions(+), 30 deletions(-)

--
2.7.4



2019-01-16 22:42:08

by Shirish S

[permalink] [raw]
Subject: [PATCH 1/2] x86/mce/amd: apply MC4_MISC thresholding to all models of family 15

Its evident from various forums and logs that MC4_MISC thresholding is not
supported for the family 15 processors, hence skip the x86_model check
while applying quirk.

Changelog[v2]:
- reword commit message to adhere to coding standards
- remove check of model range

Signed-off-by: Shirish S <[email protected]>
---
arch/x86/kernel/cpu/mce/core.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 672c722..d0c5416 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1612,11 +1612,10 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
mce_flags.overflow_recov = 1;

/*
- * Turn off MC4_MISC thresholding banks on those models since
+ * Turn off MC4_MISC thresholding banks on all models since
* they're not supported there.
*/
- if (c->x86 == 0x15 &&
- (c->x86_model >= 0x10 && c->x86_model <= 0x1f)) {
+ if (c->x86 == 0x15) {
int i;
u64 hwcr;
bool need_toggle;
--
2.7.4


2019-01-16 22:44:35

by Shirish S

[permalink] [raw]
Subject: [PATCH 2/2] x86/mce/amd: carve out MC4_MISC thresholding quirk

MC4_MISC thresholding quirk needs to be applied during S5 -> S0 and
S3 -> S0 state transitions, which follow different code paths, hence
carve it out and move it mce_amd_feature_init(), which is the converging
point of both code paths.

Changelog[v2]:
- move the quirk to mce/amd.c

Signed-off-by: Shirish S <[email protected]>
---
arch/x86/kernel/cpu/mce/amd.c | 34 ++++++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/mce/core.c | 29 -----------------------------
2 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 89298c8..f6a5c96 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -545,6 +545,33 @@ prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr,
return offset;
}

+void mc4_misc_thresholding_quirk(void)
+{
+ int i;
+ u64 hwcr;
+ bool need_toggle;
+ u32 msrs[] = {
+ 0x00000413, /* MC4_MISC0 */
+ 0xc0000408, /* MC4_MISC1 */
+ };
+
+ rdmsrl(MSR_K7_HWCR, hwcr);
+
+ /* McStatusWrEn has to be set */
+ need_toggle = !(hwcr & BIT(18));
+
+ if (need_toggle)
+ wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
+
+ /* Clear CntP bit safely */
+ for (i = 0; i < ARRAY_SIZE(msrs); i++)
+ msr_clear_bit(msrs[i], 62);
+
+ /* restore old settings */
+ if (need_toggle)
+ wrmsrl(MSR_K7_HWCR, hwcr);
+}
+
/* cpu init entry point, called from mce.c with preempt off */
void mce_amd_feature_init(struct cpuinfo_x86 *c)
{
@@ -552,6 +579,13 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
unsigned int bank, block, cpu = smp_processor_id();
int offset = -1;

+ /*
+ * Turn off MC4_MISC thresholding banks on all family 15 models since
+ * they're not supported there.
+ */
+ if (c->x86 == 0x15)
+ mc4_misc_thresholding_quirk();
+
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (mce_flags.smca)
smca_configure(bank, cpu);
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d0c5416..6063ae2 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1611,35 +1611,6 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
if (c->x86 == 0x15 && c->x86_model <= 0xf)
mce_flags.overflow_recov = 1;

- /*
- * Turn off MC4_MISC thresholding banks on all models since
- * they're not supported there.
- */
- if (c->x86 == 0x15) {
- int i;
- u64 hwcr;
- bool need_toggle;
- u32 msrs[] = {
- 0x00000413, /* MC4_MISC0 */
- 0xc0000408, /* MC4_MISC1 */
- };
-
- rdmsrl(MSR_K7_HWCR, hwcr);
-
- /* McStatusWrEn has to be set */
- need_toggle = !(hwcr & BIT(18));
-
- if (need_toggle)
- wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
-
- /* Clear CntP bit safely */
- for (i = 0; i < ARRAY_SIZE(msrs); i++)
- msr_clear_bit(msrs[i], 62);
-
- /* restore old settings */
- if (need_toggle)
- wrmsrl(MSR_K7_HWCR, hwcr);
- }
}

if (c->x86_vendor == X86_VENDOR_INTEL) {
--
2.7.4


2019-01-17 07:04:47

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/mce/amd: carve out MC4_MISC thresholding quirk

On Wed, Jan 16, 2019 at 03:10:40PM +0000, S, Shirish wrote:
> MC4_MISC thresholding quirk needs to be applied during S5 -> S0 and
> S3 -> S0 state transitions, which follow different code paths, hence
> carve it out and move it mce_amd_feature_init(), which is the converging
> point of both code paths.
>
> Changelog[v2]:
> - move the quirk to mce/amd.c

For future reference: changelog lines ...

> Signed-off-by: Shirish S <[email protected]>
> ---

... land below this line so that they are not part of the commit
message.

> arch/x86/kernel/cpu/mce/amd.c | 34 ++++++++++++++++++++++++++++++++++
> arch/x86/kernel/cpu/mce/core.c | 29 -----------------------------
> 2 files changed, 34 insertions(+), 29 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
> index 89298c8..f6a5c96 100644
> --- a/arch/x86/kernel/cpu/mce/amd.c
> +++ b/arch/x86/kernel/cpu/mce/amd.c
> @@ -545,6 +545,33 @@ prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr,
> return offset;
> }
>
> +void mc4_misc_thresholding_quirk(void)

Functions should have a verb in their names.

I've fixed those and other issues now but make sure you take a look at
those nice documents here, for future reference, when preparing other
patches:

https://www.kernel.org/doc/html/latest/process/index.html

Thx.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Subject: [tip:ras/core] x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk

Commit-ID: 30aa3d26edb0f3d7992757287eec0ca588a5c259
Gitweb: https://git.kernel.org/tip/30aa3d26edb0f3d7992757287eec0ca588a5c259
Author: Shirish S <[email protected]>
AuthorDate: Wed, 16 Jan 2019 15:10:40 +0000
Committer: Borislav Petkov <[email protected]>
CommitDate: Wed, 16 Jan 2019 19:42:00 +0100

x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk

The MC4_MISC thresholding quirk needs to be applied during S5 -> S0 and
S3 -> S0 state transitions, which follow different code paths. Carve it
out into a separate function and call it mce_amd_feature_init() where
the two code paths of the state transitions converge.

[ bp: massage commit message and the carved out function. ]

Signed-off-by: Shirish S <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Yazen Ghannam <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/cpu/mce/amd.c | 36 ++++++++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/mce/core.c | 29 -----------------------------
2 files changed, 36 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 89298c83de53..ed3327342b40 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -545,6 +545,40 @@ out:
return offset;
}

+/*
+ * Turn off MC4_MISC thresholding banks on all family 0x15 models since
+ * they're not supported there.
+ */
+void disable_err_thresholding(struct cpuinfo_x86 *c)
+{
+ int i;
+ u64 hwcr;
+ bool need_toggle;
+ u32 msrs[] = {
+ 0x00000413, /* MC4_MISC0 */
+ 0xc0000408, /* MC4_MISC1 */
+ };
+
+ if (c->x86 != 0x15)
+ return;
+
+ rdmsrl(MSR_K7_HWCR, hwcr);
+
+ /* McStatusWrEn has to be set */
+ need_toggle = !(hwcr & BIT(18));
+
+ if (need_toggle)
+ wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
+
+ /* Clear CntP bit safely */
+ for (i = 0; i < ARRAY_SIZE(msrs); i++)
+ msr_clear_bit(msrs[i], 62);
+
+ /* restore old settings */
+ if (need_toggle)
+ wrmsrl(MSR_K7_HWCR, hwcr);
+}
+
/* cpu init entry point, called from mce.c with preempt off */
void mce_amd_feature_init(struct cpuinfo_x86 *c)
{
@@ -552,6 +586,8 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
unsigned int bank, block, cpu = smp_processor_id();
int offset = -1;

+ disable_err_thresholding(c);
+
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (mce_flags.smca)
smca_configure(bank, cpu);
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d0c54160b439..6063ae2376b2 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1611,35 +1611,6 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
if (c->x86 == 0x15 && c->x86_model <= 0xf)
mce_flags.overflow_recov = 1;

- /*
- * Turn off MC4_MISC thresholding banks on all models since
- * they're not supported there.
- */
- if (c->x86 == 0x15) {
- int i;
- u64 hwcr;
- bool need_toggle;
- u32 msrs[] = {
- 0x00000413, /* MC4_MISC0 */
- 0xc0000408, /* MC4_MISC1 */
- };
-
- rdmsrl(MSR_K7_HWCR, hwcr);
-
- /* McStatusWrEn has to be set */
- need_toggle = !(hwcr & BIT(18));
-
- if (need_toggle)
- wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
-
- /* Clear CntP bit safely */
- for (i = 0; i < ARRAY_SIZE(msrs); i++)
- msr_clear_bit(msrs[i], 62);
-
- /* restore old settings */
- if (need_toggle)
- wrmsrl(MSR_K7_HWCR, hwcr);
- }
}

if (c->x86_vendor == X86_VENDOR_INTEL) {

2019-01-17 09:09:43

by S

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/mce/amd: carve out MC4_MISC thresholding quirk


On 1/17/2019 12:55 AM, Borislav Petkov wrote:
> On Wed, Jan 16, 2019 at 03:10:40PM +0000, S, Shirish wrote:
>> MC4_MISC thresholding quirk needs to be applied during S5 -> S0 and
>> S3 -> S0 state transitions, which follow different code paths, hence
>> carve it out and move it mce_amd_feature_init(), which is the converging
>> point of both code paths.
>>
>> Changelog[v2]:
>> - move the quirk to mce/amd.c
> For future reference: changelog lines ...
>
>> Signed-off-by: Shirish S <[email protected]>
>> ---
> ... land below this line so that they are not part of the commit
> message.
Understood.
>> arch/x86/kernel/cpu/mce/amd.c | 34 ++++++++++++++++++++++++++++++++++
>> arch/x86/kernel/cpu/mce/core.c | 29 -----------------------------
>> 2 files changed, 34 insertions(+), 29 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
>> index 89298c8..f6a5c96 100644
>> --- a/arch/x86/kernel/cpu/mce/amd.c
>> +++ b/arch/x86/kernel/cpu/mce/amd.c
>> @@ -545,6 +545,33 @@ prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr,
>> return offset;
>> }
>>
>> +void mc4_misc_thresholding_quirk(void)
> Functions should have a verb in their names.
>
> I've fixed those and other issues now but make sure you take a look at
> those nice documents here, for future reference, when preparing other
> patches:
>
> https://www.kernel.org/doc/html/latest/process/index.html

Thanks for the support, i shall follow the practice from next patch onwards.

Regards,

Shirish S

> Thx.
>
--
Regards,
Shirish S

2019-01-17 10:12:03

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/mce/amd: carve out MC4_MISC thresholding quirk

On Thu, Jan 17, 2019 at 06:06:06AM +0000, S wrote:
> Thanks for the support, i shall follow the practice from next patch onwards.

Ok, but this mail is still broken. I see:

From: S
To: Borislav Petkov <[email protected]>, "S, Shirish" <[email protected]>
CC: Thomas Gleixner <[email protected]>, Ingo Molnar <[email protected]>,

So I leave it up to you to pay attention to it and to fix it - I think
I've pointed it to you enough times already.

In case you're wondering why you're not getting replies to your mails,
you'll know why.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.