2016-11-18 00:36:02

by Tony Luck

[permalink] [raw]
Subject: [PATCH 2/2] mcelog: Print the PPIN in machine check records when it is available

From: Tony Luck <[email protected]>

Intel Xeons from Ivy Bridge onwards support a processor identification
number. Kernels v4.9 and higher include it in the "mce" record.

Signed-off-by: Tony Luck <[email protected]>
---
mcelog.c | 3 +++
mcelog.h | 3 +++
2 files changed, 6 insertions(+)

diff --git a/mcelog.c b/mcelog.c
index 7214a0d23f65..e79996db9b5b 100644
--- a/mcelog.c
+++ b/mcelog.c
@@ -441,6 +441,9 @@ static void dump_mce(struct mce *m, unsigned recordlen)
if (n > 0)
Wprintf("\n");

+ if (recordlen >= offsetof(struct mce, ppin) && m->ppin)
+ n += Wprintf("PPIN %llx\n", m->ppin);
+
if (recordlen >= offsetof(struct mce, cpuid) && m->cpuid) {
u32 fam, mod;
parse_cpuid(m->cpuid, &fam, &mod);
diff --git a/mcelog.h b/mcelog.h
index 254b3a092fba..9a54077e5474 100644
--- a/mcelog.h
+++ b/mcelog.h
@@ -31,6 +31,9 @@ struct mce {
__u32 socketid; /* CPU socket ID */
__u32 apicid; /* CPU initial apic ID */
__u64 mcgcap; /* MCGCAP MSR: machine check capabilities of CPU */
+ __u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
+ __u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
+ __u64 ppin; /* Protected Processor Inventory Number */
};

#define X86_VENDOR_INTEL 0
--
2.7.4


2016-11-18 00:36:11

by Tony Luck

[permalink] [raw]
Subject: [PATCH 1/2] x86/mce: Include the PPIN in machine check records when it is available

From: Tony Luck <[email protected]>

Intel Xeons from Ivy Bridge onwards support a processor identification
number. On systems that have it, include it in the machine check record.
I'm told that this would be helpful for users that run large data centers
with multi-socket servers to keep track of which CPUs are seeing errors.

Signed-off-by: Tony Luck <[email protected]>
---
arch/x86/include/asm/msr-index.h | 4 ++++
arch/x86/include/uapi/asm/mce.h | 1 +
arch/x86/kernel/cpu/mcheck/mce.c | 35 +++++++++++++++++++++++++++++++++++
3 files changed, 40 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 78f3760ca1f2..710273c617b8 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -37,6 +37,10 @@
#define EFER_FFXSR (1<<_EFER_FFXSR)

/* Intel MSRs. Some also available on other CPUs */
+
+#define MSR_PPIN_CTL 0x0000004e
+#define MSR_PPIN 0x0000004f
+
#define MSR_IA32_PERFCTR0 0x000000c1
#define MSR_IA32_PERFCTR1 0x000000c2
#define MSR_FSB_FREQ 0x000000cd
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 69a6e07e3149..eb6247a7009b 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -28,6 +28,7 @@ struct mce {
__u64 mcgcap; /* MCGCAP MSR: machine check capabilities of CPU */
__u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
__u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
+ __u64 ppin; /* Protected Processor Inventory Number */
};

#define MCE_GET_RECORD_LEN _IOR('M', 1, int)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index a7fdf453d895..eb9ce5023da3 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -43,6 +43,7 @@
#include <linux/export.h>
#include <linux/jump_label.h>

+#include <asm/intel-family.h>
#include <asm/processor.h>
#include <asm/traps.h>
#include <asm/tlbflush.h>
@@ -122,6 +123,9 @@ static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs);
*/
ATOMIC_NOTIFIER_HEAD(x86_mce_decoder_chain);

+/* Some Intel Xeons support per socket protected processor inventory number */
+static bool have_ppin;
+
/* Do initial initialization of a struct mce */
void mce_setup(struct mce *m)
{
@@ -135,6 +139,8 @@ void mce_setup(struct mce *m)
m->socketid = cpu_data(m->extcpu).phys_proc_id;
m->apicid = cpu_data(m->extcpu).initial_apicid;
rdmsrl(MSR_IA32_MCG_CAP, m->mcgcap);
+ if (have_ppin)
+ rdmsrl(MSR_PPIN, m->ppin);
}

DEFINE_PER_CPU(struct mce, injectm);
@@ -2134,8 +2140,37 @@ static int __init mcheck_enable(char *str)
}
__setup("mce", mcheck_enable);

+static void mcheck_intel_ppin_init(void)
+{
+ unsigned long long msr_ppin_ctl;
+
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return;
+ switch (boot_cpu_data.x86_model) {
+ case INTEL_FAM6_IVYBRIDGE_X:
+ case INTEL_FAM6_HASWELL_X:
+ case INTEL_FAM6_BROADWELL_XEON_D:
+ case INTEL_FAM6_BROADWELL_X:
+ case INTEL_FAM6_SKYLAKE_X:
+ if (rdmsrl_safe(MSR_PPIN_CTL, &msr_ppin_ctl))
+ return;
+ if (msr_ppin_ctl == 1) {
+ pr_info("PPIN available but disabled\n");
+ return;
+ }
+ /* if PPIN is disabled, but not locked, try to enable */
+ if (msr_ppin_ctl == 0) {
+ wrmsrl_safe(MSR_PPIN_CTL, 2);
+ rdmsrl_safe(MSR_PPIN_CTL, &msr_ppin_ctl);
+ }
+ if (msr_ppin_ctl == 2)
+ have_ppin = 1;
+ }
+}
+
int __init mcheck_init(void)
{
+ mcheck_intel_ppin_init();
mcheck_intel_therm_init();
mce_register_decode_chain(&mce_srao_nb);
mcheck_vendor_init_severity();
--
2.7.4

2016-11-18 13:00:29

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/mce: Include the PPIN in machine check records when it is available

On Thu, Nov 17, 2016 at 04:35:48PM -0800, Luck, Tony wrote:
> From: Tony Luck <[email protected]>
>
> Intel Xeons from Ivy Bridge onwards support a processor identification
> number. On systems that have it, include it in the machine check record.
> I'm told that this would be helpful for users that run large data centers
> with multi-socket servers to keep track of which CPUs are seeing errors.
>
> Signed-off-by: Tony Luck <[email protected]>
> ---
> arch/x86/include/asm/msr-index.h | 4 ++++
> arch/x86/include/uapi/asm/mce.h | 1 +
> arch/x86/kernel/cpu/mcheck/mce.c | 35 +++++++++++++++++++++++++++++++++++
> 3 files changed, 40 insertions(+)

...

> @@ -2134,8 +2140,37 @@ static int __init mcheck_enable(char *str)
> }
> __setup("mce", mcheck_enable);
>
> +static void mcheck_intel_ppin_init(void)

So this functionality could all be moved to arch/x86/kernel/cpu/intel.c
where you could set an artificial X86_FEATURE_PPIN and get rid of the
have_ppin var.

> +{
> + unsigned long long msr_ppin_ctl;
> +
> + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
> + return;

Then, that check can go.

> + switch (boot_cpu_data.x86_model) {
> + case INTEL_FAM6_IVYBRIDGE_X:
> + case INTEL_FAM6_HASWELL_X:
> + case INTEL_FAM6_BROADWELL_XEON_D:
> + case INTEL_FAM6_BROADWELL_X:
> + case INTEL_FAM6_SKYLAKE_X:
> + if (rdmsrl_safe(MSR_PPIN_CTL, &msr_ppin_ctl))
> + return;

I don't think you need to check models - if the RDMSR fails, you're
done.

> + if (msr_ppin_ctl == 1) {

& BIT_ULL(0)

for future robustness in case those other reserved bits get used.

> + pr_info("PPIN available but disabled\n");

We don't care, do we?

> + return;
> + }
> + /* if PPIN is disabled, but not locked, try to enable */
> + if (msr_ppin_ctl == 0) {

Also, properly masked off. There are [63:2] reserved bits which might be
assigned someday.

> + wrmsrl_safe(MSR_PPIN_CTL, 2);
> + rdmsrl_safe(MSR_PPIN_CTL, &msr_ppin_ctl);

Why aren't we programming a number here? Or are users supposed to do
that?

If so, please design a proper sysfs interface and not make them use
msr-tools.

> + }
> + if (msr_ppin_ctl == 2)
> + have_ppin = 1;

set_cpu_cap(c, X86_FEATURE_PPIN);

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

2016-11-18 16:41:50

by Tony Luck

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/mce: Include the PPIN in machine check records when it is available

On Fri, Nov 18, 2016 at 02:00:22PM +0100, Borislav Petkov wrote:
> On Thu, Nov 17, 2016 at 04:35:48PM -0800, Luck, Tony wrote:
> > @@ -2134,8 +2140,37 @@ static int __init mcheck_enable(char *str)
> > }
> > __setup("mce", mcheck_enable);
> >
> > +static void mcheck_intel_ppin_init(void)
>
> So this functionality could all be moved to arch/x86/kernel/cpu/intel.c
> where you could set an artificial X86_FEATURE_PPIN and get rid of the
> have_ppin var.

Ok - will do.

> > + switch (boot_cpu_data.x86_model) {
> > + case INTEL_FAM6_IVYBRIDGE_X:
> > + case INTEL_FAM6_HASWELL_X:
> > + case INTEL_FAM6_BROADWELL_XEON_D:
> > + case INTEL_FAM6_BROADWELL_X:
> > + case INTEL_FAM6_SKYLAKE_X:
> > + if (rdmsrl_safe(MSR_PPIN_CTL, &msr_ppin_ctl))
> > + return;
>
> I don't think you need to check models - if the RDMSR fails, you're
> done.

Other models may use this MSR number for some other purpose. So the
read might succeed, but what I get might be something else entirely.
Technically with the model check I shouldn't have to use the _safe
versions ... but I'm paranoid that some SKUs might not implement this.

> > + if (msr_ppin_ctl == 1) {
>
> & BIT_ULL(0)
>
> for future robustness in case those other reserved bits get used.

Unlikely ... but paranoia is good (see above about using rdmsr_safe).

> > + pr_info("PPIN available but disabled\n");
>
> We don't care, do we?

Probably not ... there might be a BIOS setting, but the user that
finds they aren't getting PPIN in their logs could diagnose by making
their own rdmsr checks ... will delete this pr_info().

> > + return;
> > + }
> > + /* if PPIN is disabled, but not locked, try to enable */
> > + if (msr_ppin_ctl == 0) {
>
> Also, properly masked off. There are [63:2] reserved bits which might be
> assigned someday.

Ok.

> > + wrmsrl_safe(MSR_PPIN_CTL, 2);
> > + rdmsrl_safe(MSR_PPIN_CTL, &msr_ppin_ctl);
>
> Why aren't we programming a number here? Or are users supposed to do
> that?
>
> If so, please design a proper sysfs interface and not make them use
> msr-tools.

The PPIN is programmed at the fab. To the user it is just a handy
unique number. I think Intel can decode it back to which fab and
production run this chip came from (useful to us if there are many
chips reporting some error).

> > + }
> > + if (msr_ppin_ctl == 2)
> > + have_ppin = 1;
>
> set_cpu_cap(c, X86_FEATURE_PPIN);

Yes - that looks prettier.

Thanks

-Tony

2016-11-18 17:03:06

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/mce: Include the PPIN in machine check records when it is available

Borislav Petkov <[email protected]> writes:
>
>> @@ -2134,8 +2140,37 @@ static int __init mcheck_enable(char *str)
>> }
>> __setup("mce", mcheck_enable);
>>
>> +static void mcheck_intel_ppin_init(void)
>
> So this functionality could all be moved to arch/x86/kernel/cpu/intel.c
> where you could set an artificial X86_FEATURE_PPIN and get rid of the
> have_ppin var.


That means that a tiny kernel that compiles out machine check
functionality has this unnecessary code.

In general it doesn't make any sense to define a FEATURE flag for
a single user. It's better to just check it where it is needed.

-Andi

2016-11-18 17:46:00

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/mce: Include the PPIN in machine check records when it is available

On Fri, Nov 18, 2016 at 09:02:56AM -0800, Andi Kleen wrote:
> In general it doesn't make any sense to define a FEATURE flag for
> a single user. It's better to just check it where it is needed.

Then the whole thing should go into mce_intel.c.

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

2016-11-18 17:48:44

by Tony Luck

[permalink] [raw]
Subject: [PATCH v2] x86/mce: Include the PPIN in machine check records when it is available

From: Tony Luck <[email protected]>

Intel Xeons from Ivy Bridge onwards support a processor identification
number set in the factory. To the user this is a handy unique number to
identify a particular cpu. Intel can decode this to the fab/production
run to track errors. On systems that have it, include it in the machine
check record. I'm told that this would be helpful for users that run
large data centers with multi-socket servers to keep track of which
CPUs are seeing errors.

Signed-off-by: Tony Luck <[email protected]>
---

Boris:
Moved feature detection to mce_intel.c
Use feature bit.
Don't spam console if feature is disabled
Program defensively against future bits in MSR_PPIN_CTL
Updated commit comment to note the PPIN is set in factory

Andi:
Dynamic feature bits don't impact tiny kernels (well we
are using one *bit* so this could contribute to NCAPINTS
someday needing to be increased).

arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 4 ++++
arch/x86/include/uapi/asm/mce.h | 1 +
arch/x86/kernel/cpu/mcheck/mce.c | 3 +++
arch/x86/kernel/cpu/mcheck/mce_intel.c | 29 +++++++++++++++++++++++++++++
5 files changed, 38 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index a39629206864..d625b651e526 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -193,6 +193,7 @@
#define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */
#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */

+#define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory Number */
#define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */
#define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */
#define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 78f3760ca1f2..710273c617b8 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -37,6 +37,10 @@
#define EFER_FFXSR (1<<_EFER_FFXSR)

/* Intel MSRs. Some also available on other CPUs */
+
+#define MSR_PPIN_CTL 0x0000004e
+#define MSR_PPIN 0x0000004f
+
#define MSR_IA32_PERFCTR0 0x000000c1
#define MSR_IA32_PERFCTR1 0x000000c2
#define MSR_FSB_FREQ 0x000000cd
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 69a6e07e3149..eb6247a7009b 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -28,6 +28,7 @@ struct mce {
__u64 mcgcap; /* MCGCAP MSR: machine check capabilities of CPU */
__u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
__u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
+ __u64 ppin; /* Protected Processor Inventory Number */
};

#define MCE_GET_RECORD_LEN _IOR('M', 1, int)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index a7fdf453d895..cc6d877db88c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -43,6 +43,7 @@
#include <linux/export.h>
#include <linux/jump_label.h>

+#include <asm/intel-family.h>
#include <asm/processor.h>
#include <asm/traps.h>
#include <asm/tlbflush.h>
@@ -135,6 +136,8 @@ void mce_setup(struct mce *m)
m->socketid = cpu_data(m->extcpu).phys_proc_id;
m->apicid = cpu_data(m->extcpu).initial_apicid;
rdmsrl(MSR_IA32_MCG_CAP, m->mcgcap);
+ if (this_cpu_has(X86_FEATURE_INTEL_PPIN))
+ rdmsrl(MSR_PPIN, m->ppin);
}

DEFINE_PER_CPU(struct mce, injectm);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 1defb8ea882c..b2601c96fc3e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -11,6 +11,8 @@
#include <linux/sched.h>
#include <linux/cpumask.h>
#include <asm/apic.h>
+#include <asm/cpufeature.h>
+#include <asm/intel-family.h>
#include <asm/processor.h>
#include <asm/msr.h>
#include <asm/mce.h>
@@ -464,11 +466,38 @@ static void intel_clear_lmce(void)
wrmsrl(MSR_IA32_MCG_EXT_CTL, val);
}

+static void intel_ppin_init(struct cpuinfo_x86 *c)
+{
+ unsigned long long msr_ppin_ctl;
+
+ switch (c->x86_model) {
+ case INTEL_FAM6_IVYBRIDGE_X:
+ case INTEL_FAM6_HASWELL_X:
+ case INTEL_FAM6_BROADWELL_XEON_D:
+ case INTEL_FAM6_BROADWELL_X:
+ case INTEL_FAM6_SKYLAKE_X:
+ if (rdmsrl_safe(MSR_PPIN_CTL, &msr_ppin_ctl))
+ return;
+ if ((msr_ppin_ctl & 3ul) == 1ul) {
+ /* PPIN available but disabled */
+ return;
+ }
+ /* if PPIN is disabled, but not locked, try to enable */
+ if (msr_ppin_ctl == 0) {
+ wrmsrl_safe(MSR_PPIN_CTL, 2ul);
+ rdmsrl_safe(MSR_PPIN_CTL, &msr_ppin_ctl);
+ }
+ if ((msr_ppin_ctl & 3ul) == 2ul)
+ set_cpu_cap(c, X86_FEATURE_INTEL_PPIN);
+ }
+}
+
void mce_intel_feature_init(struct cpuinfo_x86 *c)
{
intel_init_thermal(c);
intel_init_cmci();
intel_init_lmce();
+ intel_ppin_init(c);
}

void mce_intel_feature_clear(struct cpuinfo_x86 *c)
--
2.7.4

2016-11-23 11:49:02

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] x86/mce: Include the PPIN in machine check records when it is available

On Fri, Nov 18, 2016 at 09:48:36AM -0800, Luck, Tony wrote:
> From: Tony Luck <[email protected]>
>
> Intel Xeons from Ivy Bridge onwards support a processor identification
> number set in the factory. To the user this is a handy unique number to
> identify a particular cpu. Intel can decode this to the fab/production
> run to track errors. On systems that have it, include it in the machine
> check record. I'm told that this would be helpful for users that run
> large data centers with multi-socket servers to keep track of which
> CPUs are seeing errors.
>
> Signed-off-by: Tony Luck <[email protected]>
> ---
>
> Boris:
> Moved feature detection to mce_intel.c
> Use feature bit.
> Don't spam console if feature is disabled
> Program defensively against future bits in MSR_PPIN_CTL
> Updated commit comment to note the PPIN is set in factory
>
> Andi:
> Dynamic feature bits don't impact tiny kernels (well we
> are using one *bit* so this could contribute to NCAPINTS
> someday needing to be increased).
>
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/msr-index.h | 4 ++++
> arch/x86/include/uapi/asm/mce.h | 1 +
> arch/x86/kernel/cpu/mcheck/mce.c | 3 +++
> arch/x86/kernel/cpu/mcheck/mce_intel.c | 29 +++++++++++++++++++++++++++++
> 5 files changed, 38 insertions(+)

Applied with some minor fixups:

---
From: Tony Luck <[email protected]>
Date: Fri, 18 Nov 2016 09:48:36 -0800
Subject: [PATCH] x86/mce: Include the PPIN in MCE records when available

Intel Xeons from Ivy Bridge onwards support a processor identification
number set in the factory. To the user this is a handy unique number to
identify a particular CPU. Intel can decode this to the fab/production
run to track errors. On systems that have it, include it in the machine
check record. I'm told that this would be helpful for users that run
large data centers with multi-socket servers to keep track of which CPUs
are seeing errors.

Boris:
* Add some clarifying comments and spacing.
* Mask out [63:2] in the disabled-but-not-locked case
* Call the MSR variable "val" for more readability.

Signed-off-by: Tony Luck <[email protected]>
Cc: Ashok Raj <[email protected]>
Cc: linux-edac <[email protected]>
Cc: x86-ml <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 4 ++++
arch/x86/include/uapi/asm/mce.h | 1 +
arch/x86/kernel/cpu/mcheck/mce.c | 4 ++++
arch/x86/kernel/cpu/mcheck/mce_intel.c | 37 ++++++++++++++++++++++++++++++++++
5 files changed, 47 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index a39629206864..d625b651e526 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -193,6 +193,7 @@
#define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */
#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */

+#define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory Number */
#define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */
#define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */
#define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 78f3760ca1f2..710273c617b8 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -37,6 +37,10 @@
#define EFER_FFXSR (1<<_EFER_FFXSR)

/* Intel MSRs. Some also available on other CPUs */
+
+#define MSR_PPIN_CTL 0x0000004e
+#define MSR_PPIN 0x0000004f
+
#define MSR_IA32_PERFCTR0 0x000000c1
#define MSR_IA32_PERFCTR1 0x000000c2
#define MSR_FSB_FREQ 0x000000cd
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 69a6e07e3149..eb6247a7009b 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -28,6 +28,7 @@ struct mce {
__u64 mcgcap; /* MCGCAP MSR: machine check capabilities of CPU */
__u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
__u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
+ __u64 ppin; /* Protected Processor Inventory Number */
};

#define MCE_GET_RECORD_LEN _IOR('M', 1, int)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index aab96f8d52b0..a3cb27af4f9b 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -43,6 +43,7 @@
#include <linux/export.h>
#include <linux/jump_label.h>

+#include <asm/intel-family.h>
#include <asm/processor.h>
#include <asm/traps.h>
#include <asm/tlbflush.h>
@@ -135,6 +136,9 @@ void mce_setup(struct mce *m)
m->socketid = cpu_data(m->extcpu).phys_proc_id;
m->apicid = cpu_data(m->extcpu).initial_apicid;
rdmsrl(MSR_IA32_MCG_CAP, m->mcgcap);
+
+ if (this_cpu_has(X86_FEATURE_INTEL_PPIN))
+ rdmsrl(MSR_PPIN, m->ppin);
}

DEFINE_PER_CPU(struct mce, injectm);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index be0b2fad47c5..1faefb696af8 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -11,6 +11,8 @@
#include <linux/sched.h>
#include <linux/cpumask.h>
#include <asm/apic.h>
+#include <asm/cpufeature.h>
+#include <asm/intel-family.h>
#include <asm/processor.h>
#include <asm/msr.h>
#include <asm/mce.h>
@@ -464,11 +466,46 @@ static void intel_clear_lmce(void)
wrmsrl(MSR_IA32_MCG_EXT_CTL, val);
}

+static void intel_ppin_init(struct cpuinfo_x86 *c)
+{
+ unsigned long long val;
+
+ /*
+ * Even if testing the presence of the MSR would be enough, we don't
+ * want to risk the situation where other models reuse this MSR for
+ * other purposes.
+ */
+ switch (c->x86_model) {
+ case INTEL_FAM6_IVYBRIDGE_X:
+ case INTEL_FAM6_HASWELL_X:
+ case INTEL_FAM6_BROADWELL_XEON_D:
+ case INTEL_FAM6_BROADWELL_X:
+ case INTEL_FAM6_SKYLAKE_X:
+ if (rdmsrl_safe(MSR_PPIN_CTL, &val))
+ return;
+
+ if ((val & 3ul) == 1ul) {
+ /* PPIN available but disabled: */
+ return;
+ }
+
+ /* if PPIN is disabled, but not locked, try to enable: */
+ if (!(val & 3ul)) {
+ wrmsrl_safe(MSR_PPIN_CTL, val | 2ul);
+ rdmsrl_safe(MSR_PPIN_CTL, &val);
+ }
+
+ if ((val & 3ul) == 2ul)
+ set_cpu_cap(c, X86_FEATURE_INTEL_PPIN);
+ }
+}
+
void mce_intel_feature_init(struct cpuinfo_x86 *c)
{
intel_init_thermal(c);
intel_init_cmci();
intel_init_lmce();
+ intel_ppin_init(c);
}

void mce_intel_feature_clear(struct cpuinfo_x86 *c)
--
2.10.0

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

Subject: Re: [PATCH v2] x86/mce: Include the PPIN in machine check records when it is available

On Wed, 23 Nov 2016, Borislav Petkov wrote:
> + /* if PPIN is disabled, but not locked, try to enable: */
> + if (!(val & 3ul)) {
> + wrmsrl_safe(MSR_PPIN_CTL, val | 2ul);
> + rdmsrl_safe(MSR_PPIN_CTL, &val);
> + }

Actually, since this thing is supposed to be opt-in [through UEFI
config] for a good reason (privacy), IMHO it would make more sense to:

1. Assuming we can do it, always lock it when it is found to be unlocked
at kernel boot.

2. Not attempt to change its state from disabled to enabled *unless*
given a command line parameter authorizing it. A kconfig-based
solution for default+command line override would also work well IMHO,
if it makes more sense.

This would keep the feature opt-in as it is supposed to be, while making
it "safer" on firmware that leaves it unlocked after boot, and would
still allow owners of systems that leave it unlocked to change its
state at boot. Everyone ends up happy...

--
Henrique Holschuh

2016-11-23 13:37:30

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] x86/mce: Include the PPIN in machine check records when it is available

On Wed, Nov 23, 2016 at 11:29:51AM -0200, Henrique de Moraes Holschuh wrote:
> 1. Assuming we can do it, always lock it when it is found to be unlocked
> at kernel boot.

Because...?

> 2. Not attempt to change its state from disabled to enabled *unless*
> given a command line parameter authorizing it. A kconfig-based
> solution for default+command line override would also work well IMHO,
> if it makes more sense.

You can't reenable it:

"LockOut (R/WO)
Set 1 to prevent further writes to MSR_PPIN_CTL. Writing 1 to
MSR_PPINCTL[bit 0] is permitted only if MSR_PPIN_CTL[bit 1] is
clear, Default is 0."

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

2016-11-23 14:55:27

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] x86/mce: Include the PPIN in machine check records when it is available

On Wed, Nov 23, 2016 at 02:37:23PM +0100, Borislav Petkov wrote:
> You can't reenable it:
>
> "LockOut (R/WO)
> Set 1 to prevent further writes to MSR_PPIN_CTL. Writing 1 to
> MSR_PPINCTL[bit 0] is permitted only if MSR_PPIN_CTL[bit 1] is
> clear, Default is 0."

Well, almost.

"Enable_PPIN (R/W)
If 1, enables MSR_PPIN to be accessible using RDMSR. Once set,
attempt to write 1 to MSR_PPIN_CTL[bit 0] will cause #GP.
If 0, an attempt to read MSR_PPIN will cause #GP. Default is 0."

Frankly, I don't get what the deal behind that locking out is. And it
says that BIOS should provide an opt-in so that agent can read the PPIN
and then that agent should *disable* it again by writing 01b to the CTL
MSR.

But then the first paragraph above says that the write
MSR_PPIN_CTL[0]=1b will #GP because MSR_PPIN_CTL[1] will be 1 for the
agent to read out MSR_PPIN first.

I guess we need to write a 00b first to disable PPIN and then write 01b
to lock it out.

So AFAIU, the steps will be:

* BIOS writes 10b
* agent reads MSR_PPIN
* agent writes 00b to disable MSR_PPIN
* agent writes 01b because bit 1 is clear now and it won't #GP.

Meh...

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

Subject: [tip:ras/core] x86/mce: Include the PPIN in MCE records when available

Commit-ID: 3f5a7896a5096fd50030a04d4c3f28a7441e30a5
Gitweb: http://git.kernel.org/tip/3f5a7896a5096fd50030a04d4c3f28a7441e30a5
Author: Tony Luck <[email protected]>
AuthorDate: Fri, 18 Nov 2016 09:48:36 -0800
Committer: Thomas Gleixner <[email protected]>
CommitDate: Wed, 23 Nov 2016 16:51:52 +0100

x86/mce: Include the PPIN in MCE records when available

Intel Xeons from Ivy Bridge onwards support a processor identification
number set in the factory. To the user this is a handy unique number to
identify a particular CPU. Intel can decode this to the fab/production
run to track errors. On systems that have it, include it in the machine
check record. I'm told that this would be helpful for users that run
large data centers with multi-socket servers to keep track of which CPUs
are seeing errors.

Boris:
* Add some clarifying comments and spacing.
* Mask out [63:2] in the disabled-but-not-locked case
* Call the MSR variable "val" for more readability.

Signed-off-by: Tony Luck <[email protected]>
Cc: Ashok Raj <[email protected]>
Cc: linux-edac <[email protected]>
Cc: x86-ml <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 4 ++++
arch/x86/include/uapi/asm/mce.h | 1 +
arch/x86/kernel/cpu/mcheck/mce.c | 4 ++++
arch/x86/kernel/cpu/mcheck/mce_intel.c | 37 ++++++++++++++++++++++++++++++++++
5 files changed, 47 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index a396292..d625b65 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -193,6 +193,7 @@
#define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */
#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */

+#define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory Number */
#define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */
#define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */
#define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 78f3760..710273c 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -37,6 +37,10 @@
#define EFER_FFXSR (1<<_EFER_FFXSR)

/* Intel MSRs. Some also available on other CPUs */
+
+#define MSR_PPIN_CTL 0x0000004e
+#define MSR_PPIN 0x0000004f
+
#define MSR_IA32_PERFCTR0 0x000000c1
#define MSR_IA32_PERFCTR1 0x000000c2
#define MSR_FSB_FREQ 0x000000cd
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 69a6e07..eb6247a 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -28,6 +28,7 @@ struct mce {
__u64 mcgcap; /* MCGCAP MSR: machine check capabilities of CPU */
__u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
__u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
+ __u64 ppin; /* Protected Processor Inventory Number */
};

#define MCE_GET_RECORD_LEN _IOR('M', 1, int)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index aab96f8..a3cb27a 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -43,6 +43,7 @@
#include <linux/export.h>
#include <linux/jump_label.h>

+#include <asm/intel-family.h>
#include <asm/processor.h>
#include <asm/traps.h>
#include <asm/tlbflush.h>
@@ -135,6 +136,9 @@ void mce_setup(struct mce *m)
m->socketid = cpu_data(m->extcpu).phys_proc_id;
m->apicid = cpu_data(m->extcpu).initial_apicid;
rdmsrl(MSR_IA32_MCG_CAP, m->mcgcap);
+
+ if (this_cpu_has(X86_FEATURE_INTEL_PPIN))
+ rdmsrl(MSR_PPIN, m->ppin);
}

DEFINE_PER_CPU(struct mce, injectm);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index be0b2fa..190b3e6 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -11,6 +11,8 @@
#include <linux/sched.h>
#include <linux/cpumask.h>
#include <asm/apic.h>
+#include <asm/cpufeature.h>
+#include <asm/intel-family.h>
#include <asm/processor.h>
#include <asm/msr.h>
#include <asm/mce.h>
@@ -464,11 +466,46 @@ static void intel_clear_lmce(void)
wrmsrl(MSR_IA32_MCG_EXT_CTL, val);
}

+static void intel_ppin_init(struct cpuinfo_x86 *c)
+{
+ unsigned long long val;
+
+ /*
+ * Even if testing the presence of the MSR would be enough, we don't
+ * want to risk the situation where other models reuse this MSR for
+ * other purposes.
+ */
+ switch (c->x86_model) {
+ case INTEL_FAM6_IVYBRIDGE_X:
+ case INTEL_FAM6_HASWELL_X:
+ case INTEL_FAM6_BROADWELL_XEON_D:
+ case INTEL_FAM6_BROADWELL_X:
+ case INTEL_FAM6_SKYLAKE_X:
+ if (rdmsrl_safe(MSR_PPIN_CTL, &val))
+ return;
+
+ if ((val & 3UL) == 1UL) {
+ /* PPIN available but disabled: */
+ return;
+ }
+
+ /* If PPIN is disabled, but not locked, try to enable: */
+ if (!(val & 3UL)) {
+ wrmsrl_safe(MSR_PPIN_CTL, val | 2UL);
+ rdmsrl_safe(MSR_PPIN_CTL, &val);
+ }
+
+ if ((val & 3UL) == 2UL)
+ set_cpu_cap(c, X86_FEATURE_INTEL_PPIN);
+ }
+}
+
void mce_intel_feature_init(struct cpuinfo_x86 *c)
{
intel_init_thermal(c);
intel_init_cmci();
intel_init_lmce();
+ intel_ppin_init(c);
}

void mce_intel_feature_clear(struct cpuinfo_x86 *c)

2016-11-23 16:42:47

by Tony Luck

[permalink] [raw]
Subject: Re: [PATCH v2] x86/mce: Include the PPIN in machine check records when it is available

If the BIOS writes 10b, then PPIN is disabled and will remain so until the processor is reset. Bit 1 is a one way trip, it can be set by s/w, but not cleared again.

All this is because of the huge stink last time Intel tried to add a serial number to CPUs a decade and a half ago. The lockout bit is so that this can be turned off in a way that you can be sure that it can't be turned on again.

-Tony

Sent from my iPhone

> On Nov 23, 2016, at 06:05, Borislav Petkov <[email protected]> wrote:
>
>> On Wed, Nov 23, 2016 at 02:37:23PM +0100, Borislav Petkov wrote:
>> You can't reenable it:
>>
>> "LockOut (R/WO)
>> Set 1 to prevent further writes to MSR_PPIN_CTL. Writing 1 to
>> MSR_PPINCTL[bit 0] is permitted only if MSR_PPIN_CTL[bit 1] is
>> clear, Default is 0."
>
> Well, almost.
>
> "Enable_PPIN (R/W)
> If 1, enables MSR_PPIN to be accessible using RDMSR. Once set,
> attempt to write 1 to MSR_PPIN_CTL[bit 0] will cause #GP.
> If 0, an attempt to read MSR_PPIN will cause #GP. Default is 0."
>
> Frankly, I don't get what the deal behind that locking out is. And it
> says that BIOS should provide an opt-in so that agent can read the PPIN
> and then that agent should *disable* it again by writing 01b to the CTL
> MSR.
>
> But then the first paragraph above says that the write
> MSR_PPIN_CTL[0]=1b will #GP because MSR_PPIN_CTL[1] will be 1 for the
> agent to read out MSR_PPIN first.
>
> I guess we need to write a 00b first to disable PPIN and then write 01b
> to lock it out.
>
> So AFAIU, the steps will be:
>
> * BIOS writes 10b
> * agent reads MSR_PPIN
> * agent writes 00b to disable MSR_PPIN
> * agent writes 01b because bit 1 is clear now and it won't #GP.
>
> Meh...
>
> --
> Regards/Gruss,
> Boris.
>
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> --

2016-11-23 16:55:31

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] x86/mce: Include the PPIN in machine check records when it is available

On Wed, Nov 23, 2016 at 08:42:40AM -0800, Tony Luck wrote:
> If the BIOS writes 10b, then PPIN is disabled and will remain so until
> the processor is reset. Bit 1 is a one way trip, it can be set by s/w,
> but not cleared again.

10b means bit 1, i.e., Enable_PPIN is set, right? Which actually
*enables* PPIN. Or am I confused again?

Otherwise, this explains the "Once set" wording - if Enable_PPIN is 1,
there's no changing until next reboot.

> All this is because of the huge stink last time Intel tried to add
> a serial number to CPUs a decade and a half ago.

It certainly rang a bell when you sent v1. :-)

> The lockout bit is so that this can be turned off in a way that you
> can be sure that it can't be turned on again.

... in order to protect ourselves from root doing wrmsr? Or why are we
doing this?

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

Subject: Re: [PATCH v2] x86/mce: Include the PPIN in machine check records when it is available

On Wed, 23 Nov 2016, Borislav Petkov wrote:
> On Wed, Nov 23, 2016 at 11:29:51AM -0200, Henrique de Moraes Holschuh wrote:
> > 1. Assuming we can do it, always lock it when it is found to be unlocked
> > at kernel boot.
>
> Because...?

Privacy, and the fact that /dev/cpu/msr exists and is enabled on
almost all general-use distros.

> > 2. Not attempt to change its state from disabled to enabled *unless*
> > given a command line parameter authorizing it. A kconfig-based
> > solution for default+command line override would also work well IMHO,
> > if it makes more sense.
>
> You can't reenable it:

Yeah, I just found the description for that thing in the IA32 manual.

It can be disabled + unlocked, disabled + locked, or enabled + unlocked.
Once locked, it will stay disabled until the next reboot.

However, the manual makes it clear we are _not_ supposed to leave it
enabled + unlocked. Apparently, we're supposed to do our business and
disable+lock it (i.e. enable, read and store/process, disable+lock).

Looks like it is supposed to be used in a way that protects privacy by
making it very hard for general use software to depend on it existing
and being enabled.

--
Henrique Holschuh

2016-11-23 20:56:23

by Tony Luck

[permalink] [raw]
Subject: Re: [PATCH v2] x86/mce: Include the PPIN in machine check records when it is available

IMHO people who really care should find the BIOS option and disable it there.

Having Linux take responsibility seems a little weird. If we do go that route it should be early in setup_arch() before any callbacks to other subsystems to avoid and endless games of whack-a-mole.

I also wonder about the level of outrage this time around. The feature has been sitting there for three full generations: Ivybridge (tick), Haswell (tock) and another tick for Broadwell. Do privacy folks not read each new SDM from cover to cover?

Sent from my iPhone

> On Nov 23, 2016, at 09:29, Henrique de Moraes Holschuh <[email protected]> wrote:
>
>> On Wed, 23 Nov 2016, Borislav Petkov wrote:
>>> On Wed, Nov 23, 2016 at 11:29:51AM -0200, Henrique de Moraes Holschuh wrote:
>>> 1. Assuming we can do it, always lock it when it is found to be unlocked
>>> at kernel boot.
>>
>> Because...?
>
> Privacy, and the fact that /dev/cpu/msr exists and is enabled on
> almost all general-use distros.
>
>>> 2. Not attempt to change its state from disabled to enabled *unless*
>>> given a command line parameter authorizing it. A kconfig-based
>>> solution for default+command line override would also work well IMHO,
>>> if it makes more sense.
>>
>> You can't reenable it:
>
> Yeah, I just found the description for that thing in the IA32 manual.
>
> It can be disabled + unlocked, disabled + locked, or enabled + unlocked.
> Once locked, it will stay disabled until the next reboot.
>
> However, the manual makes it clear we are _not_ supposed to leave it
> enabled + unlocked. Apparently, we're supposed to do our business and
> disable+lock it (i.e. enable, read and store/process, disable+lock).
>
> Looks like it is supposed to be used in a way that protects privacy by
> making it very hard for general use software to depend on it existing
> and being enabled.
>
> --
> Henrique Holschuh

Subject: Re: [PATCH v2] x86/mce: Include the PPIN in machine check records when it is available

On Wed, 23 Nov 2016, Tony Luck wrote:
> IMHO people who really care should find the BIOS option and disable it
> there.

That can also be said about *enabling* it, I think (see below).

> Having Linux take responsibility seems a little weird. If we do go

Not really. The currently proposed patch *enables* PPIN if it is found
to be disabled but unlocked. That pretty much means Linux _would_ take
the responsibility, the blame, and the outrage of privacy advocates (if
any).

If we enable it, it is our fault, plain and simple.

> I also wonder about the level of outrage this time around. The feature
> has been sitting there for three full generations: Ivybridge (tick),
> Haswell (tock) and another tick for Broadwell. Do privacy folks not
> read each new SDM from cover to cover?

I very much doubt so :-)

And it would take a very through and careful read of the SDM changes to
find it, if you are not searching for it by name.

But even if the privacy advocates did read the SDM changelogs very
carefully and took notice of it, the PPIN feature clearly looks like it
was designed to protect the privacy of anyone that did not especifically
want it enabled.

1. PPIN is disabled on hard reset (as far as I can tell).

2. BIOS/UEFI ships it disabled by default, as recommended by SDM
("opt-in" feature). Although it should have recommended that it
be *locked* disabled by default, thus *ensuring* opt-in.

3. Opt-in bias is enforced in hardware (the firmware cannot lock the
feature in an enabled state).

4. Access violations (read when disabled, unlock, etc) will raise
a #GP, thus getting the operating system/firmware crash handler
involved immediately.

The expected usecase is, as described in the IA32 SDM: a trusted asset
agent will enable, read the PPIN, and lock it disabled afterwards. That
"lock it disabled" would get in the way of general abuse of the feature
by random ISVs.

I think the architecture / hardware / microcode people @intel covered
their angle really well on this. Anyone that raise a ruckus on the fact
that PPIN exists (as described in the SDM) is not going to look very
reasonable.


I recommend that the Linux kernel should take the same instance as the
intel hardware/microcode team did: don't enable it by default, don't
make it easy for any ISVs to abuse it without positive opt-in action
from the local system admin.

This is why I also recommend that the kernel should always lock it
disabled -- whether we read the PPIN for kernel use (when PPIN was
enabled by the BIOS[1]) or not. It indeed *is* the kernel taking
responsibility for side-stepping the whole "rdmsr is for ring 0"
architectural security model due to unfiltered /dev/cpu/msr.


[1] I personally have nothing against an override, e.g. a kernel
command-line parameter, that allows the kernel to enable PPIN when the
BIOS left it unlocked, as long as it is not done by default.

--
Henrique Holschuh