2018-10-10 16:27:37

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 1/2] x86/cpufeature: Add facility to match microcode revisions

From: Andi Kleen <[email protected]>

For bug workarounds or checks it is useful to check for specific
microcode versions. Add a new table format to check for steppings
with min microcode revisions.

This does not change the existing x86_cpu_id because it's an ABI
shared with modutils, and also has quite difference requirements,
as in no wildcards, but everything has to be matched exactly.

Signed-off-by: Andi Kleen <[email protected]>
---
v2:
Remove all CPU match, only check boot cpu
Move INTEL_MIN_UCODE macro to header.
Minor cleanups.
Remove max ucode and driver data
---
arch/x86/include/asm/cpu_device_id.h | 26 ++++++++++++++++++++++++++
arch/x86/kernel/cpu/match.c | 21 +++++++++++++++++++++
2 files changed, 47 insertions(+)

diff --git a/arch/x86/include/asm/cpu_device_id.h b/arch/x86/include/asm/cpu_device_id.h
index baeba0567126..1b90bd1d0b95 100644
--- a/arch/x86/include/asm/cpu_device_id.h
+++ b/arch/x86/include/asm/cpu_device_id.h
@@ -11,4 +11,30 @@

extern const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match);

+/*
+ * Match specific microcodes
+ *
+ * vendor/family/model/stepping must be all set.
+ * min_ucode is optional and can be 0.
+ */
+
+struct x86_ucode_id {
+ u8 vendor;
+ u8 family;
+ u16 model;
+ u16 stepping;
+ u32 min_ucode;
+};
+
+#define INTEL_MIN_UCODE(mod, step, rev) { \
+ .vendor = X86_VENDOR_INTEL, \
+ .family = 6, \
+ .model = mod, \
+ .stepping = step, \
+ .min_ucode = rev, \
+}
+
+extern const struct x86_ucode_id *
+x86_match_ucode(const struct x86_ucode_id *match);
+
#endif
diff --git a/arch/x86/kernel/cpu/match.c b/arch/x86/kernel/cpu/match.c
index 3fed38812eea..ec8ee31699cd 100644
--- a/arch/x86/kernel/cpu/match.c
+++ b/arch/x86/kernel/cpu/match.c
@@ -48,3 +48,24 @@ const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
return NULL;
}
EXPORT_SYMBOL(x86_match_cpu);
+
+const struct x86_ucode_id *x86_match_ucode(const struct x86_ucode_id *match)
+{
+ struct cpuinfo_x86 *c = &boot_cpu_data;
+ const struct x86_ucode_id *m;
+
+ for (m = match; m->vendor | m->family | m->model; m++) {
+ if (c->x86_vendor != m->vendor)
+ continue;
+ if (c->x86 != m->family)
+ continue;
+ if (c->x86_model != m->model)
+ continue;
+ if (c->x86_stepping != m->stepping)
+ continue;
+ if (c->microcode < m->min_ucode)
+ continue;
+ return m;
+ }
+ return NULL;
+}
--
2.17.1



2018-10-10 16:28:41

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 2/2] perf/x86/kvm: Avoid unnecessary work in guest filtering

From: Andi Kleen <[email protected]>

KVM added a workaround for PEBS events leaking
into guests with 26a4f3c08de4 ("perf/x86: disable PEBS on a guest entry.")
This uses the VT entry/exit list to add an extra disable of the PEBS_ENABLE MSR.

Intel also added a fix for this issue to microcode updates on
Haswell/Broadwell/Skylake.

It turns out using the MSR entry/exit list makes VM exits
significantly slower. The list is only needed for disabling
PEBS, because the GLOBAL_CTRL change gets optimized by
KVM into changing the VMCS.

Check for the microcode updates that have the microcode
fix for leaking PEBS, and disable the extra entry/exit list
entry for PEBS_ENABLE. In addition we always clear the
GLOBAL_CTRL for the PEBS counter while running in the guest,
which is enough to make them never fire at the wrong
side of the host/guest transition.

We see significantly reduced overhead for VM exits with the
filtering active with the patch from 8% to 4%.

Signed-off-by: Andi Kleen <[email protected]>
---
v2:
Use match_ucode, not match_ucode_all
Remove cpu lock
Use INTEL_MIN_UCODE and move to header
Update Table to include skylake clients.
---
arch/x86/events/intel/core.c | 80 ++++++++++++++++++++++++++++++++----
arch/x86/events/perf_event.h | 3 +-
2 files changed, 73 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index ab01ef9ddd77..5e8e76753eea 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -18,6 +18,7 @@
#include <asm/hardirq.h>
#include <asm/intel-family.h>
#include <asm/apic.h>
+#include <asm/cpu_device_id.h>

#include "../perf_event.h"

@@ -3166,16 +3167,27 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr)
arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
arr[0].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
arr[0].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask;
- /*
- * If PMU counter has PEBS enabled it is not enough to disable counter
- * on a guest entry since PEBS memory write can overshoot guest entry
- * and corrupt guest memory. Disabling PEBS solves the problem.
- */
- arr[1].msr = MSR_IA32_PEBS_ENABLE;
- arr[1].host = cpuc->pebs_enabled;
- arr[1].guest = 0;
+ if (x86_pmu.flags & PMU_FL_PEBS_ALL)
+ arr[0].guest &= ~cpuc->pebs_enabled;
+ else
+ arr[0].guest &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);
+ *nr = 1;
+
+ if (!x86_pmu.pebs_isolated) {
+ /*
+ * If PMU counter has PEBS enabled it is not enough to
+ * disable counter on a guest entry since PEBS memory
+ * write can overshoot guest entry and corrupt guest
+ * memory. Disabling PEBS solves the problem.
+ *
+ * Don't do this if the CPU already enforces it.
+ */
+ arr[1].msr = MSR_IA32_PEBS_ENABLE;
+ arr[1].host = cpuc->pebs_enabled;
+ arr[1].guest = 0;
+ *nr = 2;
+ }

- *nr = 2;
return arr;
}

@@ -3693,6 +3705,45 @@ static __init void intel_clovertown_quirk(void)
x86_pmu.pebs_constraints = NULL;
}

+static const struct x86_ucode_id isolation_ucodes[] = {
+ INTEL_MIN_UCODE(INTEL_FAM6_HASWELL_CORE, 3, 0x0000001f),
+ INTEL_MIN_UCODE(INTEL_FAM6_HASWELL_ULT, 1, 0x0000001e),
+ INTEL_MIN_UCODE(INTEL_FAM6_HASWELL_GT3E, 1, 0x00000015),
+ INTEL_MIN_UCODE(INTEL_FAM6_HASWELL_X, 2, 0x00000037),
+ INTEL_MIN_UCODE(INTEL_FAM6_HASWELL_X, 4, 0x0000000a),
+ INTEL_MIN_UCODE(INTEL_FAM6_BROADWELL_CORE, 4, 0x00000023),
+ INTEL_MIN_UCODE(INTEL_FAM6_BROADWELL_GT3E, 1, 0x00000014),
+ INTEL_MIN_UCODE(INTEL_FAM6_BROADWELL_XEON_D, 2, 0x00000010),
+ INTEL_MIN_UCODE(INTEL_FAM6_BROADWELL_XEON_D, 3, 0x07000009),
+ INTEL_MIN_UCODE(INTEL_FAM6_BROADWELL_XEON_D, 4, 0x0f000009),
+ INTEL_MIN_UCODE(INTEL_FAM6_BROADWELL_XEON_D, 5, 0x0e000002),
+ INTEL_MIN_UCODE(INTEL_FAM6_BROADWELL_X, 2, 0x0b000014),
+ INTEL_MIN_UCODE(INTEL_FAM6_SKYLAKE_X, 3, 0x00000021),
+ INTEL_MIN_UCODE(INTEL_FAM6_SKYLAKE_X, 4, 0x00000000),
+ INTEL_MIN_UCODE(INTEL_FAM6_SKYLAKE_MOBILE, 3, 0x0000007c),
+ INTEL_MIN_UCODE(INTEL_FAM6_SKYLAKE_DESKTOP, 3, 0x0000007c),
+ INTEL_MIN_UCODE(INTEL_FAM6_KABYLAKE_DESKTOP, 9, 0x0000004e),
+ INTEL_MIN_UCODE(INTEL_FAM6_KABYLAKE_MOBILE, 9, 0x0000004e),
+ INTEL_MIN_UCODE(INTEL_FAM6_KABYLAKE_MOBILE, 10, 0x0000004e),
+ INTEL_MIN_UCODE(INTEL_FAM6_KABYLAKE_MOBILE, 11, 0x0000004e),
+ INTEL_MIN_UCODE(INTEL_FAM6_KABYLAKE_MOBILE, 12, 0x0000004e),
+ INTEL_MIN_UCODE(INTEL_FAM6_KABYLAKE_DESKTOP, 10, 0x0000004e),
+ INTEL_MIN_UCODE(INTEL_FAM6_KABYLAKE_DESKTOP, 11, 0x0000004e),
+ INTEL_MIN_UCODE(INTEL_FAM6_KABYLAKE_DESKTOP, 12, 0x0000004e),
+ INTEL_MIN_UCODE(INTEL_FAM6_KABYLAKE_DESKTOP, 13, 0x0000004e),
+ INTEL_MIN_UCODE(INTEL_FAM6_CANNONLAKE_MOBILE, 3, 0x00000000),
+ {}
+};
+
+static void intel_check_isolation(void)
+{
+ if (!x86_match_ucode(isolation_ucodes)) {
+ x86_pmu.pebs_isolated = 0;
+ return;
+ }
+ x86_pmu.pebs_isolated = 1;
+}
+
static int intel_snb_pebs_broken(int cpu)
{
u32 rev = UINT_MAX; /* default to broken for unknown models */
@@ -3717,6 +3768,8 @@ static void intel_snb_check_microcode(void)
int pebs_broken = 0;
int cpu;

+ intel_check_isolation();
+
for_each_online_cpu(cpu) {
if ((pebs_broken = intel_snb_pebs_broken(cpu)))
break;
@@ -3798,6 +3851,12 @@ static __init void intel_sandybridge_quirk(void)
cpus_read_unlock();
}

+static __init void intel_isolation_quirk(void)
+{
+ x86_pmu.check_microcode = intel_check_isolation;
+ intel_check_isolation();
+}
+
static const struct { int id; char *name; } intel_arch_events_map[] __initconst = {
{ PERF_COUNT_HW_CPU_CYCLES, "cpu cycles" },
{ PERF_COUNT_HW_INSTRUCTIONS, "instructions" },
@@ -4362,6 +4421,7 @@ __init int intel_pmu_init(void)
case INTEL_FAM6_HASWELL_X:
case INTEL_FAM6_HASWELL_ULT:
case INTEL_FAM6_HASWELL_GT3E:
+ x86_add_quirk(intel_isolation_quirk);
x86_add_quirk(intel_ht_bug);
x86_pmu.late_ack = true;
memcpy(hw_cache_event_ids, hsw_hw_cache_event_ids, sizeof(hw_cache_event_ids));
@@ -4392,6 +4452,7 @@ __init int intel_pmu_init(void)
case INTEL_FAM6_BROADWELL_XEON_D:
case INTEL_FAM6_BROADWELL_GT3E:
case INTEL_FAM6_BROADWELL_X:
+ x86_add_quirk(intel_isolation_quirk);
x86_pmu.late_ack = true;
memcpy(hw_cache_event_ids, hsw_hw_cache_event_ids, sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, hsw_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
@@ -4452,6 +4513,7 @@ __init int intel_pmu_init(void)
case INTEL_FAM6_SKYLAKE_X:
case INTEL_FAM6_KABYLAKE_MOBILE:
case INTEL_FAM6_KABYLAKE_DESKTOP:
+ x86_add_quirk(intel_isolation_quirk);
x86_pmu.late_ack = true;
memcpy(hw_cache_event_ids, skl_hw_cache_event_ids, sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, skl_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index adae087cecdd..d5745ed62622 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -607,7 +607,8 @@ struct x86_pmu {
pebs_active :1,
pebs_broken :1,
pebs_prec_dist :1,
- pebs_no_tlb :1;
+ pebs_no_tlb :1,
+ pebs_isolated :1;
int pebs_record_size;
int pebs_buffer_size;
void (*drain_pebs)(struct pt_regs *regs);
--
2.17.1


2018-10-10 16:38:23

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/cpufeature: Add facility to match microcode revisions

Lemme paste some of tglx's review comments from last time.

On Wed, Oct 10, 2018 at 09:26:07AM -0700, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> For bug workarounds or checks it is useful to check for specific
> microcode versions. Add a new table format to check for steppings

s/versions/revisions/

> with min microcode revisions.
>
> This does not change the existing x86_cpu_id because it's an ABI
> shared with modutils, and also has quite difference requirements,

s/difference/different/

> as in no wildcards, but everything has to be matched exactly.
>
> Signed-off-by: Andi Kleen <[email protected]>
> ---
> v2:
> Remove all CPU match, only check boot cpu
> Move INTEL_MIN_UCODE macro to header.
> Minor cleanups.
> Remove max ucode and driver data
> ---
> arch/x86/include/asm/cpu_device_id.h | 26 ++++++++++++++++++++++++++
> arch/x86/kernel/cpu/match.c | 21 +++++++++++++++++++++
> 2 files changed, 47 insertions(+)
>
> diff --git a/arch/x86/include/asm/cpu_device_id.h b/arch/x86/include/asm/cpu_device_id.h
> index baeba0567126..1b90bd1d0b95 100644
> --- a/arch/x86/include/asm/cpu_device_id.h
> +++ b/arch/x86/include/asm/cpu_device_id.h
> @@ -11,4 +11,30 @@
>
> extern const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match);
>
> +/*
> + * Match specific microcodes

"What means microcodes or steppings? If you mean microcode revisions then
please spell it out and use it all over the place. steppings is confusing
at best as its associated to the CPU stepping."

> + *
> + * vendor/family/model/stepping must be all set.
> + * min_ucode is optional and can be 0.
> + */
> +
> +struct x86_ucode_id {
> + u8 vendor;
> + u8 family;
> + u16 model;
> + u16 stepping;

"Why u16? The corresponding members in cpuinfo_x86 are 8bit wide so why
wasting memory for these tables?"

> + u32 min_ucode;
> +};
> +
> +#define INTEL_MIN_UCODE(mod, step, rev) { \
> + .vendor = X86_VENDOR_INTEL, \
> + .family = 6, \
> + .model = mod, \
> + .stepping = step, \
> + .min_ucode = rev, \
> +}
> +
> +extern const struct x86_ucode_id *
> +x86_match_ucode(const struct x86_ucode_id *match);
> +
> #endif
> diff --git a/arch/x86/kernel/cpu/match.c b/arch/x86/kernel/cpu/match.c
> index 3fed38812eea..ec8ee31699cd 100644
> --- a/arch/x86/kernel/cpu/match.c
> +++ b/arch/x86/kernel/cpu/match.c
> @@ -48,3 +48,24 @@ const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
> return NULL;
> }
> EXPORT_SYMBOL(x86_match_cpu);
> +
> +const struct x86_ucode_id *x86_match_ucode(const struct x86_ucode_id *match)

s/ucode/microcode/

> +{
> + struct cpuinfo_x86 *c = &boot_cpu_data;
> + const struct x86_ucode_id *m;
> +
> + for (m = match; m->vendor | m->family | m->model; m++) {
> + if (c->x86_vendor != m->vendor)
> + continue;
> + if (c->x86 != m->family)
> + continue;
> + if (c->x86_model != m->model)
> + continue;
> + if (c->x86_stepping != m->stepping)
> + continue;
> + if (c->microcode < m->min_ucode)
> + continue;
> + return m;
> + }
> + return NULL;
> +}
> --
> 2.17.1
>

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Subject: Re: [PATCH v2 1/2] x86/cpufeature: Add facility to match microcode revisions

On Wed, 10 Oct 2018, Andi Kleen wrote:
> v2:
> Remove all CPU match, only check boot cpu

IMHO, since it looks like a v3 will be necessary anyway, it could
benefit from a comment reminding people about how to use it on older
systems where "mixed CPU stepping" configurations were common.

This is *not* a relevant limitation, and it is easy enough to handle.
But people writing quirks for very old Intel Xeon CPUs *today* (unlikely
as that might be) might well forget the mixed-stepping gotcha...

Note that while mixed-stepping SMP configurations are *not* current
practice, they *were* reasonably common practice more than a decade ago,
officially supported both by Intel (there are Intel documents detailing
the valid stepping combinations) and the server vendors.

Suggestion below.

> +/*
> + * Match specific microcodes
> + *
> + * vendor/family/model/stepping must be all set.
> + * min_ucode is optional and can be 0.

* only checks against the boot cpu. When mixed-stepping configs are
valid for a CPU model, add a quirk for every valid stepping and
do the fine-tuning in the quirk handler.

--
Henrique Holschuh

2018-10-17 10:01:07

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/cpufeature: Add facility to match microcode revisions

On Wed, 10 Oct 2018, Andi Kleen wrote:
> +/*
> + * Match specific microcodes
> + *
> + * vendor/family/model/stepping must be all set.
> + * min_ucode is optional and can be 0.

Stale comment

> + */
> +
> +struct x86_ucode_id {
> + u8 vendor;
> + u8 family;
> + u16 model;
> + u16 stepping;

Still using u16 for no reason. And please make the members aligned in a
tabular fashion.

> + u32 min_ucode;
> +};
> +
> +const struct x86_ucode_id *x86_match_ucode(const struct x86_ucode_id *match)

What's the point of returning the struct pointer? Shouldn't it be enough to
make it return bool? Also the function name really should reflect that this
checks whether the minimal required microcode revision is active.

> +{
> + struct cpuinfo_x86 *c = &boot_cpu_data;
> + const struct x86_ucode_id *m;
> +
> + for (m = match; m->vendor | m->family | m->model; m++) {

VENDOR_INTEL = 0, so this check is obscure to begin with. Either you chose
a explicit condition to put at the end of the table, e.g. vendor = U8_MAX
or you hand in the array size to the function.

> + if (c->x86_vendor != m->vendor)
> + continue;
> + if (c->x86 != m->family)
> + continue;
> + if (c->x86_model != m->model)
> + continue;
> + if (c->x86_stepping != m->stepping)
> + continue;
> + if (c->microcode < m->min_ucode)
> + continue;

Why would you continue here? If vendor, family, model, stepping match, then
there is no point to continue, really. Assuming that the return type is bool:

return c->microcode >= m->min_ucode;

is sufficient.

Thanks,

tglx

2018-10-19 23:48:46

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/cpufeature: Add facility to match microcode revisions

> > + u32 min_ucode;
> > +};
> > +
> > +const struct x86_ucode_id *x86_match_ucode(const struct x86_ucode_id *match)
>
> What's the point of returning the struct pointer? Shouldn't it be enough to
> make it return bool? Also the function name really should reflect that this
> checks whether the minimal required microcode revision is active.

This allows the user to find the table entry to tie something to it
(e.g. use the index to match some other table)

Same pattern as pci discovery etc. use.

Given the current caller doesn't need it, but we still follow standard
conventions.

>
> > +{
> > + struct cpuinfo_x86 *c = &boot_cpu_data;
> > + const struct x86_ucode_id *m;
> > +
> > + for (m = match; m->vendor | m->family | m->model; m++) {
>
> VENDOR_INTEL = 0, so this check is obscure to begin with. Either you chose
> a explicit condition to put at the end of the table, e.g. vendor = U8_MAX
> or you hand in the array size to the function.

That would both be awkward. It's the same as match_cpu, and 0 terminators
are standard practice in practical all similar code. I removed
the or with the family.

-Andi

2018-10-20 08:20:38

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/cpufeature: Add facility to match microcode revisions

On Fri, 19 Oct 2018, Andi Kleen wrote:

> > > + u32 min_ucode;
> > > +};
> > > +
> > > +const struct x86_ucode_id *x86_match_ucode(const struct x86_ucode_id *match)
> >
> > What's the point of returning the struct pointer? Shouldn't it be enough to
> > make it return bool? Also the function name really should reflect that this
> > checks whether the minimal required microcode revision is active.
>
> This allows the user to find the table entry to tie something to it
> (e.g. use the index to match some other table)
>
> Same pattern as pci discovery etc. use.
>
> Given the current caller doesn't need it, but we still follow standard
> conventions.

There is no point to return the pointer because it's not a compound
structure. If you want to provide the possibility to use the index then
return the index and an error code if it does not match.

> >
> > > +{
> > > + struct cpuinfo_x86 *c = &boot_cpu_data;
> > > + const struct x86_ucode_id *m;
> > > +
> > > + for (m = match; m->vendor | m->family | m->model; m++) {
> >
> > VENDOR_INTEL = 0, so this check is obscure to begin with. Either you chose
> > a explicit condition to put at the end of the table, e.g. vendor = U8_MAX
> > or you hand in the array size to the function.
>
> That would both be awkward. It's the same as match_cpu, and 0 terminators
> are standard practice in practical all similar code. I removed
> the or with the family.

That's debatable because it's more easy to miss the terminator than getting
the ARRAY_SIZE() argument wrong. But it doesn't matter much.

Thanks,

tglx



2018-10-20 14:40:03

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/cpufeature: Add facility to match microcode revisions

On Sat, Oct 20, 2018 at 10:19:37AM +0200, Thomas Gleixner wrote:
> On Fri, 19 Oct 2018, Andi Kleen wrote:
>
> > > > + u32 min_ucode;
> > > > +};
> > > > +
> > > > +const struct x86_ucode_id *x86_match_ucode(const struct x86_ucode_id *match)
> > >
> > > What's the point of returning the struct pointer? Shouldn't it be enough to
> > > make it return bool? Also the function name really should reflect that this
> > > checks whether the minimal required microcode revision is active.
> >
> > This allows the user to find the table entry to tie something to it
> > (e.g. use the index to match some other table)
> >
> > Same pattern as pci discovery etc. use.
> >
> > Given the current caller doesn't need it, but we still follow standard
> > conventions.
>
> There is no point to return the pointer because it's not a compound
> structure. If you want to provide the possibility to use the index then
> return the index and an error code if it does not match.

It will be useful with the driver_data pointer, which you short sightedly
forced me to remove, and likely will need to be readded at some point
anyways if this gets more widely used. At least with the pointer not
all callers will need to be changed then.

Also it's symmetric with how the PCI and USB and the existing x86 match
discovery interfaces work.

> > > VENDOR_INTEL = 0, so this check is obscure to begin with. Either you chose
> > > a explicit condition to put at the end of the table, e.g. vendor = U8_MAX
> > > or you hand in the array size to the function.
> >
> > That would both be awkward. It's the same as match_cpu, and 0 terminators
> > are standard practice in practical all similar code. I removed
> > the or with the family.
>
> That's debatable because it's more easy to miss the terminator than getting
> the ARRAY_SIZE() argument wrong. But it doesn't matter much.

Ok then please apply it.

-Andi


2018-10-21 10:25:36

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/cpufeature: Add facility to match microcode revisions

Andi,

On Sat, 20 Oct 2018, Andi Kleen wrote:
> On Sat, Oct 20, 2018 at 10:19:37AM +0200, Thomas Gleixner wrote:
> > On Fri, 19 Oct 2018, Andi Kleen wrote:
> > There is no point to return the pointer because it's not a compound
> > structure. If you want to provide the possibility to use the index then
> > return the index and an error code if it does not match.
>
> It will be useful with the driver_data pointer, which you short sightedly
> forced me to remove, and likely will need to be readded at some point
> anyways if this gets more widely used.

It's good and established practice not to add functionality on a 'might be
used' basis. If you'd provide at least one or two patches which demonstrate
how that is useful then that would be convincing.

> At least with the pointer not all callers will need to be changed then.

It doesn't need to be changed at all, when done correctly.

So lets walk through that again:

1) x86_match_microcode() is a misnomer because it's not as the name
suggests a match function. It compares whether the micro code revision
is greater or equal the minimal required micro code revision for the
current CPU.

2) None of the existing implementations needs a pointer return value,
neither does your use case at hand.

3) If this should be extended to a generic cpu id matching facility, then
it can be very well designed so. See below.

Step 1:

struct x86_cpu_check {
u8 vendor;
u8 family;
u8 model;
u8 stepping;
};

struct x86_cpu_check *x86_match_cpu(struct x86_cpu_check *table)
{
// Find matching vendor, family, model, stepping entry
... {
return entry;
}
return NULL;
}

Genuine CPU match function, which can be extended by extending the data
structure.

Step 2:

struct x86_cpu_check {
u8 vendor;
u8 family;
u8 model;
u8 stepping;
u32 microcode_rev;
};

bool x86_cpu_has_min_microcode(struct x86_cpu_check *table)
{
struct x86_cpu_check *res = x86_match_cpu(table);

if (!res)
return false;
return res->microcode_revision >= boot_cpu_data.microcode;
}

Step 3:

struct x86_cpu_check {
u8 vendor;
u8 family;
u8 model;
u8 stepping;
union {
u32 microcode_rev;
void *driver_data;
}
};

Can be used with x86_match_cpu() for all non microcode based matching.

So if you really need something which checks the microcode and provides the
pointer, then it's easy enough to do:

Step 4:

struct x86_cpu_check {
u8 vendor;
u8 family;
u8 model;
u8 stepping;
u32 microcode_rev;
void *driver_data;
};

struct x86_cpu_check *x86_check_min_microcode(struct x86_cpu_check *table)
{
struct x86_cpu_check *res = x86_match_cpu(table);

if (!res || res->microcode_rev < boot_cpu_data.microcode)
return NULL;
return res;
}

static inline bool x86_cpu_has_min_microcode(struct x86_cpu_check *table)
{
return !!x86_check_min_microcode(table);
}

None of these steps requires to change a call site or a table.

But probably I'm too short sighted and missing something crucial. Looking
forward for enlightment.

> Also it's symmetric with how the PCI and USB and the existing x86 match
> discovery interfaces work.

And the point is? That we need to keep everything as we've done it 20 years
ago?

> > > > VENDOR_INTEL = 0, so this check is obscure to begin with. Either you chose
> > > > a explicit condition to put at the end of the table, e.g. vendor = U8_MAX
> > > > or you hand in the array size to the function.
> > >
> > > That would both be awkward. It's the same as match_cpu, and 0 terminators
> > > are standard practice in practical all similar code. I removed
> > > the or with the family.
> >
> > That's debatable because it's more easy to miss the terminator than getting
> > the ARRAY_SIZE() argument wrong. But it doesn't matter much.
>
> Ok then please apply it.

Sure, once this argument is settled and all review comments are addressed.

Thanks,

tglx

2018-10-21 15:14:36

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/cpufeature: Add facility to match microcode revisions

On Sun, Oct 21, 2018 at 12:20:47PM +0200, Thomas Gleixner wrote:
> struct x86_cpu_check {

Btw, if we have to be precise, this struct is no "check" struct either.

> u8 vendor;
> u8 family;
> u8 model;
> u8 stepping;
> };

If anything, it is a x86_cpu_type or x86_cpu_desc(riptor) or so -
basically a tuple of items which determine a CPU type uniquely.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-10-25 23:25:03

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/cpufeature: Add facility to match microcode revisions

On Sun, Oct 21, 2018 at 12:20:47PM +0200, Thomas Gleixner wrote:
> Andi,
>
> On Sat, 20 Oct 2018, Andi Kleen wrote:
> > On Sat, Oct 20, 2018 at 10:19:37AM +0200, Thomas Gleixner wrote:
> > > On Fri, 19 Oct 2018, Andi Kleen wrote:
> > > There is no point to return the pointer because it's not a compound
> > > structure. If you want to provide the possibility to use the index then
> > > return the index and an error code if it does not match.
> >
> > It will be useful with the driver_data pointer, which you short sightedly
> > forced me to remove, and likely will need to be readded at some point
> > anyways if this gets more widely used.
>
> It's good and established practice not to add functionality on a 'might be
> used' basis. If you'd provide at least one or two patches which demonstrate
> how that is useful then that would be convincing.
>
> > At least with the pointer not all callers will need to be changed then.
>
> It doesn't need to be changed at all, when done correctly.

Thanks.

I opted for the simpler method of returning a boolean now.

-Andi