2013-03-18 11:28:03

by George Dunlap

[permalink] [raw]
Subject: [PATCH v2] perf: Check all MSRs before passing hw check

check_hw_exists has a number of checks which go to two exit paths:
msr_fail and bios_fail. Checks classified as msr_fail will cause
check_hw_exists() to return false, causing the PMU not to be used;
bios_fail checks will only cause a warning to be printed, but will
return true.

The problem is that if there are both msr failures and bios failures,
and the routine hits a bios_fail check first, it will exit early and
return true, not finishing the rest of the msr checks. If those msrs
are in fact broken, it will cause them to be used erroneously.

In the case of a Xen PV VM, the guest OS has read access to all the
MSRs, but write access is white-listed to supported features. Writes
to unsupported MSRs have no effect. The PMU MSRs are not (typically)
supported, because they are expensive to save and restore on a VM
context switch. One of the "msr_fail" checks is supposed to detect
this circumstance (ether for Xen or KVM) and disable the harware PMU.

However, on one of my AMD boxen, there is (apparently) a broken BIOS
which triggers one of the bios_fail checks. In particular,
MSR_K7_EVNTSEL0 has the ARCH_PERFMON_EVENTSEL_ENABLE bit set. The
guest kernel detects this because it has read access to all MSRs, and
causes it to skip the rest of the checks and try to use the
non-existent hardware PMU. This minimally causes a lot of useless
instruction emulation and Xen console spam; it may cause other issues
with the watchdog as well.

This changset causes check_hw_exists() to go through all of the msr
checks, failing and returning false if any of them fail. This makes
sure that a guest running under Xen without a virtual PMU will detect
that there is no functioning PMU and not attempt to use it.

This problem affects kernels as far back as 3.2, and should thus be
considered for backport.

v2:
- Print the warning when the event happens so the reg,val make sense
- But print it only for the first such instance
- Update changelog to include details of failing system

Signed-off-by: George Dunlap <[email protected]>
CC: Konrad Wilk <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: [email protected]
CC: Ian Campbell <[email protected]>
CC: David Vrabel <[email protected]>
CC: Andrew Cooper <[email protected]>
---
arch/x86/kernel/cpu/perf_event.c | 30 ++++++++++++++++++------------
1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 6774c17..5e2b4d6 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -182,6 +182,19 @@ static bool check_hw_exists(void)
{
u64 val, val_new = ~0;
int i, reg, ret = 0;
+ int bios_fail = 0;
+
+#define BIOS_FAIL(_r, _v) \
+do { \
+ /* \
+ * We still allow the PMU driver to operate: \
+ */ \
+ if (!bios_fail) { \
+ bios_fail = 1; \
+ printk(KERN_CONT "Broken BIOS detected, complain to your hardware vendor.\n"); \
+ printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", _r, _v); \
+ } \
+} while(0)

/*
* Check to see if the BIOS enabled any of the counters, if so
@@ -192,8 +205,8 @@ static bool check_hw_exists(void)
ret = rdmsrl_safe(reg, &val);
if (ret)
goto msr_fail;
- if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
- goto bios_fail;
+ if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
+ BIOS_FAIL(reg, val);
}

if (x86_pmu.num_counters_fixed) {
@@ -203,10 +216,12 @@ static bool check_hw_exists(void)
goto msr_fail;
for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
if (val & (0x03 << i*4))
- goto bios_fail;
+ BIOS_FAIL(reg, val);
}
}

+#undef BIOS_FAIL
+
/*
* Read the current value, change it and read it back to see if it
* matches, this is needed to detect certain hardware emulators
@@ -223,15 +238,6 @@ static bool check_hw_exists(void)

return true;

-bios_fail:
- /*
- * We still allow the PMU driver to operate:
- */
- printk(KERN_CONT "Broken BIOS detected, complain to your hardware vendor.\n");
- printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", reg, val);
-
- return true;
-
msr_fail:
printk(KERN_CONT "Broken PMU hardware detected, using software events only.\n");
printk(KERN_ERR "Failed to access perfctr msr (MSR %x is %Lx)\n", reg, val_new);
--
1.7.9.5


2013-03-25 10:57:18

by George Dunlap

[permalink] [raw]
Subject: Re: [PATCH v2] perf: Check all MSRs before passing hw check

On 18/03/13 11:27, George Dunlap wrote:
> check_hw_exists has a number of checks which go to two exit paths:
> msr_fail and bios_fail. Checks classified as msr_fail will cause
> check_hw_exists() to return false, causing the PMU not to be used;
> bios_fail checks will only cause a warning to be printed, but will
> return true.
>
> The problem is that if there are both msr failures and bios failures,
> and the routine hits a bios_fail check first, it will exit early and
> return true, not finishing the rest of the msr checks. If those msrs
> are in fact broken, it will cause them to be used erroneously.
>
> In the case of a Xen PV VM, the guest OS has read access to all the
> MSRs, but write access is white-listed to supported features. Writes
> to unsupported MSRs have no effect. The PMU MSRs are not (typically)
> supported, because they are expensive to save and restore on a VM
> context switch. One of the "msr_fail" checks is supposed to detect
> this circumstance (ether for Xen or KVM) and disable the harware PMU.
>
> However, on one of my AMD boxen, there is (apparently) a broken BIOS
> which triggers one of the bios_fail checks. In particular,
> MSR_K7_EVNTSEL0 has the ARCH_PERFMON_EVENTSEL_ENABLE bit set. The
> guest kernel detects this because it has read access to all MSRs, and
> causes it to skip the rest of the checks and try to use the
> non-existent hardware PMU. This minimally causes a lot of useless
> instruction emulation and Xen console spam; it may cause other issues
> with the watchdog as well.
>
> This changset causes check_hw_exists() to go through all of the msr
> checks, failing and returning false if any of them fail. This makes
> sure that a guest running under Xen without a virtual PMU will detect
> that there is no functioning PMU and not attempt to use it.
>
> This problem affects kernels as far back as 3.2, and should thus be
> considered for backport.
>
> v2:
> - Print the warning when the event happens so the reg,val make sense
> - But print it only for the first such instance
> - Update changelog to include details of failing system
>
> Signed-off-by: George Dunlap <[email protected]>

Ping?
-George

> CC: Konrad Wilk <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: "H. Peter Anvin" <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: [email protected]
> CC: Ian Campbell <[email protected]>
> CC: David Vrabel <[email protected]>
> CC: Andrew Cooper <[email protected]>
> ---
> arch/x86/kernel/cpu/perf_event.c | 30 ++++++++++++++++++------------
> 1 file changed, 18 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 6774c17..5e2b4d6 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -182,6 +182,19 @@ static bool check_hw_exists(void)
> {
> u64 val, val_new = ~0;
> int i, reg, ret = 0;
> + int bios_fail = 0;
> +
> +#define BIOS_FAIL(_r, _v) \
> +do { \
> + /* \
> + * We still allow the PMU driver to operate: \
> + */ \
> + if (!bios_fail) { \
> + bios_fail = 1; \
> + printk(KERN_CONT "Broken BIOS detected, complain to your hardware vendor.\n"); \
> + printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", _r, _v); \
> + } \
> +} while(0)
>
> /*
> * Check to see if the BIOS enabled any of the counters, if so
> @@ -192,8 +205,8 @@ static bool check_hw_exists(void)
> ret = rdmsrl_safe(reg, &val);
> if (ret)
> goto msr_fail;
> - if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
> - goto bios_fail;
> + if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
> + BIOS_FAIL(reg, val);
> }
>
> if (x86_pmu.num_counters_fixed) {
> @@ -203,10 +216,12 @@ static bool check_hw_exists(void)
> goto msr_fail;
> for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
> if (val & (0x03 << i*4))
> - goto bios_fail;
> + BIOS_FAIL(reg, val);
> }
> }
>
> +#undef BIOS_FAIL
> +
> /*
> * Read the current value, change it and read it back to see if it
> * matches, this is needed to detect certain hardware emulators
> @@ -223,15 +238,6 @@ static bool check_hw_exists(void)
>
> return true;
>
> -bios_fail:
> - /*
> - * We still allow the PMU driver to operate:
> - */
> - printk(KERN_CONT "Broken BIOS detected, complain to your hardware vendor.\n");
> - printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", reg, val);
> -
> - return true;
> -
> msr_fail:
> printk(KERN_CONT "Broken PMU hardware detected, using software events only.\n");
> printk(KERN_ERR "Failed to access perfctr msr (MSR %x is %Lx)\n", reg, val_new);

2013-04-02 11:25:31

by George Dunlap

[permalink] [raw]
Subject: Re: [PATCH v2] perf: Check all MSRs before passing hw check

Any opinions on this? It's been 2 weeks now.

-George

On 18/03/13 11:27, George Dunlap wrote:
> check_hw_exists has a number of checks which go to two exit paths:
> msr_fail and bios_fail. Checks classified as msr_fail will cause
> check_hw_exists() to return false, causing the PMU not to be used;
> bios_fail checks will only cause a warning to be printed, but will
> return true.
>
> The problem is that if there are both msr failures and bios failures,
> and the routine hits a bios_fail check first, it will exit early and
> return true, not finishing the rest of the msr checks. If those msrs
> are in fact broken, it will cause them to be used erroneously.
>
> In the case of a Xen PV VM, the guest OS has read access to all the
> MSRs, but write access is white-listed to supported features. Writes
> to unsupported MSRs have no effect. The PMU MSRs are not (typically)
> supported, because they are expensive to save and restore on a VM
> context switch. One of the "msr_fail" checks is supposed to detect
> this circumstance (ether for Xen or KVM) and disable the harware PMU.
>
> However, on one of my AMD boxen, there is (apparently) a broken BIOS
> which triggers one of the bios_fail checks. In particular,
> MSR_K7_EVNTSEL0 has the ARCH_PERFMON_EVENTSEL_ENABLE bit set. The
> guest kernel detects this because it has read access to all MSRs, and
> causes it to skip the rest of the checks and try to use the
> non-existent hardware PMU. This minimally causes a lot of useless
> instruction emulation and Xen console spam; it may cause other issues
> with the watchdog as well.
>
> This changset causes check_hw_exists() to go through all of the msr
> checks, failing and returning false if any of them fail. This makes
> sure that a guest running under Xen without a virtual PMU will detect
> that there is no functioning PMU and not attempt to use it.
>
> This problem affects kernels as far back as 3.2, and should thus be
> considered for backport.
>
> v2:
> - Print the warning when the event happens so the reg,val make sense
> - But print it only for the first such instance
> - Update changelog to include details of failing system
>
> Signed-off-by: George Dunlap <[email protected]>
> CC: Konrad Wilk <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: "H. Peter Anvin" <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: [email protected]
> CC: Ian Campbell <[email protected]>
> CC: David Vrabel <[email protected]>
> CC: Andrew Cooper <[email protected]>



> ---
> arch/x86/kernel/cpu/perf_event.c | 30 ++++++++++++++++++------------
> 1 file changed, 18 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 6774c17..5e2b4d6 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -182,6 +182,19 @@ static bool check_hw_exists(void)
> {
> u64 val, val_new = ~0;
> int i, reg, ret = 0;
> + int bios_fail = 0;
> +
> +#define BIOS_FAIL(_r, _v) \
> +do { \
> + /* \
> + * We still allow the PMU driver to operate: \
> + */ \
> + if (!bios_fail) { \
> + bios_fail = 1; \
> + printk(KERN_CONT "Broken BIOS detected, complain to your hardware vendor.\n"); \
> + printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", _r, _v); \
> + } \
> +} while(0)
>
> /*
> * Check to see if the BIOS enabled any of the counters, if so
> @@ -192,8 +205,8 @@ static bool check_hw_exists(void)
> ret = rdmsrl_safe(reg, &val);
> if (ret)
> goto msr_fail;
> - if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
> - goto bios_fail;
> + if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
> + BIOS_FAIL(reg, val);
> }
>
> if (x86_pmu.num_counters_fixed) {
> @@ -203,10 +216,12 @@ static bool check_hw_exists(void)
> goto msr_fail;
> for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
> if (val & (0x03 << i*4))
> - goto bios_fail;
> + BIOS_FAIL(reg, val);
> }
> }
>
> +#undef BIOS_FAIL
> +
> /*
> * Read the current value, change it and read it back to see if it
> * matches, this is needed to detect certain hardware emulators
> @@ -223,15 +238,6 @@ static bool check_hw_exists(void)
>
> return true;
>
> -bios_fail:
> - /*
> - * We still allow the PMU driver to operate:
> - */
> - printk(KERN_CONT "Broken BIOS detected, complain to your hardware vendor.\n");
> - printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", reg, val);
> -
> - return true;
> -
> msr_fail:
> printk(KERN_CONT "Broken PMU hardware detected, using software events only.\n");
> printk(KERN_ERR "Failed to access perfctr msr (MSR %x is %Lx)\n", reg, val_new);

2013-04-02 15:50:33

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v2] perf: Check all MSRs before passing hw check

On 04/02/2013 04:25 AM, George Dunlap wrote:
> Any opinions on this? It's been 2 weeks now.
>
> -George

Please just set bios_fail = 1 and save the values in the running code
and do *one* printk() at the end. Otherwise it seems okay.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.