2022-01-23 14:23:12

by Kyle Huey

[permalink] [raw]
Subject: [PATCH] x86/perf: Default freeze_on_smi on for Comet Lake and later.

Beginning in Comet Lake, Intel extended the concept of privilege rings to
SMM.[0] A side effect of this is that events caused by execution of code
in SMM are now visible to performance counters with IA32_PERFEVTSELx.USR
set.

rr[1] depends on exact counts of performance events for the user space
tracee, so this change in behavior is fatal for us. It is, however, easily
corrected by setting IA32_DEBUGCTL.FREEZE_WHILE_SMM to 1 (visible in sysfs
as /sys/devices/cpu/freeze_on_smi). While we can and will tell our users to
set freeze_on_smi manually when appropriate, because observing events in
SMM is rarely useful to anyone, we propose to change the default value of
this switch.

In this patch I have assumed that all non-Atom Intel microarchitectures
starting with Comet Lake behave like this but it would be good for someone
at Intel to verify that.

[0] See the Intel white paper "Trustworthy SMM on the Intel vPro Platform"
at https://bugzilla.kernel.org/attachment.cgi?id=300300, particularly the
end of page 5.

[1] https://rr-project.org/

Signed-off-by: Kyle Huey <[email protected]>
---
arch/x86/events/intel/core.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index fd9f908debe5..9604e19c8761 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -6094,6 +6094,11 @@ __init int intel_pmu_init(void)
x86_pmu.commit_scheduling = intel_tfa_commit_scheduling;
}

+ if (boot_cpu_data.x86_model == INTEL_FAM6_COMETLAKE_L ||
+ boot_cpu_data.x86_model == INTEL_FAM6_COMETLAKE) {
+ x86_pmu.attr_freeze_on_smi = 1;
+ }
+
pr_cont("Skylake events, ");
name = "skylake";
break;
@@ -6135,6 +6140,7 @@ __init int intel_pmu_init(void)
x86_pmu.num_topdown_events = 4;
x86_pmu.update_topdown_event = icl_update_topdown_event;
x86_pmu.set_topdown_event_period = icl_set_topdown_event_period;
+ x86_pmu.attr_freeze_on_smi = 1;
pr_cont("Icelake events, ");
name = "icelake";
break;
@@ -6172,6 +6178,7 @@ __init int intel_pmu_init(void)
x86_pmu.num_topdown_events = 8;
x86_pmu.update_topdown_event = icl_update_topdown_event;
x86_pmu.set_topdown_event_period = icl_set_topdown_event_period;
+ x86_pmu.attr_freeze_on_smi = 1;
pr_cont("Sapphire Rapids events, ");
name = "sapphire_rapids";
break;
@@ -6217,6 +6224,7 @@ __init int intel_pmu_init(void)
* x86_pmu.rtm_abort_event.
*/
x86_pmu.rtm_abort_event = X86_CONFIG(.event=0xc9, .umask=0x04);
+ x86_pmu.attr_freeze_on_smi = 1;

td_attr = adl_hybrid_events_attrs;
mem_attr = adl_hybrid_mem_attrs;
--
2.34.1


2022-01-24 19:13:20

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Default freeze_on_smi on for Comet Lake and later.

On Fri, Jan 21, 2022 at 11:26:44PM -0800, Kyle Huey wrote:
> Beginning in Comet Lake, Intel extended the concept of privilege rings to
> SMM.[0] A side effect of this is that events caused by execution of code
> in SMM are now visible to performance counters with IA32_PERFEVTSELx.USR
> set.
>
> rr[1] depends on exact counts of performance events for the user space
> tracee, so this change in behavior is fatal for us. It is, however, easily
> corrected by setting IA32_DEBUGCTL.FREEZE_WHILE_SMM to 1 (visible in sysfs
> as /sys/devices/cpu/freeze_on_smi). While we can and will tell our users to
> set freeze_on_smi manually when appropriate, because observing events in
> SMM is rarely useful to anyone, we propose to change the default value of
> this switch.
>
> In this patch I have assumed that all non-Atom Intel microarchitectures
> starting with Comet Lake behave like this but it would be good for someone
> at Intel to verify that.
>

Kan, can you look at that?

> [0] See the Intel white paper "Trustworthy SMM on the Intel vPro Platform"
> at https://bugzilla.kernel.org/attachment.cgi?id=300300, particularly the
> end of page 5.
>
> [1] https://rr-project.org/
>
> Signed-off-by: Kyle Huey <[email protected]>

Patch seems sensible enough; I'll go queue it up unless Kan comes back
with anything troublesome.

2022-01-24 19:31:39

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Default freeze_on_smi on for Comet Lake and later.



On 1/24/2022 7:21 AM, Peter Zijlstra wrote:
> On Fri, Jan 21, 2022 at 11:26:44PM -0800, Kyle Huey wrote:
>> Beginning in Comet Lake, Intel extended the concept of privilege rings to
>> SMM.[0] A side effect of this is that events caused by execution of code
>> in SMM are now visible to performance counters with IA32_PERFEVTSELx.USR
>> set.
>>
>> rr[1] depends on exact counts of performance events for the user space
>> tracee, so this change in behavior is fatal for us. It is, however, easily
>> corrected by setting IA32_DEBUGCTL.FREEZE_WHILE_SMM to 1 (visible in sysfs
>> as /sys/devices/cpu/freeze_on_smi). While we can and will tell our users to
>> set freeze_on_smi manually when appropriate, because observing events in
>> SMM is rarely useful to anyone, we propose to change the default value of
>> this switch.

+ Andi

From we heard many times from sophisticated customers, they really hate
blind spots. They want to see everything. That's why we set
freeze_on_smi to 0 as default. I think the patch breaks the principle.

I don't think there is a way to notify all the users that the default
kernel value will be changed. (Yes, the end user can always check the
/sys/devices/cpu/freeze_on_smi to get the latest value. But in practice,
no one checks it unless some errors found.) I think it may bring
troubles to the users if they rely on the counts in SMM.

The patch only changes the default values for some platforms, not all
platforms. The default value is not consistent among platforms anymore.
It can bring confusion.

All in all, we have already exposed an interface for the end-users to
change the value. If some apps, e.g., rr, doesn't want the default
value, I think they can always change it in the app for all platforms.
We should still keep the freeze_on_smi to 0 as default, which should
benefit more users.


>>
>> In this patch I have assumed that all non-Atom Intel microarchitectures
>> starting with Comet Lake behave like this but it would be good for someone
>> at Intel to verify that.
>>
>
> Kan, can you look at that?
>

I'm asking internally.

Thanks,
Kan

>> [0] See the Intel white paper "Trustworthy SMM on the Intel vPro Platform"
>> at https://bugzilla.kernel.org/attachment.cgi?id=300300, particularly the
>> end of page 5.
>>
>> [1] https://rr-project.org/
>>
>> Signed-off-by: Kyle Huey <[email protected]>
>
> Patch seems sensible enough; I'll go queue it up unless Kan comes back
> with anything troublesome.

2022-01-24 19:34:24

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Default freeze_on_smi on for Comet Lake and later.

On Mon, Jan 24, 2022 at 11:00:56AM -0500, Liang, Kan wrote:
>
>
> On 1/24/2022 7:21 AM, Peter Zijlstra wrote:
> > On Fri, Jan 21, 2022 at 11:26:44PM -0800, Kyle Huey wrote:
> > > Beginning in Comet Lake, Intel extended the concept of privilege rings to
> > > SMM.[0] A side effect of this is that events caused by execution of code
> > > in SMM are now visible to performance counters with IA32_PERFEVTSELx.USR
> > > set.
> > >
> > > rr[1] depends on exact counts of performance events for the user space
> > > tracee, so this change in behavior is fatal for us. It is, however, easily
> > > corrected by setting IA32_DEBUGCTL.FREEZE_WHILE_SMM to 1 (visible in sysfs
> > > as /sys/devices/cpu/freeze_on_smi). While we can and will tell our users to
> > > set freeze_on_smi manually when appropriate, because observing events in
> > > SMM is rarely useful to anyone, we propose to change the default value of
> > > this switch.
>
> + Andi
>
> From we heard many times from sophisticated customers, they really hate
> blind spots. They want to see everything. That's why we set freeze_on_smi to
> 0 as default. I think the patch breaks the principle.

Well, USR really, as in *REALLY* should not be counting SMM. That's just
plain broken.

There's maybe an argument to include it in OS, but USR is ring-3.

2022-01-24 19:39:38

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Default freeze_on_smi on for Comet Lake and later.



On 1/24/2022 11:30 AM, Peter Zijlstra wrote:
> On Mon, Jan 24, 2022 at 11:00:56AM -0500, Liang, Kan wrote:
>>
>>
>> On 1/24/2022 7:21 AM, Peter Zijlstra wrote:
>>> On Fri, Jan 21, 2022 at 11:26:44PM -0800, Kyle Huey wrote:
>>>> Beginning in Comet Lake, Intel extended the concept of privilege rings to
>>>> SMM.[0] A side effect of this is that events caused by execution of code
>>>> in SMM are now visible to performance counters with IA32_PERFEVTSELx.USR
>>>> set.
>>>>
>>>> rr[1] depends on exact counts of performance events for the user space
>>>> tracee, so this change in behavior is fatal for us. It is, however, easily
>>>> corrected by setting IA32_DEBUGCTL.FREEZE_WHILE_SMM to 1 (visible in sysfs
>>>> as /sys/devices/cpu/freeze_on_smi). While we can and will tell our users to
>>>> set freeze_on_smi manually when appropriate, because observing events in
>>>> SMM is rarely useful to anyone, we propose to change the default value of
>>>> this switch.
>>
>> + Andi
>>
>> From we heard many times from sophisticated customers, they really hate
>> blind spots. They want to see everything. That's why we set freeze_on_smi to
>> 0 as default. I think the patch breaks the principle.
>
> Well, USR really, as in *REALLY* should not be counting SMM. That's just
> plain broken.
>

For the USR only case, the bit could be set.

> There's maybe an argument to include it in OS, but USR is ring-3.

But we don't have an option for the USR only case. The changing will
impact the SYS case as well. Personally, maybe it's better to let the
user apace app control the bit as needed rather than changing the
default kernel value for all cases.

Thanks,
Kan

2022-01-24 19:40:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Default freeze_on_smi on for Comet Lake and later.

On Mon, Jan 24, 2022 at 12:03:10PM -0500, Liang, Kan wrote:
>
>
> On 1/24/2022 11:30 AM, Peter Zijlstra wrote:
> > On Mon, Jan 24, 2022 at 11:00:56AM -0500, Liang, Kan wrote:
> > >
> > >
> > > On 1/24/2022 7:21 AM, Peter Zijlstra wrote:
> > > > On Fri, Jan 21, 2022 at 11:26:44PM -0800, Kyle Huey wrote:
> > > > > Beginning in Comet Lake, Intel extended the concept of privilege rings to
> > > > > SMM.[0] A side effect of this is that events caused by execution of code
> > > > > in SMM are now visible to performance counters with IA32_PERFEVTSELx.USR
> > > > > set.
> > > > >
> > > > > rr[1] depends on exact counts of performance events for the user space
> > > > > tracee, so this change in behavior is fatal for us. It is, however, easily
> > > > > corrected by setting IA32_DEBUGCTL.FREEZE_WHILE_SMM to 1 (visible in sysfs
> > > > > as /sys/devices/cpu/freeze_on_smi). While we can and will tell our users to
> > > > > set freeze_on_smi manually when appropriate, because observing events in
> > > > > SMM is rarely useful to anyone, we propose to change the default value of
> > > > > this switch.
> > >
> > > + Andi
> > >
> > > From we heard many times from sophisticated customers, they really hate
> > > blind spots. They want to see everything. That's why we set freeze_on_smi to
> > > 0 as default. I think the patch breaks the principle.
> >
> > Well, USR really, as in *REALLY* should not be counting SMM. That's just
> > plain broken.
> >
>
> For the USR only case, the bit could be set.
>
> > There's maybe an argument to include it in OS, but USR is ring-3.
>
> But we don't have an option for the USR only case. The changing will impact
> the SYS case as well. Personally, maybe it's better to let the user apace
> app control the bit as needed rather than changing the default kernel value
> for all cases.

The bit is system wide, you can't sanely control it per counter. The
change that made USR include SMM is insane and broken.

2022-01-25 08:59:03

by Kyle Huey

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Default freeze_on_smi on for Comet Lake and later.

On Mon, Jan 24, 2022 at 8:01 AM Liang, Kan <[email protected]> wrote:
>
>
>
> On 1/24/2022 7:21 AM, Peter Zijlstra wrote:
> > On Fri, Jan 21, 2022 at 11:26:44PM -0800, Kyle Huey wrote:
> >> Beginning in Comet Lake, Intel extended the concept of privilege rings to
> >> SMM.[0] A side effect of this is that events caused by execution of code
> >> in SMM are now visible to performance counters with IA32_PERFEVTSELx.USR
> >> set.
> >>
> >> rr[1] depends on exact counts of performance events for the user space
> >> tracee, so this change in behavior is fatal for us. It is, however, easily
> >> corrected by setting IA32_DEBUGCTL.FREEZE_WHILE_SMM to 1 (visible in sysfs
> >> as /sys/devices/cpu/freeze_on_smi). While we can and will tell our users to
> >> set freeze_on_smi manually when appropriate, because observing events in
> >> SMM is rarely useful to anyone, we propose to change the default value of
> >> this switch.
>
> + Andi
>
> From we heard many times from sophisticated customers, they really hate
> blind spots. They want to see everything. That's why we set
> freeze_on_smi to 0 as default. I think the patch breaks the principle.

The default kernel settings for perf events prioritize preventing
information leaks to less privileged code. perf_event_paranoid
defaults to 2, preventing unprivileged users from observing kernel
space. If "sophisticated customers" want to see everything they have
already needed privileges (or an explicit opt-in through decreasing
perf_event_paranoid) for some time.

The current situation on Comet Lake+ where an unprivileged user
*cannot* observe kernel code due to security concerns but
simultaneously *must* observe SMM code seems rather absurd.

> I don't think there is a way to notify all the users that the default
> kernel value will be changed. (Yes, the end user can always check the
> /sys/devices/cpu/freeze_on_smi to get the latest value. But in practice,
> no one checks it unless some errors found.) I think it may bring
> troubles to the users if they rely on the counts in SMM.

Unfortunately the new hardware has already changed the behavior
without notifying users, no matter what we do here.

> The patch only changes the default values for some platforms, not all
> platforms. The default value is not consistent among platforms anymore.
> It can bring confusion.

I don't personally object to changing freeze_on_smi for all platforms
:) I was merely trying to limit the changes.

> All in all, we have already exposed an interface for the end-users to
> change the value. If some apps, e.g., rr, doesn't want the default
> value, I think they can always change it in the app for all platforms.
> We should still keep the freeze_on_smi to 0 as default, which should
> benefit more users.

I think "people who want to just do userspace profiling like they did
before can just change the value" is an unsatisfying answer,
especially because freeze_on_smi requires root to change.


- Kyle

>
> >>
> >> In this patch I have assumed that all non-Atom Intel microarchitectures
> >> starting with Comet Lake behave like this but it would be good for someone
> >> at Intel to verify that.
> >>
> >
> > Kan, can you look at that?
> >
>
> I'm asking internally.
>
> Thanks,
> Kan
>
> >> [0] See the Intel white paper "Trustworthy SMM on the Intel vPro Platform"
> >> at https://bugzilla.kernel.org/attachment.cgi?id=300300, particularly the
> >> end of page 5.
> >>
> >> [1] https://rr-project.org/
> >>
> >> Signed-off-by: Kyle Huey <[email protected]>
> >
> > Patch seems sensible enough; I'll go queue it up unless Kan comes back
> > with anything troublesome.

2022-01-25 20:01:10

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Default freeze_on_smi on for Comet Lake and later.



On 1/24/2022 9:59 PM, Kyle Huey wrote:
> On Mon, Jan 24, 2022 at 8:01 AM Liang, Kan <[email protected]> wrote:
>>
>>
>>
>> On 1/24/2022 7:21 AM, Peter Zijlstra wrote:
>>> On Fri, Jan 21, 2022 at 11:26:44PM -0800, Kyle Huey wrote:
>>>> Beginning in Comet Lake, Intel extended the concept of privilege rings to
>>>> SMM.[0] A side effect of this is that events caused by execution of code
>>>> in SMM are now visible to performance counters with IA32_PERFEVTSELx.USR
>>>> set.
>>>>
>>>> rr[1] depends on exact counts of performance events for the user space
>>>> tracee, so this change in behavior is fatal for us. It is, however, easily
>>>> corrected by setting IA32_DEBUGCTL.FREEZE_WHILE_SMM to 1 (visible in sysfs
>>>> as /sys/devices/cpu/freeze_on_smi). While we can and will tell our users to
>>>> set freeze_on_smi manually when appropriate, because observing events in
>>>> SMM is rarely useful to anyone, we propose to change the default value of
>>>> this switch.
>>
>> + Andi
>>
>> From we heard many times from sophisticated customers, they really hate
>> blind spots. They want to see everything. That's why we set
>> freeze_on_smi to 0 as default. I think the patch breaks the principle.
>
> The default kernel settings for perf events prioritize preventing
> information leaks to less privileged code. perf_event_paranoid
> defaults to 2, preventing unprivileged users from observing kernel
> space. If "sophisticated customers" want to see everything they have
> already needed privileges (or an explicit opt-in through decreasing
> perf_event_paranoid) for some time.
>
> The current situation on Comet Lake+ where an unprivileged user
> *cannot* observe kernel code due to security concerns but
> simultaneously *must* observe SMM code seems rather absurd.
>

I see. I was thought the unprivileged user can observe the SMM code on
the previous platforms. The CML+ change only makes part of the SMM code
CPL0. Seems I'm wrong. The change looks like changing the previous CPL0
code to CPL3 code. If so, yes, I think we should prevent the information
leaks for the unprivileged user.

>> I don't think there is a way to notify all the users that the default
>> kernel value will be changed. (Yes, the end user can always check the
>> /sys/devices/cpu/freeze_on_smi to get the latest value. But in practice,
>> no one checks it unless some errors found.) I think it may bring
>> troubles to the users if they rely on the counts in SMM.
>
> Unfortunately the new hardware has already changed the behavior
> without notifying users, no matter what we do here.
>
>> The patch only changes the default values for some platforms, not all
>> platforms. The default value is not consistent among platforms anymore.
>> It can bring confusion.
>
> I don't personally object to changing freeze_on_smi for all platforms
> :) I was merely trying to limit the changes.


Changing it to all platforms seems a too big hammer. I agree we should
limit it to the impacted platforms.

I've contacted the author of the white paper. I was told that the change
is for the client vPro platforms. They are not sure whether it impacts
Server platform or Atom platforms. I'm still working on it. I will let
you and Peter know once I get more information.

Thanks,
Kan

2022-01-26 21:29:34

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Default freeze_on_smi on for Comet Lake and later.

On Tue, Jan 25, 2022 at 08:57:09AM -0500, Liang, Kan wrote:
> I see. I was thought the unprivileged user can observe the SMM code on the
> previous platforms. The CML+ change only makes part of the SMM code CPL0.
> Seems I'm wrong. The change looks like changing the previous CPL0 code to
> CPL3 code. If so, yes, I think we should prevent the information leaks for
> the unprivileged user.

Right.

> Changing it to all platforms seems a too big hammer. I agree we should limit
> it to the impacted platforms.
>
> I've contacted the author of the white paper. I was told that the change is
> for the client vPro platforms. They are not sure whether it impacts Server
> platform or Atom platforms. I'm still working on it. I will let you and
> Peter know once I get more information.

For now I've updated the patch as per the below. I'm tempted to simply
apply it as is and let it be.

Having different defaults for vPro vs !vPro chips seems more confusing
than not.

We should also very much get this change reverted for future chips.

--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -6094,6 +6094,16 @@ __init int intel_pmu_init(void)
x86_pmu.commit_scheduling = intel_tfa_commit_scheduling;
}

+ if (boot_cpu_data.x86_model == INTEL_FAM6_COMETLAKE_L ||
+ boot_cpu_data.x86_model == INTEL_FAM6_COMETLAKE) {
+ /*
+ * For some idiotic reason SMM is visible to USR
+ * counters. Since this is a privilege issue, default
+ * disable counters in SMM for these chips.
+ */
+ x86_pmu.attr_freeze_on_smi = 1;
+ }
+
pr_cont("Skylake events, ");
name = "skylake";
break;
@@ -6135,6 +6145,8 @@ __init int intel_pmu_init(void)
x86_pmu.num_topdown_events = 4;
x86_pmu.update_topdown_event = icl_update_topdown_event;
x86_pmu.set_topdown_event_period = icl_set_topdown_event_period;
+ /* SMM visible in USR, see above */
+ x86_pmu.attr_freeze_on_smi = 1;
pr_cont("Icelake events, ");
name = "icelake";
break;
@@ -6172,6 +6184,8 @@ __init int intel_pmu_init(void)
x86_pmu.num_topdown_events = 8;
x86_pmu.update_topdown_event = icl_update_topdown_event;
x86_pmu.set_topdown_event_period = icl_set_topdown_event_period;
+ /* SMM visible in USR, see above */
+ x86_pmu.attr_freeze_on_smi = 1;
pr_cont("Sapphire Rapids events, ");
name = "sapphire_rapids";
break;
@@ -6217,6 +6231,8 @@ __init int intel_pmu_init(void)
* x86_pmu.rtm_abort_event.
*/
x86_pmu.rtm_abort_event = X86_CONFIG(.event=0xc9, .umask=0x04);
+ /* SMM visible in USR, see above */
+ x86_pmu.attr_freeze_on_smi = 1;

td_attr = adl_hybrid_events_attrs;
mem_attr = adl_hybrid_mem_attrs;

2022-01-26 21:41:03

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Default freeze_on_smi on for Comet Lake and later.



On 1/26/2022 9:04 AM, Peter Zijlstra wrote:
> On Tue, Jan 25, 2022 at 08:57:09AM -0500, Liang, Kan wrote:
>> I see. I was thought the unprivileged user can observe the SMM code on the
>> previous platforms. The CML+ change only makes part of the SMM code CPL0.
>> Seems I'm wrong. The change looks like changing the previous CPL0 code to
>> CPL3 code. If so, yes, I think we should prevent the information leaks for
>> the unprivileged user.
>
> Right.
>
>> Changing it to all platforms seems a too big hammer. I agree we should limit
>> it to the impacted platforms.
>>
>> I've contacted the author of the white paper. I was told that the change is
>> for the client vPro platforms. They are not sure whether it impacts Server
>> platform or Atom platforms. I'm still working on it. I will let you and
>> Peter know once I get more information.
>
> For now I've updated the patch as per the below. I'm tempted to simply
> apply it as is and let it be.
>
> Having different defaults for vPro vs !vPro chips seems more confusing
> than not.
>

But I think it should make sense to have different defaults for client
vs server, or big core vs atom.

I'm still working on it and trying to figure out the affected
generations. (Hope I can find the right contacts in the next few days.)

> We should also very much get this change reverted for future chips.
>

Yes. I will discuss it with the contacts once I get the name.

Thanks,
Kan

> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -6094,6 +6094,16 @@ __init int intel_pmu_init(void)
> x86_pmu.commit_scheduling = intel_tfa_commit_scheduling;
> }
>
> + if (boot_cpu_data.x86_model == INTEL_FAM6_COMETLAKE_L ||
> + boot_cpu_data.x86_model == INTEL_FAM6_COMETLAKE) {
> + /*
> + * For some idiotic reason SMM is visible to USR
> + * counters. Since this is a privilege issue, default
> + * disable counters in SMM for these chips.
> + */
> + x86_pmu.attr_freeze_on_smi = 1;
> + }
> +
> pr_cont("Skylake events, ");
> name = "skylake";
> break;
> @@ -6135,6 +6145,8 @@ __init int intel_pmu_init(void)
> x86_pmu.num_topdown_events = 4;
> x86_pmu.update_topdown_event = icl_update_topdown_event;
> x86_pmu.set_topdown_event_period = icl_set_topdown_event_period;
> + /* SMM visible in USR, see above */
> + x86_pmu.attr_freeze_on_smi = 1;
> pr_cont("Icelake events, ");
> name = "icelake";
> break;
> @@ -6172,6 +6184,8 @@ __init int intel_pmu_init(void)
> x86_pmu.num_topdown_events = 8;
> x86_pmu.update_topdown_event = icl_update_topdown_event;
> x86_pmu.set_topdown_event_period = icl_set_topdown_event_period;
> + /* SMM visible in USR, see above */
> + x86_pmu.attr_freeze_on_smi = 1;
> pr_cont("Sapphire Rapids events, ");
> name = "sapphire_rapids";
> break;
> @@ -6217,6 +6231,8 @@ __init int intel_pmu_init(void)
> * x86_pmu.rtm_abort_event.
> */
> x86_pmu.rtm_abort_event = X86_CONFIG(.event=0xc9, .umask=0x04);
> + /* SMM visible in USR, see above */
> + x86_pmu.attr_freeze_on_smi = 1;
>
> td_attr = adl_hybrid_events_attrs;
> mem_attr = adl_hybrid_mem_attrs;

2022-01-27 20:02:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Default freeze_on_smi on for Comet Lake and later.

On Thu, Jan 27, 2022 at 02:22:23AM +0000, Andrew Cooper wrote:

> Frankly, it is an error that FREEZE_WHILE_SMM is under the kernels
> control, and not SMM's control.? After all, it's SMM handling all the
> UEFI secrets/etc.
>
> Linux ought to set FREEZE_WHILE_SMM unilaterally, because most kernel
> profiling probably won't want interference from SMM.? Root can always
> disable FREEZE_WHILE_SMM if profiling is really wanted.
>
> I'm not sure if anything can be done on pre-FREEZE_WHILE_SMM CPUs.? Nor
> AMD CPUs which are also gaining CPL3 SMM logic, and don't appear to have
> any equivalent functionality.

Which suggests something like this?

---
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index c91434056c29..5874fa088630 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4703,6 +4703,19 @@ static __initconst const struct x86_pmu intel_pmu = {
.lbr_read = intel_pmu_lbr_read_64,
.lbr_save = intel_pmu_lbr_save,
.lbr_restore = intel_pmu_lbr_restore,
+
+ /*
+ * SMM has access to all 4 rings and while traditionally SMM code only
+ * ran in CPL0, newer firmware is starting to make use of CPL3 in SMM.
+ *
+ * Since the EVENTSEL.{USR,OS} CPL filtering makes no distinction
+ * between SMM or not, this results in what should be pure userspace
+ * counters including SMM data.
+ *
+ * This is a clear privilege issue, therefore globally disable
+ * counting SMM by default.
+ */
+ .attr_freeze_on_smi = 1,
};

static __init void intel_clovertown_quirk(void)

2022-02-03 16:04:42

by tip-bot2 for Haifeng Xu

[permalink] [raw]
Subject: [tip: perf/urgent] x86/perf: Default set FREEZE_ON_SMI for all

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID: a01994f5e5c79d3a35e5e8cf4252c7f2147323c3
Gitweb: https://git.kernel.org/tip/a01994f5e5c79d3a35e5e8cf4252c7f2147323c3
Author: Peter Zijlstra <[email protected]>
AuthorDate: Thu, 27 Jan 2022 12:32:51 +01:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Wed, 02 Feb 2022 13:11:39 +01:00

x86/perf: Default set FREEZE_ON_SMI for all

Kyle reported that rr[0] has started to malfunction on Comet Lake and
later CPUs due to EFI starting to make use of CPL3 [1] and the PMU
event filtering not distinguishing between regular CPL3 and SMM CPL3.

Since this is a privilege violation, default disable SMM visibility
where possible.

Administrators wanting to observe SMM cycles can easily change this
using the sysfs attribute while regular users don't have access to
this file.

[0] https://rr-project.org/

[1] See the Intel white paper "Trustworthy SMM on the Intel vPro Platform"
at https://bugzilla.kernel.org/attachment.cgi?id=300300, particularly the
end of page 5.

Reported-by: Kyle Huey <[email protected]>
Suggested-by: Andrew Cooper <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/events/intel/core.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index c914340..a3c7ca8 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4703,6 +4703,19 @@ static __initconst const struct x86_pmu intel_pmu = {
.lbr_read = intel_pmu_lbr_read_64,
.lbr_save = intel_pmu_lbr_save,
.lbr_restore = intel_pmu_lbr_restore,
+
+ /*
+ * SMM has access to all 4 rings and while traditionally SMM code only
+ * ran in CPL0, 2021-era firmware is starting to make use of CPL3 in SMM.
+ *
+ * Since the EVENTSEL.{USR,OS} CPL filtering makes no distinction
+ * between SMM or not, this results in what should be pure userspace
+ * counters including SMM data.
+ *
+ * This is a clear privilege issue, therefore globally disable
+ * counting SMM by default.
+ */
+ .attr_freeze_on_smi = 1,
};

static __init void intel_clovertown_quirk(void)