Subject: Re: [PATCH 08/11] perf tool: precise mode requires exclude_guest

On 26.07.12 10:08:29, Peter Zijlstra wrote:
> On Wed, 2012-07-25 at 23:16 -0600, David Ahern wrote:
>
> > Peter's patch (see https://lkml.org/lkml/2012/7/9/298) changes kernel
> > side to require the use of exclude_guest if the precise modifier is
> > used, returning -EOPNOTSUPP if exclude_guest is not set. This patch goes
> > after the user experience: Today if a user specifies -e <event>:p all
> > other modifiers are reset - including exclude_guest. Going forward we
> > need :p to imply :pH if a user has not specified a GH modifer.
> >
> > We could do nothing and handle the unsupported error and try setting the
> > exclude_guest option - like perf handles other new parameters. But
> > EOPNOTSUPP is not uniquely tied to this error -- e.g., it could be the
> > BTS is not supported (:pp). Also, we have no easy way to discriminate :p
> > from :pG or :pGH. It seems to me perf should not silently undo a user
> > request on the modifier, but inform the user the request is wrong. For
> > example if a user request -e cycles:pG it should not be silently turned
> > into :pH.
> >
> > And then yesterday, Robert stated that none of the exclude_xxxx
> > modifiers can be set for the AMD if the precise modifier is used, so we
> > cannot blindly set exclude_guest if precise_ip is set.
> >
> > So, seems to me perf need's one action for Intel processors and another
> > for AMD.
>
> No, we just need to teach the IBS code about SVM enter/exit.

I aggree that this could be emulated in software by enabling/disabling
the event with a guest/host switch. And, even better, we add this for
every pmu in a generic way. E.g. northbridge counter and I guess also
Intel uncore events do not support G/H counting in hardware. Same to
other pmus that could be imaginable in the future like counters for
IOMMUs or other hardware devices.

But, as some pmus are not related to virtualization or other features
they simply do not need to support those attributes, or we want other
defaults, e.g. enable it system wide. Detecting features with syscall
error checking and then falling back to other defaults does not seem
the right approach to me, because it may require several syscalls to
check *combinations* of supported attributes, makes error logging and
detection more difficult due to noisy log messages and because there
is no strict attribute flag checking in current and older kernels.

I better would like to see a pmu feature flag in the same style as
with /proc/cpuinfo, e.g.:

$ cat /sys/bus/event_source/devices/cpu/flags
exclude_host exclude_guest

We also need stricter attribute flag checking, esp. of reseved flags
and for unsupported features in some pmus (I already work on some
patches for this). Userland then checks flags and sets up syscalls
according to the reported flags. The goal should be to avoid syscall
errors at all. Thus, we are able to improve dmesg logging in case of
errors, currently we do not see any message if a syscall fails.

And finally, if a feature could be emulated, we could provide this
emulation of an attr flag to all pmus.

Does this make sense?

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center


2012-08-15 15:22:19

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH 08/11] perf tool: precise mode requires exclude_guest

On 8/3/12 7:51 AM, Robert Richter wrote:
> On 26.07.12 10:08:29, Peter Zijlstra wrote:
>> On Wed, 2012-07-25 at 23:16 -0600, David Ahern wrote:
>>
>>> Peter's patch (see https://lkml.org/lkml/2012/7/9/298) changes kernel
>>> side to require the use of exclude_guest if the precise modifier is
>>> used, returning -EOPNOTSUPP if exclude_guest is not set. This patch goes
>>> after the user experience: Today if a user specifies -e <event>:p all
>>> other modifiers are reset - including exclude_guest. Going forward we
>>> need :p to imply :pH if a user has not specified a GH modifer.
>>>
>>> We could do nothing and handle the unsupported error and try setting the
>>> exclude_guest option - like perf handles other new parameters. But
>>> EOPNOTSUPP is not uniquely tied to this error -- e.g., it could be the
>>> BTS is not supported (:pp). Also, we have no easy way to discriminate :p
>>> from :pG or :pGH. It seems to me perf should not silently undo a user
>>> request on the modifier, but inform the user the request is wrong. For
>>> example if a user request -e cycles:pG it should not be silently turned
>>> into :pH.
>>>
>>> And then yesterday, Robert stated that none of the exclude_xxxx
>>> modifiers can be set for the AMD if the precise modifier is used, so we
>>> cannot blindly set exclude_guest if precise_ip is set.
>>>
>>> So, seems to me perf need's one action for Intel processors and another
>>> for AMD.
>>
>> No, we just need to teach the IBS code about SVM enter/exit.
>
> I aggree that this could be emulated in software by enabling/disabling
> the event with a guest/host switch. And, even better, we add this for
> every pmu in a generic way. E.g. northbridge counter and I guess also
> Intel uncore events do not support G/H counting in hardware. Same to
> other pmus that could be imaginable in the future like counters for
> IOMMUs or other hardware devices.
>
> But, as some pmus are not related to virtualization or other features
> they simply do not need to support those attributes, or we want other
> defaults, e.g. enable it system wide. Detecting features with syscall
> error checking and then falling back to other defaults does not seem
> the right approach to me, because it may require several syscalls to
> check *combinations* of supported attributes, makes error logging and
> detection more difficult due to noisy log messages and because there
> is no strict attribute flag checking in current and older kernels.
>
> I better would like to see a pmu feature flag in the same style as
> with /proc/cpuinfo, e.g.:
>
> $ cat /sys/bus/event_source/devices/cpu/flags
> exclude_host exclude_guest
>
> We also need stricter attribute flag checking, esp. of reseved flags
> and for unsupported features in some pmus (I already work on some
> patches for this). Userland then checks flags and sets up syscalls
> according to the reported flags. The goal should be to avoid syscall
> errors at all. Thus, we are able to improve dmesg logging in case of
> errors, currently we do not see any message if a syscall fails.
>
> And finally, if a feature could be emulated, we could provide this
> emulation of an attr flag to all pmus.
>
> Does this make sense?

any updates on how to handle exclude_guest on AMD?

Really need to get this finished. Kind of nasty to accidentally kill all
running VMs with a perf modifier.