Since commit 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis"),
a dummy event is added to capture mmaps.
But if we run perf-record as,
# perf record -e cycles:p -IXMM0 -a -- sleep 1
Error:
dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
The issue is, if we enable the extended regs (-IXMM0), but the
pmu->capabilities is not set with PERF_PMU_CAP_EXTENDED_REGS, the kernel
will return -EOPNOTSUPP error.
See following code:
/* in kernel/events/core.c */
static int perf_try_init_event(struct pmu *pmu, struct perf_event *event)
{
....
if (!(pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS) &&
has_extended_regs(event))
ret = -EOPNOTSUPP;
....
}
For software dummy event, the PMU should not be set with
PERF_PMU_CAP_EXTENDED_REGS. But unfortunately now, the dummy
event has possibility to be set with PERF_REG_EXTENDED_MASK bit.
In evsel__config, /* tools/perf/util/evsel.c */
if (opts->sample_intr_regs) {
attr->sample_regs_intr = opts->sample_intr_regs;
}
If we use -IXMM0, the attr>sample_regs_intr will be set with
PERF_REG_EXTENDED_MASK bit.
It doesn't make sense to set attr->sample_regs_intr for a
software dummy event.
This patch adds dummy event checking before setting
attr->sample_regs_intr and attr->sample_regs_user.
After:
# ./perf record -e cycles:p -IXMM0 -a -- sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.413 MB perf.data (45 samples) ]
v2:
---
Rebase to perf/core
Fixes: 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis")
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/evsel.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 9aa51a65593d..11794d3b7879 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1014,12 +1014,14 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
if (callchain && callchain->enabled && !evsel->no_aux_samples)
evsel__config_callchain(evsel, opts, callchain);
- if (opts->sample_intr_regs && !evsel->no_aux_samples) {
+ if (opts->sample_intr_regs && !evsel->no_aux_samples &&
+ !evsel__is_dummy_event(evsel)) {
attr->sample_regs_intr = opts->sample_intr_regs;
evsel__set_sample_bit(evsel, REGS_INTR);
}
- if (opts->sample_user_regs && !evsel->no_aux_samples) {
+ if (opts->sample_user_regs && !evsel->no_aux_samples &&
+ !evsel__is_dummy_event(evsel)) {
attr->sample_regs_user |= opts->sample_user_regs;
evsel__set_sample_bit(evsel, REGS_USER);
}
--
2.17.1
On Mon, Jul 20, 2020 at 09:00:13AM +0800, Jin Yao wrote:
> Since commit 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis"),
> a dummy event is added to capture mmaps.
>
> But if we run perf-record as,
>
> # perf record -e cycles:p -IXMM0 -a -- sleep 1
> Error:
> dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
>
> The issue is, if we enable the extended regs (-IXMM0), but the
> pmu->capabilities is not set with PERF_PMU_CAP_EXTENDED_REGS, the kernel
> will return -EOPNOTSUPP error.
>
> See following code:
>
> /* in kernel/events/core.c */
> static int perf_try_init_event(struct pmu *pmu, struct perf_event *event)
>
> {
> ....
> if (!(pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS) &&
> has_extended_regs(event))
> ret = -EOPNOTSUPP;
> ....
> }
>
> For software dummy event, the PMU should not be set with
> PERF_PMU_CAP_EXTENDED_REGS. But unfortunately now, the dummy
> event has possibility to be set with PERF_REG_EXTENDED_MASK bit.
>
> In evsel__config, /* tools/perf/util/evsel.c */
>
> if (opts->sample_intr_regs) {
> attr->sample_regs_intr = opts->sample_intr_regs;
> }
>
> If we use -IXMM0, the attr>sample_regs_intr will be set with
> PERF_REG_EXTENDED_MASK bit.
>
> It doesn't make sense to set attr->sample_regs_intr for a
> software dummy event.
>
> This patch adds dummy event checking before setting
> attr->sample_regs_intr and attr->sample_regs_user.
>
> After:
> # ./perf record -e cycles:p -IXMM0 -a -- sleep 1
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.413 MB perf.data (45 samples) ]
>
> v2:
> ---
> Rebase to perf/core
>
> Fixes: 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis")
> Signed-off-by: Jin Yao <[email protected]>
> ---
> tools/perf/util/evsel.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 9aa51a65593d..11794d3b7879 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1014,12 +1014,14 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
> if (callchain && callchain->enabled && !evsel->no_aux_samples)
> evsel__config_callchain(evsel, opts, callchain);
>
> - if (opts->sample_intr_regs && !evsel->no_aux_samples) {
> + if (opts->sample_intr_regs && !evsel->no_aux_samples &&
> + !evsel__is_dummy_event(evsel)) {
hum, I thought it'd look something like this:
if (opts->sample_intr_regs && (!evsel->no_aux_samples || !evsel__is_dummy_event(evsel))
but I'm not sure how no_aux_samples flag works exactly.. so it might be
correct.. just making sure ;-)
cc-ing Adrian
jirka
> attr->sample_regs_intr = opts->sample_intr_regs;
> evsel__set_sample_bit(evsel, REGS_INTR);
> }
>
> - if (opts->sample_user_regs && !evsel->no_aux_samples) {
> + if (opts->sample_user_regs && !evsel->no_aux_samples &&
> + !evsel__is_dummy_event(evsel)) {
> attr->sample_regs_user |= opts->sample_user_regs;
> evsel__set_sample_bit(evsel, REGS_USER);
> }
> --
> 2.17.1
>
Hi Jiri,
On 7/20/2020 5:17 PM, Jiri Olsa wrote:
> On Mon, Jul 20, 2020 at 09:00:13AM +0800, Jin Yao wrote:
>> Since commit 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis"),
>> a dummy event is added to capture mmaps.
>>
>> But if we run perf-record as,
>>
>> # perf record -e cycles:p -IXMM0 -a -- sleep 1
>> Error:
>> dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
>>
>> The issue is, if we enable the extended regs (-IXMM0), but the
>> pmu->capabilities is not set with PERF_PMU_CAP_EXTENDED_REGS, the kernel
>> will return -EOPNOTSUPP error.
>>
>> See following code:
>>
>> /* in kernel/events/core.c */
>> static int perf_try_init_event(struct pmu *pmu, struct perf_event *event)
>>
>> {
>> ....
>> if (!(pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS) &&
>> has_extended_regs(event))
>> ret = -EOPNOTSUPP;
>> ....
>> }
>>
>> For software dummy event, the PMU should not be set with
>> PERF_PMU_CAP_EXTENDED_REGS. But unfortunately now, the dummy
>> event has possibility to be set with PERF_REG_EXTENDED_MASK bit.
>>
>> In evsel__config, /* tools/perf/util/evsel.c */
>>
>> if (opts->sample_intr_regs) {
>> attr->sample_regs_intr = opts->sample_intr_regs;
>> }
>>
>> If we use -IXMM0, the attr>sample_regs_intr will be set with
>> PERF_REG_EXTENDED_MASK bit.
>>
>> It doesn't make sense to set attr->sample_regs_intr for a
>> software dummy event.
>>
>> This patch adds dummy event checking before setting
>> attr->sample_regs_intr and attr->sample_regs_user.
>>
>> After:
>> # ./perf record -e cycles:p -IXMM0 -a -- sleep 1
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 0.413 MB perf.data (45 samples) ]
>>
>> v2:
>> ---
>> Rebase to perf/core
>>
>> Fixes: 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis")
>> Signed-off-by: Jin Yao <[email protected]>
>> ---
>> tools/perf/util/evsel.c | 6 ++++--
>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index 9aa51a65593d..11794d3b7879 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -1014,12 +1014,14 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
>> if (callchain && callchain->enabled && !evsel->no_aux_samples)
>> evsel__config_callchain(evsel, opts, callchain);
>>
>> - if (opts->sample_intr_regs && !evsel->no_aux_samples) {
>> + if (opts->sample_intr_regs && !evsel->no_aux_samples &&
>> + !evsel__is_dummy_event(evsel)) {
>
> hum, I thought it'd look something like this:
>
> if (opts->sample_intr_regs && (!evsel->no_aux_samples || !evsel__is_dummy_event(evsel))
>
> but I'm not sure how no_aux_samples flag works exactly.. so it might be
> correct.. just making sure ;-)
>
> cc-ing Adrian
>
> jirka
>
>
no_aux_samples is set to false by default and it's only set to true by pt, right?
So most of the time, !evsel->no_aux_samples is always true.
if (opts->sample_intr_regs && (!evsel->no_aux_samples || !evsel__is_dummy_event(evsel)) {
attr->sample_regs_intr = opts->sample_intr_regs;
evsel__set_sample_bit(evsel, REGS_INTR);
}
So even if the evsel is dummy event, the condition check is true. :(
Or maybe I misunderstand anything?
Thanks
Jin Yao
>> attr->sample_regs_intr = opts->sample_intr_regs;
>> evsel__set_sample_bit(evsel, REGS_INTR);
>> }
>>
>> - if (opts->sample_user_regs && !evsel->no_aux_samples) {
>> + if (opts->sample_user_regs && !evsel->no_aux_samples &&
>> + !evsel__is_dummy_event(evsel)) {
>> attr->sample_regs_user |= opts->sample_user_regs;
>> evsel__set_sample_bit(evsel, REGS_USER);
>> }
>> --
>> 2.17.1
>>
>
On Wed, Jul 22, 2020 at 01:00:03PM +0800, Jin, Yao wrote:
SNIP
> > >
> > > If we use -IXMM0, the attr>sample_regs_intr will be set with
> > > PERF_REG_EXTENDED_MASK bit.
> > >
> > > It doesn't make sense to set attr->sample_regs_intr for a
> > > software dummy event.
> > >
> > > This patch adds dummy event checking before setting
> > > attr->sample_regs_intr and attr->sample_regs_user.
> > >
> > > After:
> > > # ./perf record -e cycles:p -IXMM0 -a -- sleep 1
> > > [ perf record: Woken up 1 times to write data ]
> > > [ perf record: Captured and wrote 0.413 MB perf.data (45 samples) ]
> > >
> > > v2:
> > > ---
> > > Rebase to perf/core
> > >
> > > Fixes: 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis")
> > > Signed-off-by: Jin Yao <[email protected]>
> > > ---
> > > tools/perf/util/evsel.c | 6 ++++--
> > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > > index 9aa51a65593d..11794d3b7879 100644
> > > --- a/tools/perf/util/evsel.c
> > > +++ b/tools/perf/util/evsel.c
> > > @@ -1014,12 +1014,14 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
> > > if (callchain && callchain->enabled && !evsel->no_aux_samples)
> > > evsel__config_callchain(evsel, opts, callchain);
> > > - if (opts->sample_intr_regs && !evsel->no_aux_samples) {
> > > + if (opts->sample_intr_regs && !evsel->no_aux_samples &&
> > > + !evsel__is_dummy_event(evsel)) {
> >
> > hum, I thought it'd look something like this:
> >
> > if (opts->sample_intr_regs && (!evsel->no_aux_samples || !evsel__is_dummy_event(evsel))
> >
> > but I'm not sure how no_aux_samples flag works exactly.. so it might be
> > correct.. just making sure ;-)
> >
> > cc-ing Adrian
> >
> > jirka
> >
> >
>
> no_aux_samples is set to false by default and it's only set to true by pt, right?
>
> So most of the time, !evsel->no_aux_samples is always true.
>
> if (opts->sample_intr_regs && (!evsel->no_aux_samples || !evsel__is_dummy_event(evsel)) {
> attr->sample_regs_intr = opts->sample_intr_regs;
> evsel__set_sample_bit(evsel, REGS_INTR);
> }
>
> So even if the evsel is dummy event, the condition check is true. :(
>
> Or maybe I misunderstand anything?
I was just curious, because I did not follow the no_aux_samples
usage in detail.. so how about a case where:
evsel->no_aux_samples == true and evsel__is_dummy_event(evsel) = false
then the original condition will be false for non dummy event
(opts->sample_intr_regs && !evsel->no_aux_samples && !evsel__is_dummy_event(evsel))
is that ok?
jirka
Hi Jiri, Adrian,
On 7/22/2020 7:08 PM, Jiri Olsa wrote:
> On Wed, Jul 22, 2020 at 01:00:03PM +0800, Jin, Yao wrote:
>
> SNIP
>
>>>>
>>>> If we use -IXMM0, the attr>sample_regs_intr will be set with
>>>> PERF_REG_EXTENDED_MASK bit.
>>>>
>>>> It doesn't make sense to set attr->sample_regs_intr for a
>>>> software dummy event.
>>>>
>>>> This patch adds dummy event checking before setting
>>>> attr->sample_regs_intr and attr->sample_regs_user.
>>>>
>>>> After:
>>>> # ./perf record -e cycles:p -IXMM0 -a -- sleep 1
>>>> [ perf record: Woken up 1 times to write data ]
>>>> [ perf record: Captured and wrote 0.413 MB perf.data (45 samples) ]
>>>>
>>>> v2:
>>>> ---
>>>> Rebase to perf/core
>>>>
>>>> Fixes: 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis")
>>>> Signed-off-by: Jin Yao <[email protected]>
>>>> ---
>>>> tools/perf/util/evsel.c | 6 ++++--
>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>>>> index 9aa51a65593d..11794d3b7879 100644
>>>> --- a/tools/perf/util/evsel.c
>>>> +++ b/tools/perf/util/evsel.c
>>>> @@ -1014,12 +1014,14 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
>>>> if (callchain && callchain->enabled && !evsel->no_aux_samples)
>>>> evsel__config_callchain(evsel, opts, callchain);
>>>> - if (opts->sample_intr_regs && !evsel->no_aux_samples) {
>>>> + if (opts->sample_intr_regs && !evsel->no_aux_samples &&
>>>> + !evsel__is_dummy_event(evsel)) {
>>>
>>> hum, I thought it'd look something like this:
>>>
>>> if (opts->sample_intr_regs && (!evsel->no_aux_samples || !evsel__is_dummy_event(evsel))
>>>
>>> but I'm not sure how no_aux_samples flag works exactly.. so it might be
>>> correct.. just making sure ;-)
>>>
>>> cc-ing Adrian
>>>
>>> jirka
>>>
>>>
>>
>> no_aux_samples is set to false by default and it's only set to true by pt, right?
>>
>> So most of the time, !evsel->no_aux_samples is always true.
>>
>> if (opts->sample_intr_regs && (!evsel->no_aux_samples || !evsel__is_dummy_event(evsel)) {
>> attr->sample_regs_intr = opts->sample_intr_regs;
>> evsel__set_sample_bit(evsel, REGS_INTR);
>> }
>>
>> So even if the evsel is dummy event, the condition check is true. :(
>>
>> Or maybe I misunderstand anything?
>
> I was just curious, because I did not follow the no_aux_samples
> usage in detail.. so how about a case where:
>
> evsel->no_aux_samples == true and evsel__is_dummy_event(evsel) = false
>
> then the original condition will be false for non dummy event
>
> (opts->sample_intr_regs && !evsel->no_aux_samples && !evsel__is_dummy_event(evsel))
>
> is that ok?
>
I searched the perf source and found the no_aux_samples was only set to true in intel-pt.c. So I
assume for the non-pt usage, the no_aux_samples is always false.
For non-pt usage,
(opts->sample_intr_regs && !evsel->no_aux_samples && !evsel__is_dummy_event(evsel)) is equal to
(opts->sample_intr_regs && !evsel__is_dummy_event(evsel))
For pt usage, we need to consider the case that evsel__is_dummy_event(evsel) is true or false.
If evsel__is_dummy_event(evsel) is true:
(opts->sample_intr_regs && !evsel->no_aux_samples && !evsel__is_dummy_event(evsel)) is false.
It's expected.
If evsel__is_dummy_event(evsel) is false:
(opts->sample_intr_regs && !evsel->no_aux_samples && !evsel__is_dummy_event(evsel)) is equal to
(opts->sample_intr_regs && !evsel->no_aux_samples)
That's the current code logic.
So I think the condition "(opts->sample_intr_regs && !evsel->no_aux_samples &&
!evsel__is_dummy_event(evsel))" looks reasonable.
Adrian, please correct me if I'm wrong here.
Thanks
Jin Yao
> jirka
>
Hi Adrian,
Could you help to check if following condition will break PT?
"(opts->sample_intr_regs && !evsel->no_aux_samples && !evsel__is_dummy_event(evsel))"
Thanks
Jin Yao
On 7/23/2020 9:01 AM, Jin, Yao wrote:
> Hi Jiri, Adrian,
>
> On 7/22/2020 7:08 PM, Jiri Olsa wrote:
>> On Wed, Jul 22, 2020 at 01:00:03PM +0800, Jin, Yao wrote:
>>
>> SNIP
>>
>>>>>
>>>>> If we use -IXMM0, the attr>sample_regs_intr will be set with
>>>>> PERF_REG_EXTENDED_MASK bit.
>>>>>
>>>>> It doesn't make sense to set attr->sample_regs_intr for a
>>>>> software dummy event.
>>>>>
>>>>> This patch adds dummy event checking before setting
>>>>> attr->sample_regs_intr and attr->sample_regs_user.
>>>>>
>>>>> After:
>>>>> # ./perf record -e cycles:p -IXMM0 -a -- sleep 1
>>>>> [ perf record: Woken up 1 times to write data ]
>>>>> [ perf record: Captured and wrote 0.413 MB perf.data (45 samples) ]
>>>>>
>>>>> v2:
>>>>> ---
>>>>> Rebase to perf/core
>>>>>
>>>>> Fixes: 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis")
>>>>> Signed-off-by: Jin Yao <[email protected]>
>>>>> ---
>>>>> tools/perf/util/evsel.c | 6 ++++--
>>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>>>>> index 9aa51a65593d..11794d3b7879 100644
>>>>> --- a/tools/perf/util/evsel.c
>>>>> +++ b/tools/perf/util/evsel.c
>>>>> @@ -1014,12 +1014,14 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
>>>>> if (callchain && callchain->enabled && !evsel->no_aux_samples)
>>>>> evsel__config_callchain(evsel, opts, callchain);
>>>>> - if (opts->sample_intr_regs && !evsel->no_aux_samples) {
>>>>> + if (opts->sample_intr_regs && !evsel->no_aux_samples &&
>>>>> + !evsel__is_dummy_event(evsel)) {
>>>>
>>>> hum, I thought it'd look something like this:
>>>>
>>>> if (opts->sample_intr_regs && (!evsel->no_aux_samples || !evsel__is_dummy_event(evsel))
>>>>
>>>> but I'm not sure how no_aux_samples flag works exactly.. so it might be
>>>> correct.. just making sure ;-)
>>>>
>>>> cc-ing Adrian
>>>>
>>>> jirka
>>>>
>>>>
>>>
>>> no_aux_samples is set to false by default and it's only set to true by pt, right?
>>>
>>> So most of the time, !evsel->no_aux_samples is always true.
>>>
>>> if (opts->sample_intr_regs && (!evsel->no_aux_samples || !evsel__is_dummy_event(evsel)) {
>>> attr->sample_regs_intr = opts->sample_intr_regs;
>>> evsel__set_sample_bit(evsel, REGS_INTR);
>>> }
>>>
>>> So even if the evsel is dummy event, the condition check is true. :(
>>>
>>> Or maybe I misunderstand anything?
>>
>> I was just curious, because I did not follow the no_aux_samples
>> usage in detail.. so how about a case where:
>>
>> evsel->no_aux_samples == true and evsel__is_dummy_event(evsel) = false
>>
>> then the original condition will be false for non dummy event
>>
>> (opts->sample_intr_regs && !evsel->no_aux_samples && !evsel__is_dummy_event(evsel))
>>
>> is that ok?
>>
>
> I searched the perf source and found the no_aux_samples was only set to true in intel-pt.c. So I
> assume for the non-pt usage, the no_aux_samples is always false.
>
> For non-pt usage,
> (opts->sample_intr_regs && !evsel->no_aux_samples && !evsel__is_dummy_event(evsel)) is equal to
> (opts->sample_intr_regs && !evsel__is_dummy_event(evsel))
>
> For pt usage, we need to consider the case that evsel__is_dummy_event(evsel) is true or false.
>
> If evsel__is_dummy_event(evsel) is true:
> (opts->sample_intr_regs && !evsel->no_aux_samples && !evsel__is_dummy_event(evsel)) is false.
> It's expected.
>
> If evsel__is_dummy_event(evsel) is false:
> (opts->sample_intr_regs && !evsel->no_aux_samples && !evsel__is_dummy_event(evsel)) is equal to
> (opts->sample_intr_regs && !evsel->no_aux_samples)
> That's the current code logic.
>
> So I think the condition "(opts->sample_intr_regs && !evsel->no_aux_samples &&
> !evsel__is_dummy_event(evsel))" looks reasonable.
>
> Adrian, please correct me if I'm wrong here.
>
> Thanks
> Jin Yao
>
>> jirka
>>
On 29/07/20 10:23 am, Jin, Yao wrote:
> Hi Adrian,
>
> Could you help to check if following condition will break PT?
>
> "(opts->sample_intr_regs && !evsel->no_aux_samples &&
> !evsel__is_dummy_event(evsel))"
Sorry for slow response - I've been away.
This is fine. It will not break PT.
no_aux_samples is useful for evsels that have been added by the code rather
than requested by the user. For old kernels PT adds sched_switch tracepoint
to track context switches (before the current context switch event was
added) and having auxiliary sample information unnecessarily uses up space
in the perf buffer.
Acked-by: Adrian Hunter <[email protected]>
>
> Thanks
> Jin Yao
>
> On 7/23/2020 9:01 AM, Jin, Yao wrote:
>> Hi Jiri, Adrian,
>>
>> On 7/22/2020 7:08 PM, Jiri Olsa wrote:
>>> On Wed, Jul 22, 2020 at 01:00:03PM +0800, Jin, Yao wrote:
>>>
>>> SNIP
>>>
>>>>>>
>>>>>> If we use -IXMM0, the attr>sample_regs_intr will be set with
>>>>>> PERF_REG_EXTENDED_MASK bit.
>>>>>>
>>>>>> It doesn't make sense to set attr->sample_regs_intr for a
>>>>>> software dummy event.
>>>>>>
>>>>>> This patch adds dummy event checking before setting
>>>>>> attr->sample_regs_intr and attr->sample_regs_user.
>>>>>>
>>>>>> After:
>>>>>> # ./perf record -e cycles:p -IXMM0 -a -- sleep 1
>>>>>> [ perf record: Woken up 1 times to write data ]
>>>>>> [ perf record: Captured and wrote 0.413 MB perf.data (45 samples) ]
>>>>>>
>>>>>> v2:
>>>>>> ---
>>>>>> Rebase to perf/core
>>>>>>
>>>>>> Fixes: 0a892c1c9472 ("perf record: Add dummy event during system wide
>>>>>> synthesis")
>>>>>> Signed-off-by: Jin Yao <[email protected]>
>>>>>> ---
>>>>>> tools/perf/util/evsel.c | 6 ++++--
>>>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>>>>>> index 9aa51a65593d..11794d3b7879 100644
>>>>>> --- a/tools/perf/util/evsel.c
>>>>>> +++ b/tools/perf/util/evsel.c
>>>>>> @@ -1014,12 +1014,14 @@ void evsel__config(struct evsel *evsel, struct
>>>>>> record_opts *opts,
>>>>>> if (callchain && callchain->enabled && !evsel->no_aux_samples)
>>>>>> evsel__config_callchain(evsel, opts, callchain);
>>>>>> - if (opts->sample_intr_regs && !evsel->no_aux_samples) {
>>>>>> + if (opts->sample_intr_regs && !evsel->no_aux_samples &&
>>>>>> + !evsel__is_dummy_event(evsel)) {
>>>>>
>>>>> hum, I thought it'd look something like this:
>>>>>
>>>>> if (opts->sample_intr_regs && (!evsel->no_aux_samples ||
>>>>> !evsel__is_dummy_event(evsel))
>>>>>
>>>>> but I'm not sure how no_aux_samples flag works exactly.. so it might be
>>>>> correct.. just making sure ;-)
>>>>>
>>>>> cc-ing Adrian
>>>>>
>>>>> jirka
>>>>>
>>>>>
>>>>
>>>> no_aux_samples is set to false by default and it's only set to true by
>>>> pt, right?
>>>>
>>>> So most of the time, !evsel->no_aux_samples is always true.
>>>>
>>>> if (opts->sample_intr_regs && (!evsel->no_aux_samples ||
>>>> !evsel__is_dummy_event(evsel)) {
>>>> attr->sample_regs_intr = opts->sample_intr_regs;
>>>> evsel__set_sample_bit(evsel, REGS_INTR);
>>>> }
>>>>
>>>> So even if the evsel is dummy event, the condition check is true. :(
>>>>
>>>> Or maybe I misunderstand anything?
>>>
>>> I was just curious, because I did not follow the no_aux_samples
>>> usage in detail.. so how about a case where:
>>>
>>> evsel->no_aux_samples == true and evsel__is_dummy_event(evsel) = false
>>>
>>> then the original condition will be false for non dummy event
>>>
>>> (opts->sample_intr_regs && !evsel->no_aux_samples &&
>>> !evsel__is_dummy_event(evsel))
>>>
>>> is that ok?
>>>
>>
>> I searched the perf source and found the no_aux_samples was only set to
>> true in intel-pt.c. So I assume for the non-pt usage, the no_aux_samples
>> is always false.
>>
>> For non-pt usage,
>> (opts->sample_intr_regs && !evsel->no_aux_samples &&
>> !evsel__is_dummy_event(evsel)) is equal to
>> (opts->sample_intr_regs && !evsel__is_dummy_event(evsel))
>>
>> For pt usage, we need to consider the case that
>> evsel__is_dummy_event(evsel) is true or false.
>>
>> If evsel__is_dummy_event(evsel) is true:
>> (opts->sample_intr_regs && !evsel->no_aux_samples &&
>> !evsel__is_dummy_event(evsel)) is false.
>> It's expected.
>>
>> If evsel__is_dummy_event(evsel) is false:
>> (opts->sample_intr_regs && !evsel->no_aux_samples &&
>> !evsel__is_dummy_event(evsel)) is equal to
>> (opts->sample_intr_regs && !evsel->no_aux_samples)
>> That's the current code logic.
>>
>> So I think the condition "(opts->sample_intr_regs &&
>> !evsel->no_aux_samples && !evsel__is_dummy_event(evsel))" looks reasonable.
>>
>> Adrian, please correct me if I'm wrong here.
>>
>> Thanks
>> Jin Yao
>>
>>> jirka
>>>
Em Tue, Aug 04, 2020 at 10:06:56AM +0300, Adrian Hunter escreveu:
> On 29/07/20 10:23 am, Jin, Yao wrote:
> > Hi Adrian,
> >
> > Could you help to check if following condition will break PT?
> >
> > "(opts->sample_intr_regs && !evsel->no_aux_samples &&
> > !evsel__is_dummy_event(evsel))"
>
> Sorry for slow response - I've been away.
>
> This is fine. It will not break PT.
>
> no_aux_samples is useful for evsels that have been added by the code rather
> than requested by the user. For old kernels PT adds sched_switch tracepoint
> to track context switches (before the current context switch event was
> added) and having auxiliary sample information unnecessarily uses up space
> in the perf buffer.
>
> Acked-by: Adrian Hunter <[email protected]>
Thanks for checking and providing the comment, that I added as a
committer note together with your Acked-by, appreciated.
- Arnaldo