2022-05-09 04:28:03

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH V2 00/23] perf intel-pt: Better support for perf record --cpu

Hi Adrian,

On Fri, May 06, 2022 at 03:25:38PM +0300, Adrian Hunter wrote:
> Hi
>
> Here are V2 patches to support capturing Intel PT sideband events such as
> mmap, task, context switch, text poke etc, on every CPU even when tracing
> selected user_requested_cpus. That is, when using the perf record -C or
> --cpu option.
>
> This is needed for:
> 1. text poke: a text poke on any CPU affects all CPUs
> 2. tracing user space: a user space process can migrate between CPUs so
> mmap events that happen on a different CPU can be needed to decode a
> user_requested_cpus CPU.
>
> For example:
>
> Trace on CPU 1:
>
> perf record --kcore -C 1 -e intel_pt// &
>
> Start a task on CPU 0:
>
> taskset 0x1 testprog &
>
> Migrate it to CPU 1:
>
> taskset -p 0x2 <testprog pid>
>
> Stop tracing:
>
> kill %1
>
> Prior to these changes there will be errors decoding testprog
> in userspace because the comm and mmap events for testprog will not
> have been captured.

Thanks a lot for this patch set, I believe this is a common issue for
AUX trace (not only for Intel-PT), so I verified this patch set for both
Arm CoreSight and SPE; unfortunately both cannot see MMAP events for
migrated task. I used below commands:

# perf record -B -N --no-bpf-event -e cs_etm//u -C 0 -- taskset --cpu-list 1 uname
# perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
0


# perf record -B -N --no-bpf-event -e arm_spe_0//u -C 0 -- taskset --cpu-list 1 uname
# perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
0

I didn't dive into details for this patch set, so I cannot say the
failure is caused by any issue in this patch set. But it's definitely
we need to look into for Arm platforms to root cause what's the reason
it cannot record MMAP events properly when migrate tasks. Loop James
and German for this reason.

Thanks,
Leo


2022-05-09 10:15:32

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V2 00/23] perf intel-pt: Better support for perf record --cpu

On 8/05/22 18:08, Leo Yan wrote:
> Hi Adrian,
>
> On Fri, May 06, 2022 at 03:25:38PM +0300, Adrian Hunter wrote:
>> Hi
>>
>> Here are V2 patches to support capturing Intel PT sideband events such as
>> mmap, task, context switch, text poke etc, on every CPU even when tracing
>> selected user_requested_cpus. That is, when using the perf record -C or
>> --cpu option.
>>
>> This is needed for:
>> 1. text poke: a text poke on any CPU affects all CPUs
>> 2. tracing user space: a user space process can migrate between CPUs so
>> mmap events that happen on a different CPU can be needed to decode a
>> user_requested_cpus CPU.
>>
>> For example:
>>
>> Trace on CPU 1:
>>
>> perf record --kcore -C 1 -e intel_pt// &
>>
>> Start a task on CPU 0:
>>
>> taskset 0x1 testprog &
>>
>> Migrate it to CPU 1:
>>
>> taskset -p 0x2 <testprog pid>
>>
>> Stop tracing:
>>
>> kill %1
>>
>> Prior to these changes there will be errors decoding testprog
>> in userspace because the comm and mmap events for testprog will not
>> have been captured.
>
> Thanks a lot for this patch set, I believe this is a common issue for
> AUX trace (not only for Intel-PT), so I verified this patch set for both
> Arm CoreSight and SPE; unfortunately both cannot see MMAP events for
> migrated task. I used below commands:
>
> # perf record -B -N --no-bpf-event -e cs_etm//u -C 0 -- taskset --cpu-list 1 uname
> # perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
> 0
>
>
> # perf record -B -N --no-bpf-event -e arm_spe_0//u -C 0 -- taskset --cpu-list 1 uname
> # perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
> 0
>
> I didn't dive into details for this patch set, so I cannot say the
> failure is caused by any issue in this patch set. But it's definitely
> we need to look into for Arm platforms to root cause what's the reason
> it cannot record MMAP events properly when migrate tasks. Loop James
> and German for this reason.

You would need the equivalent of patch "perf intel-pt: Track sideband
system-wide when needed" which makes use of new helper
evlist__add_aux_dummy() to set up the dummy event with the option to
make it "system wide".

cs_etm_recording_options() and arm_spe_recording_options() have similar
code.

You will need to decide if it is worth the extra sideband. I decided
if it became an issue, it could be made optional in the future.

2022-05-09 10:43:19

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH V2 00/23] perf intel-pt: Better support for perf record --cpu

On Mon, May 09, 2022 at 08:44:02AM +0300, Adrian Hunter wrote:

[...]

> > Thanks a lot for this patch set, I believe this is a common issue for
> > AUX trace (not only for Intel-PT), so I verified this patch set for both
> > Arm CoreSight and SPE; unfortunately both cannot see MMAP events for
> > migrated task. I used below commands:
> >
> > # perf record -B -N --no-bpf-event -e cs_etm//u -C 0 -- taskset --cpu-list 1 uname
> > # perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
> > 0
> >
> >
> > # perf record -B -N --no-bpf-event -e arm_spe_0//u -C 0 -- taskset --cpu-list 1 uname
> > # perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
> > 0
> >
> > I didn't dive into details for this patch set, so I cannot say the
> > failure is caused by any issue in this patch set. But it's definitely
> > we need to look into for Arm platforms to root cause what's the reason
> > it cannot record MMAP events properly when migrate tasks. Loop James
> > and German for this reason.
>
> You would need the equivalent of patch "perf intel-pt: Track sideband
> system-wide when needed" which makes use of new helper
> evlist__add_aux_dummy() to set up the dummy event with the option to
> make it "system wide".
>
> cs_etm_recording_options() and arm_spe_recording_options() have similar
> code.

Thanks a lot for the guidance.

I applied the simliar change for cs_etm_recording_options() and
arm_spe_recording_options(), both can pass below tests:

# perf record -B -N --no-bpf-event -e cs_etm//u -C 0 -- taskset --cpu-list 1 uname
# perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
4

# perf record -B -N --no-bpf-event -e arm_spe_0//u -C 0 -- taskset --cpu-list 1 uname
# perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
4

And I tested a more complex case for migrating a test program 'sysbench'
in the middle of perf session, it still fails to parse any samples
testing program 'sysbench'. I need to do more homework for this part,
but welcome any suggestions, thanks! The testing script is:

---8<---

export PATH=/mnt/export/arm-linux-kernel/tools/perf/:$PATH

perf record --kcore -C 1 -e cs_etm// &
PERF_PID=$!
echo "Perf PID ${PERF_PID}"

sleep 2

taskset 0x1 ./sysbench --test=memory --max-requests=1000000000 run &
TEST_PROG_PID=$!
echo "Test Prog PID ${TEST_PROG_PID}"

sleep 1

taskset -p 0x2 $TEST_PROG_PID

sleep 1

kill $PERF_PID

> You will need to decide if it is worth the extra sideband. I decided
> if it became an issue, it could be made optional in the future.

Yeah, the condition checking for system wide tracking in patch 16/23
looks good to me.

Thanks,
Leo