2020-02-13 09:43:59

by Leo Yan

[permalink] [raw]
Subject: [PATCH v4 0/5] perf cs-etm: Fix synthesizing instruction samples

This patch series is to address issues for synthesizing instruction
samples, especially when the instruction sample period is small enough,
the current logic cannot synthesize multiple instruction samples within
one instruction range packet.

Patch 0001 is to swap packets for instruction samples, so this allow
option '--itrace=iNNN' can work well.

Patch 0002 avoids to reset the last branches for every instruction
sample; if reset the last branches for every time generating sample, the
later samples in the same range packet cannot use the last branches
anymore.

Patch 0003 is the fixing for handling different instruction periods,
especially for small sample period.

Patch 0004 is an optimization for copying last branches; it only copies
last branches once if the instruction samples share the same last
branches.

Patch 0005 is a minor fix for unsigned variable comparison to zero.

This patch set has been rebased on the latest perf/core branch; and
verified on Juno board with below commands:

# perf script --itrace=i2
# perf script --itrace=i2il16
# perf inject --itrace=i2il16 -i perf.data -o perf.data.new
# perf inject --itrace=i100il16 -i perf.data -o perf.data.new

Changes from v3:
* Refactored patch 0001 with new function cs_etm__packet_swap() (Mike);
* Refined instruction sample generation flow with single while loop,
which completely uses Mike's suggestions (Mike);
* Added Mike's review tags for patch 01/02/04/05.

Changes from v2:
* Added patch 0001 which is to fix swapping packets for instruction
samples;
* Refined minor commit logs and comments;
* Rebased on the latest perf/core branch.

Changes from v1:
* Rebased patch set on perf/core branch with latest commit 9fec3cd5fa4a
("perf map: Check if the map still has some refcounts on exit").



Leo Yan (5):
perf cs-etm: Swap packets for instruction samples
perf cs-etm: Continuously record last branch
perf cs-etm: Correct synthesizing instruction samples
perf cs-etm: Optimize copying last branches
perf cs-etm: Fix unsigned variable comparison to zero

tools/perf/util/cs-etm.c | 157 +++++++++++++++++++++++++++------------
1 file changed, 111 insertions(+), 46 deletions(-)

--
2.17.1


2020-02-13 09:44:06

by Leo Yan

[permalink] [raw]
Subject: [PATCH v4 4/5] perf cs-etm: Optimize copying last branches

If an instruction range packet can generate multiple instruction
samples, these samples share the same last branches; it's not necessary
to copy the same last branches repeatedly for these samples within the
same packet.

This patch moves out the last branches copying from function
cs_etm__synth_instruction_sample(), and execute it prior to generating
instruction samples.

Signed-off-by: Leo Yan <[email protected]>
Reviewed-by: Mike Leach <[email protected]>
---
tools/perf/util/cs-etm.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 4b7d6c36ce3c..aa4b6d060ebb 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -1151,10 +1151,8 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,

cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample);

- if (etm->synth_opts.last_branch) {
- cs_etm__copy_last_branch_rb(etmq, tidq);
+ if (etm->synth_opts.last_branch)
sample.branch_stack = tidq->last_branch;
- }

if (etm->synth_opts.inject) {
ret = cs_etm__inject_event(event, &sample,
@@ -1429,6 +1427,10 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
u64 offset = etm->instructions_sample_period - instrs_prev;
u64 addr;

+ /* Prepare last branches for instruction sample */
+ if (etm->synth_opts.last_branch)
+ cs_etm__copy_last_branch_rb(etmq, tidq);
+
while (tidq->period_instructions >=
etm->instructions_sample_period) {
/*
@@ -1506,6 +1508,11 @@ static int cs_etm__flush(struct cs_etm_queue *etmq,

if (etmq->etm->synth_opts.last_branch &&
tidq->prev_packet->sample_type == CS_ETM_RANGE) {
+ u64 addr;
+
+ /* Prepare last branches for instruction sample */
+ cs_etm__copy_last_branch_rb(etmq, tidq);
+
/*
* Generate a last branch event for the branches left in the
* circular buffer at the end of the trace.
@@ -1513,7 +1520,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq,
* Use the address of the end of the last reported execution
* range
*/
- u64 addr = cs_etm__last_executed_instr(tidq->prev_packet);
+ addr = cs_etm__last_executed_instr(tidq->prev_packet);

err = cs_etm__synth_instruction_sample(
etmq, tidq, addr,
@@ -1558,11 +1565,16 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq,
*/
if (etmq->etm->synth_opts.last_branch &&
tidq->prev_packet->sample_type == CS_ETM_RANGE) {
+ u64 addr;
+
+ /* Prepare last branches for instruction sample */
+ cs_etm__copy_last_branch_rb(etmq, tidq);
+
/*
* Use the address of the end of the last reported execution
* range.
*/
- u64 addr = cs_etm__last_executed_instr(tidq->prev_packet);
+ addr = cs_etm__last_executed_instr(tidq->prev_packet);

err = cs_etm__synth_instruction_sample(
etmq, tidq, addr,
--
2.17.1

2020-02-13 09:45:11

by Leo Yan

[permalink] [raw]
Subject: [PATCH v4 5/5] perf cs-etm: Fix unsigned variable comparison to zero

The variable 'offset' in function cs_etm__sample() is u64 type, it's not
appropriate to check it with 'while (offset > 0)'; this patch changes to
'while (offset)'.

Signed-off-by: Leo Yan <[email protected]>
Reviewed-by: Mike Leach <[email protected]>
---
tools/perf/util/cs-etm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index aa4b6d060ebb..bba969d48076 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -962,7 +962,7 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
if (packet->isa == CS_ETM_ISA_T32) {
u64 addr = packet->start_addr;

- while (offset > 0) {
+ while (offset) {
addr += cs_etm__t32_instr_size(etmq,
trace_chan_id, addr);
offset--;
--
2.17.1

2020-02-15 03:25:52

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] perf cs-etm: Fix synthesizing instruction samples

On Thu, Feb 13, 2020 at 05:41:59PM +0800, Leo Yan wrote:
> This patch series is to address issues for synthesizing instruction
> samples, especially when the instruction sample period is small enough,
> the current logic cannot synthesize multiple instruction samples within
> one instruction range packet.

Thanks a lot for Mike's review.

Hi Mathieu/Suzuki, I'd like get your green light before we can ask
Arnaldo to help merge this patch set. Thanks!

Leo

2020-02-17 15:31:42

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] perf cs-etm: Fix synthesizing instruction samples

On Fri, 14 Feb 2020 at 20:23, Leo Yan <[email protected]> wrote:
>
> On Thu, Feb 13, 2020 at 05:41:59PM +0800, Leo Yan wrote:
> > This patch series is to address issues for synthesizing instruction
> > samples, especially when the instruction sample period is small enough,
> > the current logic cannot synthesize multiple instruction samples within
> > one instruction range packet.
>
> Thanks a lot for Mike's review.
>
> Hi Mathieu/Suzuki, I'd like get your green light before we can ask
> Arnaldo to help merge this patch set. Thanks!

At the very least, please wait 10 days before pinging maintainers
about patch reviews. I have never failed to review coresight patches
and this time is no different.

>
> Leo

2020-02-17 15:45:07

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] perf cs-etm: Fix synthesizing instruction samples

Hi Mathieu,

On Mon, Feb 17, 2020 at 08:30:37AM -0700, Mathieu Poirier wrote:
> On Fri, 14 Feb 2020 at 20:23, Leo Yan <[email protected]> wrote:
> >
> > On Thu, Feb 13, 2020 at 05:41:59PM +0800, Leo Yan wrote:
> > > This patch series is to address issues for synthesizing instruction
> > > samples, especially when the instruction sample period is small enough,
> > > the current logic cannot synthesize multiple instruction samples within
> > > one instruction range packet.
> >
> > Thanks a lot for Mike's review.
> >
> > Hi Mathieu/Suzuki, I'd like get your green light before we can ask
> > Arnaldo to help merge this patch set. Thanks!
>
> At the very least, please wait 10 days before pinging maintainers
> about patch reviews. I have never failed to review coresight patches
> and this time is no different.

Understand and sorry for hurry pushing. Take your time for reviewing
and no rush :)

Thanks,
Leo

2020-02-18 18:50:08

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] perf cs-etm: Fix synthesizing instruction samples

On Thu, Feb 13, 2020 at 05:41:59PM +0800, Leo Yan wrote:
> This patch series is to address issues for synthesizing instruction
> samples, especially when the instruction sample period is small enough,
> the current logic cannot synthesize multiple instruction samples within
> one instruction range packet.
>
> Patch 0001 is to swap packets for instruction samples, so this allow
> option '--itrace=iNNN' can work well.
>
> Patch 0002 avoids to reset the last branches for every instruction
> sample; if reset the last branches for every time generating sample, the
> later samples in the same range packet cannot use the last branches
> anymore.
>
> Patch 0003 is the fixing for handling different instruction periods,
> especially for small sample period.
>
> Patch 0004 is an optimization for copying last branches; it only copies
> last branches once if the instruction samples share the same last
> branches.
>
> Patch 0005 is a minor fix for unsigned variable comparison to zero.
>
> This patch set has been rebased on the latest perf/core branch; and
> verified on Juno board with below commands:
>
> # perf script --itrace=i2
> # perf script --itrace=i2il16
> # perf inject --itrace=i2il16 -i perf.data -o perf.data.new
> # perf inject --itrace=i100il16 -i perf.data -o perf.data.new
>
> Changes from v3:
> * Refactored patch 0001 with new function cs_etm__packet_swap() (Mike);
> * Refined instruction sample generation flow with single while loop,
> which completely uses Mike's suggestions (Mike);
> * Added Mike's review tags for patch 01/02/04/05.
>
> Changes from v2:
> * Added patch 0001 which is to fix swapping packets for instruction
> samples;
> * Refined minor commit logs and comments;
> * Rebased on the latest perf/core branch.
>
> Changes from v1:
> * Rebased patch set on perf/core branch with latest commit 9fec3cd5fa4a
> ("perf map: Check if the map still has some refcounts on exit").
>
>
>
> Leo Yan (5):
> perf cs-etm: Swap packets for instruction samples
> perf cs-etm: Continuously record last branch
> perf cs-etm: Correct synthesizing instruction samples
> perf cs-etm: Optimize copying last branches
> perf cs-etm: Fix unsigned variable comparison to zero

For all the patches in this set:

Reviewed-by: Mathieu Poirier <[email protected]>

Unless Arnaldo says otherwise, I suggest you send a new V5 with Mike's RB for
patch 3/5 and mine for all of them. That way he doesn't have to edit the
patches when applying them.

Thanks,
Mathieu

>
> tools/perf/util/cs-etm.c | 157 +++++++++++++++++++++++++++------------
> 1 file changed, 111 insertions(+), 46 deletions(-)
>
> --
> 2.17.1
>

2020-02-18 19:30:43

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] perf cs-etm: Fix synthesizing instruction samples

Em Tue, Feb 18, 2020 at 11:49:34AM -0700, Mathieu Poirier escreveu:
> On Thu, Feb 13, 2020 at 05:41:59PM +0800, Leo Yan wrote:
> > This patch series is to address issues for synthesizing instruction
> > samples, especially when the instruction sample period is small enough,
> > the current logic cannot synthesize multiple instruction samples within
> > one instruction range packet.
> >
> > Patch 0001 is to swap packets for instruction samples, so this allow
> > option '--itrace=iNNN' can work well.
> >
> > Patch 0002 avoids to reset the last branches for every instruction
> > sample; if reset the last branches for every time generating sample, the
> > later samples in the same range packet cannot use the last branches
> > anymore.
> >
> > Patch 0003 is the fixing for handling different instruction periods,
> > especially for small sample period.
> >
> > Patch 0004 is an optimization for copying last branches; it only copies
> > last branches once if the instruction samples share the same last
> > branches.
> >
> > Patch 0005 is a minor fix for unsigned variable comparison to zero.
> >
> > This patch set has been rebased on the latest perf/core branch; and
> > verified on Juno board with below commands:
> >
> > # perf script --itrace=i2
> > # perf script --itrace=i2il16
> > # perf inject --itrace=i2il16 -i perf.data -o perf.data.new
> > # perf inject --itrace=i100il16 -i perf.data -o perf.data.new
> >
> > Changes from v3:
> > * Refactored patch 0001 with new function cs_etm__packet_swap() (Mike);
> > * Refined instruction sample generation flow with single while loop,
> > which completely uses Mike's suggestions (Mike);
> > * Added Mike's review tags for patch 01/02/04/05.
> >
> > Changes from v2:
> > * Added patch 0001 which is to fix swapping packets for instruction
> > samples;
> > * Refined minor commit logs and comments;
> > * Rebased on the latest perf/core branch.
> >
> > Changes from v1:
> > * Rebased patch set on perf/core branch with latest commit 9fec3cd5fa4a
> > ("perf map: Check if the map still has some refcounts on exit").
> >
> >
> >
> > Leo Yan (5):
> > perf cs-etm: Swap packets for instruction samples
> > perf cs-etm: Continuously record last branch
> > perf cs-etm: Correct synthesizing instruction samples
> > perf cs-etm: Optimize copying last branches
> > perf cs-etm: Fix unsigned variable comparison to zero
>
> For all the patches in this set:
>
> Reviewed-by: Mathieu Poirier <[email protected]>
>
> Unless Arnaldo says otherwise, I suggest you send a new V5 with Mike's RB for
> patch 3/5 and mine for all of them. That way he doesn't have to edit the
> patches when applying them.

Yeah, that would make things easier for me, always appreciated.

- Arnaldo

> Thanks,
> Mathieu
>
> >
> > tools/perf/util/cs-etm.c | 157 +++++++++++++++++++++++++++------------
> > 1 file changed, 111 insertions(+), 46 deletions(-)
> >
> > --
> > 2.17.1
> >

--

- Arnaldo

2020-02-19 01:34:12

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] perf cs-etm: Fix synthesizing instruction samples

Hi Mathieu, Arnaldo,

On Tue, Feb 18, 2020 at 04:30:11PM -0300, Arnaldo Carvalho de Melo wrote:

[...]

> > > Leo Yan (5):
> > > perf cs-etm: Swap packets for instruction samples
> > > perf cs-etm: Continuously record last branch
> > > perf cs-etm: Correct synthesizing instruction samples
> > > perf cs-etm: Optimize copying last branches
> > > perf cs-etm: Fix unsigned variable comparison to zero
> >
> > For all the patches in this set:
> >
> > Reviewed-by: Mathieu Poirier <[email protected]>
> >
> > Unless Arnaldo says otherwise, I suggest you send a new V5 with Mike's RB for
> > patch 3/5 and mine for all of them. That way he doesn't have to edit the
> > patches when applying them.

Thanks for reviewing and suggestions.

> Yeah, that would make things easier for me, always appreciated.

Sure, will send out patch set V5.

Thanks,
Leo