2018-11-11 05:01:12

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 0/5] perf cs-etm: Correct packets handling

perf cs-etm module converts decoder elements to packets and then we have
more context crossing packets to generate synthenize samples, finally
perf tool can faciliate samples for statistics and report the results.

This patch series is to address several issues found related with
packets handling and samples generation when worked firstly on branch
sample flags support for Arm CoreSight trace data, so this patch series
also is dependency for another patch series for sample flags.

The first two patches are mainly to fix issues in cs_etm__flush():
Patch 0001 corrects packets swapping in cs_etm__flush() and this can fix
the wrong branch sample caused by the missed packets swapping; patch
0002 is to fix the wrong samples generation with stale packets at the
end of every trace buffer.

Patch 0003 is used to support NO_SYNC packet, otherwise the trace
decoding cannot reflect the tracing discontinuity caused by NO_SYNC
packet.

Patch 0004/0005 has been published in the patch series 'perf cs-etm: Add
support for sample flags' before but at this time I move them into this
patch series due these two patches are more relative with packets
handling. Patch 0004 is used to generate branch sample for exception
packets; and patch 0005 is to track the exception number.

This patch series is applied on the acme's perf core branch [1] with the
latest commit f1d23afaf677 ("perf bpf: Reduce the hardcoded .max_entries
for pid_maps") and has one prerequisite from Rob's patch 'perf: Support
for Arm A32/T32 instruction sets in CoreSight trace' [2].

With applying the dependency patch, this patch series has been tested
for branch samples dumping with below command on Juno board:

# perf script -F,-time,+ip,+sym,+dso,+addr,+symoff -k vmlinux

[1] https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf/core
[2] http://archive.armlinux.org.uk/lurker/message/20181109.091126.9d69489d.en.html


Leo Yan (5):
perf cs-etm: Correct packets swapping in cs_etm__flush()
perf cs-etm: Avoid stale branch samples when flush packet
perf cs-etm: Support for NO_SYNC packet
perf cs-etm: Generate branch sample for exception packet
perf cs-etm: Track exception number

tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 91 ++++++++++++++++++++++---
tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 11 +--
tools/perf/util/cs-etm.c | 65 +++++++++++++++---
3 files changed, 146 insertions(+), 21 deletions(-)

--
2.7.4



2018-11-11 05:01:12

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 1/5] perf cs-etm: Correct packets swapping in cs_etm__flush()

The structure cs_etm_queue uses 'prev_packet' to point to previous
packet, this can be used to combine with new coming packet to generate
samples.

In function cs_etm__flush() it swaps packets only when the flag
'etm->synth_opts.last_branch' is true, this means that it will not
swap packets if without option '--itrace=il' to generate last branch
entries; thus for this case the 'prev_packet' doesn't point to the
correct previous packet and the stale packet still will be used to
generate sequential sample. Thus if dump trace with 'perf script'
command we can see the incorrect flow with the stale packet's address
info.

This patch corrects packets swapping in cs_etm__flush(); except using
the flag 'etm->synth_opts.last_branch' it also checks the another flag
'etm->sample_branches', if any flag is true then it swaps packets so
can save correct content to 'prev_packet'. Finally this can fix the
wrong program flow dumping issue.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/util/cs-etm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 48ad217..fe18d7b 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -997,7 +997,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)
}

swap_packet:
- if (etmq->etm->synth_opts.last_branch) {
+ if (etm->sample_branches || etmq->etm->synth_opts.last_branch) {
/*
* Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for
* the next incoming packet.
--
2.7.4


2018-11-11 05:01:12

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 2/5] perf cs-etm: Avoid stale branch samples when flush packet

At the end of trace buffer handling, function cs_etm__flush() is invoked
to flush any remaining branch stack entries. As a side effect, it also
generates branch sample, because the 'etmq->packet' doesn't contains any
new coming packet but point to one stale packet after packets swapping,
so it wrongly makes synthesize branch samples with stale packet info.

We could review below detailed flow which causes issue:

Packet1: start_addr=0xffff000008b1fbf0 end_addr=0xffff000008b1fbfc
Packet2: start_addr=0xffff000008b1fb5c end_addr=0xffff000008b1fb6c

step 1: cs_etm__sample():
sample: ip=(0xffff000008b1fbfc-4) addr=0xffff000008b1fb5c

step 2: flush packet in cs_etm__run_decoder():
cs_etm__run_decoder()
`-> err = cs_etm__flush(etmq, false);
sample: ip=(0xffff000008b1fb6c-4) addr=0xffff000008b1fbf0

Packet1 and packet2 are two continuous packets, when packet2 is the new
coming packet, cs_etm__sample() generates branch sample for these two
packets and use [packet1::end_addr - 4 => packet2::start_addr] as branch
jump flow, thus we can see the first generated branch sample in step 1.
At the end of cs_etm__sample() it swaps packets so 'etm->prev_packet'=
packet2 and 'etm->packet'=packet1, so far it's okay for branch sample.

If packet2 is the last one packet in trace buffer, even there have no
any new coming packet, cs_etm__run_decoder() invokes cs_etm__flush() to
flush branch stack entries as expected, but it also generates branch
samples by taking 'etm->packet' as a new coming packet, thus the branch
jump flow is as [packet2::end_addr - 4 => packet1::start_addr]; this
is the second sample which is generated in step 2. So actually the
second sample is a stale sample and we should not generate it.

This patch is to add new argument 'new_packet' for cs_etm__flush(), we
can pass 'true' for this argument if there have a new packet, otherwise
it will pass 'false' for the purpose of only flushing branch stack
entries and avoid to generate sample for stale packet.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/util/cs-etm.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index fe18d7b..f4fa877 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -955,7 +955,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq)
return 0;
}

-static int cs_etm__flush(struct cs_etm_queue *etmq)
+static int cs_etm__flush(struct cs_etm_queue *etmq, bool new_packet)
{
int err = 0;
struct cs_etm_auxtrace *etm = etmq->etm;
@@ -989,6 +989,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)

}

+ /*
+ * If 'new_packet' is false, this time call has no a new packet
+ * coming and 'etmq->packet' contains the stale packet which is
+ * set at the previous time with packets swapping. In this case
+ * this function is invoked only for flushing branch stack at
+ * the end of buffer handling.
+ *
+ * Simply to say, branch samples should be generated when every
+ * time receive one new packet; otherwise, directly bail out to
+ * avoid generate branch sample with stale packet.
+ */
+ if (!new_packet)
+ return 0;
+
if (etm->sample_branches &&
etmq->prev_packet->sample_type == CS_ETM_RANGE) {
err = cs_etm__synth_branch_sample(etmq);
@@ -1075,7 +1089,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
* Discontinuity in trace, flush
* previous branch stack
*/
- cs_etm__flush(etmq);
+ cs_etm__flush(etmq, true);
break;
case CS_ETM_EMPTY:
/*
@@ -1092,7 +1106,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)

if (err == 0)
/* Flush any remaining branch stack entries */
- err = cs_etm__flush(etmq);
+ err = cs_etm__flush(etmq, false);
}

return err;
--
2.7.4


2018-11-11 05:01:12

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 3/5] perf cs-etm: Support for NO_SYNC packet

As described in OpenCSD (CoreSight decoder lib), in the decoding stream
it includes one trace element with type OCSD_GEN_TRC_ELEM_NO_SYNC; the
element indicates 'either at start of decode, or after overflow / bad
packet', we should take it as a signal for the tracing off and this will
cause tracing discontinuity. From the trace dump with 'perf script',
sometimes the element OCSD_GEN_TRC_ELEM_NO_SYNC collaborates with
element OCSD_GEN_TRC_ELEM_TRACE_ON to show the tracing flow have been
turned off and on, in this case the cs-etm code has handled TRACE_ON
packet well so we observe the tracing discontinuity; but in another case
it only inserts the element OCSD_GEN_TRC_ELEM_NO_SYNC into instructions
packets, we miss to handle the case if has only standalone NO_SYNC
element and users cannot receive the info for tracing discontinuity.

This patch introduces new type CS_ETM_TRACE_OFF to generate packet for
receiving element OCSD_GEN_TRC_ELEM_NO_SYNC from decoder; when generate
sample, CS_ETM_TRACE_OFF packet has almost the same behaviour with
CS_ETM_TRACE_ON packet: both of them invokes cs_etm__flush() to generate
samples for the previous instructions packet, and in cs_etm__sample() it
also needs to generate samples if TRACE_OFF packet is followed by one
sequential instructions packet. This patch also converts the address to
0 for TRACE_OFF packet, this is same with TRACE_ON packet as well.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 10 ++++++++++
tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ++++---
tools/perf/util/cs-etm.c | 15 +++++++++++----
3 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
index 5efb616..9d52727 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
@@ -369,6 +369,14 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder,
}

static ocsd_datapath_resp_t
+cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder,
+ const uint8_t trace_chan_id)
+{
+ return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
+ CS_ETM_TRACE_OFF);
+}
+
+static ocsd_datapath_resp_t
cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder,
const uint8_t trace_chan_id)
{
@@ -389,6 +397,8 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
case OCSD_GEN_TRC_ELEM_UNKNOWN:
break;
case OCSD_GEN_TRC_ELEM_NO_SYNC:
+ resp = cs_etm_decoder__buffer_trace_off(decoder,
+ trace_chan_id);
decoder->trace_on = false;
break;
case OCSD_GEN_TRC_ELEM_TRACE_ON:
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
index 9351bd1..a38c97c 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
@@ -23,9 +23,10 @@ struct cs_etm_buffer {
};

enum cs_etm_sample_type {
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
+ CS_ETM_EMPTY = 0,
+ CS_ETM_RANGE = 1 << 0,
+ CS_ETM_TRACE_ON = 1 << 1,
+ CS_ETM_TRACE_OFF = 1 << 2,
};

enum cs_etm_isa {
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index f4fa877..2a0cef9 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -517,8 +517,9 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,

static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
{
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
+ /* Returns 0 for TRACE_ON and TRACE_OFF packets */
+ if (packet->sample_type == CS_ETM_TRACE_ON ||
+ packet->sample_type == CS_ETM_TRACE_OFF)
return 0;

return packet->start_addr;
@@ -527,8 +528,9 @@ static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
static inline
u64 cs_etm__last_executed_instr(const struct cs_etm_packet *packet)
{
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
+ /* Returns 0 for TRACE_ON and TRACE_OFF packets */
+ if (packet->sample_type == CS_ETM_TRACE_ON ||
+ packet->sample_type == CS_ETM_TRACE_OFF)
return 0;

return packet->end_addr - packet->last_instr_size;
@@ -930,6 +932,10 @@ static int cs_etm__sample(struct cs_etm_queue *etmq)
if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON)
generate_sample = true;

+ /* Generate sample for tracing off packet */
+ if (etmq->prev_packet->sample_type == CS_ETM_TRACE_OFF)
+ generate_sample = true;
+
/* Generate sample for branch taken packet */
if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
etmq->prev_packet->last_instr_taken_branch)
@@ -1085,6 +1091,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
cs_etm__sample(etmq);
break;
case CS_ETM_TRACE_ON:
+ case CS_ETM_TRACE_OFF:
/*
* Discontinuity in trace, flush
* previous branch stack
--
2.7.4


2018-11-11 05:03:02

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 5/5] perf cs-etm: Track exception number

When an exception packet comes, it contains the info for exception
number; the exception number indicates the exception types, so from it
we can know if the exception is taken for interrupt, system call or
other traps, etc. But because the exception return packet cannot
delivery exception number correctly by decoder thus when prepare sample
flags we cannot know what's type for exception return.

This patch adds a new 'exc_num' array in decoder structure to record
exception number per CPU, the exception number is recorded in the array
when the exception packet comes and this exception number can be used by
exception return packet. If detect there have discontinuous trace with
TRACE_ON or TRACE_OFF packet, the exception number is set to invalid
value.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 67 ++++++++++++++++++++++---
1 file changed, 59 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
index b8cb7a3e..d1a6cbc 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
@@ -43,6 +43,7 @@ struct cs_etm_decoder {
u32 packet_count;
u32 head;
u32 tail;
+ u32 *exc_num;
struct cs_etm_packet packet_buffer[MAX_BUFFER];
};

@@ -368,24 +369,64 @@ static ocsd_datapath_resp_t
cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder,
const uint8_t trace_chan_id)
{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
- CS_ETM_TRACE_OFF);
+ int ret;
+ struct cs_etm_packet *packet;
+
+ ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
+ CS_ETM_TRACE_OFF);
+ if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
+ return ret;
+
+ packet = &decoder->packet_buffer[decoder->tail];
+
+ /* Clear execption number for discontinuous trace */
+ decoder->exc_num[packet->cpu] = UINT32_MAX;
+
+ return ret;
}

static ocsd_datapath_resp_t
cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder,
const uint8_t trace_chan_id)
{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
- CS_ETM_TRACE_ON);
+ int ret;
+ struct cs_etm_packet *packet;
+
+ ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
+ CS_ETM_TRACE_ON);
+ if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
+ return ret;
+
+ packet = &decoder->packet_buffer[decoder->tail];
+
+ /* Clear execption number for discontinuous trace */
+ decoder->exc_num[packet->cpu] = UINT32_MAX;
+
+ return ret;
}

static ocsd_datapath_resp_t
cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
+ const ocsd_generic_trace_elem *elem,
const uint8_t trace_chan_id)
{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
- CS_ETM_EXCEPTION);
+ int ret;
+ struct cs_etm_packet *packet;
+
+ ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
+ CS_ETM_EXCEPTION);
+ if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
+ return ret;
+
+ packet = &decoder->packet_buffer[decoder->tail];
+
+ /*
+ * Exception number is recorded per CPU and later can be used
+ * for exception return instruction analysis.
+ */
+ decoder->exc_num[packet->cpu] = elem->exception_number;
+
+ return ret;
}

static ocsd_datapath_resp_t
@@ -423,7 +464,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_EXCEPTION:
- resp = cs_etm_decoder__buffer_exception(decoder,
+ resp = cs_etm_decoder__buffer_exception(decoder, elem,
trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_EXCEPTION_RET:
@@ -511,6 +552,10 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params,
if (!decoder)
return NULL;

+ decoder->exc_num = zalloc(sizeof(*decoder->exc_num) * num_cpu);
+ if (!decoder->exc_num)
+ goto err_free_decoder;
+
decoder->data = d_params->data;
decoder->prev_return = OCSD_RESP_CONT;
cs_etm_decoder__clear_buffer(decoder);
@@ -531,7 +576,7 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params,
decoder->dcd_tree = ocsd_create_dcd_tree(format, flags);

if (decoder->dcd_tree == 0)
- goto err_free_decoder;
+ goto err_free_decoder_exc_num;

/* init library print logging support */
ret = cs_etm_decoder__init_def_logger_printing(d_params, decoder);
@@ -542,6 +587,9 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params,
cs_etm_decoder__init_raw_frame_logging(d_params, decoder);

for (i = 0; i < num_cpu; i++) {
+ /* init expcetion number to an invalid value */
+ decoder->exc_num[i] = UINT32_MAX;
+
ret = cs_etm_decoder__create_etm_decoder(d_params,
&t_params[i],
decoder);
@@ -553,6 +601,8 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params,

err_free_decoder_tree:
ocsd_destroy_dcd_tree(decoder->dcd_tree);
+err_free_decoder_exc_num:
+ free(decoder->exc_num);
err_free_decoder:
free(decoder);
return NULL;
@@ -613,5 +663,6 @@ void cs_etm_decoder__free(struct cs_etm_decoder *decoder)

ocsd_destroy_dcd_tree(decoder->dcd_tree);
decoder->dcd_tree = NULL;
+ free(decoder->exc_num);
free(decoder);
}
--
2.7.4


2018-11-11 05:04:10

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 4/5] perf cs-etm: Generate branch sample for exception packet

The exception packet appears as one element with 'elem_type' ==
OCSD_GEN_TRC_ELEM_EXCEPTION or OCSD_GEN_TRC_ELEM_EXCEPTION_RET,
which present for exception entry and exit respectively. The decoder
set packet fields 'packet->exc' and 'packet->exc_ret' to indicate the
exception packets; but exception packets don't have dedicated sample
type and shares the same sample type CS_ETM_RANGE with normal
instruction packets.

As result, the exception packets are taken as normal instruction packets
and this introduces confusion to mix different packet types.
Furthermore, these instruction range packets will be processed for
branch sample only when 'packet->last_instr_taken_branch' is true,
otherwise they will be omitted, this can introduce mess for exception
and exception returning due we don't have complete address range info
for context switching.

To process exception packets properly, this patch introduce two new
sample type: CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET; for these two
kind packets, they will be handled by cs_etm__exception(). The func
cs_etm__exception() forces to set previous CS_ETM_RANGE packet flag
'prev_packet->last_instr_taken_branch' to true, this matches well with
the program flow when the exception is trapped from user space to kernel
space, no matter if the most recent flow has branch taken or not; this
is also safe for returning to user space after exception handling.

After exception packets have their own sample type, the packet fields
'packet->exc' and 'packet->exc_ret' aren't needed anymore, so remove
them.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 26 +++++++++++++++++------
tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 4 ++--
tools/perf/util/cs-etm.c | 28 +++++++++++++++++++++++++
3 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
index 9d52727..b8cb7a3e 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
@@ -269,8 +269,6 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder)
decoder->packet_buffer[i].instr_count = 0;
decoder->packet_buffer[i].last_instr_taken_branch = false;
decoder->packet_buffer[i].last_instr_size = 0;
- decoder->packet_buffer[i].exc = false;
- decoder->packet_buffer[i].exc_ret = false;
decoder->packet_buffer[i].cpu = INT_MIN;
}
}
@@ -298,8 +296,6 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder,

decoder->packet_buffer[et].sample_type = sample_type;
decoder->packet_buffer[et].isa = CS_ETM_ISA_UNKNOWN;
- decoder->packet_buffer[et].exc = false;
- decoder->packet_buffer[et].exc_ret = false;
decoder->packet_buffer[et].cpu = *((int *)inode->priv);
decoder->packet_buffer[et].start_addr = CS_ETM_INVAL_ADDR;
decoder->packet_buffer[et].end_addr = CS_ETM_INVAL_ADDR;
@@ -384,6 +380,22 @@ cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder,
CS_ETM_TRACE_ON);
}

+static ocsd_datapath_resp_t
+cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
+ const uint8_t trace_chan_id)
+{
+ return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
+ CS_ETM_EXCEPTION);
+}
+
+static ocsd_datapath_resp_t
+cs_etm_decoder__buffer_exception_ret(struct cs_etm_decoder *decoder,
+ const uint8_t trace_chan_id)
+{
+ return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
+ CS_ETM_EXCEPTION_RET);
+}
+
static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
const void *context,
const ocsd_trc_index_t indx __maybe_unused,
@@ -411,10 +423,12 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_EXCEPTION:
- decoder->packet_buffer[decoder->tail].exc = true;
+ resp = cs_etm_decoder__buffer_exception(decoder,
+ trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_EXCEPTION_RET:
- decoder->packet_buffer[decoder->tail].exc_ret = true;
+ resp = cs_etm_decoder__buffer_exception_ret(decoder,
+ trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_PE_CONTEXT:
case OCSD_GEN_TRC_ELEM_EO_TRACE:
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
index a38c97c..0d1c18d 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
@@ -27,6 +27,8 @@ enum cs_etm_sample_type {
CS_ETM_RANGE = 1 << 0,
CS_ETM_TRACE_ON = 1 << 1,
CS_ETM_TRACE_OFF = 1 << 2,
+ CS_ETM_EXCEPTION = 1 << 3,
+ CS_ETM_EXCEPTION_RET = 1 << 4,
};

enum cs_etm_isa {
@@ -44,8 +46,6 @@ struct cs_etm_packet {
u32 instr_count;
u8 last_instr_taken_branch;
u8 last_instr_size;
- u8 exc;
- u8 exc_ret;
int cpu;
};

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 2a0cef9..455f132 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -961,6 +961,25 @@ static int cs_etm__sample(struct cs_etm_queue *etmq)
return 0;
}

+static int cs_etm__exception(struct cs_etm_queue *etmq)
+{
+ /*
+ * When the exception packet is inserted, whether the last instruction
+ * in previous range packet is taken branch or not, we need to force
+ * to set 'prev_packet->last_instr_taken_branch' to true. This ensures
+ * to generate branch sample for the instruction range before the
+ * exception is trapped to kernel or before the exception returning.
+ *
+ * The exception packet includes the dummy address values, so don't
+ * swap PACKET with PREV_PACKET. This keeps PREV_PACKET to be useful
+ * for generating instruction and branch samples.
+ */
+ if (etmq->prev_packet->sample_type == CS_ETM_RANGE)
+ etmq->prev_packet->last_instr_taken_branch = true;
+
+ return 0;
+}
+
static int cs_etm__flush(struct cs_etm_queue *etmq, bool new_packet)
{
int err = 0;
@@ -1090,6 +1109,15 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
*/
cs_etm__sample(etmq);
break;
+ case CS_ETM_EXCEPTION:
+ case CS_ETM_EXCEPTION_RET:
+ /*
+ * If the exception packet is coming,
+ * make sure the previous instruction
+ * range packet to be handled properly.
+ */
+ cs_etm__exception(etmq);
+ break;
case CS_ETM_TRACE_ON:
case CS_ETM_TRACE_OFF:
/*
--
2.7.4


2018-11-16 22:59:56

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v1 1/5] perf cs-etm: Correct packets swapping in cs_etm__flush()

On Sun, Nov 11, 2018 at 12:59:39PM +0800, Leo Yan wrote:
> The structure cs_etm_queue uses 'prev_packet' to point to previous
> packet, this can be used to combine with new coming packet to generate
> samples.
>
> In function cs_etm__flush() it swaps packets only when the flag
> 'etm->synth_opts.last_branch' is true, this means that it will not
> swap packets if without option '--itrace=il' to generate last branch
> entries; thus for this case the 'prev_packet' doesn't point to the
> correct previous packet and the stale packet still will be used to
> generate sequential sample. Thus if dump trace with 'perf script'
> command we can see the incorrect flow with the stale packet's address
> info.
>
> This patch corrects packets swapping in cs_etm__flush(); except using
> the flag 'etm->synth_opts.last_branch' it also checks the another flag
> 'etm->sample_branches', if any flag is true then it swaps packets so
> can save correct content to 'prev_packet'. Finally this can fix the
> wrong program flow dumping issue.
>
> Signed-off-by: Leo Yan <[email protected]>
> ---
> tools/perf/util/cs-etm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 48ad217..fe18d7b 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -997,7 +997,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)
> }
>
> swap_packet:
> - if (etmq->etm->synth_opts.last_branch) {
> + if (etm->sample_branches || etmq->etm->synth_opts.last_branch) {

This seems like the right thing to do, if only to be consistent with that is
done in cs_etm__sample().

Reviewed-by: Mathieu Poirier <[email protected]>

> /*
> * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for
> * the next incoming packet.
> --
> 2.7.4
>

2018-11-16 23:07:38

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] perf cs-etm: Avoid stale branch samples when flush packet

On Sun, Nov 11, 2018 at 12:59:40PM +0800, Leo Yan wrote:
> At the end of trace buffer handling, function cs_etm__flush() is invoked
> to flush any remaining branch stack entries. As a side effect, it also
> generates branch sample, because the 'etmq->packet' doesn't contains any
> new coming packet but point to one stale packet after packets swapping,
> so it wrongly makes synthesize branch samples with stale packet info.
>
> We could review below detailed flow which causes issue:
>
> Packet1: start_addr=0xffff000008b1fbf0 end_addr=0xffff000008b1fbfc
> Packet2: start_addr=0xffff000008b1fb5c end_addr=0xffff000008b1fb6c
>
> step 1: cs_etm__sample():
> sample: ip=(0xffff000008b1fbfc-4) addr=0xffff000008b1fb5c
>
> step 2: flush packet in cs_etm__run_decoder():
> cs_etm__run_decoder()
> `-> err = cs_etm__flush(etmq, false);
> sample: ip=(0xffff000008b1fb6c-4) addr=0xffff000008b1fbf0
>
> Packet1 and packet2 are two continuous packets, when packet2 is the new
> coming packet, cs_etm__sample() generates branch sample for these two
> packets and use [packet1::end_addr - 4 => packet2::start_addr] as branch
> jump flow, thus we can see the first generated branch sample in step 1.
> At the end of cs_etm__sample() it swaps packets so 'etm->prev_packet'=
> packet2 and 'etm->packet'=packet1, so far it's okay for branch sample.
>
> If packet2 is the last one packet in trace buffer, even there have no
> any new coming packet, cs_etm__run_decoder() invokes cs_etm__flush() to
> flush branch stack entries as expected, but it also generates branch
> samples by taking 'etm->packet' as a new coming packet, thus the branch
> jump flow is as [packet2::end_addr - 4 => packet1::start_addr]; this
> is the second sample which is generated in step 2. So actually the
> second sample is a stale sample and we should not generate it.
>
> This patch is to add new argument 'new_packet' for cs_etm__flush(), we
> can pass 'true' for this argument if there have a new packet, otherwise
> it will pass 'false' for the purpose of only flushing branch stack
> entries and avoid to generate sample for stale packet.

Very good explanation, thanks for taking the time to write this.

>
> Signed-off-by: Leo Yan <[email protected]>
> ---
> tools/perf/util/cs-etm.c | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index fe18d7b..f4fa877 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -955,7 +955,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq)
> return 0;
> }
>
> -static int cs_etm__flush(struct cs_etm_queue *etmq)
> +static int cs_etm__flush(struct cs_etm_queue *etmq, bool new_packet)
> {
> int err = 0;
> struct cs_etm_auxtrace *etm = etmq->etm;
> @@ -989,6 +989,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)
>
> }
>
> + /*
> + * If 'new_packet' is false, this time call has no a new packet
> + * coming and 'etmq->packet' contains the stale packet which is
> + * set at the previous time with packets swapping. In this case
> + * this function is invoked only for flushing branch stack at
> + * the end of buffer handling.
> + *
> + * Simply to say, branch samples should be generated when every
> + * time receive one new packet; otherwise, directly bail out to
> + * avoid generate branch sample with stale packet.
> + */
> + if (!new_packet)
> + return 0;
> +
> if (etm->sample_branches &&
> etmq->prev_packet->sample_type == CS_ETM_RANGE) {
> err = cs_etm__synth_branch_sample(etmq);
> @@ -1075,7 +1089,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
> * Discontinuity in trace, flush
> * previous branch stack
> */
> - cs_etm__flush(etmq);
> + cs_etm__flush(etmq, true);
> break;
> case CS_ETM_EMPTY:
> /*
> @@ -1092,7 +1106,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
>
> if (err == 0)
> /* Flush any remaining branch stack entries */
> - err = cs_etm__flush(etmq);
> + err = cs_etm__flush(etmq, false);

I understand what you're doing and it will yield the correct results. What I'm
not sure about is if we wouldn't be better off splitting cs_etm__flush()
in order to reduce the complexity of the main decoding loop. That is rename
cs_etm__flush() to something like cs_etm__trace_on() and spin off a new
cs_etm__end_block().

It does introduce a little bit of code duplication but I think we'd win in terms
of readability and flexibility.

Thanks,
Mathieu


> }
>
> return err;
> --
> 2.7.4
>

2018-11-18 06:41:35

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] perf cs-etm: Avoid stale branch samples when flush packet

On Fri, Nov 16, 2018 at 04:05:11PM -0700, Mathieu Poirier wrote:

[...]

> > -static int cs_etm__flush(struct cs_etm_queue *etmq)
> > +static int cs_etm__flush(struct cs_etm_queue *etmq, bool new_packet)
> > {
> > int err = 0;
> > struct cs_etm_auxtrace *etm = etmq->etm;
> > @@ -989,6 +989,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)
> >
> > }
> >
> > + /*
> > + * If 'new_packet' is false, this time call has no a new packet
> > + * coming and 'etmq->packet' contains the stale packet which is
> > + * set at the previous time with packets swapping. In this case
> > + * this function is invoked only for flushing branch stack at
> > + * the end of buffer handling.
> > + *
> > + * Simply to say, branch samples should be generated when every
> > + * time receive one new packet; otherwise, directly bail out to
> > + * avoid generate branch sample with stale packet.
> > + */
> > + if (!new_packet)
> > + return 0;
> > +
> > if (etm->sample_branches &&
> > etmq->prev_packet->sample_type == CS_ETM_RANGE) {
> > err = cs_etm__synth_branch_sample(etmq);
> > @@ -1075,7 +1089,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
> > * Discontinuity in trace, flush
> > * previous branch stack
> > */
> > - cs_etm__flush(etmq);
> > + cs_etm__flush(etmq, true);
> > break;
> > case CS_ETM_EMPTY:
> > /*
> > @@ -1092,7 +1106,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
> >
> > if (err == 0)
> > /* Flush any remaining branch stack entries */
> > - err = cs_etm__flush(etmq);
> > + err = cs_etm__flush(etmq, false);
>
> I understand what you're doing and it will yield the correct results. What I'm
> not sure about is if we wouldn't be better off splitting cs_etm__flush()
> in order to reduce the complexity of the main decoding loop. That is rename
> cs_etm__flush() to something like cs_etm__trace_on() and spin off a new
> cs_etm__end_block().
>
> It does introduce a little bit of code duplication but I think we'd win in terms
> of readability and flexibility.

Thanks for reviewing, Mathieu.

I agree with your suggestion to split cs_etm__flush() into two
functions, will spin this patch with the suggestion in next
series for reviewing.

Thanks,
Leo Yan

2018-11-19 18:29:50

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v1 3/5] perf cs-etm: Support for NO_SYNC packet

On Sun, Nov 11, 2018 at 12:59:41PM +0800, Leo Yan wrote:
> As described in OpenCSD (CoreSight decoder lib), in the decoding stream
> it includes one trace element with type OCSD_GEN_TRC_ELEM_NO_SYNC; the
> element indicates 'either at start of decode, or after overflow / bad
> packet', we should take it as a signal for the tracing off and this will
> cause tracing discontinuity. From the trace dump with 'perf script',
> sometimes the element OCSD_GEN_TRC_ELEM_NO_SYNC collaborates with
> element OCSD_GEN_TRC_ELEM_TRACE_ON to show the tracing flow have been
> turned off and on, in this case the cs-etm code has handled TRACE_ON
> packet well so we observe the tracing discontinuity; but in another case
> it only inserts the element OCSD_GEN_TRC_ELEM_NO_SYNC into instructions
> packets, we miss to handle the case if has only standalone NO_SYNC
> element and users cannot receive the info for tracing discontinuity.
>
> This patch introduces new type CS_ETM_TRACE_OFF to generate packet for
> receiving element OCSD_GEN_TRC_ELEM_NO_SYNC from decoder; when generate
> sample, CS_ETM_TRACE_OFF packet has almost the same behaviour with

Here you have used the word "almost", leading me to beleive there are cases
where the handling of TRACE_ON and NO_SYNC packets differ, but the
implementation enacts the same behavior for both.

Mike or Robert, can you confirm that TRACE_ON and NO_SYNC packets should be
treated the same way?

Leo, if handling of the TRACE_ON and NO_SYNC packets is the same then we should
treat them the same way in cs_etm_decoder__gen_trace_elem_printer(), i.e simply
call cs_etm_decoder__buffer_trace_on(). That way packet handling in cs-etm.c
doesn't change. Otherwise see my comments below.

> CS_ETM_TRACE_ON packet: both of them invokes cs_etm__flush() to generate
> samples for the previous instructions packet, and in cs_etm__sample() it
> also needs to generate samples if TRACE_OFF packet is followed by one
> sequential instructions packet. This patch also converts the address to
> 0 for TRACE_OFF packet, this is same with TRACE_ON packet as well.
>
> Signed-off-by: Leo Yan <[email protected]>
> ---
> tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 10 ++++++++++
> tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ++++---
> tools/perf/util/cs-etm.c | 15 +++++++++++----
> 3 files changed, 25 insertions(+), 7 deletions(-)
>
> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> index 5efb616..9d52727 100644
> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> @@ -369,6 +369,14 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder,
> }
>
> static ocsd_datapath_resp_t
> +cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder,
> + const uint8_t trace_chan_id)
> +{
> + return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> + CS_ETM_TRACE_OFF);
> +}
> +
> +static ocsd_datapath_resp_t
> cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder,
> const uint8_t trace_chan_id)
> {
> @@ -389,6 +397,8 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
> case OCSD_GEN_TRC_ELEM_UNKNOWN:
> break;
> case OCSD_GEN_TRC_ELEM_NO_SYNC:
> + resp = cs_etm_decoder__buffer_trace_off(decoder,
> + trace_chan_id);

If we want to handle NO_SYNC element types, why introduce a "trace_off"
function? Wouldn't it make more sense to call it
cs_etm_decoder__buffer_no_sync() ?

> decoder->trace_on = false;
> break;
> case OCSD_GEN_TRC_ELEM_TRACE_ON:
> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
> index 9351bd1..a38c97c 100644
> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
> @@ -23,9 +23,10 @@ struct cs_etm_buffer {
> };
>
> enum cs_etm_sample_type {
> - CS_ETM_EMPTY = 0,
> - CS_ETM_RANGE = 1 << 0,
> - CS_ETM_TRACE_ON = 1 << 1,
> + CS_ETM_EMPTY = 0,
> + CS_ETM_RANGE = 1 << 0,
> + CS_ETM_TRACE_ON = 1 << 1,
> + CS_ETM_TRACE_OFF = 1 << 2,

CS_ETM_NO_SYNC, see above comment. And please don't use indentation.


> };
>
> enum cs_etm_isa {
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index f4fa877..2a0cef9 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -517,8 +517,9 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
>
> static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> {
> - /* Returns 0 for the CS_ETM_TRACE_ON packet */
> - if (packet->sample_type == CS_ETM_TRACE_ON)
> + /* Returns 0 for TRACE_ON and TRACE_OFF packets */
> + if (packet->sample_type == CS_ETM_TRACE_ON ||
> + packet->sample_type == CS_ETM_TRACE_OFF)
> return 0;
>
> return packet->start_addr;
> @@ -527,8 +528,9 @@ static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> static inline
> u64 cs_etm__last_executed_instr(const struct cs_etm_packet *packet)
> {
> - /* Returns 0 for the CS_ETM_TRACE_ON packet */
> - if (packet->sample_type == CS_ETM_TRACE_ON)
> + /* Returns 0 for TRACE_ON and TRACE_OFF packets */
> + if (packet->sample_type == CS_ETM_TRACE_ON ||
> + packet->sample_type == CS_ETM_TRACE_OFF)
> return 0;
>
> return packet->end_addr - packet->last_instr_size;
> @@ -930,6 +932,10 @@ static int cs_etm__sample(struct cs_etm_queue *etmq)
> if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON)
> generate_sample = true;
>
> + /* Generate sample for tracing off packet */
> + if (etmq->prev_packet->sample_type == CS_ETM_TRACE_OFF)
> + generate_sample = true;
> +
> /* Generate sample for branch taken packet */
> if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
> etmq->prev_packet->last_instr_taken_branch)
> @@ -1085,6 +1091,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
> cs_etm__sample(etmq);
> break;
> case CS_ETM_TRACE_ON:
> + case CS_ETM_TRACE_OFF:
> /*
> * Discontinuity in trace, flush
> * previous branch stack
> --
> 2.7.4
>

2018-11-19 20:50:01

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] perf cs-etm: Track exception number

On Sun, Nov 11, 2018 at 12:59:43PM +0800, Leo Yan wrote:
> When an exception packet comes, it contains the info for exception
> number; the exception number indicates the exception types, so from it
> we can know if the exception is taken for interrupt, system call or
> other traps, etc. But because the exception return packet cannot
> delivery exception number correctly by decoder thus when prepare sample
> flags we cannot know what's type for exception return.
>
> This patch adds a new 'exc_num' array in decoder structure to record
> exception number per CPU, the exception number is recorded in the array
> when the exception packet comes and this exception number can be used by
> exception return packet. If detect there have discontinuous trace with
> TRACE_ON or TRACE_OFF packet, the exception number is set to invalid
> value.
>
> Signed-off-by: Leo Yan <[email protected]>
> ---
> tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 67 ++++++++++++++++++++++---
> 1 file changed, 59 insertions(+), 8 deletions(-)
>
> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> index b8cb7a3e..d1a6cbc 100644
> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> @@ -43,6 +43,7 @@ struct cs_etm_decoder {
> u32 packet_count;
> u32 head;
> u32 tail;
> + u32 *exc_num;
> struct cs_etm_packet packet_buffer[MAX_BUFFER];
> };
>
> @@ -368,24 +369,64 @@ static ocsd_datapath_resp_t
> cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder,
> const uint8_t trace_chan_id)
> {
> - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> - CS_ETM_TRACE_OFF);
> + int ret;
> + struct cs_etm_packet *packet;
> +
> + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> + CS_ETM_TRACE_OFF);
> + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
> + return ret;
> +
> + packet = &decoder->packet_buffer[decoder->tail];
> +
> + /* Clear execption number for discontinuous trace */
> + decoder->exc_num[packet->cpu] = UINT32_MAX;
> +
> + return ret;
> }
>
> static ocsd_datapath_resp_t
> cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder,
> const uint8_t trace_chan_id)
> {
> - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> - CS_ETM_TRACE_ON);
> + int ret;
> + struct cs_etm_packet *packet;
> +
> + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> + CS_ETM_TRACE_ON);
> + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
> + return ret;
> +
> + packet = &decoder->packet_buffer[decoder->tail];
> +
> + /* Clear execption number for discontinuous trace */
> + decoder->exc_num[packet->cpu] = UINT32_MAX;
> +
> + return ret;
> }
>
> static ocsd_datapath_resp_t
> cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
> + const ocsd_generic_trace_elem *elem,
> const uint8_t trace_chan_id)
> {
> - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> - CS_ETM_EXCEPTION);
> + int ret;
> + struct cs_etm_packet *packet;
> +
> + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> + CS_ETM_EXCEPTION);
> + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
> + return ret;
> +
> + packet = &decoder->packet_buffer[decoder->tail];
> +
> + /*
> + * Exception number is recorded per CPU and later can be used
> + * for exception return instruction analysis.
> + */
> + decoder->exc_num[packet->cpu] = elem->exception_number;

Am I missing something or the information about the exception number that is
recorded here isn't used anywhere? If you want to use this in perf report/script,
the exception number will have to be added to the cs_etm_packet struct.

I am done with the revision of this set.

Thanks,
Mathieu

> +
> + return ret;
> }
>
> static ocsd_datapath_resp_t
> @@ -423,7 +464,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
> trace_chan_id);
> break;
> case OCSD_GEN_TRC_ELEM_EXCEPTION:
> - resp = cs_etm_decoder__buffer_exception(decoder,
> + resp = cs_etm_decoder__buffer_exception(decoder, elem,
> trace_chan_id);
> break;
> case OCSD_GEN_TRC_ELEM_EXCEPTION_RET:
> @@ -511,6 +552,10 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params,
> if (!decoder)
> return NULL;
>
> + decoder->exc_num = zalloc(sizeof(*decoder->exc_num) * num_cpu);
> + if (!decoder->exc_num)
> + goto err_free_decoder;
> +
> decoder->data = d_params->data;
> decoder->prev_return = OCSD_RESP_CONT;
> cs_etm_decoder__clear_buffer(decoder);
> @@ -531,7 +576,7 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params,
> decoder->dcd_tree = ocsd_create_dcd_tree(format, flags);
>
> if (decoder->dcd_tree == 0)
> - goto err_free_decoder;
> + goto err_free_decoder_exc_num;
>
> /* init library print logging support */
> ret = cs_etm_decoder__init_def_logger_printing(d_params, decoder);
> @@ -542,6 +587,9 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params,
> cs_etm_decoder__init_raw_frame_logging(d_params, decoder);
>
> for (i = 0; i < num_cpu; i++) {
> + /* init expcetion number to an invalid value */
> + decoder->exc_num[i] = UINT32_MAX;
> +
> ret = cs_etm_decoder__create_etm_decoder(d_params,
> &t_params[i],
> decoder);
> @@ -553,6 +601,8 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params,
>
> err_free_decoder_tree:
> ocsd_destroy_dcd_tree(decoder->dcd_tree);
> +err_free_decoder_exc_num:
> + free(decoder->exc_num);
> err_free_decoder:
> free(decoder);
> return NULL;
> @@ -613,5 +663,6 @@ void cs_etm_decoder__free(struct cs_etm_decoder *decoder)
>
> ocsd_destroy_dcd_tree(decoder->dcd_tree);
> decoder->dcd_tree = NULL;
> + free(decoder->exc_num);
> free(decoder);
> }
> --
> 2.7.4
>

2018-12-05 03:00:45

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] perf cs-etm: Avoid stale branch samples when flush packet

On Fri, Nov 16, 2018 at 04:05:11PM -0700, Mathieu Poirier wrote:

[...]

> > -static int cs_etm__flush(struct cs_etm_queue *etmq)
> > +static int cs_etm__flush(struct cs_etm_queue *etmq, bool new_packet)
> > {
> > int err = 0;
> > struct cs_etm_auxtrace *etm = etmq->etm;
> > @@ -989,6 +989,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)
> >
> > }
> >
> > + /*
> > + * If 'new_packet' is false, this time call has no a new packet
> > + * coming and 'etmq->packet' contains the stale packet which is
> > + * set at the previous time with packets swapping. In this case
> > + * this function is invoked only for flushing branch stack at
> > + * the end of buffer handling.
> > + *
> > + * Simply to say, branch samples should be generated when every
> > + * time receive one new packet; otherwise, directly bail out to
> > + * avoid generate branch sample with stale packet.
> > + */
> > + if (!new_packet)
> > + return 0;
> > +
> > if (etm->sample_branches &&
> > etmq->prev_packet->sample_type == CS_ETM_RANGE) {
> > err = cs_etm__synth_branch_sample(etmq);
> > @@ -1075,7 +1089,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
> > * Discontinuity in trace, flush
> > * previous branch stack
> > */
> > - cs_etm__flush(etmq);
> > + cs_etm__flush(etmq, true);
> > break;
> > case CS_ETM_EMPTY:
> > /*
> > @@ -1092,7 +1106,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
> >
> > if (err == 0)
> > /* Flush any remaining branch stack entries */
> > - err = cs_etm__flush(etmq);
> > + err = cs_etm__flush(etmq, false);
>
> I understand what you're doing and it will yield the correct results. What I'm
> not sure about is if we wouldn't be better off splitting cs_etm__flush()
> in order to reduce the complexity of the main decoding loop. That is rename
> cs_etm__flush() to something like cs_etm__trace_on() and spin off a new
> cs_etm__end_block().
>
> It does introduce a little bit of code duplication but I think we'd win in terms
> of readability and flexibility.

Sorry for long delay, Mathieu.

Agree with the idea of splitting cs_etm__flush() into two functions.
Will spin patch for new version.

Thanks,
Leo Yan

> > }
> >
> > return err;
> > --
> > 2.7.4
> >

2018-12-05 03:32:48

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v1 3/5] perf cs-etm: Support for NO_SYNC packet

On Mon, Nov 19, 2018 at 11:27:59AM -0700, Mathieu Poirier wrote:
> On Sun, Nov 11, 2018 at 12:59:41PM +0800, Leo Yan wrote:
> > As described in OpenCSD (CoreSight decoder lib), in the decoding stream
> > it includes one trace element with type OCSD_GEN_TRC_ELEM_NO_SYNC; the
> > element indicates 'either at start of decode, or after overflow / bad
> > packet', we should take it as a signal for the tracing off and this will
> > cause tracing discontinuity. From the trace dump with 'perf script',
> > sometimes the element OCSD_GEN_TRC_ELEM_NO_SYNC collaborates with
> > element OCSD_GEN_TRC_ELEM_TRACE_ON to show the tracing flow have been
> > turned off and on, in this case the cs-etm code has handled TRACE_ON
> > packet well so we observe the tracing discontinuity; but in another case
> > it only inserts the element OCSD_GEN_TRC_ELEM_NO_SYNC into instructions
> > packets, we miss to handle the case if has only standalone NO_SYNC
> > element and users cannot receive the info for tracing discontinuity.
> >
> > This patch introduces new type CS_ETM_TRACE_OFF to generate packet for
> > receiving element OCSD_GEN_TRC_ELEM_NO_SYNC from decoder; when generate
> > sample, CS_ETM_TRACE_OFF packet has almost the same behaviour with
>
> Here you have used the word "almost", leading me to beleive there are cases
> where the handling of TRACE_ON and NO_SYNC packets differ, but the
> implementation enacts the same behavior for both.
>
> Mike or Robert, can you confirm that TRACE_ON and NO_SYNC packets should be
> treated the same way?

I also would like to get suggestions/comments from Mike and Rob.

For NO_SYNC packet, Mike before has given some explination for it:

"looking at the decoder flow, when the decoder is reset, then it is
returned to an unsynchronised state and a NO_SYNC will be output.
With perf, it can concatenate multiple trace buffers into a single
section of the perf.data file. To ensure that the decode process can
see the start of a new buffer, the drivers will insert a barrier
packet between different buffers so that the decoder can spot the
boundaries. When the decoder hits a barrier packet it will
automatically reset so that it correctly decodes the next trace
buffer. This could be what you are seeing here."

So I think there still have some difference between TRACE_ON and
NO_SYNC packets, TRACE_ON packet indicates the start of trace and it's
also possible caused by tracing discontinuity; NO_SYNC packets usually
caused by an unsynchronised state. But both of them presents
discontinuity in perf trace data.

I prefer to use two different packet types to present them, this can
let the code to be more readable and easier to be maintained later.

If you agree with this, I also will rephrase the commit log to avoid
confusion that these two packets are the same thing.

> Leo, if handling of the TRACE_ON and NO_SYNC packets is the same then we should
> treat them the same way in cs_etm_decoder__gen_trace_elem_printer(), i.e simply
> call cs_etm_decoder__buffer_trace_on(). That way packet handling in cs-etm.c
> doesn't change. Otherwise see my comments below.
>
> > CS_ETM_TRACE_ON packet: both of them invokes cs_etm__flush() to generate
> > samples for the previous instructions packet, and in cs_etm__sample() it
> > also needs to generate samples if TRACE_OFF packet is followed by one
> > sequential instructions packet. This patch also converts the address to
> > 0 for TRACE_OFF packet, this is same with TRACE_ON packet as well.
> >
> > Signed-off-by: Leo Yan <[email protected]>
> > ---
> > tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 10 ++++++++++
> > tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ++++---
> > tools/perf/util/cs-etm.c | 15 +++++++++++----
> > 3 files changed, 25 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > index 5efb616..9d52727 100644
> > --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > @@ -369,6 +369,14 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder,
> > }
> >
> > static ocsd_datapath_resp_t
> > +cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder,
> > + const uint8_t trace_chan_id)
> > +{
> > + return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > + CS_ETM_TRACE_OFF);
> > +}
> > +
> > +static ocsd_datapath_resp_t
> > cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder,
> > const uint8_t trace_chan_id)
> > {
> > @@ -389,6 +397,8 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
> > case OCSD_GEN_TRC_ELEM_UNKNOWN:
> > break;
> > case OCSD_GEN_TRC_ELEM_NO_SYNC:
> > + resp = cs_etm_decoder__buffer_trace_off(decoder,
> > + trace_chan_id);
>
> If we want to handle NO_SYNC element types, why introduce a "trace_off"
> function? Wouldn't it make more sense to call it
> cs_etm_decoder__buffer_no_sync() ?

Will change to cs_etm_decoder__buffer_no_sync().

> > decoder->trace_on = false;
> > break;
> > case OCSD_GEN_TRC_ELEM_TRACE_ON:
> > diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
> > index 9351bd1..a38c97c 100644
> > --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
> > +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
> > @@ -23,9 +23,10 @@ struct cs_etm_buffer {
> > };
> >
> > enum cs_etm_sample_type {
> > - CS_ETM_EMPTY = 0,
> > - CS_ETM_RANGE = 1 << 0,
> > - CS_ETM_TRACE_ON = 1 << 1,
> > + CS_ETM_EMPTY = 0,
> > + CS_ETM_RANGE = 1 << 0,
> > + CS_ETM_TRACE_ON = 1 << 1,
> > + CS_ETM_TRACE_OFF = 1 << 2,
>
> CS_ETM_NO_SYNC, see above comment. And please don't use indentation.

Will do this.

> > };
> >
> > enum cs_etm_isa {
> > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> > index f4fa877..2a0cef9 100644
> > --- a/tools/perf/util/cs-etm.c
> > +++ b/tools/perf/util/cs-etm.c
> > @@ -517,8 +517,9 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
> >
> > static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> > {
> > - /* Returns 0 for the CS_ETM_TRACE_ON packet */
> > - if (packet->sample_type == CS_ETM_TRACE_ON)
> > + /* Returns 0 for TRACE_ON and TRACE_OFF packets */
> > + if (packet->sample_type == CS_ETM_TRACE_ON ||
> > + packet->sample_type == CS_ETM_TRACE_OFF)
> > return 0;
> >
> > return packet->start_addr;
> > @@ -527,8 +528,9 @@ static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> > static inline
> > u64 cs_etm__last_executed_instr(const struct cs_etm_packet *packet)
> > {
> > - /* Returns 0 for the CS_ETM_TRACE_ON packet */
> > - if (packet->sample_type == CS_ETM_TRACE_ON)
> > + /* Returns 0 for TRACE_ON and TRACE_OFF packets */
> > + if (packet->sample_type == CS_ETM_TRACE_ON ||
> > + packet->sample_type == CS_ETM_TRACE_OFF)
> > return 0;
> >
> > return packet->end_addr - packet->last_instr_size;
> > @@ -930,6 +932,10 @@ static int cs_etm__sample(struct cs_etm_queue *etmq)
> > if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON)
> > generate_sample = true;
> >
> > + /* Generate sample for tracing off packet */
> > + if (etmq->prev_packet->sample_type == CS_ETM_TRACE_OFF)
> > + generate_sample = true;
> > +
> > /* Generate sample for branch taken packet */
> > if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
> > etmq->prev_packet->last_instr_taken_branch)
> > @@ -1085,6 +1091,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
> > cs_etm__sample(etmq);
> > break;
> > case CS_ETM_TRACE_ON:
> > + case CS_ETM_TRACE_OFF:
> > /*
> > * Discontinuity in trace, flush
> > * previous branch stack
> > --
> > 2.7.4
> >

2018-12-05 03:50:56

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] perf cs-etm: Track exception number

On Mon, Nov 19, 2018 at 01:47:49PM -0700, Mathieu Poirier wrote:
> On Sun, Nov 11, 2018 at 12:59:43PM +0800, Leo Yan wrote:
> > When an exception packet comes, it contains the info for exception
> > number; the exception number indicates the exception types, so from it
> > we can know if the exception is taken for interrupt, system call or
> > other traps, etc. But because the exception return packet cannot
> > delivery exception number correctly by decoder thus when prepare sample
> > flags we cannot know what's type for exception return.
> >
> > This patch adds a new 'exc_num' array in decoder structure to record
> > exception number per CPU, the exception number is recorded in the array
> > when the exception packet comes and this exception number can be used by
> > exception return packet. If detect there have discontinuous trace with
> > TRACE_ON or TRACE_OFF packet, the exception number is set to invalid
> > value.
> >
> > Signed-off-by: Leo Yan <[email protected]>
> > ---
> > tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 67 ++++++++++++++++++++++---
> > 1 file changed, 59 insertions(+), 8 deletions(-)
> >
> > diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > index b8cb7a3e..d1a6cbc 100644
> > --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > @@ -43,6 +43,7 @@ struct cs_etm_decoder {
> > u32 packet_count;
> > u32 head;
> > u32 tail;
> > + u32 *exc_num;
> > struct cs_etm_packet packet_buffer[MAX_BUFFER];
> > };
> >
> > @@ -368,24 +369,64 @@ static ocsd_datapath_resp_t
> > cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder,
> > const uint8_t trace_chan_id)
> > {
> > - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > - CS_ETM_TRACE_OFF);
> > + int ret;
> > + struct cs_etm_packet *packet;
> > +
> > + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > + CS_ETM_TRACE_OFF);
> > + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
> > + return ret;
> > +
> > + packet = &decoder->packet_buffer[decoder->tail];
> > +
> > + /* Clear execption number for discontinuous trace */
> > + decoder->exc_num[packet->cpu] = UINT32_MAX;
> > +
> > + return ret;
> > }
> >
> > static ocsd_datapath_resp_t
> > cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder,
> > const uint8_t trace_chan_id)
> > {
> > - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > - CS_ETM_TRACE_ON);
> > + int ret;
> > + struct cs_etm_packet *packet;
> > +
> > + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > + CS_ETM_TRACE_ON);
> > + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
> > + return ret;
> > +
> > + packet = &decoder->packet_buffer[decoder->tail];
> > +
> > + /* Clear execption number for discontinuous trace */
> > + decoder->exc_num[packet->cpu] = UINT32_MAX;
> > +
> > + return ret;
> > }
> >
> > static ocsd_datapath_resp_t
> > cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
> > + const ocsd_generic_trace_elem *elem,
> > const uint8_t trace_chan_id)
> > {
> > - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > - CS_ETM_EXCEPTION);
> > + int ret;
> > + struct cs_etm_packet *packet;
> > +
> > + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > + CS_ETM_EXCEPTION);
> > + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
> > + return ret;
> > +
> > + packet = &decoder->packet_buffer[decoder->tail];
> > +
> > + /*
> > + * Exception number is recorded per CPU and later can be used
> > + * for exception return instruction analysis.
> > + */
> > + decoder->exc_num[packet->cpu] = elem->exception_number;
>
> Am I missing something or the information about the exception number that is
> recorded here isn't used anywhere?

The exception number will be used to set branch flag patch [1].
According to exception number we can know it's for system call,
interrupt or other traps.

[1] http://archive.armlinux.org.uk/lurker/message/20181111.050755.d1c1b257.en.html

> If you want to use this in perf report/script,
> the exception number will have to be added to the cs_etm_packet struct.

Actually before has discussed this with Mike but found it's hard to
save the exception number in cs_etm_packet struct. The reason is the
exception packet contains the correct exception number, but the
exception return packet doesn't contain exception number. Thus this
patch uses cs_etm_decoder struct to save exception number per CPU
context when receive exception packet, and later the saved exception
number will be used by exception return packet.

Please see related discussion at the end of page [2].

[2] https://lists.linaro.org/pipermail/coresight/2018-October/001832.html

> I am done with the revision of this set.

Thanks a lot for reviewing.

[...]

Thanks,
Leo Yan

2018-12-05 17:50:26

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v1 3/5] perf cs-etm: Support for NO_SYNC packet

On Tue, 4 Dec 2018 at 20:31, <[email protected]> wrote:
>
> On Mon, Nov 19, 2018 at 11:27:59AM -0700, Mathieu Poirier wrote:
> > On Sun, Nov 11, 2018 at 12:59:41PM +0800, Leo Yan wrote:
> > > As described in OpenCSD (CoreSight decoder lib), in the decoding stream
> > > it includes one trace element with type OCSD_GEN_TRC_ELEM_NO_SYNC; the
> > > element indicates 'either at start of decode, or after overflow / bad
> > > packet', we should take it as a signal for the tracing off and this will
> > > cause tracing discontinuity. From the trace dump with 'perf script',
> > > sometimes the element OCSD_GEN_TRC_ELEM_NO_SYNC collaborates with
> > > element OCSD_GEN_TRC_ELEM_TRACE_ON to show the tracing flow have been
> > > turned off and on, in this case the cs-etm code has handled TRACE_ON
> > > packet well so we observe the tracing discontinuity; but in another case
> > > it only inserts the element OCSD_GEN_TRC_ELEM_NO_SYNC into instructions
> > > packets, we miss to handle the case if has only standalone NO_SYNC
> > > element and users cannot receive the info for tracing discontinuity.
> > >
> > > This patch introduces new type CS_ETM_TRACE_OFF to generate packet for
> > > receiving element OCSD_GEN_TRC_ELEM_NO_SYNC from decoder; when generate
> > > sample, CS_ETM_TRACE_OFF packet has almost the same behaviour with
> >
> > Here you have used the word "almost", leading me to beleive there are cases
> > where the handling of TRACE_ON and NO_SYNC packets differ, but the
> > implementation enacts the same behavior for both.
> >
> > Mike or Robert, can you confirm that TRACE_ON and NO_SYNC packets should be
> > treated the same way?
>
> I also would like to get suggestions/comments from Mike and Rob.
>
> For NO_SYNC packet, Mike before has given some explination for it:
>
> "looking at the decoder flow, when the decoder is reset, then it is
> returned to an unsynchronised state and a NO_SYNC will be output.
> With perf, it can concatenate multiple trace buffers into a single
> section of the perf.data file. To ensure that the decode process can
> see the start of a new buffer, the drivers will insert a barrier
> packet between different buffers so that the decoder can spot the
> boundaries. When the decoder hits a barrier packet it will
> automatically reset so that it correctly decodes the next trace
> buffer. This could be what you are seeing here."
>
> So I think there still have some difference between TRACE_ON and
> NO_SYNC packets, TRACE_ON packet indicates the start of trace and it's
> also possible caused by tracing discontinuity; NO_SYNC packets usually
> caused by an unsynchronised state. But both of them presents
> discontinuity in perf trace data.
>
> I prefer to use two different packet types to present them, this can
> let the code to be more readable and easier to be maintained later.

Absolutely. Make sure to get to the bottom of the story with Mike
and/or Robert before sending your next patchset.

>
> If you agree with this, I also will rephrase the commit log to avoid
> confusion that these two packets are the same thing.

Yes please.

>
> > Leo, if handling of the TRACE_ON and NO_SYNC packets is the same then we should
> > treat them the same way in cs_etm_decoder__gen_trace_elem_printer(), i.e simply
> > call cs_etm_decoder__buffer_trace_on(). That way packet handling in cs-etm.c
> > doesn't change. Otherwise see my comments below.
> >
> > > CS_ETM_TRACE_ON packet: both of them invokes cs_etm__flush() to generate
> > > samples for the previous instructions packet, and in cs_etm__sample() it
> > > also needs to generate samples if TRACE_OFF packet is followed by one
> > > sequential instructions packet. This patch also converts the address to
> > > 0 for TRACE_OFF packet, this is same with TRACE_ON packet as well.
> > >
> > > Signed-off-by: Leo Yan <[email protected]>
> > > ---
> > > tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 10 ++++++++++
> > > tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ++++---
> > > tools/perf/util/cs-etm.c | 15 +++++++++++----
> > > 3 files changed, 25 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > > index 5efb616..9d52727 100644
> > > --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > > +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > > @@ -369,6 +369,14 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder,
> > > }
> > >
> > > static ocsd_datapath_resp_t
> > > +cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder,
> > > + const uint8_t trace_chan_id)
> > > +{
> > > + return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > > + CS_ETM_TRACE_OFF);
> > > +}
> > > +
> > > +static ocsd_datapath_resp_t
> > > cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder,
> > > const uint8_t trace_chan_id)
> > > {
> > > @@ -389,6 +397,8 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
> > > case OCSD_GEN_TRC_ELEM_UNKNOWN:
> > > break;
> > > case OCSD_GEN_TRC_ELEM_NO_SYNC:
> > > + resp = cs_etm_decoder__buffer_trace_off(decoder,
> > > + trace_chan_id);
> >
> > If we want to handle NO_SYNC element types, why introduce a "trace_off"
> > function? Wouldn't it make more sense to call it
> > cs_etm_decoder__buffer_no_sync() ?
>
> Will change to cs_etm_decoder__buffer_no_sync().
>
> > > decoder->trace_on = false;
> > > break;
> > > case OCSD_GEN_TRC_ELEM_TRACE_ON:
> > > diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
> > > index 9351bd1..a38c97c 100644
> > > --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
> > > +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
> > > @@ -23,9 +23,10 @@ struct cs_etm_buffer {
> > > };
> > >
> > > enum cs_etm_sample_type {
> > > - CS_ETM_EMPTY = 0,
> > > - CS_ETM_RANGE = 1 << 0,
> > > - CS_ETM_TRACE_ON = 1 << 1,
> > > + CS_ETM_EMPTY = 0,
> > > + CS_ETM_RANGE = 1 << 0,
> > > + CS_ETM_TRACE_ON = 1 << 1,
> > > + CS_ETM_TRACE_OFF = 1 << 2,
> >
> > CS_ETM_NO_SYNC, see above comment. And please don't use indentation.
>
> Will do this.
>
> > > };
> > >
> > > enum cs_etm_isa {
> > > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> > > index f4fa877..2a0cef9 100644
> > > --- a/tools/perf/util/cs-etm.c
> > > +++ b/tools/perf/util/cs-etm.c
> > > @@ -517,8 +517,9 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
> > >
> > > static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> > > {
> > > - /* Returns 0 for the CS_ETM_TRACE_ON packet */
> > > - if (packet->sample_type == CS_ETM_TRACE_ON)
> > > + /* Returns 0 for TRACE_ON and TRACE_OFF packets */
> > > + if (packet->sample_type == CS_ETM_TRACE_ON ||
> > > + packet->sample_type == CS_ETM_TRACE_OFF)
> > > return 0;
> > >
> > > return packet->start_addr;
> > > @@ -527,8 +528,9 @@ static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> > > static inline
> > > u64 cs_etm__last_executed_instr(const struct cs_etm_packet *packet)
> > > {
> > > - /* Returns 0 for the CS_ETM_TRACE_ON packet */
> > > - if (packet->sample_type == CS_ETM_TRACE_ON)
> > > + /* Returns 0 for TRACE_ON and TRACE_OFF packets */
> > > + if (packet->sample_type == CS_ETM_TRACE_ON ||
> > > + packet->sample_type == CS_ETM_TRACE_OFF)
> > > return 0;
> > >
> > > return packet->end_addr - packet->last_instr_size;
> > > @@ -930,6 +932,10 @@ static int cs_etm__sample(struct cs_etm_queue *etmq)
> > > if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON)
> > > generate_sample = true;
> > >
> > > + /* Generate sample for tracing off packet */
> > > + if (etmq->prev_packet->sample_type == CS_ETM_TRACE_OFF)
> > > + generate_sample = true;
> > > +
> > > /* Generate sample for branch taken packet */
> > > if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
> > > etmq->prev_packet->last_instr_taken_branch)
> > > @@ -1085,6 +1091,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
> > > cs_etm__sample(etmq);
> > > break;
> > > case CS_ETM_TRACE_ON:
> > > + case CS_ETM_TRACE_OFF:
> > > /*
> > > * Discontinuity in trace, flush
> > > * previous branch stack
> > > --
> > > 2.7.4
> > >

2018-12-05 18:05:01

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] perf cs-etm: Track exception number

On Tue, 4 Dec 2018 at 20:49, <[email protected]> wrote:
>
> On Mon, Nov 19, 2018 at 01:47:49PM -0700, Mathieu Poirier wrote:
> > On Sun, Nov 11, 2018 at 12:59:43PM +0800, Leo Yan wrote:
> > > When an exception packet comes, it contains the info for exception
> > > number; the exception number indicates the exception types, so from it
> > > we can know if the exception is taken for interrupt, system call or
> > > other traps, etc. But because the exception return packet cannot
> > > delivery exception number correctly by decoder thus when prepare sample
> > > flags we cannot know what's type for exception return.
> > >
> > > This patch adds a new 'exc_num' array in decoder structure to record
> > > exception number per CPU, the exception number is recorded in the array
> > > when the exception packet comes and this exception number can be used by
> > > exception return packet. If detect there have discontinuous trace with
> > > TRACE_ON or TRACE_OFF packet, the exception number is set to invalid
> > > value.
> > >
> > > Signed-off-by: Leo Yan <[email protected]>
> > > ---
> > > tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 67 ++++++++++++++++++++++---
> > > 1 file changed, 59 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > > index b8cb7a3e..d1a6cbc 100644
> > > --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > > +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> > > @@ -43,6 +43,7 @@ struct cs_etm_decoder {
> > > u32 packet_count;
> > > u32 head;
> > > u32 tail;
> > > + u32 *exc_num;
> > > struct cs_etm_packet packet_buffer[MAX_BUFFER];
> > > };
> > >
> > > @@ -368,24 +369,64 @@ static ocsd_datapath_resp_t
> > > cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder,
> > > const uint8_t trace_chan_id)
> > > {
> > > - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > > - CS_ETM_TRACE_OFF);
> > > + int ret;
> > > + struct cs_etm_packet *packet;
> > > +
> > > + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > > + CS_ETM_TRACE_OFF);
> > > + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
> > > + return ret;
> > > +
> > > + packet = &decoder->packet_buffer[decoder->tail];
> > > +
> > > + /* Clear execption number for discontinuous trace */
> > > + decoder->exc_num[packet->cpu] = UINT32_MAX;
> > > +
> > > + return ret;
> > > }
> > >
> > > static ocsd_datapath_resp_t
> > > cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder,
> > > const uint8_t trace_chan_id)
> > > {
> > > - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > > - CS_ETM_TRACE_ON);
> > > + int ret;
> > > + struct cs_etm_packet *packet;
> > > +
> > > + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > > + CS_ETM_TRACE_ON);
> > > + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
> > > + return ret;
> > > +
> > > + packet = &decoder->packet_buffer[decoder->tail];
> > > +
> > > + /* Clear execption number for discontinuous trace */
> > > + decoder->exc_num[packet->cpu] = UINT32_MAX;
> > > +
> > > + return ret;
> > > }
> > >
> > > static ocsd_datapath_resp_t
> > > cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
> > > + const ocsd_generic_trace_elem *elem,
> > > const uint8_t trace_chan_id)
> > > {
> > > - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > > - CS_ETM_EXCEPTION);
> > > + int ret;
> > > + struct cs_etm_packet *packet;
> > > +
> > > + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > > + CS_ETM_EXCEPTION);
> > > + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
> > > + return ret;
> > > +
> > > + packet = &decoder->packet_buffer[decoder->tail];
> > > +
> > > + /*
> > > + * Exception number is recorded per CPU and later can be used
> > > + * for exception return instruction analysis.
> > > + */
> > > + decoder->exc_num[packet->cpu] = elem->exception_number;
> >
> > Am I missing something or the information about the exception number that is
> > recorded here isn't used anywhere?
>
> The exception number will be used to set branch flag patch [1].

Right, I realised that when I started reviewing that set. The rule of
thumb here is to introduce code in the same patchset it is used so
that we avoid adding needless code to the kernel.

> According to exception number we can know it's for system call,
> interrupt or other traps.
>
> [1] http://archive.armlinux.org.uk/lurker/message/20181111.050755.d1c1b257.en.html
>
> > If you want to use this in perf report/script,
> > the exception number will have to be added to the cs_etm_packet struct.
>
> Actually before has discussed this with Mike but found it's hard to
> save the exception number in cs_etm_packet struct. The reason is the
> exception packet contains the correct exception number, but the
> exception return packet doesn't contain exception number. Thus this
> patch uses cs_etm_decoder struct to save exception number per CPU
> context when receive exception packet, and later the saved exception
> number will be used by exception return packet.
>
> Please see related discussion at the end of page [2].

I find Mike's point about the possibility of seeing exception returns
without having a prior exception due to various factors very
interesting. I will make sure to keep an eye out for that in the next
revision.

>
> [2] https://lists.linaro.org/pipermail/coresight/2018-October/001832.html
>
> > I am done with the revision of this set.
>
> Thanks a lot for reviewing.
>
> [...]
>
> Thanks,
> Leo Yan

2018-12-06 05:47:06

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v1 3/5] perf cs-etm: Support for NO_SYNC packet

On Wed, Dec 05, 2018 at 10:48:45AM -0700, Mathieu Poirier wrote:

[...]

> > So I think there still have some difference between TRACE_ON and
> > NO_SYNC packets, TRACE_ON packet indicates the start of trace and it's
> > also possible caused by tracing discontinuity; NO_SYNC packets usually
> > caused by an unsynchronised state. But both of them presents
> > discontinuity in perf trace data.
> >
> > I prefer to use two different packet types to present them, this can
> > let the code to be more readable and easier to be maintained later.
>
> Absolutely. Make sure to get to the bottom of the story with Mike
> and/or Robert before sending your next patchset.

Sure, will sync with Mike/Rob.

Thanks,
Leo Yan

2018-12-06 05:50:47

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] perf cs-etm: Track exception number

On Wed, Dec 05, 2018 at 11:03:29AM -0700, Mathieu Poirier wrote:

[...]

> > > > static ocsd_datapath_resp_t
> > > > cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
> > > > + const ocsd_generic_trace_elem *elem,
> > > > const uint8_t trace_chan_id)
> > > > {
> > > > - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > > > - CS_ETM_EXCEPTION);
> > > > + int ret;
> > > > + struct cs_etm_packet *packet;
> > > > +
> > > > + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
> > > > + CS_ETM_EXCEPTION);
> > > > + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
> > > > + return ret;
> > > > +
> > > > + packet = &decoder->packet_buffer[decoder->tail];
> > > > +
> > > > + /*
> > > > + * Exception number is recorded per CPU and later can be used
> > > > + * for exception return instruction analysis.
> > > > + */
> > > > + decoder->exc_num[packet->cpu] = elem->exception_number;
> > >
> > > Am I missing something or the information about the exception number that is
> > > recorded here isn't used anywhere?
> >
> > The exception number will be used to set branch flag patch [1].
>
> Right, I realised that when I started reviewing that set. The rule of
> thumb here is to introduce code in the same patchset it is used so
> that we avoid adding needless code to the kernel.

Will move this patch into sample flag series.

Thanks,
Leo Yan