2012-05-04 17:11:49

by Jiri Olsa

[permalink] [raw]
Subject: [PATCHv2 0/4] perf, tool: Fix endian issues

hi,
sending fixies to properly handle perf.data endians.

v2 changes:
- added patches 3 and 4 to handle sample_id_all header endianity

Attached patches:
1/4 perf, tool: Handle different endians properly during symbol load
2/4 perf, tool: Carry perf_event_attr bitfield throught different endians
3/4 perf, tool: Handle endianity swap on sample_id_all header data
4/4 perf, tool: Fix 32 bit values endianity swap for sample_id_all header

thanks,
jirka
---
tools/perf/util/evsel.c | 32 +++++++++++---
tools/perf/util/session.c | 101 +++++++++++++++++++++++++++++++++++++++-----
tools/perf/util/symbol.c | 33 ++++++++++++++-
tools/perf/util/symbol.h | 30 +++++++++++++
4 files changed, 176 insertions(+), 20 deletions(-)


2012-05-04 17:11:53

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 2/4] perf, tool: Carry perf_event_attr bitfield throught different endians

When the perf data file is read cross architectures, the perf_event__attr_swap
function takes care about endianness of all the struct fields except the
bitfield flags.

The bitfield flags need to be transformed as well, since the bitfield
binary storage differs for both endians.

ABI says:
Bit-fields are allocated from right to left (least to most significant)
on little-endian implementations and from left to right (most to least
significant) on big-endian implementations.

The above seems to be byte specific, so we need to reverse each
byte of the bitfield. 'Internet' also says this might be implementation
specific and we probably need proper fix and carry perf_event_attr
bitfield flags in separate data file FEAT_ section. Thought this seems
to work for now.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/session.c | 34 ++++++++++++++++++++++++++++++++++
1 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 1efd3be..07fda7c 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -481,6 +481,38 @@ static void perf_event__read_swap(union perf_event *event)
event->read.id = bswap_64(event->read.id);
}

+static u8 revbyte(u8 b)
+{
+ int rev = (b >> 4) | ((b & 0xf) << 4);
+ rev = ((rev & 0xcc) >> 2) | ((rev & 0x33) << 2);
+ rev = ((rev & 0xaa) >> 1) | ((rev & 0x55) << 1);
+ return (u8) rev;
+}
+
+/*
+ * XXX this is hack in attempt to carry flags bitfield
+ * throught endian village. ABI says:
+ *
+ * Bit-fields are allocated from right to left (least to most significant)
+ * on little-endian implementations and from left to right (most to least
+ * significant) on big-endian implementations.
+ *
+ * The above seems to be byte specific, so we need to reverse each
+ * byte of the bitfield. 'Internet' also says this might be implementation
+ * specific and we probably need proper fix and carry perf_event_attr
+ * bitfield flags in separate data file FEAT_ section. Thought this seems
+ * to work for now.
+ */
+static void swap_bitfield(u8 *p, unsigned len)
+{
+ unsigned i;
+
+ for (i = 0; i < len; i++) {
+ *p = revbyte(*p);
+ p++;
+ }
+}
+
/* exported for swapping attributes in file header */
void perf_event__attr_swap(struct perf_event_attr *attr)
{
@@ -494,6 +526,8 @@ void perf_event__attr_swap(struct perf_event_attr *attr)
attr->bp_type = bswap_32(attr->bp_type);
attr->bp_addr = bswap_64(attr->bp_addr);
attr->bp_len = bswap_64(attr->bp_len);
+
+ swap_bitfield((u8 *) (&attr->read_format + 1), sizeof(u64));
}

static void perf_event__hdr_attr_swap(union perf_event *event)
--
1.7.7.6

2012-05-04 17:11:58

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 4/4] perf, tool: Fix 32 bit values endianity swap for sample_id_all header

We swap the sample_id_all header by u64 pointers. Some members
of the header happen to be 32 bit values. We need to handle them
separatelly.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/evsel.c | 32 +++++++++++++++++++++++++-------
1 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8c13dbc..ea053b7 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -403,16 +403,27 @@ int perf_evsel__open_per_thread(struct perf_evsel *evsel,
}

static int perf_event__parse_id_sample(const union perf_event *event, u64 type,
- struct perf_sample *sample)
+ struct perf_sample *sample,
+ bool swapped)
{
const u64 *array = event->sample.array;
+ union {
+ u64 val64;
+ u32 val32[2];
+ } u;

array += ((event->header.size -
sizeof(event->header)) / sizeof(u64)) - 1;

if (type & PERF_SAMPLE_CPU) {
- u32 *p = (u32 *)array;
- sample->cpu = *p;
+ u.val64 = *array;
+ if (swapped) {
+ /* undo swap of u64, then swap on individual u32s */
+ u.val64 = bswap_64(u.val64);
+ u.val32[0] = bswap_32(u.val32[0]);
+ }
+
+ sample->cpu = u.val32[0];
array--;
}

@@ -432,9 +443,16 @@ static int perf_event__parse_id_sample(const union perf_event *event, u64 type,
}

if (type & PERF_SAMPLE_TID) {
- u32 *p = (u32 *)array;
- sample->pid = p[0];
- sample->tid = p[1];
+ u.val64 = *array;
+ if (swapped) {
+ /* undo swap of u64, then swap on individual u32s */
+ u.val64 = bswap_64(u.val64);
+ u.val32[0] = bswap_32(u.val32[0]);
+ u.val32[1] = bswap_32(u.val32[1]);
+ }
+
+ sample->pid = u.val32[0];
+ sample->tid = u.val32[1];
}

return 0;
@@ -474,7 +492,7 @@ int perf_event__parse_sample(const union perf_event *event, u64 type,
if (event->header.type != PERF_RECORD_SAMPLE) {
if (!sample_id_all)
return 0;
- return perf_event__parse_id_sample(event, type, data);
+ return perf_event__parse_id_sample(event, type, data, swapped);
}

array = event->sample.array;
--
1.7.7.6

2012-05-04 17:12:17

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 1/4] perf, tool: Handle different endians properly during symbol load

Currently we dont care about the file object's endianness. It's possible
we read buildid file object from different architecture than we are
currentlly running on. So we need to care about properly reading such
object's data - handle different endianness properly.

Adding:
needs_swap DSO field
dso__swap_init function to initialize DSO's needs_swap
DSO__READ to read the data with proper swaps

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/symbol.c | 33 ++++++++++++++++++++++++++++++++-
tools/perf/util/symbol.h | 30 ++++++++++++++++++++++++++++++
2 files changed, 62 insertions(+), 1 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index ab9867b..a63f15e 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -323,6 +323,7 @@ struct dso *dso__new(const char *name)
dso->sorted_by_name = 0;
dso->has_build_id = 0;
dso->kernel = DSO_TYPE_USER;
+ dso->needs_swap = DSO_SWAP__UNSET;
INIT_LIST_HEAD(&dso->node);
}

@@ -1156,6 +1157,33 @@ static size_t elf_addr_to_index(Elf *elf, GElf_Addr addr)
return -1;
}

+static int dso__swap_init(struct dso *dso, unsigned char eidata)
+{
+ static unsigned int const endian = 1;
+
+ dso->needs_swap = DSO_SWAP__NO;
+
+ switch (eidata) {
+ case ELFDATA2LSB:
+ /* We are big endian, DSO is little endian. */
+ if (*(unsigned char const *)&endian != 1)
+ dso->needs_swap = DSO_SWAP__YES;
+ break;
+
+ case ELFDATA2MSB:
+ /* We are little endian, DSO is big endian. */
+ if (*(unsigned char const *)&endian != 0)
+ dso->needs_swap = DSO_SWAP__YES;
+ break;
+
+ default:
+ pr_err("unrecognized DSO data encoding %d\n", eidata);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static int dso__load_sym(struct dso *dso, struct map *map, const char *name,
int fd, symbol_filter_t filter, int kmodule,
int want_symtab)
@@ -1187,6 +1215,9 @@ static int dso__load_sym(struct dso *dso, struct map *map, const char *name,
goto out_elf_end;
}

+ if (dso__swap_init(dso, ehdr.e_ident[EI_DATA]))
+ goto out_elf_end;
+
/* Always reject images with a mismatched build-id: */
if (dso->has_build_id) {
u8 build_id[BUILD_ID_SIZE];
@@ -1272,7 +1303,7 @@ static int dso__load_sym(struct dso *dso, struct map *map, const char *name,
if (opdsec && sym.st_shndx == opdidx) {
u32 offset = sym.st_value - opdshdr.sh_addr;
u64 *opd = opddata->d_buf + offset;
- sym.st_value = *opd;
+ sym.st_value = DSO__READ(dso, u64, *opd);
sym.st_shndx = elf_addr_to_index(elf, sym.st_value);
}

diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 1f00388..40a3254 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -9,6 +9,7 @@
#include <linux/list.h>
#include <linux/rbtree.h>
#include <stdio.h>
+#include <byteswap.h>

#ifdef HAVE_CPLUS_DEMANGLE
extern char *cplus_demangle(const char *, int);
@@ -160,11 +161,18 @@ enum dso_kernel_type {
DSO_TYPE_GUEST_KERNEL
};

+enum dso_swap_type {
+ DSO_SWAP__UNSET,
+ DSO_SWAP__NO,
+ DSO_SWAP__YES,
+};
+
struct dso {
struct list_head node;
struct rb_root symbols[MAP__NR_TYPES];
struct rb_root symbol_names[MAP__NR_TYPES];
enum dso_kernel_type kernel;
+ enum dso_swap_type needs_swap;
u8 adjust_symbols:1;
u8 has_build_id:1;
u8 hit:1;
@@ -182,6 +190,28 @@ struct dso {
char name[0];
};

+#define DSO__READ(dso, type, val) \
+({ \
+ type ____r = val; \
+ BUG_ON(dso->needs_swap == DSO_SWAP__UNSET); \
+ if (dso->needs_swap == DSO_SWAP__YES) { \
+ switch (sizeof(____r)) { \
+ case 2: \
+ ____r = bswap_16(val); \
+ break; \
+ case 4: \
+ ____r = bswap_32(val); \
+ break; \
+ case 8: \
+ ____r = bswap_64(val); \
+ break; \
+ default: \
+ BUG_ON(1); \
+ } \
+ } \
+ ____r; \
+})
+
struct dso *dso__new(const char *name);
void dso__delete(struct dso *dso);

--
1.7.7.6

2012-05-04 17:12:42

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 3/4] perf, tool: Handle endianity swap on sample_id_all header data

Adding endianity swapping for event header attached via sample_id_all.

Currently we dont do that and it's causing wrong data to be read when
running report on architecture with different endianity than the record.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/session.c | 67 +++++++++++++++++++++++++++++++++++++--------
1 files changed, 55 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 07fda7c..2df5e0c 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -441,37 +441,65 @@ void mem_bswap_64(void *src, int byte_size)
}
}

-static void perf_event__all64_swap(union perf_event *event)
+static void swap_sample_id_all(union perf_event *event, void *data)
+{
+ void *end = (void *) event + event->header.size;
+ int size = end - data;
+
+ BUG_ON(size % sizeof(u64));
+ mem_bswap_64(data, size);
+}
+
+static void perf_event__all64_swap(union perf_event *event,
+ bool sample_id_all __used)
{
struct perf_event_header *hdr = &event->header;
mem_bswap_64(hdr + 1, event->header.size - sizeof(*hdr));
}

-static void perf_event__comm_swap(union perf_event *event)
+static void perf_event__comm_swap(union perf_event *event, bool sample_id_all)
{
event->comm.pid = bswap_32(event->comm.pid);
event->comm.tid = bswap_32(event->comm.tid);
+
+ if (sample_id_all) {
+ void *data = &event->comm.comm;
+
+ data += ALIGN(strlen(data), sizeof(u64));
+ swap_sample_id_all(event, data);
+ }
}

-static void perf_event__mmap_swap(union perf_event *event)
+static void perf_event__mmap_swap(union perf_event *event,
+ bool sample_id_all)
{
event->mmap.pid = bswap_32(event->mmap.pid);
event->mmap.tid = bswap_32(event->mmap.tid);
event->mmap.start = bswap_64(event->mmap.start);
event->mmap.len = bswap_64(event->mmap.len);
event->mmap.pgoff = bswap_64(event->mmap.pgoff);
+
+ if (sample_id_all) {
+ void *data = &event->mmap.filename;
+
+ data += ALIGN(strlen(data), sizeof(u64));
+ swap_sample_id_all(event, data);
+ }
}

-static void perf_event__task_swap(union perf_event *event)
+static void perf_event__task_swap(union perf_event *event, bool sample_id_all)
{
event->fork.pid = bswap_32(event->fork.pid);
event->fork.tid = bswap_32(event->fork.tid);
event->fork.ppid = bswap_32(event->fork.ppid);
event->fork.ptid = bswap_32(event->fork.ptid);
event->fork.time = bswap_64(event->fork.time);
+
+ if (sample_id_all)
+ swap_sample_id_all(event, &event->fork + 1);
}

-static void perf_event__read_swap(union perf_event *event)
+static void perf_event__read_swap(union perf_event *event, bool sample_id_all)
{
event->read.pid = bswap_32(event->read.pid);
event->read.tid = bswap_32(event->read.tid);
@@ -479,6 +507,9 @@ static void perf_event__read_swap(union perf_event *event)
event->read.time_enabled = bswap_64(event->read.time_enabled);
event->read.time_running = bswap_64(event->read.time_running);
event->read.id = bswap_64(event->read.id);
+
+ if (sample_id_all)
+ swap_sample_id_all(event, &event->read + 1);
}

static u8 revbyte(u8 b)
@@ -530,7 +561,8 @@ void perf_event__attr_swap(struct perf_event_attr *attr)
swap_bitfield((u8 *) (&attr->read_format + 1), sizeof(u64));
}

-static void perf_event__hdr_attr_swap(union perf_event *event)
+static void perf_event__hdr_attr_swap(union perf_event *event,
+ bool sample_id_all __used)
{
size_t size;

@@ -541,18 +573,21 @@ static void perf_event__hdr_attr_swap(union perf_event *event)
mem_bswap_64(event->attr.id, size);
}

-static void perf_event__event_type_swap(union perf_event *event)
+static void perf_event__event_type_swap(union perf_event *event,
+ bool sample_id_all __used)
{
event->event_type.event_type.event_id =
bswap_64(event->event_type.event_type.event_id);
}

-static void perf_event__tracing_data_swap(union perf_event *event)
+static void perf_event__tracing_data_swap(union perf_event *event,
+ bool sample_id_all __used)
{
event->tracing_data.size = bswap_32(event->tracing_data.size);
}

-typedef void (*perf_event__swap_op)(union perf_event *event);
+typedef void (*perf_event__swap_op)(union perf_event *event,
+ bool sample_id_all);

static perf_event__swap_op perf_event__swap_ops[] = {
[PERF_RECORD_MMAP] = perf_event__mmap_swap,
@@ -986,6 +1021,15 @@ static int perf_session__process_user_event(struct perf_session *session, union
}
}

+static void event_swap(union perf_event *event, bool sample_id_all)
+{
+ perf_event__swap_op swap;
+
+ swap = perf_event__swap_ops[event->header.type];
+ if (swap)
+ swap(event, sample_id_all);
+}
+
static int perf_session__process_event(struct perf_session *session,
union perf_event *event,
struct perf_tool *tool,
@@ -994,9 +1038,8 @@ static int perf_session__process_event(struct perf_session *session,
struct perf_sample sample;
int ret;

- if (session->header.needs_swap &&
- perf_event__swap_ops[event->header.type])
- perf_event__swap_ops[event->header.type](event);
+ if (session->header.needs_swap)
+ event_swap(event, session->sample_id_all);

if (event->header.type >= PERF_RECORD_HEADER_MAX)
return -EINVAL;
--
1.7.7.6

2012-05-04 17:40:23

by David Ahern

[permalink] [raw]
Subject: Re: [PATCHv2 0/4] perf, tool: Fix endian issues

On 5/4/12 11:11 AM, Jiri Olsa wrote:
> hi,
> sending fixies to properly handle perf.data endians.
>
> v2 changes:
> - added patches 3 and 4 to handle sample_id_all header endianity
>
> Attached patches:
> 1/4 perf, tool: Handle different endians properly during symbol load
> 2/4 perf, tool: Carry perf_event_attr bitfield throught different endians
> 3/4 perf, tool: Handle endianity swap on sample_id_all header data
> 4/4 perf, tool: Fix 32 bit values endianity swap for sample_id_all header

Have you tested these patches by collecting data on PPC (or other big
endian) and analyzing on Intel x86 and vice versa?

David

2012-05-04 17:59:58

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCHv2 0/4] perf, tool: Fix endian issues

Em Fri, May 04, 2012 at 11:40:17AM -0600, David Ahern escreveu:
> On 5/4/12 11:11 AM, Jiri Olsa wrote:
> >hi,
> >sending fixies to properly handle perf.data endians.
> >
> >v2 changes:
> > - added patches 3 and 4 to handle sample_id_all header endianity
> >
> >Attached patches:
> > 1/4 perf, tool: Handle different endians properly during symbol load
> > 2/4 perf, tool: Carry perf_event_attr bitfield throught different endians
> > 3/4 perf, tool: Handle endianity swap on sample_id_all header data
> > 4/4 perf, tool: Fix 32 bit values endianity swap for sample_id_all header
>
> Have you tested these patches by collecting data on PPC (or other
> big endian) and analyzing on Intel x86 and vice versa?

I think so, but indeed it is good practice to show the sequence made to
test the changes, with sample, short command output before and after.

- Arnaldo

2012-05-04 18:08:15

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCHv2 0/4] perf, tool: Fix endian issues

On Fri, May 04, 2012 at 02:59:40PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, May 04, 2012 at 11:40:17AM -0600, David Ahern escreveu:
> > On 5/4/12 11:11 AM, Jiri Olsa wrote:
> > >hi,
> > >sending fixies to properly handle perf.data endians.
> > >
> > >v2 changes:
> > > - added patches 3 and 4 to handle sample_id_all header endianity
> > >
> > >Attached patches:
> > > 1/4 perf, tool: Handle different endians properly during symbol load
> > > 2/4 perf, tool: Carry perf_event_attr bitfield throught different endians
> > > 3/4 perf, tool: Handle endianity swap on sample_id_all header data
> > > 4/4 perf, tool: Fix 32 bit values endianity swap for sample_id_all header
> >
> > Have you tested these patches by collecting data on PPC (or other
> > big endian) and analyzing on Intel x86 and vice versa?
>
> I think so, but indeed it is good practice to show the sequence made to
> test the changes, with sample, short command output before and after.
>
> - Arnaldo

yep, I should have mentioned that..

I did following testing with x86_64 and ppc64:

1. perf record -a -- sleep 10
2. perf report > report.origin
3. perf archive perf.data
4. copy the perf.data.tar.bz2 to a target system
5. tar xjvf perf.data.tar.bz2 -C ~/.debug
6. perf report > report.target
7. diff -u report.origin report.target

the diff should produce no output

hm, I did jsut x86_64 record and ppc64 report.. will do also
the other way around and resend with updated changelog

thanks,
jirka