Hi,
We are seeing an issue with doing Coresight decode off target where
initially the correct dso from ~/.debug is used, but after a new thread
in the perf.data file is passed with its mmap record, the local version
of the dso is picked up instead. This happens if the binary exists in the
same path on both devices, for example /bin/ls.
Initially when parsing the build-ids in the header, the dso for /bin/ls
will be created, and the file will correctly point to
~/.debug/bin/ls/2f15ad836be3339dec0e2e6a3c637e08e48aacbd/elf, but for any
new threads or mmaps that are also for /bin/ls, they will not have a
build-id set so they point to /bin/ls on the local machine rather than the
debug folder.
To fix this I made it possible to look up which existing dsos have
build id's set that originate from the header and then copy that build-id
onto the new dso if the name matches. Another way to do it would
be to stop comparing the mmap id so it matches on filename only, but I
think we do want to differentiate between different mmaps, even if they
have the same name, which is how it works in this version.
Applies to perf/core 859f7e4554
Thanks
James
James Clark (1):
perf: Set build-id using build-id header on new mmap records
tools/perf/util/dso.h | 1 +
tools/perf/util/header.c | 1 +
tools/perf/util/map.c | 16 ++++++++++++++--
3 files changed, 16 insertions(+), 2 deletions(-)
--
2.28.0
MMAP records that occur after the build-id header is parsed do not have
their build-id set even if the filename matches an entry from the
header. Set the build-id on these dsos as long as the MMAP record
doesn't have its own build-id set.
This fixes an issue with off target analysis where the local version of
a dso is loaded rather than one from ~/.debug via a build-id.
Reported-by: Denis Nikitin <[email protected]>
Signed-off-by: James Clark <[email protected]>
---
tools/perf/util/dso.h | 1 +
tools/perf/util/header.c | 1 +
tools/perf/util/map.c | 16 ++++++++++++++--
3 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 011da3924fc1..3a9fd4d389b5 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -167,6 +167,7 @@ struct dso {
enum dso_load_errno load_errno;
u8 adjust_symbols:1;
u8 has_build_id:1;
+ u8 header_build_id:1;
u8 has_srcline:1;
u8 hit:1;
u8 annotate_warned:1;
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 6da12e522edc..571d73d4f976 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2200,6 +2200,7 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev,
build_id__init(&bid, bev->data, size);
dso__set_build_id(dso, &bid);
+ dso->header_build_id = 1;
if (dso_space != DSO_SPACE__USER) {
struct kmod_path m = { .name = NULL, };
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 1803d3887afe..4ae91e491e23 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -127,7 +127,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
if (map != NULL) {
char newfilename[PATH_MAX];
- struct dso *dso;
+ struct dso *dso, *header_bid_dso;
int anon, no_dso, vdso, android;
android = is_android_lib(filename);
@@ -185,7 +185,19 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
if (build_id__is_defined(bid))
dso__set_build_id(dso, bid);
-
+ else {
+ /*
+ * If the mmap event had no build ID, search for an existing dso from the
+ * build ID header by name. Otherwise only the dso loaded at the time of
+ * reading the header will have the build ID set and all future mmaps will
+ * have it missing.
+ */
+ header_bid_dso = __dsos__find(&machine->dsos, filename, false);
+ if (header_bid_dso && header_bid_dso->header_build_id) {
+ dso__set_build_id(dso, &header_bid_dso->bid);
+ dso->header_build_id = 1;
+ }
+ }
dso__put(dso);
}
return map;
--
2.28.0
On Thu, Feb 24, 2022 at 05:19:55PM +0000, James Clark wrote:
> MMAP records that occur after the build-id header is parsed do not have
> their build-id set even if the filename matches an entry from the
> header. Set the build-id on these dsos as long as the MMAP record
> doesn't have its own build-id set.
>
> This fixes an issue with off target analysis where the local version of
> a dso is loaded rather than one from ~/.debug via a build-id.
nice catch :)
>
> Reported-by: Denis Nikitin <[email protected]>
> Signed-off-by: James Clark <[email protected]>
> ---
> tools/perf/util/dso.h | 1 +
> tools/perf/util/header.c | 1 +
> tools/perf/util/map.c | 16 ++++++++++++++--
> 3 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> index 011da3924fc1..3a9fd4d389b5 100644
> --- a/tools/perf/util/dso.h
> +++ b/tools/perf/util/dso.h
> @@ -167,6 +167,7 @@ struct dso {
> enum dso_load_errno load_errno;
> u8 adjust_symbols:1;
> u8 has_build_id:1;
> + u8 header_build_id:1;
> u8 has_srcline:1;
> u8 hit:1;
> u8 annotate_warned:1;
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index 6da12e522edc..571d73d4f976 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -2200,6 +2200,7 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev,
>
> build_id__init(&bid, bev->data, size);
> dso__set_build_id(dso, &bid);
> + dso->header_build_id = 1;
>
> if (dso_space != DSO_SPACE__USER) {
> struct kmod_path m = { .name = NULL, };
> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
> index 1803d3887afe..4ae91e491e23 100644
> --- a/tools/perf/util/map.c
> +++ b/tools/perf/util/map.c
> @@ -127,7 +127,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>
> if (map != NULL) {
> char newfilename[PATH_MAX];
> - struct dso *dso;
> + struct dso *dso, *header_bid_dso;
> int anon, no_dso, vdso, android;
>
> android = is_android_lib(filename);
> @@ -185,7 +185,19 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>
> if (build_id__is_defined(bid))
> dso__set_build_id(dso, bid);
> -
> + else {
nit please add { } to the if clause as well
> + /*
> + * If the mmap event had no build ID, search for an existing dso from the
> + * build ID header by name. Otherwise only the dso loaded at the time of
> + * reading the header will have the build ID set and all future mmaps will
> + * have it missing.
> + */
> + header_bid_dso = __dsos__find(&machine->dsos, filename, false);
is this 'perf top' safe? I think dso should be added in the
same thread, but please check and add comment why we don't
need locking in here
thanks,
jirka
> + if (header_bid_dso && header_bid_dso->header_build_id) {
> + dso__set_build_id(dso, &header_bid_dso->bid);
> + dso->header_build_id = 1;
> + }
> + }
> dso__put(dso);
> }
> return map;
> --
> 2.28.0
>
On 27/02/2022 22:50, Jiri Olsa wrote:
> On Thu, Feb 24, 2022 at 05:19:55PM +0000, James Clark wrote:
>> MMAP records that occur after the build-id header is parsed do not have
>> their build-id set even if the filename matches an entry from the
>> header. Set the build-id on these dsos as long as the MMAP record
>> doesn't have its own build-id set.
>>
>> This fixes an issue with off target analysis where the local version of
>> a dso is loaded rather than one from ~/.debug via a build-id.
>
> nice catch :)
>
>>
>> Reported-by: Denis Nikitin <[email protected]>
>> Signed-off-by: James Clark <[email protected]>
>> ---
>> tools/perf/util/dso.h | 1 +
>> tools/perf/util/header.c | 1 +
>> tools/perf/util/map.c | 16 ++++++++++++++--
>> 3 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
>> index 011da3924fc1..3a9fd4d389b5 100644
>> --- a/tools/perf/util/dso.h
>> +++ b/tools/perf/util/dso.h
>> @@ -167,6 +167,7 @@ struct dso {
>> enum dso_load_errno load_errno;
>> u8 adjust_symbols:1;
>> u8 has_build_id:1;
>> + u8 header_build_id:1;
>> u8 has_srcline:1;
>> u8 hit:1;
>> u8 annotate_warned:1;
>> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
>> index 6da12e522edc..571d73d4f976 100644
>> --- a/tools/perf/util/header.c
>> +++ b/tools/perf/util/header.c
>> @@ -2200,6 +2200,7 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev,
>>
>> build_id__init(&bid, bev->data, size);
>> dso__set_build_id(dso, &bid);
>> + dso->header_build_id = 1;
>>
>> if (dso_space != DSO_SPACE__USER) {
>> struct kmod_path m = { .name = NULL, };
>> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
>> index 1803d3887afe..4ae91e491e23 100644
>> --- a/tools/perf/util/map.c
>> +++ b/tools/perf/util/map.c
>> @@ -127,7 +127,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>>
>> if (map != NULL) {
>> char newfilename[PATH_MAX];
>> - struct dso *dso;
>> + struct dso *dso, *header_bid_dso;
>> int anon, no_dso, vdso, android;
>>
>> android = is_android_lib(filename);
>> @@ -185,7 +185,19 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>>
>> if (build_id__is_defined(bid))
>> dso__set_build_id(dso, bid);
>> -
>> + else {
>
> nit please add { } to the if clause as well
>
>> + /*
>> + * If the mmap event had no build ID, search for an existing dso from the
>> + * build ID header by name. Otherwise only the dso loaded at the time of
>> + * reading the header will have the build ID set and all future mmaps will
>> + * have it missing.
>> + */
>> + header_bid_dso = __dsos__find(&machine->dsos, filename, false);
>
> is this 'perf top' safe? I think dso should be added in the
> same thread, but please check and add comment why we don't
> need locking in here
Seems like there are multiple synthesize_threads_workers using the same machine->dsos object so
I think locking is needed.
At first I thought of doing this:
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 4ae91e491e23..b87b81e3d41c 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -192,7 +192,9 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
* reading the header will have the build ID set and all future mmaps will
* have it missing.
*/
+ down_read(&machine->dsos.lock);
header_bid_dso = __dsos__find(&machine->dsos, filename, false);
+ up_read(&machine->dsos.lock);
if (header_bid_dso && header_bid_dso->header_build_id) {
dso__set_build_id(dso, &header_bid_dso->bid);
dso->header_build_id = 1;
But then I was wondering why it doesn't need a write lock all the way from machine__findnew_dso_id() to
dso__put()? At the moment there are writes to the dso like dso__set_loaded(), dso->nsinfo = nsi and
dso__set_build_id(), so another thread could find the dso in a partially constructed state.
Not sure if this is an issue currently without my patch, but at least with it they would have to be found
with header_build_id already set to 1 otherwise it will mess things up.
Extending the write lock outside of machine__findnew_dso_id() is difficult because it already
releases it before it returns. Does it need to be changed so that machine__findnew_dso_id() takes all the
arguments needed to construct it inside the lock?
James
>
> thanks,
> jirka
>
>> + if (header_bid_dso && header_bid_dso->header_build_id) {
>> + dso__set_build_id(dso, &header_bid_dso->bid);
>> + dso->header_build_id = 1;
>> + }
>> + }
>> dso__put(dso);
>> }
>> return map;
>> --
>> 2.28.0
>>
On Wed, Mar 02, 2022 at 04:20:00PM +0000, James Clark wrote:
>
>
> On 27/02/2022 22:50, Jiri Olsa wrote:
> > On Thu, Feb 24, 2022 at 05:19:55PM +0000, James Clark wrote:
> >> MMAP records that occur after the build-id header is parsed do not have
> >> their build-id set even if the filename matches an entry from the
> >> header. Set the build-id on these dsos as long as the MMAP record
> >> doesn't have its own build-id set.
> >>
> >> This fixes an issue with off target analysis where the local version of
> >> a dso is loaded rather than one from ~/.debug via a build-id.
> >
> > nice catch :)
> >
> >>
> >> Reported-by: Denis Nikitin <[email protected]>
> >> Signed-off-by: James Clark <[email protected]>
> >> ---
> >> tools/perf/util/dso.h | 1 +
> >> tools/perf/util/header.c | 1 +
> >> tools/perf/util/map.c | 16 ++++++++++++++--
> >> 3 files changed, 16 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> >> index 011da3924fc1..3a9fd4d389b5 100644
> >> --- a/tools/perf/util/dso.h
> >> +++ b/tools/perf/util/dso.h
> >> @@ -167,6 +167,7 @@ struct dso {
> >> enum dso_load_errno load_errno;
> >> u8 adjust_symbols:1;
> >> u8 has_build_id:1;
> >> + u8 header_build_id:1;
> >> u8 has_srcline:1;
> >> u8 hit:1;
> >> u8 annotate_warned:1;
> >> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> >> index 6da12e522edc..571d73d4f976 100644
> >> --- a/tools/perf/util/header.c
> >> +++ b/tools/perf/util/header.c
> >> @@ -2200,6 +2200,7 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev,
> >>
> >> build_id__init(&bid, bev->data, size);
> >> dso__set_build_id(dso, &bid);
> >> + dso->header_build_id = 1;
> >>
> >> if (dso_space != DSO_SPACE__USER) {
> >> struct kmod_path m = { .name = NULL, };
> >> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
> >> index 1803d3887afe..4ae91e491e23 100644
> >> --- a/tools/perf/util/map.c
> >> +++ b/tools/perf/util/map.c
> >> @@ -127,7 +127,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
> >>
> >> if (map != NULL) {
> >> char newfilename[PATH_MAX];
> >> - struct dso *dso;
> >> + struct dso *dso, *header_bid_dso;
> >> int anon, no_dso, vdso, android;
> >>
> >> android = is_android_lib(filename);
> >> @@ -185,7 +185,19 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
> >>
> >> if (build_id__is_defined(bid))
> >> dso__set_build_id(dso, bid);
> >> -
> >> + else {
> >
> > nit please add { } to the if clause as well
> >
> >> + /*
> >> + * If the mmap event had no build ID, search for an existing dso from the
> >> + * build ID header by name. Otherwise only the dso loaded at the time of
> >> + * reading the header will have the build ID set and all future mmaps will
> >> + * have it missing.
> >> + */
> >> + header_bid_dso = __dsos__find(&machine->dsos, filename, false);
> >
> > is this 'perf top' safe? I think dso should be added in the
> > same thread, but please check and add comment why we don't
> > need locking in here
>
> Seems like there are multiple synthesize_threads_workers using the same machine->dsos object so
> I think locking is needed.
>
> At first I thought of doing this:
>
> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
> index 4ae91e491e23..b87b81e3d41c 100644
> --- a/tools/perf/util/map.c
> +++ b/tools/perf/util/map.c
> @@ -192,7 +192,9 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
> * reading the header will have the build ID set and all future mmaps will
> * have it missing.
> */
> + down_read(&machine->dsos.lock);
> header_bid_dso = __dsos__find(&machine->dsos, filename, false);
> + up_read(&machine->dsos.lock);
> if (header_bid_dso && header_bid_dso->header_build_id) {
> dso__set_build_id(dso, &header_bid_dso->bid);
> dso->header_build_id = 1;
>
> But then I was wondering why it doesn't need a write lock all the way from machine__findnew_dso_id() to
> dso__put()? At the moment there are writes to the dso like dso__set_loaded(), dso->nsinfo = nsi and
> dso__set_build_id(), so another thread could find the dso in a partially constructed state.
I think that's fine, machine__findnew_dso_id takes the dso ref so
the real 'release' is when the machine object goes down at dsos__purge
>
> Not sure if this is an issue currently without my patch, but at least with it they would have to be found
> with header_build_id already set to 1 otherwise it will mess things up.
as for the partial changes I think it's also fine, because it happens
at separate places.. we'd need to investigate specific example to see
if there's a problem
>
> Extending the write lock outside of machine__findnew_dso_id() is difficult because it already
> releases it before it returns. Does it need to be changed so that machine__findnew_dso_id() takes all the
> arguments needed to construct it inside the lock?
the change above should do it.. as for top threads I don't think
it actually matters because there's events processing and hists
processing, and I did not find them clashing in this.. but there's
some setup session code that touches this, so it's better to be safe
thanks,
jirka
On 03/03/2022 11:36, Jiri Olsa wrote:
> On Wed, Mar 02, 2022 at 04:20:00PM +0000, James Clark wrote:
>>
>>
>> On 27/02/2022 22:50, Jiri Olsa wrote:
>>> On Thu, Feb 24, 2022 at 05:19:55PM +0000, James Clark wrote:
>>>> MMAP records that occur after the build-id header is parsed do not have
>>>> their build-id set even if the filename matches an entry from the
>>>> header. Set the build-id on these dsos as long as the MMAP record
>>>> doesn't have its own build-id set.
>>>>
>>>> This fixes an issue with off target analysis where the local version of
>>>> a dso is loaded rather than one from ~/.debug via a build-id.
>>>
>>> nice catch :)
>>>
>>>>
>>>> Reported-by: Denis Nikitin <[email protected]>
>>>> Signed-off-by: James Clark <[email protected]>
>>>> ---
>>>> tools/perf/util/dso.h | 1 +
>>>> tools/perf/util/header.c | 1 +
>>>> tools/perf/util/map.c | 16 ++++++++++++++--
>>>> 3 files changed, 16 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
>>>> index 011da3924fc1..3a9fd4d389b5 100644
>>>> --- a/tools/perf/util/dso.h
>>>> +++ b/tools/perf/util/dso.h
>>>> @@ -167,6 +167,7 @@ struct dso {
>>>> enum dso_load_errno load_errno;
>>>> u8 adjust_symbols:1;
>>>> u8 has_build_id:1;
>>>> + u8 header_build_id:1;
>>>> u8 has_srcline:1;
>>>> u8 hit:1;
>>>> u8 annotate_warned:1;
>>>> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
>>>> index 6da12e522edc..571d73d4f976 100644
>>>> --- a/tools/perf/util/header.c
>>>> +++ b/tools/perf/util/header.c
>>>> @@ -2200,6 +2200,7 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev,
>>>>
>>>> build_id__init(&bid, bev->data, size);
>>>> dso__set_build_id(dso, &bid);
>>>> + dso->header_build_id = 1;
>>>>
>>>> if (dso_space != DSO_SPACE__USER) {
>>>> struct kmod_path m = { .name = NULL, };
>>>> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
>>>> index 1803d3887afe..4ae91e491e23 100644
>>>> --- a/tools/perf/util/map.c
>>>> +++ b/tools/perf/util/map.c
>>>> @@ -127,7 +127,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>>>>
>>>> if (map != NULL) {
>>>> char newfilename[PATH_MAX];
>>>> - struct dso *dso;
>>>> + struct dso *dso, *header_bid_dso;
>>>> int anon, no_dso, vdso, android;
>>>>
>>>> android = is_android_lib(filename);
>>>> @@ -185,7 +185,19 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>>>>
>>>> if (build_id__is_defined(bid))
>>>> dso__set_build_id(dso, bid);
>>>> -
>>>> + else {
>>>
>>> nit please add { } to the if clause as well
>>>
>>>> + /*
>>>> + * If the mmap event had no build ID, search for an existing dso from the
>>>> + * build ID header by name. Otherwise only the dso loaded at the time of
>>>> + * reading the header will have the build ID set and all future mmaps will
>>>> + * have it missing.
>>>> + */
>>>> + header_bid_dso = __dsos__find(&machine->dsos, filename, false);
>>>
>>> is this 'perf top' safe? I think dso should be added in the
>>> same thread, but please check and add comment why we don't
>>> need locking in here
>>
>> Seems like there are multiple synthesize_threads_workers using the same machine->dsos object so
>> I think locking is needed.
>>
>> At first I thought of doing this:
>>
>> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
>> index 4ae91e491e23..b87b81e3d41c 100644
>> --- a/tools/perf/util/map.c
>> +++ b/tools/perf/util/map.c
>> @@ -192,7 +192,9 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>> * reading the header will have the build ID set and all future mmaps will
>> * have it missing.
>> */
>> + down_read(&machine->dsos.lock);
>> header_bid_dso = __dsos__find(&machine->dsos, filename, false);
>> + up_read(&machine->dsos.lock);
>> if (header_bid_dso && header_bid_dso->header_build_id) {
>> dso__set_build_id(dso, &header_bid_dso->bid);
>> dso->header_build_id = 1;
>>
>> But then I was wondering why it doesn't need a write lock all the way from machine__findnew_dso_id() to
>> dso__put()? At the moment there are writes to the dso like dso__set_loaded(), dso->nsinfo = nsi and
>> dso__set_build_id(), so another thread could find the dso in a partially constructed state.
>
> I think that's fine, machine__findnew_dso_id takes the dso ref so
> the real 'release' is when the machine object goes down at dsos__purge
>
>>
>> Not sure if this is an issue currently without my patch, but at least with it they would have to be found
>> with header_build_id already set to 1 otherwise it will mess things up.
>
> as for the partial changes I think it's also fine, because it happens
> at separate places.. we'd need to investigate specific example to see
> if there's a problem
>
>>
>> Extending the write lock outside of machine__findnew_dso_id() is difficult because it already
>> releases it before it returns. Does it need to be changed so that machine__findnew_dso_id() takes all the
>> arguments needed to construct it inside the lock?
>
> the change above should do it.. as for top threads I don't think
> it actually matters because there's events processing and hists
> processing, and I did not find them clashing in this.. but there's
> some setup session code that touches this, so it's better to be safe
Ok yes I think I agree. The header flag isn't important for perf top so having it all
constructed in place isn't necessary.
I've submitted v2 with this change.
Thanks for the review.
>
> thanks,
> jirka