In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
address but only reinsert it into rbtree if the symbol belongs to
another map. However, the expression for adjusting symbol (pos->start -=
curr_map->start - curr_map->pgoff) can change the relative order between
two symbols (even if the affected symbols are in different maps, in
kcore case they are possible to share one same dso), which damages the
rbtree.
For example:
When using kcore:
# readelf -a /proc/kcore
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
...
LOAD 0x0000000000002000 0xffffffc000000000 0x0000000000000000 <-- kernel
0x000000007fc00000 0x000000007fc00000 RWE 1000
LOAD 0xfffffffffc002000 0xffffffbffc000000 0x0000000000000000 <-- module
0x0000000004000000 0x0000000004000000 RWE 1000
For modules memory area:
map->start = 0xffffffbffc000000, map->pgoff = 0xfffffffffc002000
For normal kernel memory area:
map->start = 0xffffffc000000000, map->pgoff = 0x0000000000002000
Function A is a normal kernel function at: 0xffffffc00021b428.
Function B is a function in module at: 0xffffffbffc000000.
&A > &B before calling dso__split_kallsyms_for_kcore(), and they are
already in the rbtree.
During dso__split_kallsyms_for_kcore(), when adjusting symbols using
pos->start -= curr_map->start - curr_map->pgoff
pos->start for A become: (0xffffffc00021b428 - 0xffffffc000000000 + 0x0000000000002000) = 0x21d428
pos->start for B become: (0xffffffbffc000000 - 0xffffffbffc000000 + 0xfffffffffc002000) = 0xfffffffffc002000
&A < &B, the order is changed.
This patch rebuild rbtree unconditionally to ensure the rbtree is
always healthy.
Signed-off-by: Wang Nan <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
---
Here is my test result on my aarch64 system:
*Step 1: create kprobes*
[root@localhost ~]# ./perf_arm64 probe -m /tmp/kernel_module.ko my_func
Added new event:
probe:my_func (on my_func in kernel_module)
You can now use it in all perf tools, such as:
perf record -e probe:my_func -aR sleep 1
[root@localhost ~]# ./perf_arm64 probe sys_write
Added new event:
probe:sys_write (on sys_write)
You can now use it in all perf tools, such as:
perf record -e probe:sys_write -aR sleep 1
[root@localhost ~]# cat /sys/kernel/debug/kprobes/list
ffffffbffc000000 k my_func+0x0 kernel_module [DISABLED]
ffffffc00021b428 k SyS_write+0x0 [DISABLED]
*Step 2: rebuild perf without commit 98d3b25*
$ git log --oneline
3321d2b Revert "perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success"
e054731 perf stat: Make stat options global
0014de1 perf sched latency: Fix thread pid reuse issue
98d3b25 perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success
956959f perf trace: Fix documentation for -i
*Step 3: test and get the buggy result*
[root@localhost ~]# PAGER=cat ./perf_arm64 probe -l
Error: Failed to show event list.
[root@localhost ~]# PAGER=cat ./perf_arm64 probe -v -l
map_groups__set_modules_path_dir: cannot open /lib/modules/4.1.12+ dir
Problems setting modules path maps, continuing anyway...
Opening /sys/kernel/debug/tracing//kprobe_events write=0
Opening /sys/kernel/debug/tracing//uprobe_events write=0
Parsing probe_events: p:probe/my_func kernel_module:my_func
Group:probe Event:my_func probe:p
Looking at the vmlinux_path (7 entries long)
symsrc__init: cannot get elf header.
Using /proc/kcore for kernel object code
Using /proc/kallsyms for symbols
try to find information at 3ffc000000 in kernel_module
Failed to find module kernel_module.
Failed to find the path for kernel_module: [kernel_module]
Failed to find corresponding probes from debuginfo.
Failed to synthesize perf probe point: 0
Error: Failed to show event list. Reason: Invalid argument (Code: -22)
*Step 4: Introduce this patch*
$ git log --oneline
36a8201 perf tools: Rebuild rbtree when adjusting symbols for kcore
3321d2b Revert "perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success"
e054731 perf stat: Make stat options global
0014de1 perf sched latency: Fix thread pid reuse issue
98d3b25 perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success
*Step 5: Try again*
[root@localhost ~]# PAGER=cat ./perf_arm64 probe -l
probe:my_func (on my_func in kernel_module)
probe:sys_write (on sys_write)
[root@localhost ~]# PAGER=cat ./perf_arm64 probe -v -l
map_groups__set_modules_path_dir: cannot open /lib/modules/4.1.12+ dir
Problems setting modules path maps, continuing anyway...
Opening /sys/kernel/debug/tracing//kprobe_events write=0
Opening /sys/kernel/debug/tracing//uprobe_events write=0
Parsing probe_events: p:probe/my_func kernel_module:my_func
Group:probe Event:my_func probe:p
Looking at the vmlinux_path (7 entries long)
symsrc__init: cannot get elf header.
Using /proc/kcore for kernel object code
Using /proc/kallsyms for symbols
Failed to find corresponding probes from debuginfo.
Failed to find probe point from both of dwarf and map.
probe:my_func (on my_func in kernel_module)
Parsing probe_events: p:probe/sys_write _text+1684520
Group:probe Event:sys_write probe:p
try to find information at 19b428 in kernel
Looking at the vmlinux_path (7 entries long)
symsrc__init: cannot get elf header.
Failed to find the path for kernel: Invalid ELF file
Failed to find corresponding probes from debuginfo.
probe:sys_write (on sys_write)
---
tools/perf/util/symbol.c | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index b4cc766..09bb6e8 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -654,7 +654,7 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
struct map_groups *kmaps = map__kmaps(map);
struct map *curr_map;
struct symbol *pos;
- int count = 0, moved = 0;
+ int count = 0;
struct rb_root *root = &dso->symbols[map->type];
struct rb_node *next = rb_first(root);
@@ -677,25 +677,23 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
rb_erase_init(&pos->rb_node, root);
symbol__delete(pos);
} else {
+ rb_erase_init(&pos->rb_node, root);
+
pos->start -= curr_map->start - curr_map->pgoff;
if (pos->end)
pos->end -= curr_map->start - curr_map->pgoff;
- if (curr_map->dso != map->dso) {
- rb_erase_init(&pos->rb_node, root);
- symbols__insert(
- &curr_map->dso->symbols[curr_map->type],
- pos);
- ++moved;
- } else {
- ++count;
- }
+
+ symbols__insert(
+ &curr_map->dso->symbols[curr_map->type],
+ pos);
+ ++count;
}
}
/* Symbols have been adjusted */
dso->adjust_symbols = 1;
- return count + moved;
+ return count;
}
/*
--
1.8.3.4
Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:
> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
> address but only reinsert it into rbtree if the symbol belongs to
> another map. However, the expression for adjusting symbol (pos->start -=
> curr_map->start - curr_map->pgoff) can change the relative order between
> two symbols (even if the affected symbols are in different maps, in
> kcore case they are possible to share one same dso), which damages the
> rbtree.
Right, some code does change the symbol values it gets from whatever
symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
->reloc, members for that :-\
I.e. 'struct dso' should be just what comes from the symtab, while
'struct map' should be about where that DSO is in memory.
With that in mind, do you still think your fix is the correct one?
Adrian?
- Arnaldo
> For example:
>
> When using kcore:
>
> # readelf -a /proc/kcore
>
> Type Offset VirtAddr PhysAddr
> FileSiz MemSiz Flags Align
> ...
> LOAD 0x0000000000002000 0xffffffc000000000 0x0000000000000000 <-- kernel
> 0x000000007fc00000 0x000000007fc00000 RWE 1000
> LOAD 0xfffffffffc002000 0xffffffbffc000000 0x0000000000000000 <-- module
> 0x0000000004000000 0x0000000004000000 RWE 1000
>
> For modules memory area:
> map->start = 0xffffffbffc000000, map->pgoff = 0xfffffffffc002000
> For normal kernel memory area:
> map->start = 0xffffffc000000000, map->pgoff = 0x0000000000002000
>
> Function A is a normal kernel function at: 0xffffffc00021b428.
> Function B is a function in module at: 0xffffffbffc000000.
>
> &A > &B before calling dso__split_kallsyms_for_kcore(), and they are
> already in the rbtree.
>
> During dso__split_kallsyms_for_kcore(), when adjusting symbols using
> pos->start -= curr_map->start - curr_map->pgoff
>
> pos->start for A become: (0xffffffc00021b428 - 0xffffffc000000000 + 0x0000000000002000) = 0x21d428
> pos->start for B become: (0xffffffbffc000000 - 0xffffffbffc000000 + 0xfffffffffc002000) = 0xfffffffffc002000
>
> &A < &B, the order is changed.
>
> This patch rebuild rbtree unconditionally to ensure the rbtree is
> always healthy.
>
> Signed-off-by: Wang Nan <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Masami Hiramatsu <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Zefan Li <[email protected]>
> Cc: [email protected]
> ---
>
> Here is my test result on my aarch64 system:
>
> *Step 1: create kprobes*
>
> [root@localhost ~]# ./perf_arm64 probe -m /tmp/kernel_module.ko my_func
> Added new event:
> probe:my_func (on my_func in kernel_module)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:my_func -aR sleep 1
>
> [root@localhost ~]# ./perf_arm64 probe sys_write
> Added new event:
> probe:sys_write (on sys_write)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:sys_write -aR sleep 1
>
> [root@localhost ~]# cat /sys/kernel/debug/kprobes/list
> ffffffbffc000000 k my_func+0x0 kernel_module [DISABLED]
> ffffffc00021b428 k SyS_write+0x0 [DISABLED]
>
>
> *Step 2: rebuild perf without commit 98d3b25*
>
> $ git log --oneline
> 3321d2b Revert "perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success"
> e054731 perf stat: Make stat options global
> 0014de1 perf sched latency: Fix thread pid reuse issue
> 98d3b25 perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success
> 956959f perf trace: Fix documentation for -i
>
>
> *Step 3: test and get the buggy result*
>
> [root@localhost ~]# PAGER=cat ./perf_arm64 probe -l
> Error: Failed to show event list.
>
> [root@localhost ~]# PAGER=cat ./perf_arm64 probe -v -l
> map_groups__set_modules_path_dir: cannot open /lib/modules/4.1.12+ dir
> Problems setting modules path maps, continuing anyway...
> Opening /sys/kernel/debug/tracing//kprobe_events write=0
> Opening /sys/kernel/debug/tracing//uprobe_events write=0
> Parsing probe_events: p:probe/my_func kernel_module:my_func
> Group:probe Event:my_func probe:p
> Looking at the vmlinux_path (7 entries long)
> symsrc__init: cannot get elf header.
> Using /proc/kcore for kernel object code
> Using /proc/kallsyms for symbols
> try to find information at 3ffc000000 in kernel_module
> Failed to find module kernel_module.
> Failed to find the path for kernel_module: [kernel_module]
> Failed to find corresponding probes from debuginfo.
> Failed to synthesize perf probe point: 0
> Error: Failed to show event list. Reason: Invalid argument (Code: -22)
>
>
> *Step 4: Introduce this patch*
>
> $ git log --oneline
> 36a8201 perf tools: Rebuild rbtree when adjusting symbols for kcore
> 3321d2b Revert "perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success"
> e054731 perf stat: Make stat options global
> 0014de1 perf sched latency: Fix thread pid reuse issue
> 98d3b25 perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success
>
>
> *Step 5: Try again*
>
> [root@localhost ~]# PAGER=cat ./perf_arm64 probe -l
> probe:my_func (on my_func in kernel_module)
> probe:sys_write (on sys_write)
> [root@localhost ~]# PAGER=cat ./perf_arm64 probe -v -l
> map_groups__set_modules_path_dir: cannot open /lib/modules/4.1.12+ dir
> Problems setting modules path maps, continuing anyway...
> Opening /sys/kernel/debug/tracing//kprobe_events write=0
> Opening /sys/kernel/debug/tracing//uprobe_events write=0
> Parsing probe_events: p:probe/my_func kernel_module:my_func
> Group:probe Event:my_func probe:p
> Looking at the vmlinux_path (7 entries long)
> symsrc__init: cannot get elf header.
> Using /proc/kcore for kernel object code
> Using /proc/kallsyms for symbols
> Failed to find corresponding probes from debuginfo.
> Failed to find probe point from both of dwarf and map.
> probe:my_func (on my_func in kernel_module)
> Parsing probe_events: p:probe/sys_write _text+1684520
> Group:probe Event:sys_write probe:p
> try to find information at 19b428 in kernel
> Looking at the vmlinux_path (7 entries long)
> symsrc__init: cannot get elf header.
> Failed to find the path for kernel: Invalid ELF file
> Failed to find corresponding probes from debuginfo.
> probe:sys_write (on sys_write)
>
> ---
> tools/perf/util/symbol.c | 20 +++++++++-----------
> 1 file changed, 9 insertions(+), 11 deletions(-)
>
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index b4cc766..09bb6e8 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -654,7 +654,7 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> struct map_groups *kmaps = map__kmaps(map);
> struct map *curr_map;
> struct symbol *pos;
> - int count = 0, moved = 0;
> + int count = 0;
> struct rb_root *root = &dso->symbols[map->type];
> struct rb_node *next = rb_first(root);
>
> @@ -677,25 +677,23 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> rb_erase_init(&pos->rb_node, root);
> symbol__delete(pos);
> } else {
> + rb_erase_init(&pos->rb_node, root);
> +
> pos->start -= curr_map->start - curr_map->pgoff;
> if (pos->end)
> pos->end -= curr_map->start - curr_map->pgoff;
> - if (curr_map->dso != map->dso) {
> - rb_erase_init(&pos->rb_node, root);
> - symbols__insert(
> - &curr_map->dso->symbols[curr_map->type],
> - pos);
> - ++moved;
> - } else {
> - ++count;
> - }
> +
> + symbols__insert(
> + &curr_map->dso->symbols[curr_map->type],
> + pos);
> + ++count;
> }
> }
>
> /* Symbols have been adjusted */
> dso->adjust_symbols = 1;
>
> - return count + moved;
> + return count;
> }
>
> /*
> --
> 1.8.3.4
On 2015/11/6 21:19, Arnaldo Carvalho de Melo wrote:
> Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:
>> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
>> address but only reinsert it into rbtree if the symbol belongs to
>> another map. However, the expression for adjusting symbol (pos->start -=
>> curr_map->start - curr_map->pgoff) can change the relative order between
>> two symbols (even if the affected symbols are in different maps, in
>> kcore case they are possible to share one same dso), which damages the
>> rbtree.
> Right, some code does change the symbol values it gets from whatever
> symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
> map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
> ->reloc, members for that :-\
>
> I.e. 'struct dso' should be just what comes from the symtab, while
> 'struct map' should be about where that DSO is in memory.
>
> With that in mind, do you still think your fix is the correct one?
Not very sure. I'm not familar with this part of code. Actually
speaking I don't understand the relationship between what you said
and what I found...
I spent a whole day to answer Masami's question that why
kernel_get_symbol_address_by_name success but __find_kernel_function()
fail in my platform, and described it in commit message.
This patch is the best one I can find. It solves my problem but may be
incorrect. Just want you and other know my result. Please let
me know if you and other want further information. Now its pirority
is low because patch 98d3b25 and Masami's update are already enough
for me.
I'll go back to BPF stuff. There are still much work to do :-)
Thank you.
> Adrian?
>
> - Arnaldo
>
On 06/11/15 15:19, Arnaldo Carvalho de Melo wrote:
> Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:
>> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
>> address but only reinsert it into rbtree if the symbol belongs to
>> another map. However, the expression for adjusting symbol (pos->start -=
>> curr_map->start - curr_map->pgoff) can change the relative order between
>> two symbols (even if the affected symbols are in different maps, in
>> kcore case they are possible to share one same dso), which damages the
>> rbtree.
>
> Right, some code does change the symbol values it gets from whatever
> symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
> map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
> ->reloc, members for that :-\
>
> I.e. 'struct dso' should be just what comes from the symtab, while
> 'struct map' should be about where that DSO is in memory.
>
> With that in mind, do you still think your fix is the correct one?
>
> Adrian?
The problem is when the order in memory (in kallsyms) is different
to the order on the dso (kcore).
I think to make it more general it needs to insert to a new tree.
e.g.
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index b4cc7662677e..09343a880c0b 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
struct map_groups *kmaps = map__kmaps(map);
struct map *curr_map;
struct symbol *pos;
- int count = 0, moved = 0;
+ int count = 0;
+ struct rb_root old_root = dso->symbols[map->type];
struct rb_root *root = &dso->symbols[map->type];
struct rb_node *next = rb_first(root);
if (!kmaps)
return -1;
+ *root = RB_ROOT;
+
while (next) {
char *module;
pos = rb_entry(next, struct symbol, rb_node);
next = rb_next(&pos->rb_node);
+ rb_erase_init(&pos->rb_node, &old_root);
+
module = strchr(pos->name, '\t');
if (module)
*module = '\0';
@@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
curr_map = map_groups__find(kmaps, map->type, pos->start);
if (!curr_map || (filter && filter(curr_map, pos))) {
- rb_erase_init(&pos->rb_node, root);
symbol__delete(pos);
- } else {
- pos->start -= curr_map->start - curr_map->pgoff;
- if (pos->end)
- pos->end -= curr_map->start - curr_map->pgoff;
- if (curr_map->dso != map->dso) {
- rb_erase_init(&pos->rb_node, root);
- symbols__insert(
- &curr_map->dso->symbols[curr_map->type],
- pos);
- ++moved;
- } else {
- ++count;
- }
+ continue;
}
+
+ pos->start -= curr_map->start - curr_map->pgoff;
+ if (pos->end)
+ pos->end -= curr_map->start - curr_map->pgoff;
+ symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);
+ ++count;
}
/* Symbols have been adjusted */
dso->adjust_symbols = 1;
- return count + moved;
+ return count;
}
/*
Em Fri, Nov 06, 2015 at 09:34:55PM +0800, Wangnan (F) escreveu:
> On 2015/11/6 21:19, Arnaldo Carvalho de Melo wrote:
> >Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:
> >>In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
> >>address but only reinsert it into rbtree if the symbol belongs to
> >>another map. However, the expression for adjusting symbol (pos->start -=
> >>curr_map->start - curr_map->pgoff) can change the relative order between
> >>two symbols (even if the affected symbols are in different maps, in
> >>kcore case they are possible to share one same dso), which damages the
> >>rbtree.
> >Right, some code does change the symbol values it gets from whatever
> >symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
> >map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
> >->reloc, members for that :-\
> >
> >I.e. 'struct dso' should be just what comes from the symtab, while
> >'struct map' should be about where that DSO is in memory.
> >
> >With that in mind, do you still think your fix is the correct one?
>
> Not very sure. I'm not familar with this part of code. Actually
> speaking I don't understand the relationship between what you said
> and what I found...
What I said is that no code should, how did you state it? Here it is:
"In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
address"
It should not, not before adding it to the rbtree, and specially not
_after, any adjustments should be done to 'struct map'.
> I spent a whole day to answer Masami's question that why
> kernel_get_symbol_address_by_name success but __find_kernel_function()
> fail in my platform, and described it in commit message.
Well, and that was a good exercise, I think, even one I wouldn't have
done, being as busy as you.
Your fix was perfectly fine, there was no strict need to figure out when
that would result in problem, at that point, if sym was NULL it should
return -ENOENT and since 'ret' was being overwritten...
> This patch is the best one I can find.
And I thank you for that, the investigation + the patch uncovered a bug.
We now need to find a fix, but not necessarily you need to do that tho.
> It solves my problem but may be incorrect. Just want you and other
> know my result. Please let me know if you and other want further
> information. Now its pirority is low because patch 98d3b25 and
> Masami's update are already enough for me.
Sure, lets move forward.
> I'll go back to BPF stuff. There are still much work to do :-)
Indeed, thank you for doing all this work!
- Arnaldo
?????ҵ? iPhone
> ?? 2015??11??6?գ?????9:59??Adrian Hunter <[email protected]> д????
>
>> On 06/11/15 15:19, Arnaldo Carvalho de Melo wrote:
>> Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:
>>> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
>>> address but only reinsert it into rbtree if the symbol belongs to
>>> another map. However, the expression for adjusting symbol (pos->start -=
>>> curr_map->start - curr_map->pgoff) can change the relative order between
>>> two symbols (even if the affected symbols are in different maps, in
>>> kcore case they are possible to share one same dso), which damages the
>>> rbtree.
>>
>> Right, some code does change the symbol values it gets from whatever
>> symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
>> map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
>> ->reloc, members for that :-\
>>
>> I.e. 'struct dso' should be just what comes from the symtab, while
>> 'struct map' should be about where that DSO is in memory.
>>
>> With that in mind, do you still think your fix is the correct one?
>>
>> Adrian?
>
> The problem is when the order in memory (in kallsyms) is different
> to the order on the dso (kcore).
>
> I think to make it more general it needs to insert to a new tree.
> e.g.
>
Thanks to your quick reply, but I have left my office and won't have time and
environment to test your patch until next Wednesday. Is it possible for you to
test it for yourself?
Thank you.
>
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index b4cc7662677e..09343a880c0b 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> struct map_groups *kmaps = map__kmaps(map);
> struct map *curr_map;
> struct symbol *pos;
> - int count = 0, moved = 0;
> + int count = 0;
> + struct rb_root old_root = dso->symbols[map->type];
> struct rb_root *root = &dso->symbols[map->type];
> struct rb_node *next = rb_first(root);
>
> if (!kmaps)
> return -1;
>
> + *root = RB_ROOT;
> +
> while (next) {
> char *module;
>
> pos = rb_entry(next, struct symbol, rb_node);
> next = rb_next(&pos->rb_node);
>
> + rb_erase_init(&pos->rb_node, &old_root);
> +
> module = strchr(pos->name, '\t');
> if (module)
> *module = '\0';
> @@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> curr_map = map_groups__find(kmaps, map->type, pos->start);
>
> if (!curr_map || (filter && filter(curr_map, pos))) {
> - rb_erase_init(&pos->rb_node, root);
> symbol__delete(pos);
> - } else {
> - pos->start -= curr_map->start - curr_map->pgoff;
> - if (pos->end)
> - pos->end -= curr_map->start - curr_map->pgoff;
> - if (curr_map->dso != map->dso) {
> - rb_erase_init(&pos->rb_node, root);
> - symbols__insert(
> - &curr_map->dso->symbols[curr_map->type],
> - pos);
> - ++moved;
> - } else {
> - ++count;
> - }
> + continue;
> }
> +
> + pos->start -= curr_map->start - curr_map->pgoff;
> + if (pos->end)
> + pos->end -= curr_map->start - curr_map->pgoff;
> + symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);
> + ++count;
> }
>
> /* Symbols have been adjusted */
> dso->adjust_symbols = 1;
>
> - return count + moved;
> + return count;
> }
>
> /*
>
?????ҵ? iPhone
> ?? 2015??11??6?գ?????10:03??Arnaldo Carvalho de Melo <[email protected]> д????
>
> Em Fri, Nov 06, 2015 at 09:34:55PM +0800, Wangnan (F) escreveu:
>> On 2015/11/6 21:19, Arnaldo Carvalho de Melo wrote:
>>> Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:
>>>> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
>>>> address but only reinsert it into rbtree if the symbol belongs to
>>>> another map. However, the expression for adjusting symbol (pos->start -=
>>>> curr_map->start - curr_map->pgoff) can change the relative order between
>>>> two symbols (even if the affected symbols are in different maps, in
>>>> kcore case they are possible to share one same dso), which damages the
>>>> rbtree.
>>> Right, some code does change the symbol values it gets from whatever
>>> symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
>>> map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
>>> ->reloc, members for that :-\
>>>
>>> I.e. 'struct dso' should be just what comes from the symtab, while
>>> 'struct map' should be about where that DSO is in memory.
>>>
>>> With that in mind, do you still think your fix is the correct one?
>>
>> Not very sure. I'm not familar with this part of code. Actually
>> speaking I don't understand the relationship between what you said
>> and what I found...
>
> What I said is that no code should, how did you state it? Here it is:
>
> "In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
> address"
>
> It should not, not before adding it to the rbtree, and specially not
> _after, any adjustments should be done to 'struct map'.
>
>> I spent a whole day to answer Masami's question that why
>> kernel_get_symbol_address_by_name success but __find_kernel_function()
>> fail in my platform, and described it in commit message.
>
> Well, and that was a good exercise, I think, even one I wouldn't have
> done, being as busy as you.
>
> Your fix was perfectly fine, there was no strict need to figure out when
> that would result in problem, at that point, if sym was NULL it should
> return -ENOENT and since 'ret' was being overwritten...
>
>> This patch is the best one I can find.
>
> And I thank you for that, the investigation + the patch uncovered a bug.
> We now need to find a fix, but not necessarily you need to do that tho.
And also thanks to our great testing team. They found this bug and push me to
solve it.
Thank you.
>
>> It solves my problem but may be incorrect. Just want you and other
>> know my result. Please let me know if you and other want further
>> information. Now its pirority is low because patch 98d3b25 and
>> Masami's update are already enough for me.
>
> Sure, lets move forward.
>
>> I'll go back to BPF stuff. There are still much work to do :-)
>
> Indeed, thank you for doing all this work!
>
> - Arnaldo
Em Fri, Nov 06, 2015 at 03:59:29PM +0200, Adrian Hunter escreveu:
> On 06/11/15 15:19, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:
> >> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
> >> address but only reinsert it into rbtree if the symbol belongs to
> >> another map. However, the expression for adjusting symbol (pos->start -=
> >> curr_map->start - curr_map->pgoff) can change the relative order between
> >> two symbols (even if the affected symbols are in different maps, in
> >> kcore case they are possible to share one same dso), which damages the
> >> rbtree.
> >
> > Right, some code does change the symbol values it gets from whatever
> > symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
> > map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
> > ->reloc, members for that :-\
> >
> > I.e. 'struct dso' should be just what comes from the symtab, while
> > 'struct map' should be about where that DSO is in memory.
> >
> > With that in mind, do you still think your fix is the correct one?
> >
> > Adrian?
>
> The problem is when the order in memory (in kallsyms) is different
> to the order on the dso (kcore).
What order? Can you ellaborate a bit more? I thought more about keeping
whatever address is in the symtab from where we read the symbols, and
then create one map per kernel module all pointing to the same DSO, that
would be the one loaded from kallsyms.
Any adjustments would be fone in the map, not the DSO.
I.e. we wouldn't be splitting anything, just creating struct map
instances pointing to the same DSO.
- Arnaldo
> I think to make it more general it needs to insert to a new tree.
> e.g.
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index b4cc7662677e..09343a880c0b 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> struct map_groups *kmaps = map__kmaps(map);
> struct map *curr_map;
> struct symbol *pos;
> - int count = 0, moved = 0;
> + int count = 0;
> + struct rb_root old_root = dso->symbols[map->type];
> struct rb_root *root = &dso->symbols[map->type];
> struct rb_node *next = rb_first(root);
>
> if (!kmaps)
> return -1;
>
> + *root = RB_ROOT;
> +
> while (next) {
> char *module;
>
> pos = rb_entry(next, struct symbol, rb_node);
> next = rb_next(&pos->rb_node);
>
> + rb_erase_init(&pos->rb_node, &old_root);
> +
> module = strchr(pos->name, '\t');
> if (module)
> *module = '\0';
> @@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> curr_map = map_groups__find(kmaps, map->type, pos->start);
>
> if (!curr_map || (filter && filter(curr_map, pos))) {
> - rb_erase_init(&pos->rb_node, root);
> symbol__delete(pos);
> - } else {
> - pos->start -= curr_map->start - curr_map->pgoff;
> - if (pos->end)
> - pos->end -= curr_map->start - curr_map->pgoff;
> - if (curr_map->dso != map->dso) {
> - rb_erase_init(&pos->rb_node, root);
> - symbols__insert(
> - &curr_map->dso->symbols[curr_map->type],
> - pos);
> - ++moved;
> - } else {
> - ++count;
> - }
> + continue;
> }
> +
> + pos->start -= curr_map->start - curr_map->pgoff;
> + if (pos->end)
> + pos->end -= curr_map->start - curr_map->pgoff;
> + symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);
> + ++count;
> }
>
> /* Symbols have been adjusted */
> dso->adjust_symbols = 1;
>
> - return count + moved;
> + return count;
> }
>
> /*
On 06/11/15 20:51, Arnaldo Carvalho de Melo wrote:
> Em Fri, Nov 06, 2015 at 03:59:29PM +0200, Adrian Hunter escreveu:
>> On 06/11/15 15:19, Arnaldo Carvalho de Melo wrote:
>>> Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:
>>>> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
>>>> address but only reinsert it into rbtree if the symbol belongs to
>>>> another map. However, the expression for adjusting symbol (pos->start -=
>>>> curr_map->start - curr_map->pgoff) can change the relative order between
>>>> two symbols (even if the affected symbols are in different maps, in
>>>> kcore case they are possible to share one same dso), which damages the
>>>> rbtree.
>>>
>>> Right, some code does change the symbol values it gets from whatever
>>> symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
>>> map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
>>> ->reloc, members for that :-\
>>>
>>> I.e. 'struct dso' should be just what comes from the symtab, while
>>> 'struct map' should be about where that DSO is in memory.
>>>
>>> With that in mind, do you still think your fix is the correct one?
>>>
>>> Adrian?
>>
>> The problem is when the order in memory (in kallsyms) is different
>> to the order on the dso (kcore).
>
> What order? Can you ellaborate a bit more?
Normally symbols are read from the DSO and adjusted, if need be, so that the
symbol start matches the file offset in the DSO file (we want the file
offset because that is what we know from MMAP events). That is done by
dso__load_sym() which inserts the symbols *after* adjusting them.
In the case of kcore, the symbols have been read from kallsyms and the
symbol start is the memory address. The symbols have to be adjusted to match
the kcore file offsets. dso__split_kallsyms_for_kcore() does that, but now
the adjustment is being done *after* the symbols have been inserted. It
appears dso__split_kallsyms_for_kcore() was assuming that changing the
symbol start would not change the order in the rbtree - which is, of course,
not guaranteed.
> I thought more about keeping
> whatever address is in the symtab from where we read the symbols, and
> then create one map per kernel module all pointing to the same DSO, that
> would be the one loaded from kallsyms.
>
> Any adjustments would be fone in the map, not the DSO.
>
> I.e. we wouldn't be splitting anything, just creating struct map
> instances pointing to the same DSO.
>
> - Arnaldo
>
>> I think to make it more general it needs to insert to a new tree.
>> e.g.
>
>
>> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
>> index b4cc7662677e..09343a880c0b 100644
>> --- a/tools/perf/util/symbol.c
>> +++ b/tools/perf/util/symbol.c
>> @@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
>> struct map_groups *kmaps = map__kmaps(map);
>> struct map *curr_map;
>> struct symbol *pos;
>> - int count = 0, moved = 0;
>> + int count = 0;
>> + struct rb_root old_root = dso->symbols[map->type];
>> struct rb_root *root = &dso->symbols[map->type];
>> struct rb_node *next = rb_first(root);
>>
>> if (!kmaps)
>> return -1;
>>
>> + *root = RB_ROOT;
>> +
>> while (next) {
>> char *module;
>>
>> pos = rb_entry(next, struct symbol, rb_node);
>> next = rb_next(&pos->rb_node);
>>
>> + rb_erase_init(&pos->rb_node, &old_root);
>> +
>> module = strchr(pos->name, '\t');
>> if (module)
>> *module = '\0';
>> @@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
>> curr_map = map_groups__find(kmaps, map->type, pos->start);
>>
>> if (!curr_map || (filter && filter(curr_map, pos))) {
>> - rb_erase_init(&pos->rb_node, root);
>> symbol__delete(pos);
>> - } else {
>> - pos->start -= curr_map->start - curr_map->pgoff;
>> - if (pos->end)
>> - pos->end -= curr_map->start - curr_map->pgoff;
>> - if (curr_map->dso != map->dso) {
>> - rb_erase_init(&pos->rb_node, root);
>> - symbols__insert(
>> - &curr_map->dso->symbols[curr_map->type],
>> - pos);
>> - ++moved;
>> - } else {
>> - ++count;
>> - }
>> + continue;
>> }
>> +
>> + pos->start -= curr_map->start - curr_map->pgoff;
>> + if (pos->end)
>> + pos->end -= curr_map->start - curr_map->pgoff;
>> + symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);
>> + ++count;
>> }
>>
>> /* Symbols have been adjusted */
>> dso->adjust_symbols = 1;
>>
>> - return count + moved;
>> + return count;
>> }
>>
>> /*
>
Em Mon, Nov 09, 2015 at 10:26:13AM +0200, Adrian Hunter escreveu:
> On 06/11/15 20:51, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Nov 06, 2015 at 03:59:29PM +0200, Adrian Hunter escreveu:
> >> The problem is when the order in memory (in kallsyms) is different
> >> to the order on the dso (kcore).
> > What order? Can you ellaborate a bit more?
> Normally symbols are read from the DSO and adjusted, if need be, so that the
> symbol start matches the file offset in the DSO file (we want the file
> offset because that is what we know from MMAP events). That is done by
> dso__load_sym() which inserts the symbols *after* adjusting them.
> In the case of kcore, the symbols have been read from kallsyms and the
> symbol start is the memory address. The symbols have to be adjusted to match
> the kcore file offsets. dso__split_kallsyms_for_kcore() does that, but now
So you're saying that some symbols get adjusted, by say X bytes, while
some other symbols are adjusted by a different, Y value, or are _all_
the symbols adjusted by the same value, i.e. one that could be adjusted
in 'struct map' instead?
> the adjustment is being done *after* the symbols have been inserted. It
> appears dso__split_kallsyms_for_kcore() was assuming that changing the
> symbol start would not change the order in the rbtree - which is, of course,
> not guaranteed.
Sure, the minimal fix should be not to change the key (sym->start/end)
after you add it to an rbtree that uses that key.
> > I thought more about keeping
> > whatever address is in the symtab from where we read the symbols, and
> > then create one map per kernel module all pointing to the same DSO, that
> > would be the one loaded from kallsyms.
> >
> > Any adjustments would be fone in the map, not the DSO.
> >
> > I.e. we wouldn't be splitting anything, just creating struct map
> > instances pointing to the same DSO.
> >
> > - Arnaldo
> >
> >> I think to make it more general it needs to insert to a new tree.
> >> e.g.
> >
> >
> >> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> >> index b4cc7662677e..09343a880c0b 100644
> >> --- a/tools/perf/util/symbol.c
> >> +++ b/tools/perf/util/symbol.c
> >> @@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> >> struct map_groups *kmaps = map__kmaps(map);
> >> struct map *curr_map;
> >> struct symbol *pos;
> >> - int count = 0, moved = 0;
> >> + int count = 0;
> >> + struct rb_root old_root = dso->symbols[map->type];
> >> struct rb_root *root = &dso->symbols[map->type];
> >> struct rb_node *next = rb_first(root);
> >>
> >> if (!kmaps)
> >> return -1;
> >>
> >> + *root = RB_ROOT;
> >> +
> >> while (next) {
> >> char *module;
> >>
> >> pos = rb_entry(next, struct symbol, rb_node);
> >> next = rb_next(&pos->rb_node);
> >>
> >> + rb_erase_init(&pos->rb_node, &old_root);
> >> +
> >> module = strchr(pos->name, '\t');
> >> if (module)
> >> *module = '\0';
> >> @@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> >> curr_map = map_groups__find(kmaps, map->type, pos->start);
> >>
> >> if (!curr_map || (filter && filter(curr_map, pos))) {
> >> - rb_erase_init(&pos->rb_node, root);
> >> symbol__delete(pos);
> >> - } else {
> >> - pos->start -= curr_map->start - curr_map->pgoff;
> >> - if (pos->end)
> >> - pos->end -= curr_map->start - curr_map->pgoff;
> >> - if (curr_map->dso != map->dso) {
> >> - rb_erase_init(&pos->rb_node, root);
> >> - symbols__insert(
> >> - &curr_map->dso->symbols[curr_map->type],
> >> - pos);
> >> - ++moved;
> >> - } else {
> >> - ++count;
> >> - }
> >> + continue;
> >> }
> >> +
> >> + pos->start -= curr_map->start - curr_map->pgoff;
> >> + if (pos->end)
> >> + pos->end -= curr_map->start - curr_map->pgoff;
> >> + symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);
> >> + ++count;
> >> }
> >>
> >> /* Symbols have been adjusted */
> >> dso->adjust_symbols = 1;
> >>
> >> - return count + moved;
> >> + return count;
> >> }
> >>
> >> /*
> >
On 09/11/15 16:56, Arnaldo Carvalho de Melo wrote:
> Em Mon, Nov 09, 2015 at 10:26:13AM +0200, Adrian Hunter escreveu:
>> On 06/11/15 20:51, Arnaldo Carvalho de Melo wrote:
>>> Em Fri, Nov 06, 2015 at 03:59:29PM +0200, Adrian Hunter escreveu:
>>>> The problem is when the order in memory (in kallsyms) is different
>>>> to the order on the dso (kcore).
>
>>> What order? Can you ellaborate a bit more?
>
>> Normally symbols are read from the DSO and adjusted, if need be, so that the
>> symbol start matches the file offset in the DSO file (we want the file
>> offset because that is what we know from MMAP events). That is done by
>> dso__load_sym() which inserts the symbols *after* adjusting them.
>
>> In the case of kcore, the symbols have been read from kallsyms and the
>> symbol start is the memory address. The symbols have to be adjusted to match
>> the kcore file offsets. dso__split_kallsyms_for_kcore() does that, but now
>
> So you're saying that some symbols get adjusted, by say X bytes, while
> some other symbols are adjusted by a different, Y value, or are _all_
> the symbols adjusted by the same value, i.e. one that could be adjusted
> in 'struct map' instead?
Yes X != Y. The maps are correct - just the symbols need adjusting.
>
>> the adjustment is being done *after* the symbols have been inserted. It
>> appears dso__split_kallsyms_for_kcore() was assuming that changing the
>> symbol start would not change the order in the rbtree - which is, of course,
>> not guaranteed.
>
> Sure, the minimal fix should be not to change the key (sym->start/end)
> after you add it to an rbtree that uses that key.
>
>>> I thought more about keeping
>>> whatever address is in the symtab from where we read the symbols, and
>>> then create one map per kernel module all pointing to the same DSO, that
>>> would be the one loaded from kallsyms.
>>>
>>> Any adjustments would be fone in the map, not the DSO.
>>>
>>> I.e. we wouldn't be splitting anything, just creating struct map
>>> instances pointing to the same DSO.
>>>
>>> - Arnaldo
>>>
>>>> I think to make it more general it needs to insert to a new tree.
>>>> e.g.
>>>
>>>
>>>> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
>>>> index b4cc7662677e..09343a880c0b 100644
>>>> --- a/tools/perf/util/symbol.c
>>>> +++ b/tools/perf/util/symbol.c
>>>> @@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
>>>> struct map_groups *kmaps = map__kmaps(map);
>>>> struct map *curr_map;
>>>> struct symbol *pos;
>>>> - int count = 0, moved = 0;
>>>> + int count = 0;
>>>> + struct rb_root old_root = dso->symbols[map->type];
>>>> struct rb_root *root = &dso->symbols[map->type];
>>>> struct rb_node *next = rb_first(root);
>>>>
>>>> if (!kmaps)
>>>> return -1;
>>>>
>>>> + *root = RB_ROOT;
>>>> +
>>>> while (next) {
>>>> char *module;
>>>>
>>>> pos = rb_entry(next, struct symbol, rb_node);
>>>> next = rb_next(&pos->rb_node);
>>>>
>>>> + rb_erase_init(&pos->rb_node, &old_root);
>>>> +
>>>> module = strchr(pos->name, '\t');
>>>> if (module)
>>>> *module = '\0';
>>>> @@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
>>>> curr_map = map_groups__find(kmaps, map->type, pos->start);
>>>>
>>>> if (!curr_map || (filter && filter(curr_map, pos))) {
>>>> - rb_erase_init(&pos->rb_node, root);
>>>> symbol__delete(pos);
>>>> - } else {
>>>> - pos->start -= curr_map->start - curr_map->pgoff;
>>>> - if (pos->end)
>>>> - pos->end -= curr_map->start - curr_map->pgoff;
>>>> - if (curr_map->dso != map->dso) {
>>>> - rb_erase_init(&pos->rb_node, root);
>>>> - symbols__insert(
>>>> - &curr_map->dso->symbols[curr_map->type],
>>>> - pos);
>>>> - ++moved;
>>>> - } else {
>>>> - ++count;
>>>> - }
>>>> + continue;
>>>> }
>>>> +
>>>> + pos->start -= curr_map->start - curr_map->pgoff;
>>>> + if (pos->end)
>>>> + pos->end -= curr_map->start - curr_map->pgoff;
>>>> + symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);
>>>> + ++count;
>>>> }
>>>>
>>>> /* Symbols have been adjusted */
>>>> dso->adjust_symbols = 1;
>>>>
>>>> - return count + moved;
>>>> + return count;
>>>> }
>>>>
>>>> /*
>>>
>
On 2015/11/6 21:59, Adrian Hunter wrote:
> On 06/11/15 15:19, Arnaldo Carvalho de Melo wrote:
>> Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:
>>> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
>>> address but only reinsert it into rbtree if the symbol belongs to
>>> another map. However, the expression for adjusting symbol (pos->start -=
>>> curr_map->start - curr_map->pgoff) can change the relative order between
>>> two symbols (even if the affected symbols are in different maps, in
>>> kcore case they are possible to share one same dso), which damages the
>>> rbtree.
>> Right, some code does change the symbol values it gets from whatever
>> symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
>> map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
>> ->reloc, members for that :-\
>>
>> I.e. 'struct dso' should be just what comes from the symtab, while
>> 'struct map' should be about where that DSO is in memory.
>>
>> With that in mind, do you still think your fix is the correct one?
>>
>> Adrian?
> The problem is when the order in memory (in kallsyms) is different
> to the order on the dso (kcore).
>
> I think to make it more general it needs to insert to a new tree.
> e.g.
>
I have tested this patch and it works for me.
Thank you.
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index b4cc7662677e..09343a880c0b 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> struct map_groups *kmaps = map__kmaps(map);
> struct map *curr_map;
> struct symbol *pos;
> - int count = 0, moved = 0;
> + int count = 0;
> + struct rb_root old_root = dso->symbols[map->type];
> struct rb_root *root = &dso->symbols[map->type];
> struct rb_node *next = rb_first(root);
>
> if (!kmaps)
> return -1;
>
> + *root = RB_ROOT;
> +
> while (next) {
> char *module;
>
> pos = rb_entry(next, struct symbol, rb_node);
> next = rb_next(&pos->rb_node);
>
> + rb_erase_init(&pos->rb_node, &old_root);
> +
> module = strchr(pos->name, '\t');
> if (module)
> *module = '\0';
> @@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> curr_map = map_groups__find(kmaps, map->type, pos->start);
>
> if (!curr_map || (filter && filter(curr_map, pos))) {
> - rb_erase_init(&pos->rb_node, root);
> symbol__delete(pos);
> - } else {
> - pos->start -= curr_map->start - curr_map->pgoff;
> - if (pos->end)
> - pos->end -= curr_map->start - curr_map->pgoff;
> - if (curr_map->dso != map->dso) {
> - rb_erase_init(&pos->rb_node, root);
> - symbols__insert(
> - &curr_map->dso->symbols[curr_map->type],
> - pos);
> - ++moved;
> - } else {
> - ++count;
> - }
> + continue;
> }
> +
> + pos->start -= curr_map->start - curr_map->pgoff;
> + if (pos->end)
> + pos->end -= curr_map->start - curr_map->pgoff;
> + symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);
> + ++count;
> }
>
> /* Symbols have been adjusted */
> dso->adjust_symbols = 1;
>
> - return count + moved;
> + return count;
> }
>
> /*
>
Em Wed, Nov 11, 2015 at 03:02:35PM +0800, Wangnan (F) escreveu:
>
>
> On 2015/11/6 21:59, Adrian Hunter wrote:
> >On 06/11/15 15:19, Arnaldo Carvalho de Melo wrote:
> >>Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:
> >>>In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
> >>>address but only reinsert it into rbtree if the symbol belongs to
> >>>another map. However, the expression for adjusting symbol (pos->start -=
> >>>curr_map->start - curr_map->pgoff) can change the relative order between
> >>>two symbols (even if the affected symbols are in different maps, in
> >>>kcore case they are possible to share one same dso), which damages the
> >>>rbtree.
> >>Right, some code does change the symbol values it gets from whatever
> >>symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
> >>map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
> >>->reloc, members for that :-\
> >>
> >>I.e. 'struct dso' should be just what comes from the symtab, while
> >>'struct map' should be about where that DSO is in memory.
> >>
> >>With that in mind, do you still think your fix is the correct one?
> >>
> >>Adrian?
> >The problem is when the order in memory (in kallsyms) is different
> >to the order on the dso (kcore).
> >
> >I think to make it more general it needs to insert to a new tree.
> >e.g.
> >
>
> I have tested this patch and it works for me.
>
> Thank you.
Adrian, I took your explanation as the commit log, adding your S-o-B, so
far not provided, is that ok with you, can I have your S-o-B?
>From 500fe7dbd2c6cebc3638196352439490e1e3a8a4 Mon Sep 17 00:00:00 2001
From: Adrian Hunter <[email protected]>
Date: Fri, 6 Nov 2015 15:59:29 +0200
Subject: [PATCH 1/1] perf symbols: Rebuild rbtree when adjusting symbols for
kcore
Normally symbols are read from the DSO and adjusted, if need be, so that
the symbol start matches the file offset in the DSO file (we want the
file offset because that is what we know from MMAP events). That is done
by dso__load_sym() which inserts the symbols *after* adjusting them.
In the case of kcore, the symbols have been read from kallsyms and the
symbol start is the memory address. The symbols have to be adjusted to
match the kcore file offsets. dso__split_kallsyms_for_kcore() does that,
but now the adjustment is being done *after* the symbols have been
inserted. It appears dso__split_kallsyms_for_kcore() was assuming that
changing the symbol start would not change the order in the rbtree -
which is, of course, not guaranteed.
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Wang Nan <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/symbol.c | 30 ++++++++++++++----------------
1 file changed, 14 insertions(+), 16 deletions(-)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index b4cc7662677e..09343a880c0b 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
struct map_groups *kmaps = map__kmaps(map);
struct map *curr_map;
struct symbol *pos;
- int count = 0, moved = 0;
+ int count = 0;
+ struct rb_root old_root = dso->symbols[map->type];
struct rb_root *root = &dso->symbols[map->type];
struct rb_node *next = rb_first(root);
if (!kmaps)
return -1;
+ *root = RB_ROOT;
+
while (next) {
char *module;
pos = rb_entry(next, struct symbol, rb_node);
next = rb_next(&pos->rb_node);
+ rb_erase_init(&pos->rb_node, &old_root);
+
module = strchr(pos->name, '\t');
if (module)
*module = '\0';
@@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
curr_map = map_groups__find(kmaps, map->type, pos->start);
if (!curr_map || (filter && filter(curr_map, pos))) {
- rb_erase_init(&pos->rb_node, root);
symbol__delete(pos);
- } else {
- pos->start -= curr_map->start - curr_map->pgoff;
- if (pos->end)
- pos->end -= curr_map->start - curr_map->pgoff;
- if (curr_map->dso != map->dso) {
- rb_erase_init(&pos->rb_node, root);
- symbols__insert(
- &curr_map->dso->symbols[curr_map->type],
- pos);
- ++moved;
- } else {
- ++count;
- }
+ continue;
}
+
+ pos->start -= curr_map->start - curr_map->pgoff;
+ if (pos->end)
+ pos->end -= curr_map->start - curr_map->pgoff;
+ symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);
+ ++count;
}
/* Symbols have been adjusted */
dso->adjust_symbols = 1;
- return count + moved;
+ return count;
}
/*
--
2.1.0
> >diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> >index b4cc7662677e..09343a880c0b 100644
> >--- a/tools/perf/util/symbol.c
> >+++ b/tools/perf/util/symbol.c
> >@@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> > struct map_groups *kmaps = map__kmaps(map);
> > struct map *curr_map;
> > struct symbol *pos;
> >- int count = 0, moved = 0;
> >+ int count = 0;
> >+ struct rb_root old_root = dso->symbols[map->type];
> > struct rb_root *root = &dso->symbols[map->type];
> > struct rb_node *next = rb_first(root);
> > if (!kmaps)
> > return -1;
> >+ *root = RB_ROOT;
> >+
> > while (next) {
> > char *module;
> > pos = rb_entry(next, struct symbol, rb_node);
> > next = rb_next(&pos->rb_node);
> >+ rb_erase_init(&pos->rb_node, &old_root);
> >+
> > module = strchr(pos->name, '\t');
> > if (module)
> > *module = '\0';
> >@@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> > curr_map = map_groups__find(kmaps, map->type, pos->start);
> > if (!curr_map || (filter && filter(curr_map, pos))) {
> >- rb_erase_init(&pos->rb_node, root);
> > symbol__delete(pos);
> >- } else {
> >- pos->start -= curr_map->start - curr_map->pgoff;
> >- if (pos->end)
> >- pos->end -= curr_map->start - curr_map->pgoff;
> >- if (curr_map->dso != map->dso) {
> >- rb_erase_init(&pos->rb_node, root);
> >- symbols__insert(
> >- &curr_map->dso->symbols[curr_map->type],
> >- pos);
> >- ++moved;
> >- } else {
> >- ++count;
> >- }
> >+ continue;
> > }
> >+
> >+ pos->start -= curr_map->start - curr_map->pgoff;
> >+ if (pos->end)
> >+ pos->end -= curr_map->start - curr_map->pgoff;
> >+ symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);
> >+ ++count;
> > }
> > /* Symbols have been adjusted */
> > dso->adjust_symbols = 1;
> >- return count + moved;
> >+ return count;
> > }
> > /*
> >
>
On 11/11/15 22:44, Arnaldo Carvalho de Melo wrote:
> Em Wed, Nov 11, 2015 at 03:02:35PM +0800, Wangnan (F) escreveu:
>>
>>
>> On 2015/11/6 21:59, Adrian Hunter wrote:
>>> On 06/11/15 15:19, Arnaldo Carvalho de Melo wrote:
>>>> Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:
>>>>> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
>>>>> address but only reinsert it into rbtree if the symbol belongs to
>>>>> another map. However, the expression for adjusting symbol (pos->start -=
>>>>> curr_map->start - curr_map->pgoff) can change the relative order between
>>>>> two symbols (even if the affected symbols are in different maps, in
>>>>> kcore case they are possible to share one same dso), which damages the
>>>>> rbtree.
>>>> Right, some code does change the symbol values it gets from whatever
>>>> symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
>>>> map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
>>>> ->reloc, members for that :-\
>>>>
>>>> I.e. 'struct dso' should be just what comes from the symtab, while
>>>> 'struct map' should be about where that DSO is in memory.
>>>>
>>>> With that in mind, do you still think your fix is the correct one?
>>>>
>>>> Adrian?
>>> The problem is when the order in memory (in kallsyms) is different
>>> to the order on the dso (kcore).
>>>
>>> I think to make it more general it needs to insert to a new tree.
>>> e.g.
>>>
>>
>> I have tested this patch and it works for me.
>>
>> Thank you.
>
> Adrian, I took your explanation as the commit log, adding your S-o-B, so
> far not provided, is that ok with you, can I have your S-o-B?
Yes. Thank you!
>
>>From 500fe7dbd2c6cebc3638196352439490e1e3a8a4 Mon Sep 17 00:00:00 2001
> From: Adrian Hunter <[email protected]>
> Date: Fri, 6 Nov 2015 15:59:29 +0200
> Subject: [PATCH 1/1] perf symbols: Rebuild rbtree when adjusting symbols for
> kcore
>
> Normally symbols are read from the DSO and adjusted, if need be, so that
> the symbol start matches the file offset in the DSO file (we want the
> file offset because that is what we know from MMAP events). That is done
> by dso__load_sym() which inserts the symbols *after* adjusting them.
>
> In the case of kcore, the symbols have been read from kallsyms and the
> symbol start is the memory address. The symbols have to be adjusted to
> match the kcore file offsets. dso__split_kallsyms_for_kcore() does that,
> but now the adjustment is being done *after* the symbols have been
> inserted. It appears dso__split_kallsyms_for_kcore() was assuming that
> changing the symbol start would not change the order in the rbtree -
> which is, of course, not guaranteed.
>
> Signed-off-by: Adrian Hunter <[email protected]>
> Tested-by: Wang Nan <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Masami Hiramatsu <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Zefan Li <[email protected]>
> Cc: [email protected]
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
> ---
> tools/perf/util/symbol.c | 30 ++++++++++++++----------------
> 1 file changed, 14 insertions(+), 16 deletions(-)
>
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index b4cc7662677e..09343a880c0b 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> struct map_groups *kmaps = map__kmaps(map);
> struct map *curr_map;
> struct symbol *pos;
> - int count = 0, moved = 0;
> + int count = 0;
> + struct rb_root old_root = dso->symbols[map->type];
> struct rb_root *root = &dso->symbols[map->type];
> struct rb_node *next = rb_first(root);
>
> if (!kmaps)
> return -1;
>
> + *root = RB_ROOT;
> +
> while (next) {
> char *module;
>
> pos = rb_entry(next, struct symbol, rb_node);
> next = rb_next(&pos->rb_node);
>
> + rb_erase_init(&pos->rb_node, &old_root);
> +
> module = strchr(pos->name, '\t');
> if (module)
> *module = '\0';
> @@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
> curr_map = map_groups__find(kmaps, map->type, pos->start);
>
> if (!curr_map || (filter && filter(curr_map, pos))) {
> - rb_erase_init(&pos->rb_node, root);
> symbol__delete(pos);
> - } else {
> - pos->start -= curr_map->start - curr_map->pgoff;
> - if (pos->end)
> - pos->end -= curr_map->start - curr_map->pgoff;
> - if (curr_map->dso != map->dso) {
> - rb_erase_init(&pos->rb_node, root);
> - symbols__insert(
> - &curr_map->dso->symbols[curr_map->type],
> - pos);
> - ++moved;
> - } else {
> - ++count;
> - }
> + continue;
> }
> +
> + pos->start -= curr_map->start - curr_map->pgoff;
> + if (pos->end)
> + pos->end -= curr_map->start - curr_map->pgoff;
> + symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);
> + ++count;
> }
>
> /* Symbols have been adjusted */
> dso->adjust_symbols = 1;
>
> - return count + moved;
> + return count;
> }
>
> /*
>
Commit-ID: 866548dd6e22c3795ae5146a9746a5cf659698f1
Gitweb: http://git.kernel.org/tip/866548dd6e22c3795ae5146a9746a5cf659698f1
Author: Adrian Hunter <[email protected]>
AuthorDate: Fri, 6 Nov 2015 15:59:29 +0200
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 12 Nov 2015 18:58:17 -0300
perf symbols: Rebuild rbtree when adjusting symbols for kcore
Normally symbols are read from the DSO and adjusted, if need be, so that
the symbol start matches the file offset in the DSO file (we want the
file offset because that is what we know from MMAP events). That is done
by dso__load_sym() which inserts the symbols *after* adjusting them.
In the case of kcore, the symbols have been read from kallsyms and the
symbol start is the memory address. The symbols have to be adjusted to
match the kcore file offsets. dso__split_kallsyms_for_kcore() does that,
but now the adjustment is being done *after* the symbols have been
inserted. It appears dso__split_kallsyms_for_kcore() was assuming that
changing the symbol start would not change the order in the rbtree -
which is, of course, not guaranteed.
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Wang Nan <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/symbol.c | 30 ++++++++++++++----------------
1 file changed, 14 insertions(+), 16 deletions(-)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index b4cc766..09343a8 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
struct map_groups *kmaps = map__kmaps(map);
struct map *curr_map;
struct symbol *pos;
- int count = 0, moved = 0;
+ int count = 0;
+ struct rb_root old_root = dso->symbols[map->type];
struct rb_root *root = &dso->symbols[map->type];
struct rb_node *next = rb_first(root);
if (!kmaps)
return -1;
+ *root = RB_ROOT;
+
while (next) {
char *module;
pos = rb_entry(next, struct symbol, rb_node);
next = rb_next(&pos->rb_node);
+ rb_erase_init(&pos->rb_node, &old_root);
+
module = strchr(pos->name, '\t');
if (module)
*module = '\0';
@@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
curr_map = map_groups__find(kmaps, map->type, pos->start);
if (!curr_map || (filter && filter(curr_map, pos))) {
- rb_erase_init(&pos->rb_node, root);
symbol__delete(pos);
- } else {
- pos->start -= curr_map->start - curr_map->pgoff;
- if (pos->end)
- pos->end -= curr_map->start - curr_map->pgoff;
- if (curr_map->dso != map->dso) {
- rb_erase_init(&pos->rb_node, root);
- symbols__insert(
- &curr_map->dso->symbols[curr_map->type],
- pos);
- ++moved;
- } else {
- ++count;
- }
+ continue;
}
+
+ pos->start -= curr_map->start - curr_map->pgoff;
+ if (pos->end)
+ pos->end -= curr_map->start - curr_map->pgoff;
+ symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);
+ ++count;
}
/* Symbols have been adjusted */
dso->adjust_symbols = 1;
- return count + moved;
+ return count;
}
/*