2018-08-23 16:19:41

by Martin Liška

[permalink] [raw]
Subject: [PATCH] Properly interpret indirect call in perf annotate.

The patch changes interpretation of:
callq *0x8(%rbx)

from:
0.26 │ → callq *8
to:
0.26 │ → callq *0x8(%rbx)

in this can an address is followed by a register, thus
one can't parse only address.

Signed-off-by: Martin Liška <[email protected]>
---
tools/perf/util/annotate.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)



Attachments:
0001-Properly-interpret-indirect-call-in-perf-annotate.patch (690.00 B)

2018-08-23 16:26:40

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] Properly interpret indirect call in perf annotate.

Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.

Please mention one or two functions where such sequence appears, so that
others can reproduce your before/after more quickly,

- Arnaldo

> Signed-off-by: Martin Liška <[email protected]>
> ---
> tools/perf/util/annotate.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
>

> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>
> indirect_call:
> tok = strchr(endptr, '*');
> - if (tok != NULL)
> - ops->target.addr = strtoull(tok + 1, NULL, 16);
> + if (tok != NULL) {
> + endptr++;
> +
> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
> + * Do not parse such instruction. */
> + if (strstr(endptr, "(%r") == NULL)
> + ops->target.addr = strtoull(endptr, NULL, 16);
> + }
> goto find_target;
> }
>
>


2018-08-23 23:02:18

by Kim Phillips

[permalink] [raw]
Subject: Re: [PATCH] Properly interpret indirect call in perf annotate.

On Thu, 23 Aug 2018 14:29:34 +0200
Martin Liška <[email protected]> wrote:

> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.
>
> Signed-off-by: Martin Liška <[email protected]>
> ---

Tested this doesn't incur any grievances parsing aarch64 code:

Tested-by: Kim Phillips <[email protected]>

Thanks,

Kim

2018-08-27 09:08:12

by Martin Liška

[permalink] [raw]
Subject: Re: [PATCH] Properly interpret indirect call in perf annotate.

On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
>> The patch changes interpretation of:
>> callq *0x8(%rbx)
>>
>> from:
>> 0.26 │ → callq *8
>> to:
>> 0.26 │ → callq *0x8(%rbx)
>>
>> in this can an address is followed by a register, thus
>> one can't parse only address.
>
> Please mention one or two functions where such sequence appears, so that
> others can reproduce your before/after more quickly,

Sure, there's self-contained example on can compile (-O2) and test.
It's following call in test function:

test:
.LFB1:
.cfi_startproc
movq %rdi, %rax
subq $8, %rsp
.cfi_def_cfa_offset 16
movq %rsi, %rdi
movq %rdx, %rsi
call *8(%rax) <---- here
cmpl $1, %eax
adcl $-1, %eax
addq $8, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc

Martin

>
> - Arnaldo
>
>> Signed-off-by: Martin Liška <[email protected]>
>> ---
>> tools/perf/util/annotate.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>>
>
>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
>> index e4268b948e0e..e32ead4744bd 100644
>> --- a/tools/perf/util/annotate.c
>> +++ b/tools/perf/util/annotate.c
>> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>>
>> indirect_call:
>> tok = strchr(endptr, '*');
>> - if (tok != NULL)
>> - ops->target.addr = strtoull(tok + 1, NULL, 16);
>> + if (tok != NULL) {
>> + endptr++;
>> +
>> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
>> + * Do not parse such instruction. */
>> + if (strstr(endptr, "(%r") == NULL)
>> + ops->target.addr = strtoull(endptr, NULL, 16);
>> + }
>> goto find_target;
>> }
>>
>>
>


Attachments:
perf-callq.c (1.30 kB)

2018-08-27 10:38:39

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] Properly interpret indirect call in perf annotate.

Hello,

On Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška wrote:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.

Also there's a case with no offset like: callq *%rbx


>
> Signed-off-by: Martin Liška <[email protected]>
> ---
> tools/perf/util/annotate.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
>

> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>
> indirect_call:
> tok = strchr(endptr, '*');
> - if (tok != NULL)
> - ops->target.addr = strtoull(tok + 1, NULL, 16);
> + if (tok != NULL) {
> + endptr++;
> +
> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
> + * Do not parse such instruction. */
> + if (strstr(endptr, "(%r") == NULL)
> + ops->target.addr = strtoull(endptr, NULL, 16);

It seems too x86-specific, what about this? (not tested)


indirect_call:
tok = strchr(endptr, '*');
if (tok != NULL) {
endptr++;
if (!isdigit(*endptr))
return 0;

addr = strtoull(endptr, &endptr, 0);
if (*endptr != '('))
ops->target.addr = addr;


Thanks,
Namhyung


> + }
> goto find_target;
> }
>
>


2018-08-27 14:29:48

by Martin Liška

[permalink] [raw]
Subject: Re: [PATCH] Properly interpret indirect call in perf annotate.

On 08/27/2018 12:37 PM, Namhyung Kim wrote:
> Hello,
>
> On Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška wrote:
>> The patch changes interpretation of:
>> callq *0x8(%rbx)
>>
>> from:
>> 0.26 │ → callq *8
>> to:
>> 0.26 │ → callq *0x8(%rbx)
>>
>> in this can an address is followed by a register, thus
>> one can't parse only address.
>
> Also there's a case with no offset like: callq *%rbx

Yes. But this case is fine as strtoull returns 0 for that:
'If there were no digits at all, strtoul() stores the original value of nptr in *endptr (and returns 0).'
So ops->target.addr is then 0 and it's fine.

>
>
>>
>> Signed-off-by: Martin Liška <[email protected]>
>> ---
>> tools/perf/util/annotate.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>>
>
>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
>> index e4268b948e0e..e32ead4744bd 100644
>> --- a/tools/perf/util/annotate.c
>> +++ b/tools/perf/util/annotate.c
>> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>>
>> indirect_call:
>> tok = strchr(endptr, '*');
>> - if (tok != NULL)
>> - ops->target.addr = strtoull(tok + 1, NULL, 16);
>> + if (tok != NULL) {
>> + endptr++;
>> +
>> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
>> + * Do not parse such instruction. */
>> + if (strstr(endptr, "(%r") == NULL)
>> + ops->target.addr = strtoull(endptr, NULL, 16);
>
> It seems too x86-specific, what about this? (not tested)

It is, I'm fine with that. I've just tested that for the callq *0x8(%rbx) example.
I'm sending patch for that version.

Martin

>
>
> indirect_call:
> tok = strchr(endptr, '*');
> if (tok != NULL) {
> endptr++;
> if (!isdigit(*endptr))
> return 0;
>
> addr = strtoull(endptr, &endptr, 0);
> if (*endptr != '('))
> ops->target.addr = addr;
>
>
> Thanks,
> Namhyung
>
>
>> + }
>> goto find_target;
>> }
>>
>>
>


Attachments:
0001-Properly-interpret-indirect-call-in-perf-annotate.patch (1.52 kB)

2018-08-28 14:11:51

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] Properly interpret indirect call in perf annotate.

Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.
>
> Signed-off-by: Martin Liška <[email protected]>
> ---
> tools/perf/util/annotate.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)

Please don't send the patch as an attachment, use 'git-format-patch',
I'm fixing this up this time.

- Arnaldo

>

> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>
> indirect_call:
> tok = strchr(endptr, '*');
> - if (tok != NULL)
> - ops->target.addr = strtoull(tok + 1, NULL, 16);
> + if (tok != NULL) {
> + endptr++;
> +
> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
> + * Do not parse such instruction. */
> + if (strstr(endptr, "(%r") == NULL)
> + ops->target.addr = strtoull(endptr, NULL, 16);
> + }
> goto find_target;
> }
>
>


2018-08-28 14:13:12

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] Properly interpret indirect call in perf annotate.

Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
> On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> >> The patch changes interpretation of:
> >> callq *0x8(%rbx)
> >>
> >> from:
> >> 0.26 │ → callq *8
> >> to:
> >> 0.26 │ → callq *0x8(%rbx)
> >>
> >> in this can an address is followed by a register, thus
> >> one can't parse only address.
> >
> > Please mention one or two functions where such sequence appears, so that
> > others can reproduce your before/after more quickly,
>
> Sure, there's self-contained example on can compile (-O2) and test.
> It's following call in test function:
>
> test:
> .LFB1:
> .cfi_startproc
> movq %rdi, %rax
> subq $8, %rsp
> .cfi_def_cfa_offset 16
> movq %rsi, %rdi
> movq %rdx, %rsi
> call *8(%rax) <---- here
> cmpl $1, %eax
> adcl $-1, %eax
> addq $8, %rsp
> .cfi_def_cfa_offset 8
> ret
> .cfi_endproc

Here I'm getting:

Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
test /home/acme/c/perf-callq [Percent: local period]
0.17 │ mov %rdx,-0x28(%rbp)
0.58 │ mov -0x18(%rbp),%rax
7.90 │ mov 0x8(%rax),%rax
8.67 │ mov -0x28(%rbp),%rcx
│ mov -0x20(%rbp),%rdx
0.08 │ mov %rcx,%rsi
6.28 │ mov %rdx,%rdi
10.50 │ → callq *%rax
1.67 │ mov %eax,-0x4(%rbp)
11.95 │ cmpl $0x0,-0x4(%rbp)
8.14 │ ↓ je 3d
│ mov -0x4(%rbp),%eax
│ sub $0x1,%eax
│ ↓ jmp 42
│3d: mov $0x0,%eax
7.84 │42: leaveq
│ ← retq

Without the patch, will check if something changes with it.

- Arnaldo



2018-08-28 14:21:39

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] Properly interpret indirect call in perf annotate.

Em Tue, Aug 28, 2018 at 11:10:47AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
> > On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> > >> The patch changes interpretation of:
> > >> callq *0x8(%rbx)
> > >>
> > >> from:
> > >> 0.26 │ → callq *8
> > >> to:
> > >> 0.26 │ → callq *0x8(%rbx)

<SNIP>

> > > Please mention one or two functions where such sequence appears, so that
> > > others can reproduce your before/after more quickly,

> > Sure, there's self-contained example on can compile (-O2) and test.
> > It's following call in test function:

> > test:
> > .LFB1:
> > .cfi_startproc
> > movq %rdi, %rax
> > subq $8, %rsp
> > .cfi_def_cfa_offset 16
> > movq %rsi, %rdi
> > movq %rdx, %rsi
> > call *8(%rax) <---- here
> > cmpl $1, %eax
> > adcl $-1, %eax
> > addq $8, %rsp
> > .cfi_def_cfa_offset 8
> > ret
> > .cfi_endproc
>
> Here I'm getting:
>
> Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
> test /home/acme/c/perf-callq [Percent: local period]
> 0.17 │ mov %rdx,-0x28(%rbp)
> 0.58 │ mov -0x18(%rbp),%rax
> 7.90 │ mov 0x8(%rax),%rax
> 8.67 │ mov -0x28(%rbp),%rcx
> │ mov -0x20(%rbp),%rdx
> 0.08 │ mov %rcx,%rsi
> 6.28 │ mov %rdx,%rdi
> 10.50 │ → callq *%rax
> 1.67 │ mov %eax,-0x4(%rbp)
> 11.95 │ cmpl $0x0,-0x4(%rbp)
> 8.14 │ ↓ je 3d
> │ mov -0x4(%rbp),%eax
> │ sub $0x1,%eax
> │ ↓ jmp 42
> │3d: mov $0x0,%eax
> 7.84 │42: leaveq
> │ ← retq
>
> Without the patch, will check if something changes with it.

No changes with the patch, but then I did another test, ran a system
wide record for a while, then tested without/with your patch, with
--stdio2 redirecting to /tmp/{before,after} and got the expected
results, see below.

Thanks, applying,

- Arnaldo

--- /tmp/before 2018-08-28 11:16:03.238384143 -0300
+++ /tmp/after 2018-08-28 11:15:39.335341042 -0300
@@ -13274,7 +13274,7 @@
↓ jle 128
hash_value = hash_table->hash_func (key);
mov 0x8(%rsp),%rdi
- 0.91 → callq *30
+ 0.91 → callq *0x30(%r12)
mov $0x2,%r8d
cmp $0x2,%eax
node_hash = hash_table->hashes[node_index];
@@ -13848,7 +13848,7 @@
mov %r14,%rdi
sub %rbx,%r13
mov %r13,%rdx
- → callq *38
+ → callq *0x38(%r15)
cmp %rax,%r13
1.91 ↓ je 240
1b4: mov $0xffffffff,%r13d
@@ -14026,7 +14026,7 @@
mov %rcx,-0x500(%rbp)
mov %r15,%rsi
mov %r14,%rdi
- → callq *38
+ → callq *0x38(%rax)
mov -0x500(%rbp),%rcx
cmp %rax,%rcx
↓ jne 9b0
<SNIP tons of other such cases>

2018-08-28 17:57:49

by Martin Liška

[permalink] [raw]
Subject: Re: [PATCH] Properly interpret indirect call in perf annotate.

On 08/28/2018 04:18 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Aug 28, 2018 at 11:10:47AM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
>>> On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
>>>> Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
>>>>> The patch changes interpretation of:
>>>>> callq *0x8(%rbx)
>>>>>
>>>>> from:
>>>>> 0.26 │ → callq *8
>>>>> to:
>>>>> 0.26 │ → callq *0x8(%rbx)
>
> <SNIP>
>
>>>> Please mention one or two functions where such sequence appears, so that
>>>> others can reproduce your before/after more quickly,
>
>>> Sure, there's self-contained example on can compile (-O2) and test.
>>> It's following call in test function:
>
>>> test:
>>> .LFB1:
>>> .cfi_startproc
>>> movq %rdi, %rax
>>> subq $8, %rsp
>>> .cfi_def_cfa_offset 16
>>> movq %rsi, %rdi
>>> movq %rdx, %rsi
>>> call *8(%rax) <---- here
>>> cmpl $1, %eax
>>> adcl $-1, %eax
>>> addq $8, %rsp
>>> .cfi_def_cfa_offset 8
>>> ret
>>> .cfi_endproc
>>
>> Here I'm getting:
>>
>> Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
>> test /home/acme/c/perf-callq [Percent: local period]
>> 0.17 │ mov %rdx,-0x28(%rbp)
>> 0.58 │ mov -0x18(%rbp),%rax
>> 7.90 │ mov 0x8(%rax),%rax
>> 8.67 │ mov -0x28(%rbp),%rcx
>> │ mov -0x20(%rbp),%rdx
>> 0.08 │ mov %rcx,%rsi
>> 6.28 │ mov %rdx,%rdi
>> 10.50 │ → callq *%rax
>> 1.67 │ mov %eax,-0x4(%rbp)
>> 11.95 │ cmpl $0x0,-0x4(%rbp)
>> 8.14 │ ↓ je 3d
>> │ mov -0x4(%rbp),%eax
>> │ sub $0x1,%eax
>> │ ↓ jmp 42
>> │3d: mov $0x0,%eax
>> 7.84 │42: leaveq
>> │ ← retq
>>
>> Without the patch, will check if something changes with it.

Hi Arnaldo.

Thanks for re-sending of the patch and for the testing. The example I sent
is dependent on version of GCC compiler.

>
> No changes with the patch, but then I did another test, ran a system
> wide record for a while, then tested without/with your patch, with
> --stdio2 redirecting to /tmp/{before,after} and got the expected
> results, see below.
>
> Thanks, applying,

Good!
Martin

>
> - Arnaldo
>
> --- /tmp/before 2018-08-28 11:16:03.238384143 -0300
> +++ /tmp/after 2018-08-28 11:15:39.335341042 -0300
> @@ -13274,7 +13274,7 @@
> ↓ jle 128
> hash_value = hash_table->hash_func (key);
> mov 0x8(%rsp),%rdi
> - 0.91 → callq *30
> + 0.91 → callq *0x30(%r12)
> mov $0x2,%r8d
> cmp $0x2,%eax
> node_hash = hash_table->hashes[node_index];
> @@ -13848,7 +13848,7 @@
> mov %r14,%rdi
> sub %rbx,%r13
> mov %r13,%rdx
> - → callq *38
> + → callq *0x38(%r15)
> cmp %rax,%r13
> 1.91 ↓ je 240
> 1b4: mov $0xffffffff,%r13d
> @@ -14026,7 +14026,7 @@
> mov %rcx,-0x500(%rbp)
> mov %r15,%rsi
> mov %r14,%rdi
> - → callq *38
> + → callq *0x38(%rax)
> mov -0x500(%rbp),%rcx
> cmp %rax,%rcx
> ↓ jne 9b0
> <SNIP tons of other such cases>
>