2017-11-29 18:17:05

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH] perf annotate: Fix objdump comment parsing for Intel mov dissassembly



On 11/28/2017 01:26 PM, Thomas Richter wrote:
> The command 'perf annotate' parses the output of objdump and also
> investigates the comments produced by objdump. For example the
> output of objdump produces (on x86):
>
> 23eee: 4c 8b 3d 13 01 21 00 mov 0x210113(%rip),%r15
> # 234008 <stderr@@GLIBC_2.2.5+0x9a8>
>
> and the function mov__parse() is called to investigate the complete
> line. Mov__parse() breaks this line into several parts and finally
> calls function comment__symbol() to parse the data after the comment
> character '#'. Comment__symbol() expects a hexadecimal address followed
> by a symbol in '<' and '>' brackets.
>
> However the 2nd parameter given to function comment__symbol()
> always points to the comment character '#'. The address parsing
> always returns 0 because the character '#' is not a digit and
> strtoull() fails without being noticed.
>
> Fix this by advancing the second parameter to function comment__symbol()
> by one byte before invocation and add an error check after strtoull()
> has been called.

Yeah, looks like it fails to get correct value in 'addrp'.

Can you please show the difference in perf annotate output before
and after patch.

Thanks,
Ravi


From 1585413937346233957@xxx Wed Nov 29 15:15:52 +0000 2017
X-GM-THRID: 1585295770389440484
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread


2017-11-29 15:15:52

by Thomas-Mich Richter

[permalink] [raw]
Subject: Re: [PATCH] perf annotate: Fix objdump comment parsing for Intel mov dissassembly

On 11/29/2017 02:24 PM, Ravi Bangoria wrote:
>
>
> On 11/28/2017 01:26 PM, Thomas Richter wrote:
>> The command 'perf annotate' parses the output of objdump and also
>> investigates the comments produced by objdump. For example the
>> output of objdump produces (on x86):
>>
>> 23eee: 4c 8b 3d 13 01 21 00 mov 0x210113(%rip),%r15
>> # 234008 <stderr@@GLIBC_2.2.5+0x9a8>
>>
>> and the function mov__parse() is called to investigate the complete
>> line. Mov__parse() breaks this line into several parts and finally
>> calls function comment__symbol() to parse the data after the comment
>> character '#'. Comment__symbol() expects a hexadecimal address followed
>> by a symbol in '<' and '>' brackets.
>>
>> However the 2nd parameter given to function comment__symbol()
>> always points to the comment character '#'. The address parsing
>> always returns 0 because the character '#' is not a digit and
>> strtoull() fails without being noticed.
>>
>> Fix this by advancing the second parameter to function comment__symbol()
>> by one byte before invocation and add an error check after strtoull()
>> has been called.
>
> Yeah, looks like it fails to get correct value in 'addrp'.
>
> Can you please show the difference in perf annotate output before
> and after patch.
>
> Thanks,
> Ravi
>


There is no difference in output of --stdio. The adress value is not
read and remains 0x0 in ops->source.addr or ops->target.addr.
That is not visible because in function mov__scnprintf() that wrong
address is not printed:

static int mov__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops)
{
return scnprintf(bf, size, "%-6.6s %s,%s", ins->name,
ops->source.name ?: ops->source.raw,
ops->target.name ?: ops->target.raw);
}


--
Thomas Richter, Dept 3303, IBM LTC Boeblingen Germany
--
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294


From 1585295770389440484@xxx Tue Nov 28 07:57:40 +0000 2017
X-GM-THRID: 1585295770389440484
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread