Peter & Stephane,
We are plumbing the POWER8 Branch History Rolling Buffer (BHRB) into
struct perf_branch_entry.
Sometimes on POWER8 we may not be able to fill out the "to" address. We
initially thought of just making this 0, but it's feasible that this
could be a valid address to branch to.
The other logical value to indicate an invalid entry would be all 1s
which is not possible (on POWER at least).
Do you guys have a preference as to what we should use as an invalid
entry? This would have some consequences for the userspace tool also.
The alternative would be to add a flag alongside mispred/predicted to
indicate the validity of the "to" address.
Mikey
On Tue, May 07, 2013 at 11:35:28AM +1000, Michael Neuling wrote:
> Peter & Stephane,
>
> We are plumbing the POWER8 Branch History Rolling Buffer (BHRB) into
> struct perf_branch_entry.
>
> Sometimes on POWER8 we may not be able to fill out the "to" address.
Just because I'm curious.. however does that happen? Surely the CPU knows where
next to fetch instructions?
> We
> initially thought of just making this 0, but it's feasible that this
> could be a valid address to branch to.
Right, while highly unlikely, x86 actually has some cases where 0 address is
valid *shudder*..
> The other logical value to indicate an invalid entry would be all 1s
> which is not possible (on POWER at least).
>
> Do you guys have a preference as to what we should use as an invalid
> entry? This would have some consequences for the userspace tool also.
>
> The alternative would be to add a flag alongside mispred/predicted to
> indicate the validity of the "to" address.
Either would work with me I suppose.. Stephane do you have any preference?
On Wed, May 8, 2013 at 5:59 PM, Peter Zijlstra <[email protected]> wrote:
> On Tue, May 07, 2013 at 11:35:28AM +1000, Michael Neuling wrote:
>> Peter & Stephane,
>>
>> We are plumbing the POWER8 Branch History Rolling Buffer (BHRB) into
>> struct perf_branch_entry.
>>
>> Sometimes on POWER8 we may not be able to fill out the "to" address.
>
> Just because I'm curious.. however does that happen? Surely the CPU knows where
> next to fetch instructions?
>
>> We
>> initially thought of just making this 0, but it's feasible that this
>> could be a valid address to branch to.
>
> Right, while highly unlikely, x86 actually has some cases where 0 address is
> valid *shudder*..
>
>> The other logical value to indicate an invalid entry would be all 1s
>> which is not possible (on POWER at least).
>>
>> Do you guys have a preference as to what we should use as an invalid
>> entry? This would have some consequences for the userspace tool also.
>>
>> The alternative would be to add a flag alongside mispred/predicted to
>> indicate the validity of the "to" address.
>
> Either would work with me I suppose.. Stephane do you have any preference?
But if the 'to' is bogus, why not just drop the sample?
That happens on x86 if the HW captured branches which do not correspond to
user filter settings (due to bug).
Peter Zijlstra <[email protected]> wrote:
> On Tue, May 07, 2013 at 11:35:28AM +1000, Michael Neuling wrote:
> > Peter & Stephane,
> >
> > We are plumbing the POWER8 Branch History Rolling Buffer (BHRB) into
> > struct perf_branch_entry.
> >
> > Sometimes on POWER8 we may not be able to fill out the "to" address.
>
> Just because I'm curious.. however does that happen? Surely the CPU
> knows where next to fetch instructions?
For computed gotos (ie. branch to a register value), the hardware gives
you the from and to address in the branch history buffer.
For branches where the branch target address is an immediate encoded in
the instruction, the hardware only logs the from address. It assumes
that software (perf irq handler in this case) can read this branch
instruction, calculate the corresponding offset and hence the
to/target address.
It's entirely possible that when the perf IRQ handler happens, the
instruction in question is not readable or is no longer a branch (self
modifying code). Hence we aren't able to calculate a valid to address.
Mikey
>
> > We
> > initially thought of just making this 0, but it's feasible that this
> > could be a valid address to branch to.
>
> Right, while highly unlikely, x86 actually has some cases where 0 address is
> valid *shudder*..
>
> > The other logical value to indicate an invalid entry would be all 1s
> > which is not possible (on POWER at least).
> >
> > Do you guys have a preference as to what we should use as an invalid
> > entry? This would have some consequences for the userspace tool also.
> >
> > The alternative would be to add a flag alongside mispred/predicted to
> > indicate the validity of the "to" address.
>
> Either would work with me I suppose.. Stephane do you have any preference?
>
Stephane Eranian <[email protected]> wrote:
> On Wed, May 8, 2013 at 5:59 PM, Peter Zijlstra <[email protected]> wrote:
> > On Tue, May 07, 2013 at 11:35:28AM +1000, Michael Neuling wrote:
> >> Peter & Stephane,
> >>
> >> We are plumbing the POWER8 Branch History Rolling Buffer (BHRB) into
> >> struct perf_branch_entry.
> >>
> >> Sometimes on POWER8 we may not be able to fill out the "to" address.
> >
> > Just because I'm curious.. however does that happen? Surely the CPU knows where
> > next to fetch instructions?
> >
> >> We
> >> initially thought of just making this 0, but it's feasible that this
> >> could be a valid address to branch to.
> >
> > Right, while highly unlikely, x86 actually has some cases where 0 address is
> > valid *shudder*..
> >
> >> The other logical value to indicate an invalid entry would be all 1s
> >> which is not possible (on POWER at least).
> >>
> >> Do you guys have a preference as to what we should use as an invalid
> >> entry? This would have some consequences for the userspace tool also.
> >>
> >> The alternative would be to add a flag alongside mispred/predicted to
> >> indicate the validity of the "to" address.
> >
> > Either would work with me I suppose.. Stephane do you have any preference?
>
> But if the 'to' is bogus, why not just drop the sample?
> That happens on x86 if the HW captured branches which do not correspond to
> user filter settings (due to bug).
We can I guess but it seems useful to log the from address when
possible.
Can we log it and userspace tools can ignore it if it's not useful?
Mikey
On Thu, 2013-05-09 at 08:45 +1000, Michael Neuling wrote:
> Stephane Eranian <[email protected]> wrote:
>
> > On Wed, May 8, 2013 at 5:59 PM, Peter Zijlstra <[email protected]> wrote:
> > > On Tue, May 07, 2013 at 11:35:28AM +1000, Michael Neuling wrote:
> > >> Peter & Stephane,
> > >>
> > >> We are plumbing the POWER8 Branch History Rolling Buffer (BHRB) into
> > >> struct perf_branch_entry.
> > >>
> > >> Sometimes on POWER8 we may not be able to fill out the "to" address.
> > >
> > > Just because I'm curious.. however does that happen? Surely the CPU knows where
> > > next to fetch instructions?
> > >
> > >> We
> > >> initially thought of just making this 0, but it's feasible that this
> > >> could be a valid address to branch to.
> > >
> > > Right, while highly unlikely, x86 actually has some cases where 0 address is
> > > valid *shudder*..
> > >
> > >> The other logical value to indicate an invalid entry would be all 1s
> > >> which is not possible (on POWER at least).
> > >>
> > >> Do you guys have a preference as to what we should use as an invalid
> > >> entry? This would have some consequences for the userspace tool also.
> > >>
> > >> The alternative would be to add a flag alongside mispred/predicted to
> > >> indicate the validity of the "to" address.
> > >
> > > Either would work with me I suppose.. Stephane do you have any preference?
> >
> > But if the 'to' is bogus, why not just drop the sample?
> > That happens on x86 if the HW captured branches which do not correspond to
> > user filter settings (due to bug).
>
> We can I guess but it seems useful to log the from address when
> possible.
Yeah I think it is useful. Knowing that you were there, but the to
address is invalid, is better than wondering why you never hit the code
at all.
cheers
On Thu, May 09, 2013 at 08:39:15AM +1000, Michael Neuling wrote:
> > Just because I'm curious.. however does that happen? Surely the CPU
> > knows where next to fetch instructions?
>
> For computed gotos (ie. branch to a register value), the hardware gives
> you the from and to address in the branch history buffer.
>
> For branches where the branch target address is an immediate encoded in
> the instruction, the hardware only logs the from address. It assumes
> that software (perf irq handler in this case) can read this branch
> instruction, calculate the corresponding offset and hence the
> to/target address.
>
> It's entirely possible that when the perf IRQ handler happens, the
> instruction in question is not readable or is no longer a branch (self
> modifying code). Hence we aren't able to calculate a valid to address.
Ohh how cute! You've gotta love lazy hardware :-)
On Fri, May 10, 2013 at 8:43 PM, Peter Zijlstra <[email protected]> wrote:
> On Thu, May 09, 2013 at 08:39:15AM +1000, Michael Neuling wrote:
>> > Just because I'm curious.. however does that happen? Surely the CPU
>> > knows where next to fetch instructions?
>>
>> For computed gotos (ie. branch to a register value), the hardware gives
>> you the from and to address in the branch history buffer.
>>
>> For branches where the branch target address is an immediate encoded in
>> the instruction, the hardware only logs the from address. It assumes
>> that software (perf irq handler in this case) can read this branch
>> instruction, calculate the corresponding offset and hence the
>> to/target address.
>>
>> It's entirely possible that when the perf IRQ handler happens, the
>> instruction in question is not readable or is no longer a branch (self
>> modifying code). Hence we aren't able to calculate a valid to address.
>
> Ohh how cute! You've gotta love lazy hardware :-)
The buffer is in the core (not main memory) and hence only has limited
entries. So skipping entries that can hopefully be determined in
other ways means we can log more branches.
That being said, it's a PITA for the kernel ;-)
Mikey
On Fri, 2013-05-10 at 20:50 +1000, Michael Neuling wrote:
> The buffer is in the core (not main memory) and hence only has limited
> entries. So skipping entries that can hopefully be determined in
> other ways means we can log more branches.
>
> That being said, it's a PITA for the kernel ;-)
I would suggest flagging them. As you mention, the code might have been
modified since the sample was taken. Even if it still looks like a
branch and you can compute the "To" address it might not be the right
one ... at least userspace should be notified that this specific sample
is to handle with care.
And if you just can't read the instruction or it's not a branch anymore,
then stick a -1 in there, no way it can be a valid branch address :-)
Cheers,
Ben.