Hi Ingo,
Andi asked me this week when we should expect to see the "immediate
values" make it into mainline. I remember you pulled them at one point.
He would like to use them to encode some very hot-path variables into
the instruction stream.
How should I proceed to get that upstream ? Would a repost be
appropriate ?
I have support for powerpc, x86 ans sparc64 currently.
Thanks,
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
* Mathieu Desnoyers <[email protected]> wrote:
> Hi Ingo,
>
> Andi asked me this week when we should expect to see the "immediate
> values" make it into mainline. I remember you pulled them at one
> point. He would like to use them to encode some very hot-path
> variables into the instruction stream.
>
> How should I proceed to get that upstream ? Would a repost be
> appropriate ?
Would have to see it in full context i guess, with before/after
measurements, etc.
Ingo
On Thu, Sep 24, 2009 at 02:34:28PM +0200, Ingo Molnar wrote:
> * Mathieu Desnoyers <[email protected]> wrote:
>
> > Hi Ingo,
> >
> > Andi asked me this week when we should expect to see the "immediate
> > values" make it into mainline. I remember you pulled them at one
> > point. He would like to use them to encode some very hot-path
> > variables into the instruction stream.
> >
> > How should I proceed to get that upstream ? Would a repost be
> > appropriate ?
>
> Would have to see it in full context i guess, with before/after
> measurements, etc.
>
> Ingo
right we've proposed an alternative to the immediate values, which I've
been calling 'jump label', here:
http://marc.info/?l=linux-kernel&m=125200966226921&w=2
The basic idea is that gcc, 4.5 will have support for an 'asm goto'
construct which can refer to c code labels. Thus, we can replace a nop
in the code stream with a 'jmp' instruction to various branch targets.
In terms of a comparison between the two, IMO, I think that the syntax
for the immediate variables can be more readable, since it just looks
like a conditional expression.
The immediate values do a 'mov', 'test' and then a jump, whereas jump
label can just do a jump. So in this respect, I believe jump label can
be more optimal. Additinally, if we want to mark sections 'cold' so they
don't impact the istruction cache, the jump label already has the labels
for doing so. Obviously, a performance comparison would be interesting
as well.
thanks,
-Jason
Jason Baron wrote:
>
> right we've proposed an alternative to the immediate values, which I've
> been calling 'jump label', here:
>
> http://marc.info/?l=linux-kernel&m=125200966226921&w=2
>
> The basic idea is that gcc, 4.5 will have support for an 'asm goto'
> construct which can refer to c code labels. Thus, we can replace a nop
> in the code stream with a 'jmp' instruction to various branch targets.
>
> In terms of a comparison between the two, IMO, I think that the syntax
> for the immediate variables can be more readable, since it just looks
> like a conditional expression.
>
> The immediate values do a 'mov', 'test' and then a jump, whereas jump
> label can just do a jump. So in this respect, I believe jump label can
> be more optimal. Additinally, if we want to mark sections 'cold' so they
> don't impact the istruction cache, the jump label already has the labels
> for doing so. Obviously, a performance comparison would be interesting
> as well.
>
Direct jumps should at least theoretically be able to have better
performance, but it would still be nice to have measurements of both.
-hpa
* Jason Baron ([email protected]) wrote:
> On Thu, Sep 24, 2009 at 02:34:28PM +0200, Ingo Molnar wrote:
> > * Mathieu Desnoyers <[email protected]> wrote:
> >
> > > Hi Ingo,
> > >
> > > Andi asked me this week when we should expect to see the "immediate
> > > values" make it into mainline. I remember you pulled them at one
> > > point. He would like to use them to encode some very hot-path
> > > variables into the instruction stream.
> > >
> > > How should I proceed to get that upstream ? Would a repost be
> > > appropriate ?
> >
> > Would have to see it in full context i guess, with before/after
> > measurements, etc.
> >
> > Ingo
>
> right we've proposed an alternative to the immediate values, which I've
> been calling 'jump label', here:
>
> http://marc.info/?l=linux-kernel&m=125200966226921&w=2
>
> The basic idea is that gcc, 4.5 will have support for an 'asm goto'
> construct which can refer to c code labels. Thus, we can replace a nop
> in the code stream with a 'jmp' instruction to various branch targets.
>
> In terms of a comparison between the two, IMO, I think that the syntax
> for the immediate variables can be more readable, since it just looks
> like a conditional expression.
>
> The immediate values do a 'mov', 'test' and then a jump, whereas jump
> label can just do a jump. So in this respect, I believe jump label can
> be more optimal. Additinally, if we want to mark sections 'cold' so they
> don't impact the istruction cache, the jump label already has the labels
> for doing so. Obviously, a performance comparison would be interesting
> as well.
>
For branches, I'm convinced that a "static jump" approach will beat
immediate values anytime, because you save the BPB hit completely.
However, there are other use-cases involving a variable read, and in
that case immediate values are useful. Andi has been bugging me for a
while to re-post this patchset, I'm pretty sure he has precise ideas
about how he would like to use it.
Until we get the static jump support mainlined in gcc, immediate values
at least save the d-cache hit. So it would be a step in the right
direction.
Thanks,
Mathieu
> thanks,
>
> -Jason
>
>
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
Jason Baron wrote:
>
> http://marc.info/?l=linux-kernel&m=125200966226921&w=2
>
> The basic idea is that gcc, 4.5 will have support for an 'asm goto'
> construct which can refer to c code labels. Thus, we can replace a nop
> in the code stream with a 'jmp' instruction to various branch targets.
>
Looking at the above, I'm a bit unclear for the need for a NOP5. We
obviously need a *total* of 5 bytes, but at least I don't seem to
understand why we need 7 bytes per tracepoint.
-hpa
On Thu, Sep 24, 2009 at 07:16:56AM -0700, H. Peter Anvin wrote:
> Jason Baron wrote:
>>
>> http://marc.info/?l=linux-kernel&m=125200966226921&w=2
>>
>> The basic idea is that gcc, 4.5 will have support for an 'asm goto'
>> construct which can refer to c code labels. Thus, we can replace a nop
>> in the code stream with a 'jmp' instruction to various branch targets.
>>
>
> Looking at the above, I'm a bit unclear for the need for a NOP5. We
> obviously need a *total* of 5 bytes, but at least I don't seem to
> understand why we need 7 bytes per tracepoint.
>
> -hpa
that's right. The optimal solution doesn't require the the NOP5 at all,
and I've been playing around with an implementation that doesn't have
it. The problem I've been running into is that sometimes the compiler
will put in a short jump - '0xeb', with a 1 byte offset, but the jump
target is further away. Thus, I need to either ensure the target is
close, or somehow force a longer jump '0xe9' into the code so I always
have the space. The other advantage of not including the nop is easier
support for all x86 implementations, since I'm not sure a 5 byte atomic
nop is always available, whereas a jump is always atomic. I'm pretty
sure we can come up with a patch that avoids the nop...I'll keep working
on it.
thanks,
-Jason
Jason Baron wrote:
>
> that's right. The optimal solution doesn't require the the NOP5 at all,
> and I've been playing around with an implementation that doesn't have
> it. The problem I've been running into is that sometimes the compiler
> will put in a short jump - '0xeb', with a 1 byte offset, but the jump
> target is further away. Thus, I need to either ensure the target is
> close, or somehow force a longer jump '0xe9' into the code so I always
> have the space. The other advantage of not including the nop is easier
> support for all x86 implementations, since I'm not sure a 5 byte atomic
> nop is always available, whereas a jump is always atomic. I'm pretty
> sure we can come up with a patch that avoids the nop...I'll keep working
> on it.
>
Unfortunately gas doesn't have any equivalent of the NASM "strict"
operand modifier, which would be ideal here. The following *seems* to
work on binutils-2.18.50.0.9-8.fc10.x86_64 at least:
.byte 0xe9
.long %0-1f
1:
-hpa
* Mathieu Desnoyers <[email protected]> wrote:
> * Jason Baron ([email protected]) wrote:
> >
> > right we've proposed an alternative to the immediate values, which
> > I've been calling 'jump label', here:
> >
> > http://marc.info/?l=linux-kernel&m=125200966226921&w=2
> >
> > The basic idea is that gcc, 4.5 will have support for an 'asm goto'
> > construct which can refer to c code labels. Thus, we can replace a
> > nop in the code stream with a 'jmp' instruction to various branch
> > targets.
> >
> > In terms of a comparison between the two, IMO, I think that the
> > syntax for the immediate variables can be more readable, since it
> > just looks like a conditional expression.
> >
> > The immediate values do a 'mov', 'test' and then a jump, whereas
> > jump label can just do a jump. So in this respect, I believe jump
> > label can be more optimal. Additinally, if we want to mark sections
> > 'cold' so they don't impact the istruction cache, the jump label
> > already has the labels for doing so. Obviously, a performance
> > comparison would be interesting as well.
>
> For branches, I'm convinced that a "static jump" approach will beat
> immediate values anytime, because you save the BPB hit completely.
>
> However, there are other use-cases involving a variable read, and in
> that case immediate values are useful. Andi has been bugging me for a
> while to re-post this patchset, I'm pretty sure he has precise ideas
> about how he would like to use it.
It depends on how significant that usecase is.
Tracepoints used to be the biggest use-case for immediate values, and
without that the thing becomes rather complex to maintain, for probably
very little benefit.
Ingo
* H. Peter Anvin <[email protected]> wrote:
> I would like to get an official ACK or NAK for this patching technique
> from inside Intel, and preferrably from AMD as well. If it does work
> as described it would provide a very clean way to do one-shot
> alternative functions, which probably would be higher value than
> immediate data values.
Sounds tempting. Things like the CONFIG_SECURITY hookery could use it?
But ... since it's patched under stopmachine, is there any reason why
this wouldnt work?
Ingo
On Thu, 24 Sep 2009 21:34:22 +0200
Ingo Molnar <[email protected]> wrote:
>
> * H. Peter Anvin <[email protected]> wrote:
>
> > I would like to get an official ACK or NAK for this patching
> > technique from inside Intel, and preferrably from AMD as well. If
> > it does work as described it would provide a very clean way to do
> > one-shot alternative functions, which probably would be higher
> > value than immediate data values.
>
> Sounds tempting. Things like the CONFIG_SECURITY hookery could use it?
>
> But ... since it's patched under stopmachine, is there any reason why
> this wouldnt work?
>
stopmachine is fine.
more aggressive tricks are rather dicey.
(cross modifying code that's being executed in ring 0 is ... not
something CPU designers had in mind)
--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
* Arjan van de Ven ([email protected]) wrote:
> On Thu, 24 Sep 2009 21:34:22 +0200
> Ingo Molnar <[email protected]> wrote:
>
[context for people CCed: see
http://lkml.org/lkml/2009/9/24/262]
> >
> > * H. Peter Anvin <[email protected]> wrote:
> >
> > > I would like to get an official ACK or NAK for this patching
> > > technique from inside Intel, and preferrably from AMD as well. If
> > > it does work as described it would provide a very clean way to do
> > > one-shot alternative functions, which probably would be higher
> > > value than immediate data values.
> >
> > Sounds tempting. Things like the CONFIG_SECURITY hookery could use it?
> >
> > But ... since it's patched under stopmachine, is there any reason why
> > this wouldnt work?
> >
>
> stopmachine is fine.
>
> more aggressive tricks are rather dicey.
>
> (cross modifying code that's being executed in ring 0 is ... not
> something CPU designers had in mind)
>
Then, following your advice, kprobes should be re-designed to do a
stop_machine around the int3 breakpoint insertion ? And gdb
should be stopping all threads of a target process before inserting a
breakpoint. Therefore, I do not seem to be the only one confused about
Intel statement on this issue.
Mathieu
> --
> Arjan van de Ven Intel Open Source Technology Centre
> For development, discussion and tips for power savings,
> visit http://www.lesswatts.org
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
On Fri, 25 Sep 2009 03:35:13 -0400
Mathieu Desnoyers <[email protected]> wrote:
> * Arjan van de Ven ([email protected]) wrote:
> > On Thu, 24 Sep 2009 21:34:22 +0200
> > Ingo Molnar <[email protected]> wrote:
> >
> [context for people CCed: see
> http://lkml.org/lkml/2009/9/24/262]
>
> > >
> >
> > stopmachine is fine.
> >
> > more aggressive tricks are rather dicey.
> >
> > (cross modifying code that's being executed in ring 0 is ... not
> > something CPU designers had in mind)
> >
>
> Then, following your advice, kprobes should be re-designed to do a
> stop_machine around the int3 breakpoint insertion ? And gdb
> should be stopping all threads of a target process before inserting a
> breakpoint. Therefore, I do not seem to be the only one confused about
> Intel statement on this issue.
you are oversimplifying what you are trying to do, and overstating what
a ring 3 app and others do.
But I'm not the one whom you'd need to convince, I don't design the
CPU. The people who do are extremely frowning on cross modifying code,
and Peter and I need to sit down with people who did many generations
of CPU to figure out if your scheme is actually safe. And on the AMD
side someone will need to do the same.
--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
> Then, following your advice, kprobes should be re-designed to do a
> stop_machine around the int3 breakpoint insertion ? And gdb
> should be stopping all threads of a target process before inserting a
> breakpoint. Therefore, I do not seem to be the only one confused about
> Intel statement on this issue.
There was considerable discussion abut this when the kprobe stuff went
in. If I remember rightly it was stated by someone @intel.com then that
int3 was ok (even though its not strictly documented as such). The same
is not true for all instructions on all x86 processors unfortunately.
Alan
On Fri, 25 Sep 2009 11:02:06 +0100
Alan Cox <[email protected]> wrote:
> > Then, following your advice, kprobes should be re-designed to do a
> > stop_machine around the int3 breakpoint insertion ? And gdb
> > should be stopping all threads of a target process before inserting
> > a breakpoint. Therefore, I do not seem to be the only one confused
> > about Intel statement on this issue.
>
> There was considerable discussion abut this when the kprobe stuff went
> in. If I remember rightly it was stated by someone @intel.com then
> that int3 was ok (even though its not strictly documented as such).
> The same is not true for all instructions on all x86 processors
> unfortunately.
specifically, using int3 *and then going back to the old value*.
--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
Alan Cox <[email protected]> wrote on 25/09/2009 11:02:06:
>
> There was considerable discussion abut this when the kprobe stuff went
> in. If I remember rightly it was stated by someone @intel.com then that
> int3 was ok (even though its not strictly documented as such). The same
> is not true for all instructions on all x86 processors unfortunately.
>
> Alan
Alan, I had that discussion with Intel, and yes int3 is a special case
because of the interrupt processing associated with it. The discussion
went along this lines: int3 is practically useless in an MP environment
if it's trouble by the cross-modifying erratum.
I suppose it is possible the more recent microarchitectures have
changed things. And yes, we might need to have that conversation again.
Richard
Richard J Moore wrote:
>
>
> Alan Cox <[email protected]> wrote on 25/09/2009 11:02:06:
>
>
>>
>> There was considerable discussion abut this when the kprobe stuff went
>> in. If I remember rightly it was stated by someone @intel.com then that
>> int3 was ok (even though its not strictly documented as such). The same
>> is not true for all instructions on all x86 processors unfortunately.
>>
>> Alan
>
> Alan, I had that discussion with Intel, and yes int3 is a special case
> because of the interrupt processing associated with it. The discussion
> went along this lines: int3 is practically useless in an MP environment
> if it's trouble by the cross-modifying erratum.
>
> I suppose it is possible the more recent microarchitectures have
> changed things. And yes, we might need to have that conversation again.
Hi,
I'm also very interested in this topic, since I'd like to replace
kprobe's int3 with jump instruction by using bypass code, which
Mathieu's new imv using.
http://lkml.org/lkml/2009/9/14/549
Actually, it is OK even if I need to use stop_machine(), because
the main goal is reducing overhead of probing, not reducing
the replacing time. :)
Thank you,
--
Masami Hiramatsu
Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division
e-mail: [email protected]
Arjan van de Ven wrote:
> On Fri, 25 Sep 2009 11:02:06 +0100
> Alan Cox <[email protected]> wrote:
>
>>> Then, following your advice, kprobes should be re-designed to do a
>>> stop_machine around the int3 breakpoint insertion ? And gdb
>>> should be stopping all threads of a target process before inserting
>>> a breakpoint. Therefore, I do not seem to be the only one confused
>>> about Intel statement on this issue.
>> There was considerable discussion abut this when the kprobe stuff went
>> in. If I remember rightly it was stated by someone @intel.com then
>> that int3 was ok (even though its not strictly documented as such).
>> The same is not true for all instructions on all x86 processors
>> unfortunately.
>
> specifically, using int3 *and then going back to the old value*.
>
As I told Mathieu in person yesterday:
1. We have no information if this is safe or not. It is most certainly
not documented as safe, and trying to play language lawyer with the
errata text is pointless, as it's trying to interpret something that
isn't there.
2. There are some reasons to believe there might be a safe technique
somewhere in here (the one he described is a possibility, but not the
only one.)
3. Being able to patch code without stopping all cores has other uses,
and so spending some time doing legwork on it is probably worth it.
4. "Someone at Intel" isn't a reference... we need to track down actual
CPU architects with real names who can give us a thumbs up or down.
-hpa
On Fri, 25 Sep 2009 09:19:32 -0700
"H. Peter Anvin" <[email protected]> wrote:
> Arjan van de Ven wrote:
> > On Fri, 25 Sep 2009 11:02:06 +0100
> > Alan Cox <[email protected]> wrote:
> >
> >>> Then, following your advice, kprobes should be re-designed to do a
> >>> stop_machine around the int3 breakpoint insertion ? And gdb
> >>> should be stopping all threads of a target process before
> >>> inserting a breakpoint. Therefore, I do not seem to be the only
> >>> one confused about Intel statement on this issue.
> >> There was considerable discussion abut this when the kprobe stuff
> >> went in. If I remember rightly it was stated by someone @intel.com
> >> then that int3 was ok (even though its not strictly documented as
> >> such). The same is not true for all instructions on all x86
> >> processors unfortunately.
> >
> > specifically, using int3 *and then going back to the old value*.
> >
>
> As I told Mathieu in person yesterday:
>
> 1. We have no information if this is safe or not. It is most
> certainly not documented as safe, and trying to play language lawyer
> with the errata text is pointless, as it's trying to interpret
> something that isn't there.
>
> 2. There are some reasons to believe there might be a safe technique
> somewhere in here (the one he described is a possibility, but not the
> only one.)
>
> 3. Being able to patch code without stopping all cores has other
> uses, and so spending some time doing legwork on it is probably worth
> it.
>
> 4. "Someone at Intel" isn't a reference... we need to track down
> actual CPU architects with real names who can give us a thumbs up or
> down.
and we need to talk to about 5 or so generations at least.
We know whom to talk to, it just will take time
(and first indication is frowned faces)
AMD will also do the same, and VIA (I think they have dual core/smp as
well)
--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
Arjan van de Ven wrote:
>
> and we need to talk to about 5 or so generations at least.
> We know whom to talk to, it just will take time
> (and first indication is frowned faces)
>
> AMD will also do the same, and VIA (I think they have dual core/smp as
> well)
>
Of course.
-hpa