> Summary
> -------
>
> Although the extreme case shows a nice improvement, I'm skeptical if it
> is worth doing for real world applications.
You did the experiment, and credit to you for not going "I did the work,
now include it" but rather for publishing the results so we can learn
from them.
It *does* make me wonder if we can leverage RTM for a significant subset
of these (as an interrupt will abort a transaction); that should be
substantially cheaper and less complex.
-hpa
"H. Peter Anvin" <[email protected]> writes:
>> Summary
>> -------
>>
>> Although the extreme case shows a nice improvement, I'm skeptical if it
>> is worth doing for real world applications.
>
> You did the experiment, and credit to you for not going "I did the work,
> now include it" but rather for publishing the results so we can learn
> from them.
>
> It *does* make me wonder if we can leverage RTM for a significant subset
> of these (as an interrupt will abort a transaction); that should be
> substantially cheaper and less complex.
I miss the original context and can't find the original patchkit, but:
- If the goal is to lower interrupt latency then RTM would still
need to use a fallback, so the worst case would be the fallback, thus
not be better.
- If the goal is to make CLI/STI faster:
I'm not sure RTM is any faster than a PUSHF/CLI/POPF pair. It may
well be slightly slower in fact (guessing here, haven't benchmarked)
- Also when you abort you would need to reexecute of course.
- My TSX patchkit actually elides CLI/STI inside transactions
(no need to do them, as any interrupt would abort anyways)
but the main motivation was to avoid extra aborts.
- That said, I think a software CLI/STI is somewhat useful for profiling,
as it can allow to measure how long interrupts are delayed
by CLI/STI. I heard use cases of this, but I'm not
sure how common it really is
[I presume a slightly modified RT kernel could also give the same
profiling results]
-Andi
--
[email protected] -- Speaking for myself only
On Wed, 09 Oct 2013 15:25:23 -0700
Andi Kleen <[email protected]> wrote:
> "H. Peter Anvin" <[email protected]> writes:
>
> >> Summary
> >> -------
> >>
> >> Although the extreme case shows a nice improvement, I'm skeptical if it
> >> is worth doing for real world applications.
> >
> > You did the experiment, and credit to you for not going "I did the work,
> > now include it" but rather for publishing the results so we can learn
> > from them.
> >
> > It *does* make me wonder if we can leverage RTM for a significant subset
> > of these (as an interrupt will abort a transaction); that should be
> > substantially cheaper and less complex.
>
> I miss the original context and can't find the original patchkit, but:
Yeah, for some reason, the original email didn't make it to LKML.
Dave,
I don't know why my email never reached LKML, was there something about
it that prevented it from going? The total character length was 46,972,
well below the 100,000 limit. Also the Cc list wasn't that big. Did my
ISP get flagged as a spam bot or something?
I can bounce it to you to see what was wrong with it.
-- Steve
>
> - If the goal is to lower interrupt latency then RTM would still
> need to use a fallback, so the worst case would be the fallback, thus
> not be better.
>
> - If the goal is to make CLI/STI faster:
> I'm not sure RTM is any faster than a PUSHF/CLI/POPF pair. It may
> well be slightly slower in fact (guessing here, haven't benchmarked)
>
> - Also when you abort you would need to reexecute of course.
>
> - My TSX patchkit actually elides CLI/STI inside transactions
> (no need to do them, as any interrupt would abort anyways)
> but the main motivation was to avoid extra aborts.
>
> - That said, I think a software CLI/STI is somewhat useful for profiling,
> as it can allow to measure how long interrupts are delayed
> by CLI/STI. I heard use cases of this, but I'm not
> sure how common it really is
>
> [I presume a slightly modified RT kernel could also give the same
> profiling results]
>
> -Andi
>
From: Steven Rostedt <[email protected]>
Date: Wed, 9 Oct 2013 20:36:27 -0400
> I don't know why my email never reached LKML, was there something about
> it that prevented it from going? The total character length was 46,972,
> well below the 100,000 limit. Also the Cc list wasn't that big. Did my
> ISP get flagged as a spam bot or something?
>
> I can bounce it to you to see what was wrong with it.
That's odd, can you try sending it to the list again? I'll watch
very carefully for a bounce.
Thanks.
On Wed, 09 Oct 2013 23:39:28 -0400 (EDT)
David Miller <[email protected]> wrote:
> From: Steven Rostedt <[email protected]>
> Date: Wed, 9 Oct 2013 20:36:27 -0400
>
> > I don't know why my email never reached LKML, was there something about
> > it that prevented it from going? The total character length was 46,972,
> > well below the 100,000 limit. Also the Cc list wasn't that big. Did my
> > ISP get flagged as a spam bot or something?
> >
> > I can bounce it to you to see what was wrong with it.
>
> That's odd, can you try sending it to the list again? I'll watch
> very carefully for a bounce.
>
I just sent it as a redirect to LKML. Hopefully that worked.
Thanks,
-- Steve
* Andi Kleen <[email protected]> wrote:
> - That said, I think a software CLI/STI is somewhat useful for
> profiling, as it can allow to measure how long interrupts are delayed by
> CLI/STI. [...]
That could be measured directly in a simpler way, without disrupting
CLI/STI: by turning all IRQs into NMIs and resending them from a special
NMI handler. (and of course timestamping the NMI arrival time and the IRQ
entry time so that instrumentation can recover it.)
If indirect, statistical measurement suffices then IRQ delivery latencies
can also be estimated statistically without any kernel changes: by
profiling IRQ disable/enable sections (there's a counter for that),
calculating average IRQ-disable section length from that. The average IRQ
delay will be 50% of that value, assuming normal distribution of IRQs.
This should be good enough for most cases.
Thanks,
Ingo