* Peter Zijlstra <[email protected]> [2012-04-24 19:09:14]:
> On Tue, 2012-04-24 at 18:58 +0200, Peter Zijlstra wrote:
> > On Tue, 2012-04-24 at 22:26 +0530, Srivatsa Vaddagiri wrote:
> > > Steer a waking task towards a cpu where its cgroup has zero tasks (in
> > > order to provide it better sleeper credits and hence reduce its wakeup
> > > latency).
> >
> > That's just vile.. pjt could you post your global vruntime stuff so
> > vatsa can have a go at that?
>
> That is, you're playing a deficiency we should fix, not exploit.
>
> Also, you do a second loop over all those cpus right after we've already
> iterated them..
>
> furthermore, that 100%+ gain is still way insane, what else is broken?
> Did you try those paravirt tlb-flush patches and other such weirdness?
I got around to try pv-tlb-flush patches and it showed >100%
improvement for sysbench (without the balance-on-wake patch on host). This
was what readprofile showed (when pv-tlb-flush patches were absent in
guest):
1135704 total 0.3265
636737 native_cpuid 18192.4857
283201 __bitmap_empty 2832.0100
137853 flush_tlb_others_ipi 569.6405
I will try out how much they help Trade workload (which got me started
on this originally) and report back (part of the problem in trying out
is that pv-tlb-flush platches are throwing wierd problems - which Nikunj
is helping investigate).
In any case, we can't expect users to be able to easily upgrade their
guest VM kernels and so I am still looking for a solution that works for
older guest VM kernels. Paul, hope I can get the global vruntime patches
from you soon to test how much it helps!
- vatsa
On Wed, 2 May 2012 19:31:17 +0530, Srivatsa Vaddagiri <[email protected]> wrote:
> * Peter Zijlstra <[email protected]> [2012-04-24 19:09:14]:
>
> > On Tue, 2012-04-24 at 18:58 +0200, Peter Zijlstra wrote:
> > > On Tue, 2012-04-24 at 22:26 +0530, Srivatsa Vaddagiri wrote:
> > > > Steer a waking task towards a cpu where its cgroup has zero tasks (in
> > > > order to provide it better sleeper credits and hence reduce its wakeup
> > > > latency).
> > >
> > > That's just vile.. pjt could you post your global vruntime stuff so
> > > vatsa can have a go at that?
> >
> > That is, you're playing a deficiency we should fix, not exploit.
> >
> > Also, you do a second loop over all those cpus right after we've already
> > iterated them..
> >
> > furthermore, that 100%+ gain is still way insane, what else is broken?
> > Did you try those paravirt tlb-flush patches and other such weirdness?
>
> I got around to try pv-tlb-flush patches and it showed >100%
> improvement for sysbench (without the balance-on-wake patch on host). This
> was what readprofile showed (when pv-tlb-flush patches were absent in
> guest):
>
> 1135704 total 0.3265
> 636737 native_cpuid 18192.4857
> 283201 __bitmap_empty 2832.0100
> 137853 flush_tlb_others_ipi 569.6405
>
> I will try out how much they help Trade workload (which got me started
> on this originally) and report back (part of the problem in trying out
> is that pv-tlb-flush platches are throwing wierd problems - which Nikunj
> is helping investigate).
>
A bug, the below patch should cure it. kvm_mmu_flush_tlb will queue the
request, but we have already passed the phase of checking that. I will
fold this in my next version.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6c42056..b114411 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1550,7 +1550,7 @@ static void kvm_set_vcpu_state(struct kvm_vcpu *vcpu)
vs->state = 1;
if (vs->flush_on_enter) {
- kvm_mmu_flush_tlb(vcpu);
+ kvm_x86_ops->tlb_flush(vcpu);
vs->flush_on_enter = 0;
}