2018-10-11 00:12:06

by Subhra Mazumdar

[permalink] [raw]
Subject: Gang scheduling

Hi,

I was following the Coscheduling patch discussion on lkml and Peter
mentioned he had a patch series. I found the following on github.

https://github.com/pdxChen/gang/commits/sched_1.23-loadbal

I would like to test this with KVMs. Are the commits from 38d5acb to
f019876 sufficient? Also is there any documentaion on how to use it (any
knobs I need to turn on for gang scheduling to happen?) or is it enabled
by default for KVMs?

Thanks,
Subhra



2018-10-12 18:02:36

by Tim Chen

[permalink] [raw]
Subject: Re: Gang scheduling

On 10/10/2018 05:09 PM, Subhra Mazumdar wrote:
> Hi,
>
> I was following the Coscheduling patch discussion on lkml and Peter mentioned he had a patch series. I found the following on github.
>
> https://github.com/pdxChen/gang/commits/sched_1.23-loadbal
>
> I would like to test this with KVMs. Are the commits from 38d5acb to f019876 sufficient? Also is there any documentaion on how to use it (any knobs I need to turn on for gang scheduling to happen?) or is it enabled by default for KVMs?
>
> Thanks,
> Subhra
>

I would suggest you try
https://github.com/pdxChen/gang/tree/sched_1.23-base
without the load balancing part of gang scheduling.
It is enabled by default for KVMs.

Due to the constant change in gang scheduling status of the QEMU thread
depending on whether vcpu is loaded or unloaded,
the load balancing part of the code doesn't work very well.

The current version of the code need to be optimized further. Right now
the QEMU thread constantly does vcpu load and unload during VM enter and exit.
We gang schedule only after vcpu load and register the thread to be gang
scheduled. When we do vcpu unload, the thread is removed from the set
to be gang scheduled. Each time there's a synchronization with the
sibling thread that's expensive.

However, for QEMU, there's a one to one correspondence between the QEMU
thread and vcpu. So we don't have to change the gang scheduling status
for such thread to avoid the church and sync with the sibling. That should
be helpful for VM with lots of I/O causing constant VM exits. We're
still working on this optimization. And the load balancing should be
better after this change.

Tim

2018-10-15 22:50:50

by Subhra Mazumdar

[permalink] [raw]
Subject: Re: Gang scheduling



On 10/12/2018 11:01 AM, Tim Chen wrote:
> On 10/10/2018 05:09 PM, Subhra Mazumdar wrote:
>> Hi,
>>
>> I was following the Coscheduling patch discussion on lkml and Peter mentioned he had a patch series. I found the following on github.
>>
>> https://github.com/pdxChen/gang/commits/sched_1.23-loadbal
>>
>> I would like to test this with KVMs. Are the commits from 38d5acb to f019876 sufficient? Also is there any documentaion on how to use it (any knobs I need to turn on for gang scheduling to happen?) or is it enabled by default for KVMs?
>>
>> Thanks,
>> Subhra
>>
> I would suggest you try
> https://github.com/pdxChen/gang/tree/sched_1.23-base
> without the load balancing part of gang scheduling.
> It is enabled by default for KVMs.
>
> Due to the constant change in gang scheduling status of the QEMU thread
> depending on whether vcpu is loaded or unloaded,
> the load balancing part of the code doesn't work very well.
Thanks. Does this mean each vcpu thread need to be affinitized to a cpu?
>
> The current version of the code need to be optimized further. Right now
> the QEMU thread constantly does vcpu load and unload during VM enter and exit.
> We gang schedule only after vcpu load and register the thread to be gang
> scheduled. When we do vcpu unload, the thread is removed from the set
> to be gang scheduled. Each time there's a synchronization with the
> sibling thread that's expensive.
>
> However, for QEMU, there's a one to one correspondence between the QEMU
> thread and vcpu. So we don't have to change the gang scheduling status
> for such thread to avoid the church and sync with the sibling. That should
> be helpful for VM with lots of I/O causing constant VM exits. We're
> still working on this optimization. And the load balancing should be
> better after this change.
>
> Tim
>

Also FYI I get the following error while building sched_1.23-base:

ERROR: "sched_ttwu_pending" [arch/x86/kvm/kvm-intel.ko] undefined!
scripts/Makefile.modpost:92: recipe for target '__modpost' failed

Adding the following fixed it:

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 46807dc..302b77d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -21,6 +21,7 @@
 #include <trace/events/sched.h>

 DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+EXPORT_SYMBOL_GPL(sched_ttwu_pending);

 #if defined(CONFIG_SCHED_DEBUG) && defined(HAVE_JUMP_LABEL)
 /*

2019-02-13 03:00:58

by Subhra Mazumdar

[permalink] [raw]
Subject: Re: Gang scheduling

Hi Tim,

On 10/12/18 11:01 AM, Tim Chen wrote:
> On 10/10/2018 05:09 PM, Subhra Mazumdar wrote:
>> Hi,
>>
>> I was following the Coscheduling patch discussion on lkml and Peter mentioned he had a patch series. I found the following on github.
>>
>> https://github.com/pdxChen/gang/commits/sched_1.23-loadbal
>>
>> I would like to test this with KVMs. Are the commits from 38d5acb to f019876 sufficient? Also is there any documentaion on how to use it (any knobs I need to turn on for gang scheduling to happen?) or is it enabled by default for KVMs?
>>
>> Thanks,
>> Subhra
>>
> I would suggest you try
> https://github.com/pdxChen/gang/tree/sched_1.23-base
> without the load balancing part of gang scheduling.
> It is enabled by default for KVMs.
I applied the following 3 patches on 4.19 and tried to install a KVM (with
virt-install). But the kernel hangs with following error:

kernel:watchdog: BUG: soft lockup - CPU#21 stuck for 23s! [kworker/21:1:573]

kvm,sched: Track VCPU threads
x86/kvm,sched: Add fast path for reschedule interrupt
sched: Optimize scheduler_ipi()

The track VCPU patch seems to be the culprit.

Thanks,
Subhra
>
> Due to the constant change in gang scheduling status of the QEMU thread
> depending on whether vcpu is loaded or unloaded,
> the load balancing part of the code doesn't work very well.
>
> The current version of the code need to be optimized further. Right now
> the QEMU thread constantly does vcpu load and unload during VM enter and exit.
> We gang schedule only after vcpu load and register the thread to be gang
> scheduled. When we do vcpu unload, the thread is removed from the set
> to be gang scheduled. Each time there's a synchronization with the
> sibling thread that's expensive.
>
> However, for QEMU, there's a one to one correspondence between the QEMU
> thread and vcpu. So we don't have to change the gang scheduling status
> for such thread to avoid the church and sync with the sibling. That should
> be helpful for VM with lots of I/O causing constant VM exits. We're
> still working on this optimization. And the load balancing should be
> better after this change.
>
> Tim

2019-02-13 23:31:06

by Tim Chen

[permalink] [raw]
Subject: Re: Gang scheduling

On 2/12/19 6:57 PM, Subhra Mazumdar wrote:
> Hi Tim,
>
> On 10/12/18 11:01 AM, Tim Chen wrote:
>> On 10/10/2018 05:09 PM, Subhra Mazumdar wrote:
>>> Hi,
>>>
>>> I was following the Coscheduling patch discussion on lkml and Peter mentioned he had a patch series. I found the following on github.
>>>
>>> https://github.com/pdxChen/gang/commits/sched_1.23-loadbal
>>>
>>> I would like to test this with KVMs. Are the commits from 38d5acb to f019876 sufficient? Also is there any documentaion on how to use it (any knobs I need to turn on for gang scheduling to happen?) or is it enabled by default for KVMs?
>>>
>>> Thanks,
>>> Subhra
>>>
>> I would suggest you try
>> https://github.com/pdxChen/gang/tree/sched_1.23-base
>> without the load balancing part of gang scheduling.
>> It is enabled by default for KVMs.
> I applied the following 3 patches on 4.19 and tried to install a KVM (with
> virt-install). But the kernel hangs with following error:
>
> kernel:watchdog: BUG: soft lockup - CPU#21 stuck for 23s! [kworker/21:1:573]
>
> kvm,sched: Track VCPU threads
> x86/kvm,sched: Add fast path for reschedule interrupt
> sched: Optimize scheduler_ipi()
>
> The track VCPU patch seems to be the culprit.
>
> Thanks,
> Subhra

Thanks for giving it a try. Peter is working on a new version. So I'll not try
to debug this patchset for now.

Tim